Tuesday, February 26, 2008

PWR Flaws

College Hockey News released its first Bracket ABCs article, following the ups and downs of the Pairwise Rankings.

First off, let me say that I'm a huge fan of this system for selecting the NCAA tournament field. I think the transparency with which the field is selected is great for college hockey. Though I do have a few issues with some flaws in the system. Last year, I mentioned the problems with the TUC Cliff between the 25th and 26th team in the RPI rankings. This year, they added a new twist to the PWR, which has created a new flaw.

Starting this year, a team has to play at least 10 games Teams Under Consideration in order for that part of a comparison to count. I think the reasoning behind the rule was to make sure smaller conference teams didn't sneak by with one or two wins TUC wins, and win every comparison with a perfect winning percentage. It looks as though the exact opposite is happening, as illustrated by CHN's analysis of Princeton:
Finally, we reach poor Princeton. Yes, it sure looked magical for the Tigers after last weekend's sweep of Cornell and Colgate. You looked at the Pairwise and — WOW! — the Tigers were 12th. Princeton's only made the NCAAs once, winning the 1998 ECAC tournament. But then, you look at the Pairwise on Monday, and Princeton is down at No. 16, tied with Boston University. After not having played a game. Princeton got screwed, you say? No — Princeton got TUCed.

You see, after Saturday, Princeton was 2-7 against Teams Under Consideration, meaning that component was not yet counting in any of Princeton's comparisons with other teams. That was a good thing — for Princeton. Clearly, though, once the Tigers played another game against a TUC, things were going to change — because Princeton was barely winning three comparisons to WCHA teams that it would instantly lose once the Record vs. TUC kicked in.

Well, Princeton didn't have to wait until Friday for this to happen. Thanks to the Pairwise quirks, after Sunday's games, Cornell became a TUC. With that, Princeton's record against TUC — adding in two WINS against Cornell this season — improved the Tigers' Record vs. TUC to 4-7, which kicked that component into play, and made them LOSE the comparisons with three WCHA teams. You dizzy yet? Ahhh, the fun has only just begun.
I realize that the immediate comeback to this is: "But the PWR is only supposed to be used on the last day of the season! Princeton will probably have more than 10 games against TUCs by then!" Maybe so. But what if they didn't? It's not out of the realm of possibility that a team could go 0-9 against TUCs during the season, but not suffer any consequences because they don't have that tenth game.

One solution to this problem that I've heard before, and I think would work, is that if a team doesn't reach 10 games against TUCs, every game short of 10 counts as a loss. For example, if Princeton was 2-7-0 against TUCs, it would be counted as 2-8-0. It would work the same way if a team did well against TUCs, but couldn't schedule enough. If a team was 8-0-0 against TUCs, it would be counted as 8-2-0 and they'd probably still win that category on most comparisons.

This would guarantee that teams wouldn't benefit from not playing enough games against TUCs. It would probably make the early season drafts of the PWR look like an absolute mess, but like I said, the system is really only meant to be used one day per year.

13 comments:

CHRE said...

In general, I think the only team capable of winning four straight games against other top teams (the champion) has been selected to the field every year since this method has been applied. That's good.

However, in principal I don't agree with any system that has number cliffs like "25" or "10". A sliding scale system of some other sort would be preferable.

Anonymous said...

Those are good points, but what if a team is 5-4 vs TUC and by counting that missing game as a loss (dropping them to a 5-5 record), suddenly they lose a lot of comparisons that they would have won otherwise.

What team is going be able to swallow that?

I understand what your short-term fix is trying accomplish. However, if adding those additional games as losses has a material impact, then it just would't work.

Perhaps you make the sample set 5 games as opposed to 10? What is the number that makes the win / loss record statistically meaningful / relevant?

Moe said...

KRACH

EOD

Anonymous said...

Chris I agree with you - but we can't lose sight of the fact that college hockey has by far the best system to rank the teams.

I personnally would not make an automatic berth for any league - let them vcompete on an equal basis.

Still - it is a lot better than other NCAA sports

Anonymous said...

KRACH is the better methodology...no cliffs to fall off of, no 10 TUC games requirement, no TUC 25 counts and 26 doesn't. BUt of course the NCAA will continue to use the more illogical ranking system...that being Pairwise. Or PairUnwise.

Deejer

Chris said...

I'm not a fan of KRACH just because I think it places way too much emphasis on strength of schedule. Every year, WCHA teams end up way too high just because they all have a strength of schedule in the top 10 in the country.

Right now, Alaska-Anchorage is 7-18-7 and is still 26th in KRACH. Michigan Tech is 5 games below .500, but still 16th in KRACH, which would be right on the bubble for the NCAA tournament.

It just seems to place too much emphasis on who you play, and not how you play against them.

Anonymous said...

Chris

How would Michigan Tech fare against Princeton's league?

I think we know the answer to that.

Anonymous said...

Good stuff Chris, and thanks for pointing out the article. I hope everyone reads it all and please leave comments on what I might've missed. ...

Anyway, as an old ECAC guy but also a supporter of KRACH, as much as it pains me to say, I think KRACH is pretty accurate in assessing Michigan Tech, etc... I don't think KRACH over-counts strength of schedule. KRACH doesn't really count anything. KRACH just is. It's a recursive, iterative method of figuring out who would beat who ... like the old Team A beat Team B beat Team C thing, etc... but on a much more complicated scale.

Anyway, Chris, I like your TUC idea. But remember, the penalty is supposed to be that the category doesn't count - because it's supposed to guard against a weak team going 3-0 in TUC games. So that's already the penalty.

Anonymous said...

The problem with Krach is that it penalizes a team for something it can't control - what conference it is in and hence, the quality of its schedule.

That is why Krach will never be adopted.

In theory, it has a lot of merit, but for the reason above, will never work.

Unknown said...

"But remember, the penalty is supposed to be that the category doesn't count - because it's supposed to guard against a weak team going 3-0 in TUC games. So that's already the penalty."

It also throws out the category if one team is 0-9 against TUC and another team is 10-0 against TUC (you can insert a less extreme example if you want).

I think the way the rule would executed is the category remains a tie, unless a team would still have an advantage if you assumed they lost every game up to to 10 (if they're not yet up to 10 games) and you assumed the other team won every game up to 10 games (if they're not up to 10 games yet).

Anonymous said...

It's not true that KRACH "penalizes" you for being in another conference -- and certainly not any more than any other system that has strength of sked built into it does - such as the current Pairwise system. ... If you win your games, you are not "penalized"

Anonymous said...

The problem with the PWR is that we keep tweaking it to get the results we want, and the system has no coherent rationale for why it ranks teams the way it does.

I think KRACH is far superior. I also think the concerns about SOS are not really about SOS, but rather fears about one conference dominating things as far as at large selection goes.

If that's the case, the solution should be something like capping the total number of teams allowed from one conference at 5, rather than trying to distort the rankings so the results fit preconceived notions of what should happen.

Anonymous said...

As the CHN poster says, KRACH does NOT penalize a team for being in a weaker conference...that is the strong point of KRACH. If you win the games on your schedule, no matter what schedule you play (either a heavy ECAC schedule or a WCHA schedule), the KRACH ranking will not differentiate.