Potshot #2

Menick is reaching a flawed conclusion, in my opinion, not because his reasoning is unsound — try not to faint of shock — but because his underlying assumptions are.

Assumption #1: My pref sheet is based on paradigms.

Hah, as if.  The nether regions of the pref sheet are sometimes based on a reading, or the mere existence, of a paradigm.  The meat of the pref sheet, however, is based on first hand experience with judges.  We keep a written log of our RFDs in our team Dropbox; I read those regularly to adjust our pref sheets, because that’s actual data, and not just random assertions and opinions like the paradigm represents.   I’ve never read the paradigms of our top judges; I rely instead on being in the room when they say “I vote on theory” and ignore their assertion that they never will.

As such I find judges, the ones I prefer anyway, rather predictable.  When my debaters lose a round, they get an L; when they win one, they get a W, and we can talk about how to make the former turn into the latter.   99 times out of 100 I can figure out why they lost a given judge and most of those times, it’s because they ignored something I told them to do beforehand.  The other 1 time, the pref sheet probably changes.  My pref sheet isn’t about converting Ls into Ws; it’s about winning when we do win the debate, and being able to coach beforehand.

Remember, señor Menick, that the folks you’re hearing from are those coming into tab suggesting that you find them a better panel, as if you hadn’t already thought to try.  It takes someone rather unfamiliar with the way things work to imagine that tab rooms put out horrendous panels and withhold good ones until asked.  In short, you’re hearing only from people who don’t know what they’re doing, and drawing general conclusions from that data.  Talk about your flimsy evidence.

Assumption #2: Good pref sheets can win all the rounds!!!

A good chunk of rounds are yours no matter the panel; a good chunk of rounds likewise are impossible to win.  If you’re a senior with ten bids against a sophomore with ten cards, you’re going to win unless you do something tragic, even in front of a 1-5 judge.  The pref sheet isn’t about that; it’s about helping you in the marginal debates.  Which debates are marginal depends on what your level is; a younger team should have different prefs than an older.

Assumption #3: The W/L is the only concern of the pref sheet

This assumption is the most important of the set.  The number I hang on a judge isn’t entirely about the likelihood they’ll vote for us.  It’s just as much informed by the type of debate that judge would like to hear.  If you don’t like debating theory, then you de-pref the theoriest of the theory judges, even if one or two of them is likely to vote for you anyway.  The aim here is not to win rounds you would have lost otherwise, but to have debates you find enjoyable and are prepared for.

NStar was a mighty LARPer, and was most at home with DAs and CPs and such.  If some framework-happy sophomore in her first varsity tournament came along, and hit him round 2 in front of a 1-2 judge in her favor, a judge who positively loves philosophical framework debate, she’d still be toast.   But she could argue the kind of debate she wanted, and he could not, despite being better at it.  So even with the W, a harm is caused.  It’s sometimes an unavoidable harm, but it’s one the pref system is designed to minimize; and it’s one that blunt, imprecise tiers minimize less well.

Note that it is not a harm that NStar had to debate framework; it’s simply relatively unfair that one debater got to steer the debate into her own home turf and the other didn’t.  A better outcome is a judge who likes yet a third style of debate, and so both debaters have to adapt equally.  That is, after all, the idea behind mutuality, and an argument for the maximal mutuality possible.

At a wider level, too, I’ll pref differently for younger debaters.  Some judges are not as good for us stylistically, but they’re great educators and can give excellent feedback.  I won’t stake a junior or senior’s last bid round on being able to adapt to them in front of their favoritest debaters; but I might take the chance to get a good post-round for a student who is going to lose those rounds anyway.

Assumption #4: Team’s opinions don’t matter

Take away everything else, denounce it as the foul lies of a dirty Papist or what have you; and you’re still left with a final thought: the perception of fairness sometimes matters as much as the reality.  If a rating is entirely about unfounded imprecise opinions — which I would assert might be true for some people’s pref sheets but isn’t of mine — those opinions still matter.  A kid walking into a debate where she feels she has no shot to win because of the judge, likely will not, even if she actually had a decent shot after all.

There are a lot of reasons to de-prefer a judge; there’s the judge who might actually like your style a great deal but freaks out out, or the college freshman judge who you have a crush on and can’t string two words together in front of.  There are a host of considerations that go into a pref sheet, and some of the are opaque to the tabber; if we’re going to have the tool at all, it may as well be as good as you can make it.

RFD

Finally, to steal a page from debate; there is no offense in the round for less precise categories.  I’ve given a half dozen or so positive reasons for more precision; thus far all I’ve heard in reply is doubt whether the precision achieves a real effect.  But without any affirmative reason to prefer blunter categories, who cares?  Why is a tournament better for having 4 categories instead of 8 or 12 or 16?  All I’ve heard argued is defense: doubt that the 8 category tournament is better than the 4 category one.  There’s no argument on my flow of anything being harmed by having smaller, more numerous categories.  Sure, an 8 tier system will have more 1-2 matchups, but if it makes you feel better, all those 1-2 matchups would be 1-1s in a 4 tier system; and there’ll be fewer of them on the pairing than there would be hiding underneath those 1-1s.  Better for the debaters and the judges both; though perhaps worse for perfectionists tabbers.  Speaking as a perfectionist tabber, surely my sensibilities are an unimportant factor here.

In absence of offense for the larger category pref sheet then, I’m sticking to, and advocating for, more of ’em.

I’ll have my tiers and eat them too

Menick contends that fewer tiers are fine, thanks.  I’m definitely in the more-is-better school with categories.  Fewer categories needlessly throws away information that could be used by tab to make more mutual pairings, is why.

I think the point of difference is that he  defines a “mutual” matchup as a matchup where the two judges are in the same category, and anything else as imperfect.  But the real picture is murkier than that; simply making the categories larger and thereby forcing me to rate more judges in a category does not magically make all those judges equally preferred to my debaters.

Suppose one tournament has 4 categories and another 8; and the first delivers 100% mutual matchups, and the second has a number of one-offs on the pairing.  The second tournament will have delivered the more mutual judging.  The 4-category tournament will have many more matchups that are just as non-mutual as those 8 category one-offs.  They will only nominally be mutual to the eye of the tabber. There will be more of them, too, as the tab system no longer knows which matchups would have been 1-2s in a 8 category system, and so it can do nothing to minimize them.

When you place a 1-2 judge in an 8 category tournament, you know what you’re doing and you know there’s no better choice.  If that same tournament used 4 tiers, then you might place that same judge into the same debate, despite there being a more mutual option which is concealed by the broader, less precise categories.  The pairing looks prettier, but at the expense of the missing data which might make it fairer.  Choosing tier sizes should not be about satisfying the OCD of tab staff.

Pairing mutual matchups is becoming an automatic process, and there’s little reason to deprive this process of additional data.  The point of expanding to more categories while necessarily growing more permissive of one-off matchups is not to increase the numbers of those matchups, but to minimize the number of hidden ones.  So from a tabbing perspective there is little defense in my mind to using blunter, less precise categories.

It may be there’s a limit past which coaches are unable to make distinctions in the judging pool.  That limit is higher than 9, for me: I certainly could have filled out a pref sheet at Lex with clear distinctions between tiers; all my 1s would have been preferred to the 2s, all 2s over the 3s, etc.  The distinctions would have carried real information well down my sheet, and if I were going to the tournament, I’d want the tab staff to have that information in pairing my judging.

College debate has operated for some years under the premise that the limit of reasonable distinctions does not exist, and rates judging ordinally at most tournaments.  I think the high school community largely hasn’t followed suit only because our software wouldn’t permit it, not for any inherent reason.  At worst, finer distinctions do no harm; if you really don’t have any point of difference between 6s and 7s on your sheet, then just randomly assign them between the two ratings.  Given that I am a coach who makes those distinctions, I don’t see why I should sacrifice them because another coach does not.

So in short, I’m a computer programmer, and a data person, and I don’t see much value in sacrificing more data for less.