A theory of theory

A is the interpretation.

Most theory is terrible, and never should be run.   Theory as a strategy is harmful to debate.

B, the violation, is self-evident.

C is the ground.

Judges are routinely voting for things they hate, because the debaters present them little choice.     Theory is everyone’s villain: nobody refers to a theory heavy debate as a classic. We speak of rounds “devolving” to theory battles, designating them for a lower plane of evolution. It leads to unhappy judges, lowered speaker points, and unsatisfying rounds – all assertions that need little warrant.

Theory doesn’t win. Sure, it wins rounds – a lot of them. But it doesn’t tend to win tournaments.     Debaters who resort to theory a lot are the under performers – the debaters who never seem to reach the level of success their skill would suggest for them. The big championships tend to be won by the debaters who engage in it least.     Theory can win when both debaters do it, as the judge wishes to be elsewhere while signing the ballot. Theory can win when the debater using it is much better versed in it than their opponent – a round which the theory debater would have won anyway.     It can also win in the cheap shot round – throwing a trick out there, a snake hidden in the weeds, to snatch a victory from a better debater. The last approach is seductive to sophomores, struggling in their first varsity rounds. It also only works for sophomores – once a debater does it enough, they cease to catch anyone unawares, as their opponents grow alert to the threat.

Theory doesn’t help LD. The more theory has grown in the last four years, the more LD participation numbers have dropped. Theory is not useful beyond debate. What little it does teach – logic, extemping arguments – substantive discussion teaches better. Theory could easily drive students away – it’s boring. It’s a skill that will give them nothing past LD.     We’re left with the debaters who would have stuck around anyway – debaters who are glad to win theory because they’re in it to win, and don’t especially care about how they get there.     Debaters run it as a time sink, which crowds out actual substantive debate by definition.

Theory encourages more abusive affirmatives in the first place. If every debate is just going to devolve to theory anyway, there’s little penalty to breaking realistic norms with intent. Why not run an abusive, shifting and non-topical plan, when you’re going to have to win a theory debate anyway? May as well start off with a lead on substance.     This year, I hear a lot of angst at the rise of critical race theory arguments or other non-topical cases based on identity, which some LDers have imported from policy. I wonder how an LD debater who runs mutually exclusive theory interpretations can possibly object to abandoning topical debate in favor of identity arguments, when what it’s really replacing is theory games involving invented rules.

Theory blocks access to LD.     It’s totally opaque in most cases, as ground arguments speed on by incomprehensibly; I rarely even bother trying to flow it, given I can’t understand and don’t pretend to care. The local debater or debater trying out LD for the first time is just blown out of the round, and then figures they should look at PF or mock trial. There’s nothing wrong with PF or mock trial, but there’s something wrong when someone who really loves philosophy and would be happiest in LD settles for them because they can’t make headway against theory.

Theory is the preserve of those who can afford camp. Research about topical literature is available to all. Research about identity and performance is likewise available to all.     Camp makes arguing these things easier, but it’s not necessary.     Theory, however, can be learned nowhere else.     It rose in part so camps could justify their cost – it’s the only way, short of rigging the topic votes, that a camp can provide arguments guaranteed to be useful in the coming school year.     But their utility comes at a cost; since there’s no external way to learn about theory or practice it, beyond the bounds of a large coaching staff or affording camp, it becomes a gateway issue, a hurdle to those who have neither. It’s hard to teach oneself substantive debate and philosophy, but the internet and the library do afford the chance. It’s impossible to teach oneself theory, since it’s all about technique, and most of that technique is about freezing your opponent out of rounds in the first place.

Theory prevents the formation of actual norms in the community. If we had the occasional theory everyone asserts is necessary – some viable limits on the topic, and the approaches that affirmatives and negatives take with it – then the argument would hold. But in a world where debaters are constantly inventing rules mid-round and accusing their opponents of violating them – when the violation comes ahead of the interpretation – it’s impossible to settle on actual norms. It’s further impossible when the educators are removed from the question. Judges are admonished not to intervene, which means we’re unable to use the debate round as a platform to help establish those norms and get past most of the frivolous theory out there.     Theory can never reach an actual answer in the round; if we did, the debaters who rely on it would just move the goalposts.

Theory has no impact debate. Education and fairness are rarely sketched out arguments, but instead are watchwords, talismans invoked but not explained. Rarely are LD theory impacts actually tailored to the violation; instead they are rote incantations with little value beyond their ritualistic necessity.

Theory is impossible to judge, and to train judges in.     Without a reference to the rest of the world, there’s no way a judge can gauge theory arguments on anything other than crosshatched tallies of argument quantity. I can tell you whether an economic argument or a moral one has internal sense; I cannot do the same of theory arguments. Debaters complain about random outcomes to theory debates, and then those same debaters become judges and understand – now only too late to run something else as a debater.

D, of course, is the impacts.

Theory hurts fairness, freezing the debater without money or resources even further by pinning debates on esoteric nonsense that give automatic wins to those who invoke it. It makes preparation infinite, as you can never prepare for the invented rules of your opponent. It excludes people without the time or the inclination to learn material that never will be useful again.

Theory hurts education. It displaces topical debate, a lot of it. It displaces substantive non-topical debate, too. It lets negatives who haven’t prepared enough get away with using it as a filler. It prevents both sides from having to think about responding to novel arguments, to engage in the crucial skill of applying evidence and reasoning in a way they hadn’t thought of to answer a new position.     It encourages frivolous affs who know full well nothing will be extended.     And it reduces the numbers of debaters, and even programs in LD in the first place.

The last impact is a personal one. If theory keeps being a dominant part of LD, then LD will cease being a dominant presence in my life. Among the many major impacts is a minor one – it’s boring me to tears. I’ll coach something else, if at all, and even recommend that Lexington stop doing it. It’s a waste, of time, effort and money, to play in this self-referential sandbox. I’m not sure why I do it even now. If it lasts much longer, I won’t, and I’ll steer others away it as well.     It doesn’t help matters that next year’s policy topic is one I am really interested in and have technical expertise in.     This minor impact becomes major because I’m not alone in feeling that way.

E is the alternative. OK, so this just became a K.     You’re going to have to cope.

Without some theory, we go back to the land of eighty three NIBS, of floating advocacy, of made up evidence, or whatever else got us started down the path.  But the status quo means the solution has become worse than the illness. So we require means to keep the limits without the excess.

So I propose we add one rule to theory that can sweep aside many others: every interpretation should be warranted with a card.     Before a debater may run theory in a round, they should first justify the interpretation and standard on real grounds in public writing, or have a coach do the same.

That solves many of the harms above. It allows for rules to be fleshed out in an open arena, devoid of the competitive pressures, time limits and necessity to vote a round entails. It could be two competing theory interpretations are both wrong – a judge still must vote for one of them, but in an open forum, the audience may easily reject both.     Therefore, bad rules or norms can be winnowed out. A good proposed norm will stand the scrutiny of many voices, while a harmful or spurious rule will quickly grow a list of arguments against it.

It allows for adult participation in the argument. Adults have no voice in the course of a debate, which is proper – but adults should have a voice in the formation of norms, which itself is the curriculum of debate in a real way.     If theory must be cited, then a coach can generate those citations, or argue against them as easily as a debater.

Publication is no bar to anyone; there’s essentially infinite space on the debate web, and few of the sites aren’t looking for content. Getting a coherent theory article published should be possible for anyone. And once online, they become a resource to those who can’t afford the tuition and travel of camp; a debater can self-educate on theory, and prepare for a circuit tournament from a local league. Theory cards would have to carry the same citations as any other, and the ground and impact level debate would be already developed within those cards.

About the only harm is that it would limit what you could do in a round when something truly bizarre and objectionable emerges. In that case, you might lose a round – a somewhat less serious harm than debate practices eating at the very fabric of the event.     Or, you’d have to think about the arguments raised and the parallels to evidence and theory already established – which would, incidentally, be a critical educational goal of debate in the first place.     Independent thinking isn’t so bad, once you get used to it.

Semiotics

I actually wouldn’t mind ordinals; the idea you’d spend hours figuring out distinctions between your 75th and 76th judge is spurious.   If you can’t distinguish between a couple judges, then it doesn’t matter what order you put them in: just put one as 75 and one as 76 and move on with life.   It’s easier in practice than you’d think.   Plus, Tabroom actually has rather nice features built around ordinals; it can auto-generate a pref sheet for you based on your past ratings of a judge as a starting point.

And you’ll pardon me if I casually shrug off the conservative objections to Something New from the group of tabbers whose original schedule for switching over to online tabbing & balloting would have had us trying it out for the first time in early 2016.

But to see the point I’m trying to make about narrower band categories, you start with an ordinal sheet; the “true” sheet of how a given debater prefers judging.   A true ranking of preferences won’t necessarily be a smooth linear progression; it’ll have little clumps.   There’ll be a nest of three judges in the topmost spot, and then maybe a clear 4th, then maybe 3-4 judges tied for 5th, and so on.   But the size and location of those clumps will be random enough that the pref sheet is essentially a gradient.

A tiered system necessarily imposes arbitrary boundaries on that gradient, turning them into bands.   The fewer (and wider) the bands, the more information is lost.   Menick says that when you rate a judge a 2, they’re a 2, and thus are magically mutual with every other 2.   But plonking a label on a judge doesn’t make it so.   The judge could be my favorite 2 and your least favorite 2, in which case the judge isn’t very mutual at all; there’s a lot of slope between our opinions.   The most mutual judge, in fact, may be my favorite 2 and your least favorite 1, being separated only by one notch made more significant by the random chance of the tournament policy.

Look at tournaments that require you to rate 25% of the field a 1 and 25% a 2.   You’ve now rated half your judges in the top two tiers; and you’re going to get all kinds of judge matchups where the gap in preferences may differ by as much 24% of the field.   A 1-2 matchup in that case could be killer.   It’s not uncommon — and not difficult — for such tournaments to put out pairings with all 1-1 matchups, maybe a few 2-2s.   These matchups sometimes really stink; you end up with your opponent’s favoritest judge in the world when it’s the person you held your nose and plunked a 1 because you really needed just one more.   I’ve been on the wrong end of those pairings.

Now take Lex, where 11% were 1, and 11% were 2.   Lex’s 1-2 pairings are a little more precise; they all fall within the top 22% of your pref sheet.   Further, by drawing a line in the middle of that pool of judges, you can minimize their number.   There’s nothing a tournament requiring 25% of the field to rank 1 can do to minimize judge pairings that may be 24% of the field off on people’s actual preferences; it doesn’t have the data, and so doesn’t know that another 1-1 might actually be far more mutual in the debaters’ actual preferences.   Lex didn’t have any matchups 25% of the field off, and minimized the number of pairings that are 11-22% of the field off.     It’s more mutual, by which I mean more reflective of the actual preferences underlying the numbers on the pref sheet, not the fake mutuality of categories, which conflate the signifier and the signified.

I don’t buy the argument that new schools are going to be screwed by not figuring this out; that’s an argument to abandon prefs, not an argument to make them blunt versus narrow.   It’s also non-unique harm; new schools have to figure out theory, framework, impacts, cards, and spreading too.   It’s not the pref sheet keeping rookies out of the bid round.   A new school could get a judge that really favors their style and disfavors mine, and chances are my kid will beat them anyway, if only because I will know the judges’ preferences and the new coach won’t.   “You have to be smart to do that!” is true of nearly everything in debate, and should rather remain so.   Plus, LD is way more open to new schools and disruption than policy is; new schools are competitive on the national circuit all the time.   The answer to this is helping and teaching new schools how debate operates, not changing how debate should operate.   I put my money where my mouth is there, too: I’ve filled out pref sheets for lots of other schools.

In the end the debate is immaterial, I believe.   Whatever the merits, I think the trendline towards greater precision is inevitable.   TRPC limited LD to 6 while Policy could choose 6 or 9; 6 remained the norm in LD but in Policy most tournaments use 9 — in other words, both ended up using the largest number of tiers they could.   College policy, thanks to the good professors Larson and Bruschke, had software that did ordinal prefs years ago, ordinals they have used, with few exceptions.   Now that Tabroom has lifted the arbitrary 6/9 limits, I think high school debate, for high stakes tournaments anyway, will move towards ordinals, no matter what we say or do.   I embrace it, but even if I didn’t, I would think it’s going there anyway.     It’s a natural progression outwards from strikes to prefs, when you think about it; strikes are simply a really inaccurate pref sheet.   So I’d say, practice your ordinals…

…and just wait until I explain the percentile system to you.

Answering innumeracy with data

He still doesn’t understand my point about increased mutuality, but I’ll write that up when it’s not   12:30 AM on a Tuesday.

For now:

  • Percentage of 1-off judges at Lexington:     8.46%
  • Percentage of 1-offs at Columbia:   8.33%

Man, really blew mutuality to shreds with those 9 tiers at Lex, didn’t we?

At Columbia, 12.5% of the VLD pool did not pref; at Lexington it was only 9%, thus making the job harder at Lexington to boot.     As we say in the business, “No Link.”

 

Potshot #2

Menick is reaching a flawed conclusion, in my opinion, not because his reasoning is unsound — try not to faint of shock — but because his underlying assumptions are.

Assumption #1: My pref sheet is based on paradigms.

Hah, as if.   The nether regions of the pref sheet are sometimes based on a reading, or the mere existence, of a paradigm.   The meat of the pref sheet, however, is based on first hand experience with judges.   We keep a written log of our RFDs in our team Dropbox; I read those regularly to adjust our pref sheets, because that’s actual data, and not just random assertions and opinions like the paradigm represents.     I’ve never read the paradigms of our top judges; I rely instead on being in the room when they say “I vote on theory” and ignore their assertion that they never will.

As such I find judges, the ones I prefer anyway, rather predictable.   When my debaters lose a round, they get an L; when they win one, they get a W, and we can talk about how to make the former turn into the latter.     99 times out of 100 I can figure out why they lost a given judge and most of those times, it’s because they ignored something I told them to do beforehand.   The other 1 time, the pref sheet probably changes.   My pref sheet isn’t about converting Ls into Ws; it’s about winning when we do win the debate, and being able to coach beforehand.

Remember, señor Menick, that the folks you’re hearing from are those coming into tab suggesting that you find them a better panel, as if you hadn’t already thought to try.   It takes someone rather unfamiliar with the way things work to imagine that tab rooms put out horrendous panels and withhold good ones until asked.   In short, you’re hearing only from people who don’t know what they’re doing, and drawing general conclusions from that data.   Talk about your flimsy evidence.

Assumption #2: Good pref sheets can win all the rounds!!!

A good chunk of rounds are yours no matter the panel; a good chunk of rounds likewise are impossible to win.   If you’re a senior with ten bids against a sophomore with ten cards, you’re going to win unless you do something tragic, even in front of a 1-5 judge.   The pref sheet isn’t about that; it’s about helping you in the marginal debates.   Which debates are marginal depends on what your level is; a younger team should have different prefs than an older.

Assumption #3: The W/L is the only concern of the pref sheet

This assumption is the most important of the set.   The number I hang on a judge isn’t entirely about the likelihood they’ll vote for us.   It’s just as much informed by the type of debate that judge would like to hear.   If you don’t like debating theory, then you de-pref the theoriest of the theory judges, even if one or two of them is likely to vote for you anyway.   The aim here is not to win rounds you would have lost otherwise, but to have debates you find enjoyable and are prepared for.

NStar was a mighty LARPer, and was most at home with DAs and CPs and such.   If some framework-happy sophomore in her first varsity tournament came along, and hit him round 2 in front of a 1-2 judge in her favor, a judge who positively loves philosophical framework debate, she’d still be toast.     But she could argue the kind of debate she wanted, and he could not, despite being better at it.   So even with the W, a harm is caused.   It’s sometimes an unavoidable harm, but it’s one the pref system is designed to minimize; and it’s one that blunt, imprecise tiers minimize less well.

Note that it is not a harm that NStar had to debate framework; it’s simply relatively unfair that one debater got to steer the debate into her own home turf and the other didn’t.   A better outcome is a judge who likes yet a third style of debate, and so both debaters have to adapt equally.   That is, after all, the idea behind mutuality, and an argument for the maximal mutuality possible.

At a wider level, too, I’ll pref differently for younger debaters.   Some judges are not as good for us stylistically, but they’re great educators and can give excellent feedback.   I won’t stake a junior or senior’s last bid round on being able to adapt to them in front of their favoritest debaters; but I might take the chance to get a good post-round for a student who is going to lose those rounds anyway.

Assumption #4: Team’s opinions don’t matter

Take away everything else, denounce it as the foul lies of a dirty Papist or what have you; and you’re still left with a final thought: the perception of fairness sometimes matters as much as the reality.   If a rating is entirely about unfounded imprecise opinions — which I would assert might be true for some people’s pref sheets but isn’t of mine — those opinions still matter.   A kid walking into a debate where she feels she has no shot to win because of the judge, likely will not, even if she actually had a decent shot after all.

There are a lot of reasons to de-prefer a judge; there’s the judge who might actually like your style a great deal but freaks out out, or the college freshman judge who you have a crush on and can’t string two words together in front of.   There are a host of considerations that go into a pref sheet, and some of the are opaque to the tabber; if we’re going to have the tool at all, it may as well be as good as you can make it.

RFD

Finally, to steal a page from debate; there is no offense in the round for less precise categories.   I’ve given a half dozen or so positive reasons for more precision; thus far all I’ve heard in reply is doubt whether the precision achieves a real effect.   But without any affirmative reason to prefer blunter categories, who cares?   Why is a tournament better for having 4 categories instead of 8 or 12 or 16?   All I’ve heard argued is defense: doubt that the 8 category tournament is better than the 4 category one.   There’s no argument on my flow of anything being harmed by having smaller, more numerous categories.   Sure, an 8 tier system will have more 1-2 matchups, but if it makes you feel better, all those 1-2 matchups would be 1-1s in a 4 tier system; and there’ll be fewer of them on the pairing than there would be hiding underneath those 1-1s.   Better for the debaters and the judges both; though perhaps worse for perfectionists tabbers.   Speaking as a perfectionist tabber, surely my sensibilities are an unimportant factor here.

In absence of offense for the larger category pref sheet then, I’m sticking to, and advocating for, more of ’em.

I’ll have my tiers and eat them too

Menick contends that fewer tiers are fine, thanks.   I’m definitely in the more-is-better school with categories.   Fewer categories needlessly throws away information that could be used by tab to make more mutual pairings, is why.

I think the point of difference is that he   defines a “mutual” matchup as a matchup where the two judges are in the same category, and anything else as imperfect.   But the real picture is murkier than that; simply making the categories larger and thereby forcing me to rate more judges in a category does not magically make all those judges equally preferred to my debaters.

Suppose one tournament has 4 categories and another 8; and the first delivers 100% mutual matchups, and the second has a number of one-offs on the pairing.   The second tournament will have delivered the more mutual judging.   The 4-category tournament will have many more matchups that are just as non-mutual as those 8 category one-offs.   They will only nominally be mutual to the eye of the tabber. There will be more of them, too, as the tab system no longer knows which matchups would have been 1-2s in a 8 category system, and so it can do nothing to minimize them.

When you place a 1-2 judge in an 8 category tournament, you know what you’re doing and you know there’s no better choice.   If that same tournament used 4 tiers, then you might place that same judge into the same debate, despite there being a more mutual option which is concealed by the broader, less precise categories.   The pairing looks prettier, but at the expense of the missing data which might make it fairer.   Choosing tier sizes should not be about satisfying the OCD of tab staff.

Pairing mutual matchups is becoming an automatic process, and there’s little reason to deprive this process of additional data.   The point of expanding to more categories while necessarily growing more permissive of one-off matchups is not to increase the numbers of those matchups, but to minimize the number of hidden ones.   So from a tabbing perspective there is little defense in my mind to using blunter, less precise categories.

It may be there’s a limit past which coaches are unable to make distinctions in the judging pool.   That limit is higher than 9, for me: I certainly could have filled out a pref sheet at Lex with clear distinctions between tiers; all my 1s would have been preferred to the 2s, all 2s over the 3s, etc.   The distinctions would have carried real information well down my sheet, and if I were going to the tournament, I’d want the tab staff to have that information in pairing my judging.

College debate has operated for some years under the premise that the limit of reasonable distinctions does not exist, and rates judging ordinally at most tournaments.   I think the high school community largely hasn’t followed suit only because our software wouldn’t permit it, not for any inherent reason.   At worst, finer distinctions do no harm; if you really don’t have any point of difference between 6s and 7s on your sheet, then just randomly assign them between the two ratings.   Given that I am a coach who makes those distinctions, I don’t see why I should sacrifice them because another coach does not.

So in short, I’m a computer programmer, and a data person, and I don’t see much value in sacrificing more data for less.