An Introduction to Determinants: 5 Problems with Measuring Merit
Hierarchical Culture Selection Pt. 2
In The Cultural Microbiome I offered a primer to the idea of hierarchical cultural selection. In order to build towards mechanisms of action for hierarchical culture selection, we must first discuss determinants. While determinants exist outside of meritocratic processes, the best way to frame an initial investigation into them is via paths our society has already laid to evaluate them.
Part 1: Why would you want to measure merit?
There are generally two reasons to measure merit: to predict whether someone will do well, and to evaluate if someone did well. This maps to the two types of meritocratic process: predictive, and evaluative.
Predictive Meritocratic Processes
Meritocratic processes like hiring are predictive. They set out to evaluate merit under the assumption that correct prediction will lead to better financial gain. The hitch is that the merit they are screening for can only be evaluated with future performance. So instead they use predictors, which we’ll talk about in part 3, to act as heuristics for which candidates will be most deserving in the future. In addition to hiring, dating, financial forecasting, and admissions are predictive meritocratic processes.
Evaluative Meritocratic Processes
Meritocratic processes like competition are evaluative. They analyze existing data on performance to determine which candidates are superior. In addition to competitions, performance reviews, criticism (as in literary criticism), and most legal trials are evaluative meritocratic processes.
No matter what meritocratic processes rely on qualities they assign meaning as representative of skill, value, or nature. These qualities are the determining factors in whether or not a candidate is deemed to pass muster, so I call them “determinants.”
As I said in the introduction, determinants do not only belong in the realm of meritocracy: they are also the factors that determine the success of an organism in reproduction, of a person in career, or a theory in adoption. Meritocratic processes deliver us the best space in which to evaluate them, because that’s what we as humans design filtering mechanisms to do: measure determinants of success!
While the intention of meritocratic processes is pure, they often trip over themselves in choosing determinants and deciding what they actually mean. Even in the simplest of cases, it’s easy to run into problems when interpreting a determinant.
Part 2: Direct Determinants
To start, let’s talk about direct determinants of merit- what you see is what you get, the determinant itself is the merit. How do we know Usain Bolt is the faster man in the world? We measured everyone who seemed to be a contender for the title and he came out on top. This straightforward, what-you-see-is-what-you-get metric is really nice for determining merit. There’s only one way to prove whether another is more meritorious1 of the title of fastest man: beat Usain Bolt’s time.
Let’s do a thought experiment:
Is it possible that there is someone in the history of the world that is or was faster than Bolt?
Yes.
We did not test every single person on the planet, nor have we always tested every single person on the planet. This brings us to problem #1 with measuring merit:
Problem #1 with measuring merit:
Availability of information.
Bolt is deserving of the title of “fastest” because he is the fastest we have measured- we can never know for certain if he truly is the GOAT. We’re not done with direct determinants yet though, there’s another more insidious problem here, which is definitional.
Problem #2 with measuring merit:
Definition of merit.
What does fastest mean? Is Usain Bolt faster at running a marathon than the top long distance runner? If not, why does he deserve the title of fastest as opposed to them? This is an operationalization problem- the more abstract your assertion of merit, the more chance for subjective influence on the determinant used to evaluate merit.
In Donohue, Levitt, Roe, and Wade we saw this problem on display in the social sciences, where one hypothesis can be proven or disproven depending which empirical measurements you use as determinants in operationalizing your theory, and on which data sets you then evaluate those determinants. Some may say “fastest means top speed” while others might say “fastest means minimum completion time.”
In the case of Bolt, we can reduce this effect by being more specific in our claims of merit:
Usain Bolt is the fastest man alive: he holds the record for fastest recorded running speed in a human: around 27 1/2 mph (44.72km/h).
Usain Bolt is the fastest man alive: he holds the record for fastest recorded 100m dash at 9.58 seconds.
Now of course, you could be even more pedantic like Randall Munroe, and ask the ridiculous question: Two legs, or all fours? Which introduces the third and final problem for measuring direct determinants.
Problem #3 with measuring merit:
Scope of claim.
It’s easy to see that rampant qualification of determinants quickly extinguishes the romance and drama of titles like “fastest man alive” by turning them into “man with fastest two-legged recorded running speed.” I like to think of the addition of useless qualifiers to determinants as “baseball stats-ifying,” since it’s reminiscent of some of the more inane and arbitrary statistics dreamt up in sabermetrics.
Take, for instance, LIPS (late-inning pressure situation), which requires that:
The game must be in the seventh inning or later.
The batter’s team trails by no more than three runs, is tied, or is ahead by one run.
The batter’s team can also be down by four runs if the bases are loaded.
LIPS is still a one-to-one determinant, its sample size is just so vanishingly small for the average batter as to make it a worthless piece of data to collect.
I may not be able to match Joey Chestnut’s “fastest hot dog eater alive” title, but can he match my “fastest chicago-dog eater alive in Washington D.C. after midnight on a Thursday in November” title? Didn’t think so.
So definition of merit, scope of claim, and availability of information are the biggest issues when measuring direct determinants of merit. What about indirect determinants? Hoo boy….
Part 3: Indirect Determinants
Not every meritocratic process has the privilege of evaluating the same thing it is measuring for. There’s a few different shapes that indirect determinants of merit can take:
Subjective Determinants
Subjective determinants eschew operationalization entirely. This is because the qualities they attempt to measure can’t meaningfully be assigned a number without also understanding the context of that number. The quality of art is subjective- if I told you Guernica was a 6, you might say
“6 out of what?” To which I’d reply:
“6 out of 7.”
Alright, so now you know what I think about Guernica in an abstract way. How do you plug my subjective judgement into a meritocratic process meaningfully? Well, you either ask me to give you a rating of a bunch of other art on a scale of 7, or you ask a bunch of other people to review Guernica on a scale of 7. Either way, the initial data point isn’t enough information to operationalize artistic quality without calibration.
The panel of judges in Olympic Gymnastics is a good example of utilizing both methods of subjective calibration I mentioned above, as they all review the same candidates on the same scoring rubric/scale. This brings in problem #4:
Problem #4 with measuring merit:
Subjectivity must be calibrated away.
If you decide to calibrate away subjectivity with a group, you’re going to need to select that group- often times using some form of a meritocratic process, which has put us in a Catch-22: how do you select for a group to calibrate away subjectivity when your selections themselves are subjective?
Partite Determinants
If you want to evaluate something unmeasurable like programming ability, you might instead to decide to break down the topic you are evaluating merit for into multiple determinants like:
coding fluency,
ability to communicate ideas,
implementation of tests,
knowledge of algorithms and data structures.
When you get hired as a developer, you go through a technical interview in which you’re scored on these and other categories so the employer can get an idea of how good you’ll be at your job. This is a partite determinant, it attempts to operationalize merit and ability by breaking it down into constituent measurable qualities.
Each of these sub-determinants have all of the problems with determinants mentioned above, with the added confounding factor of then being rolled up into a master judgement of merit. Defintion and Scope are compounding problems in partite determinants- how do we prove the measurable determinants we decided on are truly related to the unmeasurable determinant? And even if they are, how do we know the best arenas in which to test them? Here we arrive at problem #5:
Problem #5 with measuring merit:
As merit becomes more complex, it becomes more subjective.
Time-Delayed Determinants (Predictors)
Time-Delayed Determinants, or predictors, are those which operationalize a quality that can’t be measured now. Instead, a quality is chosen as a leading indicator of merit. Predictors themselves don’t have any unique inherent flaws, and in some ways are better than the other indirect determinants because they can often simply be direct determinants with an intermission.
Conclusions
Today we have explored the different types of determinants, and used the lens of meritocracy to highlight issues that exist in measuring them. If you have any questions don’t hesitate to reach out to me in the comments or on twitter.
I’m trying to move away from aggressive self promotion on subreddits, so if you liked this article, it would be a huge help to me if you shared it with a community that you think might enjoy it.
God I hate the english language. Why can it not just be “meritous?” why “meritorious?” The first sounds like a reasonable word that obeys the standard rules of grammar, the second sounds like a word someone made up to sound fancy. Of course the second is the one in the dictionary.
Thanks for the read! Interesting to think about. There are certainly some mathematical rules about how error of measurement compounds in complex systems, so I thought the partite bit was quite compelling and hadn’t thought of its risks that way (usually it seems people add a rubric to a process to try to -improve- precision, but does it?)