Surrogation Nation
Why we should give every metric an expiration date.
Ce qui est simple est toujours faux. Ce qui ne l'est pas est inutilisable.
What is simple is always wrong. What is not is unusable.
- Paul Valéry
In order to decide as a group, we must reason about the unknowable. In order to reason about the unknowable, we must approximate it with the knowable. In reasoning about the knowable, humans have the unfortunate tendency to forget that there was ever anything unknown to reason about at all.
We really need to land on a phrase for this bad habit of ours. I quite like the term “surrogation,” despite its provincial upbringing in managerial research. We also need a term to describe the subjects of surrogation, I call them “inbred metrics” further down the page, but that’s for wont of a more descriptive term. To drive home my point that we need to agree on language, and that this is a far reaching and endemic problem, here are 13 wikipedia articles from different fields addressing surrogation:
Surrogation - a measure of a construct of interest evolves to replace that construct.
Campbell's Law - The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor
Map-Territory Relation - A map is not the territory it represents, but, if correct, it has a similar structure to the territory, which accounts for its usefulness.
McNamara Fallacy - Making a decision based solely on quantitative observations (or metrics) and ignoring all others.
Goodhart's Law - "When a measure becomes a target, it ceases to be a good measure."
Lucas Critique (economics) It is naive to try to predict the effects of a change in economic policy entirely on the basis of relationships observed in historical data.
Reification (fallacy) - The error of treating something that is not concrete, such as an idea, as a concrete thing.
Overfitting - The production of an analysis that corresponds too closely or exactly to a particular set of data, and may therefore fail to fit to additional data or predict future observations reliably.
Faulty Generalization - A conclusion is drawn about all or many instances of a phenomenon on the basis of one or a few instances of that phenomenon.
Black swan problem // Problem of induction (philosophy) - Rejection of knowledge that goes beyond a mere collection of observations.
Perverse Incentive // Cobra Effect - An incentive that has an unintended and undesirable result that is contrary to the intentions of its designers.
Instrumental convergence (The Paperclip Problem) - The tendency to pursue unbounded instrumental goals.
If you’re a Baudrillard fan, this may be giving you Simulacra flashbacks. Simulacra are copies that have no original, the same as inbred metrics. If you haven’t read Simulacra, the tl;dr is that we gradually replace the real with the simulated until the simulated fully subsumes the real, and we are unable to tell the difference between a copy of reality and reality itself.
The postmodern project is founded on the graves of intellectuals who forgot their metrics were meant to show deeper truth. I could go on about how we practice idolatry in the modern day, scientism, verificationism, logical positivism, etc, etc, etc. However, we’ve been out in this field beating a strawman Office Space style for 30 years and it’s time to let go.
Abstraction is always wrong, but it’s often right enough to get an edge on nature. In my article on professionalism and stereotyping I attempt to ask if the concepts are useful, and really only show that they’re useful to those who decide to use them- often times to perpetuate unfair systems. How can we use knowledge of inbred metrics to benefit all humans, not just those with systemic power?
Expiry
Jorge Luis Borges, Collected Fictions, translated by Andrew Hurley.
...In that Empire, the Art of Cartography attained such Perfection that the map of a single Province occupied the entirety of a City, and the map of the Empire, the entirety of a Province. In time, those Unconscionable Maps no longer satisfied, and the Cartographers Guilds struck a Map of the Empire whose size was that of the Empire, and which coincided point for point with it. The following Generations, who were not so fond of the Study of Cartography as their Forebears had been, saw that that vast Map was Useless, and not without some Pitilessness was it, that they delivered it up to the Inclemencies of Sun and Winters. In the Deserts of the West, still today, there are Tattered Ruins of that Map, inhabited by Animals and Beggars; in all the Land there is no other Relic of the Disciplines of Geography.
—Suarez Miranda,Viajes de varones prudentes, Libro IV,Cap. XLV, Lerida, 1658
I don’t really care for this Borges story, but it drives the point home in fewer words than I ever will. Maintenance of models/metrics/simulacra is pointless as they are too complex to stay in lock step with reality as we age and our understanding develops. Empires, seasons, lives, metrics - all things wane.
What I’d like to propose is that some need help to wane a little faster. Let’s start giving metrics (and hopefully, much, much more) expiration dates.
I present to you the San Francisco Declaration on Research Assessment
From 1975 on, Academia went nuts for JIF.
No, not this JIF, this JIF (eat your heart out Faith Hill):
JIF (journal impact factor) is an equation that lets you figure out the ratio of publications to citations for a journal in a 2 years period. JIF was devised by Eugene Garfield In 1975. This had made many people very angry and has been widely regarded as a bad move.
While originally invented as a tool to help university librarians to decide which journals to purchase, the impact factor soon became used as a measure for judging academic success
Everyone forgot all about the (ligurally) poor librarians. Not long after Garfield made created his ill-fated metric the woozle effect was in full swing, and number of citations mattered more than merit.
Thankfully, research is beginning to come to its senses. In the past decade a sea of voices has shouted down JIF, and all its nutty goodness. The SF Declaration on Research Assessment is admirable in its resistance to the stranglehold of an inbred metric on an industry.
However, other contemporaries push further. Principle #10 of the Leiden Manifesto hits the most important nail on the head: Scrutinize indicators regularly and update them.
Conclusion: Embrace the lie
We all know that art is not truth. Art is a lie that makes us realize truth, at least the truth that is given us to understand. The artist must know the manner whereby to convince others of the truthfulness of his lies.
- Pablo Picasso
When I sat down to write this piece initially wanted to make an argument for fighting metric rot with model validation. I’ve change my mind. Don’t adjust metrics, dismantle them. Acknowledge that the abstractions and concepts of 30 years past have strayed too far from reality to still be of use. Call metrics what they are: useful lies.
After publishing my clumsy article satirizing strong longtermism; I spent a lot of time arguing with redditors about the distinction between their belief in the possibility of space colonization and the actual possibility itself. Hint: there isn’t one.
Arguing about the viability of future technology is a lot like overfitting a metric. It isn’t just a waste of time, it’s a vain indulgence. It’s a refusal to admit that the disconnect between our models, our projections, our estimations is greater than we can overcome.
We cannot know the future. We can only wait and see if our best-guess was good enough. Why scold the prophet while begging new prophecy? Don’t dabble in idolatry, dive in.
Shoot the false prophet, and find a new one.
Bonus round
Some ways in which we trick ourselves into to believing the unknown is known:
Operationalization (Social sciences) - A process of defining the measurement of a phenomenon which is not directly measurable.
Proxy (statistics) - a variable that is not in itself directly relevant, but that serves in place of an unobservable or immeasurable variable.
Policy-based evidence making - the commissioning of research in order to support a policy which has already been decided upon. It
Streetlight Effect // Drunkard’s Search - When people only search for something where it is easiest to look.
Some follow-on effects:
Bonini’s Paradox - As a model of a complex system becomes more complete, it becomes less understandable.
Counterfactual definiteness (Quantum mechanics) - The ability to speak "meaningfully" of the definiteness of the results of measurements that have not been or cannot be performed.
Observer effect (physics) - the disturbance of an observed system by the act of observation.