Ah, review scores. They’re the flame that draws the moth-like lazy readers who want anywhere from 500 to 2,000 words summed up in a single number. If that sentence doesn’t convey their inherent problem — and years of watching the Olympics or receiving grades on exams haven’t clued you in — then consider how much grey area exists between one and 100.
Any form of media, from books to games, is not made of the same stuff as a 100-point exam, either. If each test question is worth one point, then figuring out the grade the student deserves is an easy enough calculation. It’s when the teacher starts awarding half points and quarters-points that you storm over to her desk and demand an explanation.
As a reviewer of various things, I assign scores. Outlets tend to use their own criteria, forming a total out of 10 or 100, for example, or maybe even adopting letter grades. Even so, what a “9” represents on one website is not the same somewhere else even though we like to qualify it as such on aggregate sites like Metacritic.
I’ve switched over to letter grades (A through F) when reviewing for pleasure because it’s familiar and refreshingly straightforward. I don’t have to worry about how a 9 is minutely different from an 8 when a 6 is in a separate league of awfulness. Ironically, the grading system we turn to for simplicity has poisoned how we measure quality. (And worse, I can’t seem to stop throwing in pluses and minuses. Help!)
It’s hard to determine the best score. We get so bogged down by what’s “fair” that we have no idea what fair means, but the truth of the matter is that no one will ever agree on a single verdict. You’re always wrong to someone. There is no perfect number because the numbers mean little to begin with.
This is what gives writers grief: Numbers do carry weight. Readers don’t read reviews as much as they hunt for the end result and then go back and read the evidence — not for educational purposes and to better understand but for ammunition to attack the critic with.
You might think otherwise, but reviewers really do care. In video games, I obsess over a score more than I should and usually end up regretting it later no matter what choice I make. But we do weigh the pros and cons, shuffle around the numbers in our head, and put forth a great effort than a lot of people seem to believe.
The source of the problem is not the system itself but in how we’re inclined to interpret it. Even the places that crack down — making a point to stretch their legs in all that numerical space — can’t seem to budge out of a certain range.
For example, all Game Informer loves to give is 8s, and all Polygon feels comfortable with are 6s. You can’t count too many high scores without seeing an immediate, almost predictable drop as though the staff is meeting some sort of quota for the month.
I love both those sites dearly, but what we need is to do away with the 10-point system entirely. No one possesses the sort of reflex that enables them to operate well within that range because we were all raised on the same grading system in America, where As and Bs are good and everything else is shameful. Better not show that one to your parents.
If we’re all so awkward at assigning scores that we’re making it up as we go along, why even bother to have them? Well, they’re expected. Mandatory. A part of the industry and the universal language for “good” and “bad” product. A thumbs-up and thumbs-down are too basic; we need more flexibility to allow for degrees of quality.
That concept just gets out of control.
What’s worse is when we take back scores under pressure, like Polygon did recently with one of its articles. What happened is this: Electronic Arts released SimCity on its always-online service Origin, and early reviewers had no problem playing around in the game, which is apparently quite good. Then launch day came, and the public servers were fired up, and no one could access the game they had paid $60 for.
So Polygon downgraded its score for a very generous 9.5 to an 8.0 and then to a 4.0. You can see the history and the reasons given for it on the review page.
This withdrawal seems like more of an attempt to be a voice of the people than to present an honest assessment. But game “journalists” aren’t gamers’ friends even though we’re gamers ourselves. When you change a score, you’re giving readers a reason to doubt your credibility. It’s not uncommon for a game experience to change before or after release, mostly due to performance-enhancing or bug-fixing patches, but reviewers rate a game based on the state they played it — not external factors that take effect afterward.
And moving from an 8 to a 4 is a mental reconfiguration equal to a geographic landslide.
The bottom line is that connectivity is a temporary issue. Even Blizzard’s Diablo III failed to launch when it was supposed to, but servers do not make a game. As embarrassing as it is, shouldn’t Polygon be obligated to restore its original outcome once these problems are completely resolved and the experience is, well, good again? Amazingly good, apparently. A 9.5 is pretty outstanding.
SimCity itself doesn’t sound like a low-quality game to me, but unfortunately, the business doesn’t separate reviews scores into sections: one for the game and one for server maintenance. Maybe it should.