Should “data” be singular or plural?

From The Economist:

For more than a millennium after the fall of Rome, educated Europeans were distinguished by their knowledge of Latin. One of the three subjects of the trivium—the basic tier of a classical education, itself based on a Roman model—was Latin grammar. Europeans have long since stopped writing primarily in Latin, but learned people are still expected to be able to deduce that to “decimate” means to destroy a tenth of something (a mutinous legion was punished in this way), or sprinkle annus mirabilis and mutatis mutandis into their speech.

It is not for lack of knowledge of, or affection for, Latin that The Economist marks a change this week. The reform involves one of the most curiously polarising issues an ending on a foreign word has ever generated in English. We will now allow singular use of data alongside the plural. Specifically, when considered as a concept—as in data is the new oil—the singular will be acceptable, as well as when the data in question is considered as a mass (the data on this mobile-phone plan is insufficient). However, when data points are considered as a group of pieces of information, the plural should still be used: data from the National Oceanic and Atmospheric Administration indicate the hottest summer of all time.

Data, as every child at a grammar school once knew, is the plural of Latin’s datum, “something given”. Originally that plural sense was carried over into English. But already in 1702, the Oxford English Dictionary records, came the first appearance of singular data, in an astronomy textbook. This was almost 60 years after plural data was first recorded.

The rise of computing has changed the balance. While an 18th-century scholar’s data might be a single column of numbers, today’s computers quickly manage billions of bytes. Data points begin to seem like the water molecules in the ocean and so, in such contexts, to be perceived as a mass. Singular data is now more common than the plural in books, and far more prevalent on the web.

Data is hardly the first foreign word to undergo grammatical change in English. The nearest equivalent is agenda, an old plural of agendum, “something to be acted on”. Once those collected agenda started being thought of as a list, the English singular was born. (Candelabrastamina and insignia were all Latin plurals, too.) The Economist’s style guide prescribes a list of Latin -um words in English that pluralise with -a (memorandastrata), but many more that violate Latin grammar and take -ums (forums, stadiums, ultimatums). It demonstrates that those words are now English; Latin rules need not apply.

Those who oppose singular data argue that the word refers to a set of numbers. Yet the properties of the thing itself are not a reliable guide to a term’s grammar. Go to a shop where dried goods are sold from barrels and note rice (a singular) next to lentils (a plural), and wheat (singular) next to oats (plural). Head to the pasta section and see what happens to other languages’ words in English: spaghetti and lasagne, both Italian plurals, are singular when served up in English.

Link to the rest at The Economist

13 thoughts on “Should “data” be singular or plural?”

  1. Jeeze.
    “How many angels can dance in the head of a pin? ”
    And it’s not even a slow news week!

    They say it themselves: data is a collective. More importantly, data is only *useful* as a collective.
    Hardly anybody uses datum because a single point of data is useless. (And that only in academia, where one point is enough to spawn an entire paper.)

    As the saying goes: Once is happenstance, twice is coincidence. It takes thrice to actually mean something, be it enemy action or a plan. 😉

  2. While I’m all in favor of realism in evaluating the evolution of word forms as the natural language rolls along (even for formal usage), I think this is well overstating the case for “data, datum”. Yes, we have accepted “data” in casual use as a singular sometimes, but it’s still also accepted as a plural. What has really happened is that we are in the (lengthy) process of dropping “datum”, and making “data” acceptable as both the singular and plural (collective) form (c.f., “sheep”).

    In usage, singulars of accepted Latin neutral plurals like this will survive for quite a while rhetorically as pointed insults, e.g., “In all this mass of alarm there was only one datum that seemed relevant…” [As we would use “factoid” in a derisory sense.]

    And the phrase “All the data are in support of…” isn’t going anywhere.

    All sorts of things can happen to grammatical distinctions from learned sources, and sometimes it depends on when and from where they first enter the destination language. We retain some of them inconsistently (cherub, cherubs, (cherubin), cherubim — via French/Latin/Greek/Hebrew), and others become confused. For example: “Stadium” (ampitheatre) is derived from “Stadia” (measurement), because the length of a stadium is (to quote the OED): “An ancient Greek and Roman measure of length, varying according to time and place, but most commonly equal to 600 Greek or Roman feet, or one-eighth of a Roman mile. (In the English Bible rendered by furlong.)” The plural of Latin Stadium (Stadii) and Stadia (Stadia) are distinct, adding to the confusion, since all of us dimly-recollecting Latinists would normally guess at “Stadia” as the plural of neutral “Stadium”, but we would be wrong.

    Changes in grammatical number happens to us even in our native English via natural inheritance rather than learned language. Indo-European had not only a Singular and a Plural for grammatical number, but also a Dual form, for things that come in pairs (surviving variously in the daughter languages). For English, this is where the “n” comes from in “eyen”, a now-archaic/dialectical version of “eyes”, and the “r” in “door” (linguistic evidence that doors were perhaps a pair of “doorposts” when the word was established, rather than a single panel). Link:

  3. The article nods in the direction of understanding the issue, but doesn’t quite make it. The issue is not whether “data” is singular or plural, but whether it is a mass noun or a count noun. English mass nouns are grammatically singular. The “data is plural!” crowd is asserting, usually without understanding it, that “data” cannot be used as a mass noun. Yet it clearly is is used this way, and has been for a long time. Why is that wrongety wrong wrong? We are never quite told. Is there a rule that a word borrowed from a foreign language must retain its grammatical properties? A casual browse through any good dictionary will show that this is not, and never has been, a rule of English grammar. It isn’t a rule even if we limit it to familiar European languages.

    What actually happened is that “data” randomly won the spun of the grammatical peeve wheel. People get upset about it because they were told they should, not because there is any principled basis for them to.

  4. I’m glad they’re catching up 35 years later. Everyone has known since 1987 that Data is forever singular. And we’re not going to delve into “the company are” versus “the company is” any time soon, either, because there isn’t even the excuse of “it’s a dead language so we can do whatever the hell we want with it.” (Although after their performance yesterday, it’s definitely “Manchester United are” — there’s nothing unified or singular about them.)

  5. Living languages evolve under pressure of usability. Even the French academy has come to bow before the inevitable and grudgingly accept neologisms and (grrr) anglicisms.

    English has long been mutable in vocabulary and grammar as usage mandates and even seriously “offensive” terms like “ain’t” endure because they serve a purpose. (As recently pointed out by “them’s the breaks”.) Whether latin, french, spanish, german, or japanese, english will happily adopt and word or usage that serves a purpose. And it is that mutability that is propelling it as the global lingua franca.

    Usage rules. Offensive though it might be to the old school.

    Tomorrow’s english (say a century or three) will very likely be as unintelligible to us as old english is to most people. So what? It’ll be tbeir world and their needs setting tbe rules.

    (That’s one thing FIREFLY got right.)

    • What they’re really upset about is that they had an elite edge (dimly remembered Latin neuter plurals) and the hoi polloi(*) aren’t using their words right.

      This is the classic case of a middle-brow intellectual: bright enough to remember there is a status usage they believe they should know, dim enough not to remember how it really works in Latin, proud that they can exhibit (what remains of) their education to brag about their (self-perceived) class, and way too unsophisticated to understand the basis of their “rule”, either in Latin or in how living languages like English work (much less the history of their native tongue and its evolution over time). This is the class that memorizes rules; they don’t understand them (much less extend/refine them), they’re not curious about where they come from, but they got a B on the test and that’s all that counts.

      (*) Yes, “the hoi polloi” is redundant (“the the people”) but it’s how the phrase keeps being used in English quotations. You can see how I dread being corrected by a “karen”. (And to think I’ve lived to see the day that my own name got co-opted as a byname for a scold…)

      • Ah, my sister-in-law shares your name, so I am forbidden to use it as a descriptive.

        Since I do not know anyone with the name, I’ve fallen back upon the old-school “Mrs. Grundy” for such people.

      • They can always fall back on using the native foreign language pronunciation for foreign cities. An anglicized pronunciation is so gauche and unrefined. And the use of English words derived from French (Latin) is still a priority for many.

    • Apropos of the French Academy…

      I learned my French before there were such things as consumer computers (late 70s-early 80s)

      I well remember salmon fishing with my husband in the Quebec region early in my tech career, in a rural Quebecois-speaking-only area. When the guides wanted to know my profession, I tried out “computer” in a (Parisian) French accent, to no avail (and since even “hamburger” was still iffy in French, I wasn’t entirely surprised). Gestures and descriptions didn’t work — they’d never seen small ones.

      Not until I returned to the States did I discover the word I wanted was “inordinateur”.

      We used to amuse ourselves on the long drives back from fishing trips in French Canada by trying to carry on long conversations in French with each other. Since my grasp on the spoken language was glib but fading, and my husband’s was more or less non-existent, it generally devolved into Pepe le Pew vaudeville before we got south of Maine.

Comments are closed.