For someone who doesn’t know any grammar, I can be a bit of a Grammar nazi sometimes. And one of my pet peeves is when people use the word data in the singular. No! Data are!

Or so I used to believe.

But recently I came across possibly the most sensible compromise between the “is” and the “are” crowd, articulated by the Grammar Girl, Mignon Fogarty.

The compromise hinges on the hair-splitting difference between the so-called count nouns and mass nouns. Count nouns are those which you can count (e.g., I have five books, you have 75 CDs, he got 50 million votes). You have more or fewer of them, and you may say you have many of them. Mass nouns are those which cannot be made plural and cannot be used to count (e.g., you don’t ask for two cups of coffees, or two pieces of chalks). You have more or less of it, and you use it in a sentence with much, as in “how much coffee would you like”. (Of course, English being what it is, you can also say “how many coffees are you ordering”, which actually is shorthand for “how many cups of coffee …” and is thus implicitly pointing to a count noun.)

An easy way to tell these two types of nouns apart is to ask yourself how many or how much. If it makes sense to ask how many there are of a noun, as in how many cars or how many people, then it’s a count noun. If, however, it makes more sense to ask how much there is of a noun, as in how much butter or how much rain, then it’s a mass noun.

The use of many and much parallels the use of fewer and less: many and fewer are used with count nouns (like items in a grocery cart) and much and less are used with mass nouns, like tea or bacon.

The trick now is to realize that data can be both a count noun and a mass noun. If you use it as a count noun, it is always plural, and you are using it in lieu of the word “facts”; the literal translation of datum from the original Latin is that it is “a thing given”, hence data is “things given” — it refers to a quantity. If instead you are using it in lieu of the word “information”, that makes it a mass noun, and it becomes singular. The facts are compelling uses it as a count noun, but the information is compelling uses it as a mass noun.

The count noun datum and its plural data, meaning “a given fact or assumption,” were adopted from Latin into English by the seventeenth century (2); however, it wasn’t till the late nineteenth century that data took on the modern sense of facts and figures. This shift in meaning also led some to start treating data as a mass noun.

She goes on to give some good advice –

So if data is correct as both a count noun and as a mass noun, which should you use? It comes down to style and personal preference. Many academic and scientific fields, as well as many publishers and newspapers, still insist on the plural count noun use of data

Just be aware that if you do write or edit for a publisher or in a discipline that insists on plural data, you should make sure the surrounding words properly reflect the plural treatment of the word data. Even if you don’t have a style guide insisting on the plural usage but you decide to use it anyway because you like Latin plurals, be sure to do it consistently throughout the document — in other words, don’t mix up your datas, using it as a count noun in one place and as a mass noun in another.

I, for one, am willing to accept a ceasefire in the data wars.

