'Data' - singular or plural

We work with data all the time but we're never sure whether to use the word as a singular or plural noun when discussing the topic. Whichever way we jump we are sure to upset somebody. If we treat 'data' as a plural noun and say 'the data are stored' then that makes us sound old-fashioned to young developers. The alternative is to assume it's a singular noun and say 'the data is retrieved'; but that makes us sound slapdash to senior staff.

Whatever we do would be wrong so we tend to hide our personal views and avoid the ambiguity of the bare word 'data' altogether. Instead we use a term that is obviously singular or obviously plural and say 'the data files are stored' or 'a data item is retrieved'. Privately we treat 'data' as a singular and but over Christmas I came across a line of argument which might give us the confidence to come clean in public.

The Latin inheritance

The problem of course comes from the Latin word 'datum' - a known fact. The plural in Latin is 'data' and this is the word that programmers have adopted to describe a collection of known facts. English speakers seem to be happier to treat this plural noun as though it were singular and this is what upsets the pedants.

We've now found two other words to upset the pedants. 'Stamina' is one. It's the plural of 'stamen' - a thread or sinew - but 'agenda' is the killer. If anyone complains about our saying 'the data is' I shall remind them that 'agenda' is the plural form of 'agendum'. It means 'items to be discussed' and I shall inform the complainant that they must treat the word properly as a plural. I won't let them say 'an agenda' but I will insist that they say 'a few agenda' or even 'some agenda'. Furthermore, they won't be allowed to put something 'on' an agenda but they'll have to put it 'in' or 'amongst' the agenda.

We have however given up on the argument about 'media' and 'medium'. It's the same Latin problem so 'media' is plural and a single CD can't be storage media. But this doesn't seem to upset the purists in the same way as 'data' vs 'datum'.