All Tweets Will Soon Be Available to Researchers

May 27, 2014 | Andy Cush

Twitter is a treasure trove of information about the people who use it. With hundreds of millions of tweets posted every day, it’s possible to analyze everything from the emotional landscape of a particular geographic region to the linguistic characteristics of a city’s neighborhoods through the social network.

Twitter’s API, however, only searches about one percent of the service’s vast library of tweets, meaning that researchers don’t have easy access to the entire archive. That may soon change. According to Scientific Americanthe company will begin making every single public tweet — dating back to Twitter’s beginnings in 2006 — available for scientific work.

As Scientific American points out, that raises some ethical questions:

Will Twitter retain any legal rights to scientific findings? Is the use of Twitter as a research tool ethical, given that its users do not intend to contribute to research?

The first is interesting, but there’s an easy answer for the second. As the recent dustup surrounding BuzzFeed’s use of tweets from rape survivors in an article about sexual assault showed, Twitter is and has always been a public platform that’s owned and operated by a for-profit company. If you’re worried about who’s reading or otherwise using your tweets after you post them, you might consider making your feed private (or shutting it down altogether).

However, just because you can access someone’s tweets doesn’t mean you should do it without being sensitive. Two epidemiologists from Virginia Tech outlined reasonable guidelines for using Twitter data in February:

To address these concerns, Caitlin Rivers and Bryan Lewis, computational epidemiologists at Virginia Tech, published guidelines for the ethical use of Twitter data in February. Among other things, they suggest that scientists never reveal screen names and make research objectives publicly available. For example, although it is considered ethical to collect information from public spaces—and Twitter is a public space—it would be unethical to share identifying details about a single user without his or her consent. Rivers and Lewis argue that it is crucial for scientists to consider and protect users’ privacy as Twitter-based research projects multiply.