Tools for Archiving Social Media
This guide presents methods of collecting twitter data using tools built by DocNow, a collaborative effort between Shift Design, Inc., the University of Maryland, and the University of Virginia, with funding from the Andrew W. Mellon Foundation. We use the tools developed by DocNow for collecting twitter data because of their strong commitment to prioritizing ethical practices in collection, use, and preservation of social media content.
Twarc for Twitter Data Collection
Twarc is a command line tool that downloads tweets using Twitter’s API. API’s, or application programming interfaces, are simply ways that different organizations, whether it is Twitter or the Census bureau, provide more direct access to data. API’s also oftentimes provide limits to how much data you can gather. Twarc will handle Twitter API’s rate limits for you.
The following pages provide instructions on installing and using Twarc. Parts of this guide are subject to change with updates to Twitter’s developers site, so please use this guide as a general guideline. For troubleshooting with Twarc, please contact the developers of DocNow, and join in conversation with the DocNow community of scholars, students, and archivists.
This page was published: April 25, 2019