INCA: Infrastructure for content analysis

Image credit: Unsplash


We present INCA (short for INfrastructure for Content Analysis), a Python module for collecting, storing, processing, and analyzing a wide variety of media content, including but not limited to news, political debates, social media, forums, and customer reviews. Using Elasticsearch as a database backend and Celery for task management, it makes automated content analysis scalable. INCA’s main objective is to enable and promote an integrated workflow. INCA focuses on re-usability of data, processors, and analyses; making all steps of automated content analysis (ACA) accessible to social scientists, without requiring advanced programming skills. Here, we present the aim, implementation, and recommended workflow for INCA.

In IEEE 14th International Conference on e-Science
Felicia Loecherbach
Felicia Loecherbach
Assistant Professor Political Communication and Journalism

My research interests include understanding news consumption online making use of theories from political communication and journalism. I use computational methods to study digital trace data. Only publishing research and tools open source.