Over the last years, I have been working on quite a number of research tools, mainly for the data driven analysis of social media platforms. My main goal is to gain a deeper understanding of the logics embedded in these platforms and their APIs, but I think that writing research tools is an excellent way to pursue this kind of exploration. Nothing beats first hand experience.
Depending on availability, I am interested in participating in research software development on a freelance basis. If you are looking for consulting and/or programming capacity in the area of data analysis, web APIs, (interactive) visualization or data mining (machine learning, clustering, etc.), drop me a line at email@example.com.
Check out other tools developed by the Digital Methods Initiative here.
These tools are “as is” software, no support is provided. But most tools have basic descriptions or FAQ sections. There are also a number of instruction videos on my YouTube channel. Most of the code is available on github.
High quality bug reports are much appreciated. Please send bug reports to firstname.lastname@example.org. I do not reply to inquiries on other channels. If you are interested in acquiring paid technical support, please also contact email@example.com.
The Digital Methods Initiative Twitter Capture and Analysis Toolset - developed with Erik Borra and Emile den Tex - provides various ways to retrieve and collect tweets from Twitter and provides a number of modules to analyze tweet collections. Requires server installation.
A tool that extracts data from different sections of the Facebook platform – in particular groups and pages – for research purposes since 2010; due to API and ToS changes, the feature set has changed over the years.
A simple tool that gets posts tagged with a specific term and creates tabular statistics and co-tag networks.
A simple tool that gets media from Instagram tagged with a specific term or posted around a specific location and creates tabular statistics and co-tag networks.
Since Instagram has changed its API regulations, this tool no longer works.
A collection of simple tools for extracting data from the YouTube platform via the YouTube API v3.
A tool for analyzing textual data stored in timestamped lines of text (e.g. files from Netvizz, DMI-TCAT, etc.). Provides fast text searching and some statistical and visual text analysis. Work in progress. Requires server installation.
A visualization tool for analyzing changes in ordered lists (e.g. rankings) over time.
Calculates cosine similarity between lists of quantified variables (i.e. feature vectors) and outputs a similarity network.
Input tags and values in wordle format to produce a HTML tag cloud or tag list.
Another small text analysis tool for emoji statistics and bigram/collocation extraction.
A simple PHP script for using Google's Vision API. Takes a comma- or tab-separated file containing a column with image URLs as input, sends images to the Vision API and puts the detected annotations back into the list.
A series of PHP scripts to scrape and analyze pipermail list archives.
This is a collection of PHP command line scripts to grab data from Reddit and transform it into CSV files.
A (possibly growing) collection of basic Python scripts that interface common data with more complex forms of text processing.
Script that uses browser automation to click through the YouTube web interface and download the transcript file. A basic example for starting with Selenium.
Creates networks of related artists, based on data from Spotify.