Data science cat and dog

Andrew Russell Green

Research, data science and software portfolio

Data science cat and dog

Andrew Russell Green

Research, data science and software portfolio

Wikifunctions UX
Wikifunctions UX

Mixed-methods, generative UX/design research to support new coding features on Wikipedia.

Skills used
UX/design research
Mixed-methods research
Research methodology
Product recommendations
Interviewing
Statistics
Data visualization
SQL
Python
Writing

This study was requested by the Wikimedia Foundation to help design a feature to add content from a new site to Wikipedia.

The new site, called Wikifunctions, is a wiki for computer code. The requesting team planned to add a feature to Wikipedia to let volunteers use code from Wikifunctions directly in Wikipedia articles.

The team had a general notion of how this might work and why it would be useful. Also, they expected that adding this feature to Wikipedia would bring them closer to another, more ambitious goal: the creation of Abstract Wikipedia, a language-independent version of Wikipedia.

Still, a lot of details needed to be filled in before work on the new feature could begin.

For this project, first, we defined the scope and approach we’d follow. We then conducted semi-structured or unstructured interviews with 27 volunteers in 13 countries, and followed up with quantitative analyses of data from Wikipedia servers.

This led to numerous product recommendations, as well as new insights about the role of technical Wikipedia volunteers, and about how Wikipedias in different languages evolve and diverge.

Approach

This was exploratory, generative research: we sought to answer open-ended questions and generate product recommendations. The central research question was: how could volunteers use Wikifunctions to improve their workflows and effectiveness when contributing to Wikipedia?

To explore this question from a community perspective, we interviewed highly engaged volunteers. This is similar to the technique in Anthropology of finding key informants with deep knowledge of their communities.

Following the interviews, we analyzed data from Wikimedia servers to see if we could confirm what we’d been told. We also studied online documents created by volunteers.

We focused mainly on Wikipedias in languages that had previously been selected as priorities for collaboration by the requesting product team.

Results

Our research confirmed that, as expected, Wikipedia volunteer have many issues with existing code features on Wikipedia. We recommended using Wikifunctions to help address some of these problems.

The most common workflow that volunteers told us about was translating articles from English to their language using a system called the Content Translation Tool (CX). However, the subsequent quantitative analysis showed that this workflow just happened to be unusually common for the communities selected for the research.

The plots below are from the quantitative part of this study. The first plot shows growth of CX usage on the Wikipedias included in the study. The second is a distribution of the percentage of new articles created using CX across all Wikipedias. In the third plot, we illustrate how, across four different sets of Wikipedias, this percentage actually can imply quite different total numbers of article creations.

Another qualitative finding that we looked into using quantitative analysis was co-evolution of Wikipedia coding features, articles and editorial policy. While we were not able to confirm this phenomenon directly, the quantitative analysis is at least congruent with it.

Below, the first plot shows the distribution of the number of templates (currently the site’s main coding system) across Wikipedias. The second is a matrix of correlations (PCC of log-transformed values) across Wikipedias for the amount of on-wiki code, number of articles, number of policy pages, and a few other metrics.

Code and Report

The Jupyter notebooks used in the quantitative analysis are here and here. The full report on the study is available here.