Diachronic study of Wikipedia articles

This work focuses on the evolution of articles on the Wikipedia encyclopedia and the obstacles to its study. Through different cases, the aim is to identify and/or build methods and devices likely to facilitate the collection, analysis and visualization of the numerous versions that constitute the history of a given article. We will be interested in the content of the articles, but also in the links that connect them, in the references that they mobilize, or in the exchanges that take place in the discussion pages.

Key Concepts

Mediation of knowledge, Temporality of media, Commons-based peer production, Distant reading

Studied medias

Wikipedia articles, version history, discussion pages, links between articles articles, references used

Delivrables types

Paper in a scientific conference scientific conference, Article in a scientific journal, Computer computer program, Data sets

Methodology

While the diachronic study of Wikipedia is not new, it is most often based on qualitative analyses, necessarily limited to a small number of articles and time points. In order to overcome these limitations, this project is in line with the work on remote reading of version histories. Initial studies led to the design and development of Historic Graphs (https://framagit.org/retrodev/historic-graphs), a tool that leverages MediaWiki APIs to build dynamic graphs showing the evolution of links between articles over time. Due to the considerable (and growing) density of the encyclopedia’s link network, especially around articles with geospatial themes (years, country names, etc.), the analysis of such graphs is nevertheless tricky. Other developments are envisaged, to facilitate the analysis of the joint evolution of a limited number of articles. More qualitative approaches have also been explored, through the use of references or the comparison with other media temporalities (press, scientific literature).

Expected results

Ce projet vise à déterminer les conditions dans lesquelles cette Wikipédia peut constituer une source pertinente pour étudier l’évolution de certains concepts et de leurs représentations dans l’espace public au cours d’une période données. Ces travaux doivent également clarifier la place occupée par l’encyclopédie au sein de l’écosystème médiatique qui, tour à tour, l’alimente et s’en alimente. Ils vont pour cela entrer en résonance avec beaucoup des recherches actuelles sur la recomposition de cet écosystème, des acteurs qui le composent et des principes qui fondent sa légitimité.

Key dates

2018 : beginning of the project and development of the first versions of Historic Graphs
2019 : presentation of the tool and a first case study (November 13, 2015 attacks in Paris) at the H2PTM conference, discussions on the limits of graphs: Segault, A. (2019). Historic Graphs: dynamic mapping of links on Wikipedia. H2PTM 2019, From hypertext to digital humanities.
2021 : présentation d’un cas d’étude (incluant la dimension qualitative et la comparaison des temporalités) sur la représentation de l’intelligence artificielle lors de la conférence H2PTM : Bouchereau, A., & Segault, A. (2021). Évolution des représentations de l’intelligence artificielle : De quelle(s) IA Wikipédia est-elle l’encyclopédie. H2PTM 2021, Information : Enjeux et nouveaux défis.
• Accès aux jeux de données : https://framagit.org/retrodev/historic-graphs/-/tree/master/examples/h2pt https://framagit.org/bouchereaua/ia-wikipedia-h2ptm21

Other members involved

Aymeric Bouchereau (UPEC / Lab’Urba)