MIP – Innovative Pedagogical Modules / Digital explorations of the INA archives

The project, conducted in collaboration with Marta Severo and Antonin Segault, professor and lecturer at the University of Paris Nanterre, focuses on a different theme (“imaginary bodies”, Europe, Sport…). Using data sets, some of which have been processed by the AI tools of the INA’s Research Department (transcription, speech and face detection, similarity recognition, etc.), participants must produce analyses that renew the scientific exploitation of heritage collections. During 4 consecutive days, organized in groups and in a “Sprint” format, the participants alternated work sessions between the Learning Lab of the University of Nanterre and the INA thèque.

Key concepts

Work in groups, data

Studies materials

Set of data in the Inathèque’s holdings (complete perimeter of radio, television, and the Web, and this, since the origin of each of the media, i.e. 22 million hours of radio and television, 60 million documentary records, 118 billion archived Web URLs, almost 3 billion social network account publications)

Type of delivrables

Presentations made by groups of students from the data sets they have worked on data sets and their analysis

Methodology

The students, divided into groups, analyze data selected from the Inathèque’s collections and apply treatments to them. They use mining tools and visualization tools that allow them to see more clearly the salient features and even the gaps in these massive data sets. INA’s research engineers also apply to some of these corpora, and before the event, search, visualization or transcription treatments that they have developed. In a short time (one week), the groups organize themselves to make these data sets “speak” from the treatments they have applied during the sprint. They finally produce analyses and a report on the last day of the sprint.

Expected results

Dans un temps réduit, une exploitation quanti-quali et analyse de données de la part de groupes d’étudiants. Faire « parler » les donnés de l’INA.

Key dates

• 2020 : First edition on the theme of the imaginary of computer science (see reports https://inatheque.hypotheses.org/20467 and https://inatheque.hypotheses.org/20461
• 2021 : Second edition on the theme of the body (see presentation https://inatheque.hypotheses.org/22479 and reports https://inatheque.hypotheses.org/23014 and https://inatheque.hypotheses.org/23148
• 2022 : Third edition on the theme of Europe (see presentation https://inatheque.hypotheses.org/25145
• 2023 : Fourth edition to come on the theme of Sport