This project presents the R code used to prepare and analyse data coming from a lesson recording.
The data is composed of the transcript of a teacher’s discourse during an informatics lesson, with some comments.
It is part of a PhD project in educational sciences.
This file presents the R code for the analysis of a lesson where one single teacher is working and talking.
For data protection reasons, the data is not available in this repository. We only provide R code to give an idea of the process setup to analyse the data.
Apart from the RStudio project file itself (lesson_discourse_1t.Rproj
), the main file is a Quarto document: class_id_lesson_id.qmd
.
From this file, and as in our context we use this project multiple times to analyse different lessons, we centralise most of the code and import several .R files.
01_libraries.R
imports all the packages needed in the project01bis_import_initial_variables.R
imports the variables provided by another analysis (dataframes calledclasses
,lessons
,discourses
,teachers
that centralise all the variables related to the classes, the lessons, the discourses and the teachers02_import.R
imports transcripts and comments as .csv files coming from the Trint platform03_end_of_intro.R
cleans the data, merges discourse and comments, creates the main Quantedacorpus
andtokens
objects and computes textual statistics04_splitting_lemmatisation_tokenisation_dfm_creation.R
splits the corpus into segments for the clustering, lemmatises with Spacy, adds asdocvars()
all the variables to the corpus and creates a data-feature matrix (Quantedadfm
object)05_plot_frequencies.R
plots the most frequent words in the lexicon06_reinert_clustering.R
computes the clustering withrainette
package, identifies the biggest clusters and plots the evolution of clusters over time07_alt_creation_and_ca.R
creates an aggregated lexical table (ALT), computes a Correspondance Analysis (CA) usingCA()
fromFactoMineR
package, plots it and prepares tables showing the elements most related to the first two axes of the CA08_export.R
exports the Quantedacorpus
andtokens
objects for the global analysis grouping all the lessons, as well as all the variables computed as dataframes