Blog post Part of series: Artificial Intelligence in educational research and practice

Natural language processing: A tool for microgenetic analysis

Florence R. Sullivan, Professor of Learning Technology at University of Massachusetts 7 Jan 2020

Microgenetic analysis is a Vygotskyan educational research method that supports uncovering student sense-making activity during collaborative learning. Investigating learning using microgenetic analysis requires paying close attention to the social interactions, speech acts and the use of tools within the learning environment in order to understand the genesis of conceptual development for children. While microgenetic analysis is a powerful educational research method, it is difficult to employ with large datasets; the very nature of the close and detailed work belies wide application. However, advances in artificial intelligence have led to computational methods that have the potential to support microgenetic analysis of larger datasets.

In our paper, ‘Exploring the potential of natural language processing to support microgenetic analysis of collaborative learning discussions’ (Sullivan & Keith, 2019), we provide a detailed account of how we deployed a natural language processing (NLP) approach known as parts of speech (POS) analysis to assist in microgenetic data analysis. POS refers to the grammatical role of a word in a sentence (noun, verb, adverb, preposition, and so on). We ground our use of the POS NLP method in Bakhtin’s (1986) theory of speech genres and Goffman’s (1974) notion of social frameworks. Bakhtin characterises speech genres as relatively stable types of utterances occurring within a particular sphere of human activity. Meanwhile, Goffman notes that participants in a specific, culturally recognisable activity, share a social framework for the type of interactions that may unfold in the activity. This social framework helps to guide interaction.

‘Our goal in this study was to identify grammatical clusters that perform specific types of “work” within the group towards solving the robotics challenge.’

In prior work (Sullivan, 2011), we developed a qualitative model of student problem-solving activity with robotics that we term the troubleshooting cycle (TSC) (writing, testing, discussing, debugging, re-writing, re-testing). The TSC is a relatively regular and stable feature of student activity while solving robotics problems. We argue that within this bounded sphere of robotics learning activity, the utterances that accompany the TSC activity will likewise be stable and specific. Our goal in this study was to identify grammatical clusters that perform specific types of ‘work’ within the group towards solving the robotics challenge.

The participants in our study were a group of three 12-year-old students in a sixth-grade science class. Our dataset consists of a 30-minute segment of collaborative problem solving we had already analysed by hand (Sullivan, 2011). Our research goal was to investigate the development of conceptual understanding of the role of a light sensor in solving a line-following robotics challenge. We sought to replicate our prior research findings with the aid of the new POS NLP method. To do so, we clustered words at the level of the bigram and the trigram. We selected these n-gram configurations because, arguably, they are the smallest levels at which complete utterances might be made. Halliday and Matthiessen (2014) point out that while the clause is the smallest semantic unit in the English language, clauses are made up of smaller grammatical units that also have meaning, including the nominal group, the verbal group, the adverbial group, and the prepositional phrase. Importantly, the theme of a clause will be carried by one of these smaller structural elements (p. 92).

Through a deliberative process we assigned problem-solving codes to specific POS bigrams and trigrams, we then developed a temporal view of the clustering of n-grams at specific times over the 30-minute segment. This allowed us to visually identify robust periods of problem-solving discussions, which we then subjected to deeper analysis. Through this deeper analysis, we identified a trajectory of improving understanding of the role and function of the light sensor in the activity, partially replicating prior results.

Our work demonstrates that AI techniques, such as POS NLP can aid researchers in conducting microgenetic analysis and expanding the approach to larger datasets.

This blog is based on the article ‘Exploring the potential of natural language processing to support microgenetic analysis of collaborative learning discussions’ by Florence Sullivan and P. Kevin Keith, published in the British Journal of Educational Technology. It has been made free-to-view until 31 January 2020, courtesy of our publishing partners, Wiley.

References

Bakhtin, M. M. (1986). The problem of speech genres. In V.W. McGee, Trans., C. Emerson, & M. Holquist (Eds.), Speech genres and other late essays (pp. 60–102). Austin, TX: University of Texas Press.

Goffman, E. (1974). Frame analysis: An essay on the organization of experience. New York, NY: Harper and Row.

Halliday, M. A. K., & Matthiessen, C. M. I. M. (2014). Halliday’s introduction to functional grammar (4th ed.). New York, NY: Routledge Press.

Sullivan, F. R. (2011). Serious and playful inquiry: Epistemological aspects of collaborative creativity. Journal of Educational Technology and Society, 14(1), 55–65.

Sullivan, F. R., & Keith, P. K. (2019). Exploring the potential of natural language processing to support microgenetic analysis of collaborative learning discussions. British Journal of Educational Technology, 50(6), 3047–3063. https://doi.org/10.1111/bjet.12875.

Florence R. Sullivan, Dr

Professor of Learning Technology at University of Massachusetts

Dr Florence R. Sullivan is a professor of learning technology and chair of the Teacher Education and Curriculum Studies department in the College of Education at the University of Massachusetts, Amherst. She is the author of Creativity, technology, and learning: Theory for classroom practice, published by Routledge Press in 2017. Dr Sullivan currently serves as an associate editor for the interdisciplinary journal ACM Transactions on Computing Education. Her research focusses on middle school (ages 11–14) student’s collaborative learning with constructionist-based, computational media including LEGO robotics and Scratch. She has published over 30 papers on this topic in the last 10 years. She is especially interested in researching effective curricular and pedagogical means of supporting girls’ and other underrepresented groups’ learning with these computational media. Her research work is supported by the US-based National Science Foundation.