Skip to content

By analysing the learning process, one can understand how student behaviours are related to learning outcomes. When assessing the learning process, researchers have shown multimodal learning analytics to provide better predictions than single data streams in individual learning due to each unimodal measure providing different information (Cukurova, Kent, & Luckin, 2019; Giannakos, Sharma, Pappas, Kostakos, & Velloso, 2019). Moreover, by including temporal aspects of the data, in which data points are collected across multiple time points (such as for each 10-second window), rather than only counts or averages, one can understand the correlations and impacts around the change in behaviours (Csanadi, Eagan, Kollar, Shaffer, & Fischer, 2018), which may lead to better predictions. In our recent article, ‘Temporal analysis of multimodal data to predict collaborative learning outcomes’, published in the British Journal of Educational Technology (Olsen, Sharma, Rummel, & Aleven, 2020), we investigate how multimodal data can aid in understanding the temporal inter-relationship of variables explaining learning from the collaborative process. The work expands our understanding of the use of multimodal learning analytics for collaborative learning beyond what unimodal data can provide and systematically assesses the benefits of different data streams in a temporal analysis.

A systematic comparison of data streams

Multimodal data does not refer to a specific combination of data. Rather, multimodal data refers to any combination of multiple types of data. For example, multimodal data may consist of audio, gaze and log data, or EEG and dialogue data all being collected from the same participant. On the other hand, if only one of these streams is collected, such as audio, this is unimodal data, even if multiple measures are used, such as tempo or energy from audio. Consequently, what the combination of data includes in terms of which and how many data streams are collected can impact how beneficial the use of multimodal data may be. We analysed multimodal data collected from 25 9–11-year-old dyads as they collaborated using a fractions intelligent tutoring system, which is a system that provides step-by-step and adaptive instructional support to students. Using data streams that spanned time scales (Newell, 1990) – in other words, measured interactions at a biological, cognitive or social level – we investigated how different combinations of data streams impacted the prediction of learning gains and post-test scores. Specifically, we assessed the relation of gaze, tutor log, audio (speech at the signal level) and dialogue (speech at the content level) data.

Expanding data in time and type

When we remove the temporal aspect of the process data by just using counts or averages for the different measures, we found few relations between the process data and learning gains. However, through our temporal analysis, in which we analysed each of the measures in 120-second windows, it is clear that these relationships do exist and may just be masked when we used counts and averages. We saw that addressing the temporal aspect of the data provides more information, although not equally. The variables that are measured at a smaller time scale, such as the gaze and audio measures, provided a more accurate prediction of learning gains than the measures at a higher time scale, such as the log data.

‘The variables that are measured at a smaller time scale, such as the gaze and audio measures, provided a more accurate prediction of learning gains than the measures at a higher time scale, such as the log data.’

As with the expansion of the analysis across time by considering the temporal aspects, we also found benefits of expanding the data across type through a multimodal analysis, supporting previous research (Vrzakova, Amon, Stewart, Duran, & D’Mello, 2020). However, this is not without a caveat. It is not enough to just have multimodal data, as some of our combinations actually had a less accurate prediction of learning gains than the unimodal data. What data is combined matters. We saw that combining the data streams of different time scales is beneficial to predict learning gains. One explanation for the benefit of the different time scales may be that they provide information on different dimensions. It may be less about the combination of multimodal data that is a benefit in itself, and more about what unique information each data stream brings – with the time scales being one dimension to consider.

This blog is based on the article ‘Temporal analysis of multimodal data to predict collaborative learning outcomes’ by Jennifer Olsen, Khsitij Sharma, Nicol Rummel and Vincent Aleven, published in the British Journal of Educational Technology. It has been made free-to-view for those without a subscription for a limited period, courtesy of our publisher, Wiley.


Csanadi, A., Eagan, B., Kollar, I., Shaffer, D. W., & Fischer, F. (2018). When coding-and-counting is not enough: Using epistemic network analysis (ENA) to analyze verbal data in CSCL research. International Journal of Computer-Supported Collaborative Learning, 13(4), 419–438.

Cukurova, M., Kent, C., & Luckin, R. (2019). Artificial intelligence and multimodal data in the service of human decision-making: A case study in debate tutoring. British Journal of Educational Technology, 50(6), 3032–3046.

Giannakos, M. N., Sharma, K., Pappas, I. O., Kostakos, V., & Velloso, E. (2019). Multimodal data as a means to understand the learning experience. International Journal of Information Management, 48, 108–119.

Newell, A. (1990). Unified theories of cognition. Cambridge, MA: Harvard University Press.

Olsen, J. K., Sharma, K., Rummel, N., & Aleven, V. (2020). Temporal analysis of multimodal data to predict collaborative learning outcomes. British Journal of Educational Technology.

Vrzakova, H., Amon, M. J., Stewart, A., Duran, N. D., & D’Mello, S. K. (2020). Focused or stuck together: Multimodal patterns reveal triads’ performance in collaborative problem solving. In LAK 2020 Conference Proceedings – Celebrating 10 years of LAK: Shaping the Future of the Field – 10th International Conference on Learning Analytics and Knowledge (pp. 295–304). (ACM International Conference Proceeding Series). Association for Computing Machinery.

More content by Jennifer K. Olsen, Kshitij Sharma, Nikol Rummel and Vincent Aleven