A Text Mining Approach to Measuring Student Emotion as an Early Warning Indicator (NSF DUE 1161222)
Project O ver view
3 Datasets
We characterize incoming college students’ preparation for success in STEM fields along two axes, their interest and their proficiency. The overall goal of our efforts is to move students towards the high interest/high proficiency quadrant.
Our data analysis process utilizes a variety of datasets, each describing a different facet of the students in focus. Below are three of the main categories we have used in the past.
tail
e Full D
y
42
2 ePortfolios
ELE PO CTR RT ON FO I LIO C S
4 Prior Work
Of the three main data categories, our current research focuses on the electronic portfolios. Electronic portfolios consist of a collection of electronic evidence assembled and managed by the students themselves. This set of evidences can be provided in a rich variety of formats (e.g., text, images, multimedia files, blog entries and hyperlinks)
Students not retained
TOTAL
48
th Ma t c oje e Pr ienc lp e Sc H ical n ife ha L ec M
Ca Bu r ild E M ee Co xcit ake r mp ing ute Fie r ld
Positive and negative emotion scores derived from a text analysis of student ePortfolio reflections at mid-semester (a) and at the end of semester (b) using the LIWC tool1. (a)
E al mic he
Mec ha nical E nginee ring Civ il En gine erin g En vir on me nta Co lE mp ng ute ine C erin rS om g cie pu nc te e rE ng in ee rin g
ing er ine ng rE ula lec mo Bio ath dM plie Ap
s e i t i n u t r Oppo k r o W e r u & Fut
Cr Ne itic w Im Go al Ph prov al y e Cre sics S ate Pro choo blem l Bro s ad Lov e Wan Peoplet Things Different Meaning Skills Use Engineering Enjoy Studyil Civ jec t Sub nge lle l Cha Fee n sig De orld W tion a e ov olv k Inn S hin ts T s re te In
IDENTIFIED BY PERFORMANCE
Stayers
ar
S ION ISS A M AD DAT
Leavers
m
(b)
(b)
Stayers
m
Leavers
Su
Stayers
ick
However, the measurement of the arousal and valence of student emotions as a predictor of outcome shows promise.
Current and future research includes: (1) Applying the methods utilized in this research to a larger data set. (2) Deploying an early intervention plan based on student disengagement alerts and predictive metrics provided by the quantiative and qualitative data gathered from student ePortfolios. (3) Evaluating the predictive value of other text mining methodologies (i.e. parts of speech analysis, concordances, named entity extraction, summarization, classiffication and clustering).
5 Word Frequency
Word clouds representing a word frequency analysis of the end of semester ePortfolio student reflections for “leavers” and “stayers” respectively.
Leavers
Our preliminary results show that simply using word frequency counts as a predictor of outcome is ineffective or insuffcient at best. While there seemed to be a slight variance in the distribution of words used by “Leavers” and “Stayers”, the inferred information value of word frequency appears to be low.
hods
ACADEMIC PERFORMANCE
Qu
Take Hom e Message
Met
Stayers
• “What does it mean to be an engineer? How does engineering fit into your interests?”
1 Goal
Nitesh V. Chawla
Leavers
• “Engineering is a very broad field of study. What is it about engineering that interests you?"
Everaldo Aguiar G. Alex Ambrose Victoria Goodrich College of Engineering University of Notre Dame
Ma the ma tics Ma the ma tic sa nd Co mp utin g
In previous work, we described the use of quantitative electronic portfolio data as a proxy to measuring student engagement, and showed how it can be predictive of student retention. This research highlights our ongoing work as we explore how the valence of positive or negative emotions in student reflections can serve as an early warning indicator of student disengagement. Our work is based on student reflections to the following two questions asked in the middle and at the end of the semester, respectively:
Frederick Nwanganga
1
2
3 4 5 6 Positive Emotion Score
7
8
0.0
0.5 1.0 Negative Emotion Score
J. W. Pennebaker, C. K. Chung, M. Ireland, A. Gonzales, and R. J. Booth. The Development and Psychometric Properties of LIWC2007. Austin, Texas, 2007.
a few big known protein complexes that have clearly defined interactions ... comparison to random pairs, while in the other three species only slightly ... ing results from gene expression data has been proposed. Since .... Term Database.
COMHIS Collective. BSECS Conference ... Initial data. Evolving set of analysis and processing tools ... statistical summaries and data analysis - work in progress.
Internet are HTML document or XML document. The document pretreatment .... Verkamo, A. I. âFast discovery of association rules.â Advance in knowledge ...
is such an analytical technique, which reveals various dimensions of data and their ... sional data cube as a suitable data structure to capture multi-dimensional ...
Jul 31, 2015 - Bishop and Thompson (2015) concluded that âFor uncorrelated variables, simulated p-hacked data do not give the signature left-skewed ...
1. Topic Mining over Asynchronous Text. Sequences. Xiang Wang, Xiaoming Jin, Meng-En .... database literature from year 1975 to 2006 and the ...... Engineering degree in 2008, both from Ts- ... PhD student in Computer Science at Univer-.
dle customer issues, and address product-and service-related issues. .... service center calls. ...... customer: connect, application, error Ï complaint about error in.
i.e. the Web, can hide interesting information of realistic colocation ... method involves automatically summarization from free-text databases to n-gram models ...
100084, China. Email: [email protected]; [email protected]; ... topic data warehouse and the second data mining, which are two common topics shared ...