Thesis Defence: Colton Aarts (Master of Science in Computer Science)

Date
to
Location
Library (5-140D) and/or Zoom
Campus
Prince George
Online

You are encouraged to attend the defence. The details of the defence and attendance information is included below:  

Date: April 11, 2025
Time: 12:00 PM to 2:00 PM (PT)

Defence mode: Hybrid 
In-Person Attendance: Library (5-140D)
Virtual Attendance: via Zoom  

LINK TO JOIN: Please contact the Office of Graduate Administration for information regarding remote attendance for online defences. 

To ensure the defence proceeds with no interruptions, please mute your audio and video on entry and do not inadvertently share your screen. The meeting will be locked to entry 5 minutes after it begins: please ensure you are on time.  

Thesis entitled: COMBINING ACTIVE LEARNING AND DATA AUGMENTATION FOR SENTIMENT ANALYSIS   

Abstract: Creating a sentiment analysis classifier requires a large amount of labelled training data. Labelling this data is an expensive and time-consuming process. Because of this, reducing the amount of labelled data required leads to classifiers that are cheaper to train and more accessible to all disciplines. Many different methods can be used to reduce the amount of labelled data. For this research, we focused on combining active learning and lexical expansion techniques.

By combining these two techniques, this research examined an underutilized area of study. Active learning focuses on letting the classifier select the data to learn from, while lexical expansion creates more data for the classifier. While there are a larger number of different techniques in both fields, there is little work to be done to combine them. We felt this was a natural progression for these techniques as they complement each other well. The active learning technique will select the data to be labelled, and the lexical expansion technique will generate high-quality artificial data from this hand-selected information. In addition to combining these techniques, we examined how different neural network structures would interact with our new technique.

Our research found that the combination of active learning and lexical expansion improved the performance of our classifiers for very small amounts of data. We found a significant difference between the performance of our two classifiers. While there was an improvement at low levels of training data, at higher levels, we found that the combined techniques did not offer any improvements over the active learning technique. Overall, we found potential benefits to combining the two techniques and that future research is required to understand further how to leverage these improvements best.  

Defence Committee:  
Chair: Dr. Deborah Roberts
Supervisor: Dr. Fan Jiang
Committee Member: Dr. Liang Chen
Committee Member: Dr. Kafui Monu
External Examiner: Dr. Jianhui Zhou

Contact Information

Graduate Administration in the Office of the Registrar, University of Northern British Columbia