Predicting Programming Language Preferences from Big5 Personality Traits
Abstract
This thesis explores the relationship between the Big Five personality traits (BPT) and programming language preferences. We randomly collect data of a total of N=820 Twitter (currently X) and Stack overflow (SO) users. By analyzing the social media activity and Stack Overflow profiles of users, we aim to predict their preferred programming languages based on their BPT. We cross-link the data between Twitter and SO profiles. Then, we collect user features (i.e., users’ BPT, word embedding of tweets, etc.) from Twitter and programming preferences (i.e., programming tags, reputation, question, answer, etc.) from SO. Then, we apply various machine learning (ML) and deep learning (DL) techniques to predict users’ programming language preferences from their BPT. We also investigate a few interesting insights about Twitter and SO platforms and how reputation, question asking/replying associated with user’s BPT. The results indicate a significant predictive capability, achieving an accuracy rate of 78%. This demonstrates that personality traits,
as captured by the BPT, can be a strong indicator of programming language preference. The findings suggest potential applications in personalized learning environments, career guidance, and team composition in software development projects. This research contributes to the growing body of knowledge at the intersection of psychology and computer science, highlighting the impact of personality on technical choices. Future work could ex-
pand on these findings by exploring additional personality frameworks and incorporating larger and more diverse datasets.
Collections
- M.Sc Thesis/Project [145]