TY - GEN
T1 - An Experience Report on Technical Debt in Pull Requests
T2 - 16th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2022
AU - Karmakar, Shubhashis
AU - Codabux, Zadia
AU - Vidoni, Melina
N1 - Publisher Copyright:
© 2022 Association for Computing Machinery.
PY - 2022/9/19
Y1 - 2022/9/19
N2 - Background: GitHub is a collaborative platform for global software development, where Pull Requests (PRs) are essential to bridge code changes with version control. However, developers often trade software quality for faster implementation, incurring Technical Debt (TD). When developers undertake reviewers' roles and evaluate PRs, they can often detect TD instances, leading to either PR rejection or discussions. Aims: We investigated whether Pull Request Comments (PRCs) indicate TD by assessing three large-scale repositories: Spark, Kafka, and React. Method: We combined manual classification with automated detection using machine learning and deep learning models. Results: We classified two datasets and found that 37.7 and 38.7% of PRCs indicate TD, respectively. Our best model achieved F 1 = 0.85 when classifying TD during the validation phase. Conclusions: We faced several challenges during this process, which may hint that TD in PRCs is discussed differently from other software artifacts (e.g., code comments, commits, issues, or discussion forums). Thus, we present challenges and lessons learned to assist researchers in pursuing this area of research.
AB - Background: GitHub is a collaborative platform for global software development, where Pull Requests (PRs) are essential to bridge code changes with version control. However, developers often trade software quality for faster implementation, incurring Technical Debt (TD). When developers undertake reviewers' roles and evaluate PRs, they can often detect TD instances, leading to either PR rejection or discussions. Aims: We investigated whether Pull Request Comments (PRCs) indicate TD by assessing three large-scale repositories: Spark, Kafka, and React. Method: We combined manual classification with automated detection using machine learning and deep learning models. Results: We classified two datasets and found that 37.7 and 38.7% of PRCs indicate TD, respectively. Our best model achieved F 1 = 0.85 when classifying TD during the validation phase. Conclusions: We faced several challenges during this process, which may hint that TD in PRCs is discussed differently from other software artifacts (e.g., code comments, commits, issues, or discussion forums). Thus, we present challenges and lessons learned to assist researchers in pursuing this area of research.
KW - Mining Software Repositories
KW - Pull Request Comments
KW - Technical Debt
UR - http://www.scopus.com/inward/record.url?scp=85139844046&partnerID=8YFLogxK
U2 - 10.1145/3544902.3546637
DO - 10.1145/3544902.3546637
M3 - Conference contribution
T3 - International Symposium on Empirical Software Engineering and Measurement
SP - 295
EP - 300
BT - Proceedings of the 16th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2022
A2 - Madeiral, Fernanda
A2 - Lassenius, Casper
A2 - Lassenius, Casper
A2 - Conte, Tayana
A2 - Mannisto, Tomi
PB - IEEE Computer Society
Y2 - 18 September 2022 through 23 September 2022
ER -