An empirical study on using multi-labels for issues in github

Jindae Kim, Seonah Lee

Research output: Contribution to journalArticlepeer-review

11 Scopus citations

Abstract

On GitHub, one of the most successful services for software project hosting, labels have been used to represent various information and decisions about reported issues. However, previous studies on labels were limited to simple statistics or label recommendations for issue types. In this paper, we aim to provide a better understanding of labels and their usage in software development. We particularly focus on using multiple and custom labels on issues. To analyze label usage, we collected software project data and label usage information from GitHub. We then quantitatively investigated the performance of projects with multi-label features and qualitatively investigated the categories of multi-labels, and the usage of multi-labels based on these categories. Our analysis results show that multi-labels are common in the majority of software projects and that projects using multi-label features manage their issues more effectively. In addition, our analysis results reveal different types of information represented by labels, which are related to features, development, and issues. This study finds several facts that can be used for studies on issue management and thus that help develop labeling techniques to mitigate the burden of issue management.

Original languageEnglish
Pages (from-to)134984-134997
Number of pages14
JournalIEEE Access
Volume9
DOIs
StatePublished - 2021

Keywords

  • Github
  • Issue labels
  • Issue management
  • Issue tracking system
  • Multi-labels
  • Open source software
  • Software maintenance

Fingerprint

Dive into the research topics of 'An empirical study on using multi-labels for issues in github'. Together they form a unique fingerprint.

Cite this