Data underlying the BSc project: "An analysis of Java release practices on GitHub"

Data underlying the BSc project: "An analysis of Java release practices on GitHub"

1
contributor

Description

This dataset contains the following inside a tar.zst file:

A list of all Java repositories on GitHub in a CSV formatThe POM.xml file from those repositories if there was one at the root of the repoA sample of 500 000 repositories thatHave been searched recursively for POM.xml filesOf those that have a POM.xml file an 'effective' POM.xml has been createdOf those that have distribution repositories configured, GitHub workflow files if they exista report.json file that contains aggregate information of the sample

The scraper written to retrieve this data is also included.

This dataset was created for a Computer Science Bachelor Research Project titled "An analysis of Java release practices on GitHub" by Vivian Roest.

Logo of Data underlying the BSc project:  "An analysis of Java release practices on GitHub"
Keywords
Programming languages
  • Other 55%
  • Rust 43%
  • XML 2%
License
  • CC0-1.0
</>Source code
Packages
data.4tu.nl
data.4tu.nl

Contributors

Member of community

4TU