bigcode-fetcher 0.1.2

Last updated:

0 purchases

bigcode-fetcher 0.1.2 Image
bigcode-fetcher 0.1.2 Images
Add to Cart

Description:

bigcodefetcher 0.1.2

# bigcode-fetcher
A utility to search and fetch code from GitHub.
This tool was build to easily create datasets for repository analysis.
The tool works in two phases, search finds repositories using the GitHub API,
and saves the result in a JSON file. download fetch all the repositories
inside the JSON file.
## Install
This tool can be installed by running
` pip install bigcode-fetcher `
or by fetching this repository and running
` pip install . `
in this directory.
## Usage
### search command
By default, the utility searches for repositories fulfilling the following conditions

size between 1M and 100M
stars count > 10
non-viral license (MIT,Apache-2.0,MPL-2.0,BSD-2-Clause,BSD-3-Clause,BSD-4-Clause,MS-PL)

and retrieves the first 100 projects, ordered by number of stars.
To avoid API rate limiting, an access token can be provided either with the –token
CLI argument or with the GITHUB_TOKEN environment variable.
See the help to see all the options:
` bigcode-fetcher search -h `
#### Example
Search for all Apache commons projects written in Java
` mkdir -p apache-common-projects bigcode-fetcher search --language Java --user apache --stars '>0' --keyword commons --max-repos 500 -o apache-common-projects/apache-commons.json `
### download command
This commands will simply git clone all the repositories in the
JSON generated by the search command.
To reduce the download size, only the latest revision is fetched by default (i.e. git clone –depth 1). This can be disabled by passing in the –full flag.
USERNAME/REPO will be fetched in OUTPUT_DIR/USERNAME/REPO, where
OUTPUT_DIR is set by the –output option.
The command will ignore the project if the directory already exists,
so running the command multiple times is safe, and recommended to make
sure all repositories have been fetched.
See the help for more information:
` bigcode-fetcher download -h `
#### Example
Download all the Apache commons project generated above
` mkdir -p apache-common-projects/repositories bigcode-fetcher download -i apache-common-projects/apache-commons.json -o apache-common-projects/repositories `

License:

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.