从GitHub抓取基础结构代码存储库和脚本的模块

iacminer的Python项目详细描述


iac矿工

一个用Python编写的挖掘工具,以代码的形式挖掘基础设施的软件存储库

PyPI:pip install iacminer上提供。在

API使用

首先,将Github的访问令牌导出到名为GITHUB_ACCESS_TOKEN的环境变量中,并使用:

export GITHUB_ACCESS_TOKEN='yourtokenhere'(在Linux上)

此变量将包含访问GitHub api的令牌,避免在代码中硬编码。在

矿山Github

importosfromdatetimeimportdatetimefromiacminer.miners.githubimportGithubMinerminer=GithubMiner(access_token=os.getenv('GITHUB_ACCESS_TOKEN'),date_from=datetime.strptime('2020-01-01 00:00:00','%Y-%m-%d %H:%M:%S'),date_to=datetime.strptime('2020-01-02 00:00:00','%Y-%m-%d %H:%M:%S'),push_after=datetime.strptime('2020-06-07 00:00:00','%Y-%m-%d %H:%M:%S'),min_stars=<int>,# (default = 0)min_releases=<int>,# (default = 0)min_watchers=<int>,# (default = 0)min_issues=<int>,# (default = 0)primary_language=<str|None>,# e.g., 'python' (default = None)include_fork=<True|False>)# (default = False)forrepositoryinminer.mine():print(repository)

矿山资源库

^{pr2}$

或者,立即执行前面的方法,并根据每个版本从标记的文件中提取度量:

fromiacminer.miners.repositoryimportRepositoryMinerminer=RepositoryMiner(token=os.getenv('GITHUB_ACCESS_TOKEN'),path_to_repo='path/to/cloned/repository',branch='development',# Optional (default='master')owner='radon-h2020',# Optional (default=None)repo='radon-iac-miner')# Optional (default=None)formetricsinminer.mine():print(metrics)

组合GithubMiner和RepositoryMiner

importosfromiacminer.miners.githubimportGithubMinerfromiacminer.miners.repositoryimportRepositoryMinergh_miner=GithubMiner(access_token=os.getenv('GITHUB_ACCESS_TOKEN'),min_stars=<int>,# Optional (default 0)min_issues=<int>,# Optional (default 0))forrepositoryingh_miner.mine():print(repository)repo_miner=RepositoryMiner(token=os.getenv('GITHUB_ACCESS_TOKEN'),path_to_repo='path/to/cloned/repository',branch='development',# Optional (default='master')owner='radon-h2020',# Optional (default=None)repo='radon-iac-miner')# Optional (default=None)# Mine repository as previous example ...

命令行用法

usage: iac-miner [-h] [-v] {mine-github,mine-repository} ...

A Python library to crawl GitHub for Infrastructure-as-Code based repositories
and minethose repositories to identify fixing commits and label defect-prone
files.

positional arguments:
  {mine-github,mine-repository}
    mine-github         Mine repositories from Github
    mine-repository     Mine a single repository

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit

iac miner mine github

usage: iac-miner mine-github [-h] [--from DATE_FROM] [--to DATE_TO]
                             [--pushed-after DATE_PUSH]
                             [--iac-languages [{ansible,chef,puppet,all} [{ansible,chef,puppet,all} ...]]]
                             [--include-fork] [--min-issues MIN_ISSUES]
                             [--min-releases MIN_RELEASES]
                             [--min-stars MIN_STARS]
                             [--min-watchers MIN_WATCHERS]
                             [--primary-language PRIMARY_LANGUAGE] [--verbose]
                             dest tmp_clones_folder

positional arguments:
  dest                  destination folder to save results
  tmp_clones_folder     path to temporary clone the repositories for the
                        analysis

optional arguments:
  -h, --help            show this help message and exit
  --from DATE_FROM      start searching from this date (default: 2014-01-01
                        00:00:00)
  --to DATE_TO          search up to this date (default: 2020-01-01 00:00:00)
  --pushed-after DATE_PUSH
                        search up to this date (default: 2019-01-01 00:00:00)
  --iac-languages [{ansible,chef,puppet,all} [{ansible,chef,puppet,all} ...]]
                        only repositories with this language(s) will be
                        analyzed (default: all)
  --include-fork        whether to include forked repositories (default:
                        False)
  --min-issues MIN_ISSUES
                        minimum number of issues (default: 0)
  --min-releases MIN_RELEASES
                        minimum number of releases (default: 0)
  --min-stars MIN_STARS
                        minimum number of stars (default: 0)
  --min-watchers MIN_WATCHERS
                        minimum number of watchers (default: 0)
  --primary-language PRIMARY_LANGUAGE
                        the primary language of the repository (default: None)
  --verbose             whether to output results (default: False)

iac miner矿山资源库
usage: iac-miner mine-repository [-h] [--branch REPO_OWNER]
                                 [--owner REPO_OWNER] [--name REPO_NAME]
                                 [--verbose]
                                 path_to_repo dest

positional arguments:
  path_to_repo         Name of the repository (owner/name).
  dest                 Destination folder to save results.

optional arguments:
  -h, --help           show this help message and exit
  --branch REPO_OWNER  the repository's default branch (default: master)
  --owner REPO_OWNER   the repository's owner (default: None)
  --name REPO_NAME     the repository's name (default: None)
  --verbose            whether to output results (default: False)

当前版本

[0.1.3]

  • 现在支持mine repository选项

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
使用java在mysql中使用外键插入到表中   java Android按类别筛选列表   java对JGoodies外观的更改   java如何在Drools规则中推断中间值   用于虚拟拆分文本文件的java解决方案   使用XML的java Android UI(Android和XML)   使用ApachePOI库将工作表适配到xlsx文件中的单个页面时遇到的java问题   java类型为javafx。fxml。FXMLLoader不可访问(vscode)   java所有计划的作业都在运行,即使提到了节点id   java将大整数打印到文本文件   java让正则表达式忽略新行,只匹配整个大字符串?   java图形库   CardLayout的java替代品,不包含对所包含组件的引用   个人Android应用程序的java包名称   java 安卓写入文件不会写入任何内容   每次加载页面gest后,java Xpath都会发生变化   java动态地向FlipperView添加文本视图,并使用不同的滚动方式   java如何将新类映射到实体中的现有类?   ffmpeg无法将命令参数传递给外部。Java调用的exe应用程序