• Stars
    star
    281
  • Rank 146,987 (Top 3 %)
  • Language
    Python
  • License
    GNU General Publi...
  • Created almost 9 years ago
  • Updated 11 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Send Sir Perceval on a quest to retrieve and gather data from software repositories.

Perceval Build Status Coverage Status PyPI version Documentation in RTD

Send Sir Perceval on a quest to retrieve and gather data from software repositories.

Usage

usage: perceval [-g] <backend> [<args>] | --help | --version | --list

Send Sir Perceval on a quest to retrieve and gather data from software
repositories.

Repositories are reached using specific backends. The most common backends
are:

    askbot           Fetch questions and answers from Askbot site
    bugzilla         Fetch bugs from a Bugzilla server
    bugzillarest     Fetch bugs from a Bugzilla server (>=5.0) using its REST API
    confluence       Fetch contents from a Confluence server
    discourse        Fetch posts from Discourse site
    dockerhub        Fetch repository data from Docker Hub site
    gerrit           Fetch reviews from a Gerrit server
    git              Fetch commits from Git
    github           Fetch issues, pull requests and repository information from GitHub
    gitlab           Fetch issues, merge requests from GitLab
    gitter           Fetch messages from a Gitter room
    googlehits       Fetch hits from Google API
    groupsio         Fetch messages from Groups.io
    hyperkitty       Fetch messages from a HyperKitty archiver
    jenkins          Fetch builds from a Jenkins server
    jira             Fetch issues from JIRA issue tracker
    launchpad        Fetch issues from Launchpad issue tracker
    mattermost       Fetch posts from a Mattermost server
    mbox             Fetch messages from MBox files
    mediawiki        Fetch pages and revisions from a MediaWiki site
    meetup           Fetch events from a Meetup group
    nntp             Fetch articles from a NNTP news group
    pagure           Fetch issues from Pagure
    phabricator      Fetch tasks from a Phabricator site
    pipermail        Fetch messages from a Pipermail archiver
    redmine          Fetch issues from a Redmine server
    rocketchat       Fetch messages from a Rocket.Chat channel
    rss              Fetch entries from a RSS feed server
    slack            Fetch messages from a Slack channel
    stackexchange    Fetch questions from StackExchange sites
    supybot          Fetch messages from Supybot log files
    telegram         Fetch messages from the Telegram server
    twitter          Fetch tweets from the Twitter Search API

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show version
  -g, --debug           set debug mode on
  -l, --list            show available backends

Run 'perceval <backend> --help' to get information about a specific backend.

Requirements

  • Python >= 3.7
  • Poetry >= 1.2
  • git
  • build-essential

You will also need some other libraries for running the tool, you can find the whole list of dependencies in pyproject.toml file.

How to install

  • build-essentials

Build-essentials is a package that contains a set of tools to compile and build software. It is required to work with Debian packages.

$ sudo apt-get install build-essential
  • git

Git is a version control system that allows you to keep track of changes in your code. It is required to work with Git repositories.

$ sudo apt-get install git

Installation

There are several ways to install Perceval on your system: packages or source code using Poetry or pip or using Docker.

PyPI

Perceval can be installed using pip, a tool for installing Python packages. To do it, run the next command:

$ pip install perceval

Source code

To install from the source code you will need to clone the repository first:

$ git clone https://github.com/chaoss/grimoirelab-perceval
$ cd grimoirelab-perceval

Then use pip or Poetry to install the package along with its dependencies.

Pip

To install the package from local directory run the following command:

$ pip install .

In case you are a developer, you should install perceval in editable mode:

$ pip install -e .

Poetry

We use poetry for dependency management and packaging. You can install it following its documentation. Once you have installed it, you can install perceval and the dependencies in a project isolated environment using:

$ poetry install

To spaw a new shell within the virtual environment use:

$ poetry shell

Docker

A Perceval Docker image is available at DockerHub.

Detailed information on how to run and/or build this image can be found here.

Documentation

Documentation is generated automatically in the ReadTheDocs Perceval site.

References

If you use Perceval in your research papers, please refer to Perceval: software project data at your will -- Pre-print:

APA style

Dueñas, S., Cosentino, V., Robles, G., & Gonzalez-Barahona, J. M. (2018, May). Perceval: software project data at your will. In Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings (pp. 1-4). ACM.

BibTeX

@inproceedings{duenas2018perceval,
  title={Perceval: software project data at your will},
  author={Due{\~n}as, Santiago and Cosentino, Valerio and Robles, Gregorio and Gonzalez-Barahona, Jesus M},
  booktitle={Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings},
  pages={1--4},
  year={2018},
  organization={ACM}
}

Examples

Askbot

$ perceval askbot 'http://askbot.org/' --from-date '2016-01-01'

Bugzilla

To fetch bugs from Bugzilla, you have two options:

a) Use the traditional backend

$ perceval bugzilla 'https://bugzilla.redhat.com/' --backend-user user --backend-password pass --from-date '2016-01-01'

b) Use the REST API backend for Buzilla 5.0 (or higher) servers. We strongly recommend this backend when data is fetched from version servers >=5.0 because the retrieval process is much faster.

$ perceval bugzillarest 'https://bugzilla.mozilla.org/' --backend-user user --backend-password pass --from-date '2016-01-01'

Confluence

$ perceval confluence 'https://wiki.opnfv.org/' --from-date '2016-01-01'

Discourse

$ perceval discourse 'https://foro.mozilla-hispano.org/' --from-date '2016-01-01'

Docker Hub

$ perceval dockerhub grimoirelab perceval

Gerrit

To run gerrit, you will need an authorized SSH private key:

$ eval `ssh-agent -s`
$ ssh-add ~/.ssh/id_rsa
Identity added: /home/user/.ssh/id_rsa (/home/user/.ssh/id_rsa)

To run the backend, execute the next command:

$ perceval gerrit --user user 'review.openstack.org' --from-date '2016-01-01'

Git

To run this backend execute the next command. Take into account that to run this backend Git program has to be installed on your system.

$ perceval git 'https://github.com/chaoss/grimoirelab-perceval.git' --from-date '2016-01-01'

To run the backend against a private git repository, you must pass the credentials directly in the URL:

$ perceval git https://<username>:<password>@repository-url

For example, for private GitHub repositories:

$ perceval git https://<username>:<api-token>@github.com/chaoss/grimoirelab-perceval

Git backend can also work with a Git log file as input. We recommend to use the next command to get the most complete log file.

$ git log --raw --numstat --pretty=fuller --decorate=full --parents --reverse --topo-order -M -C -c --remotes=origin --all > /tmp/gitlog.log

Then, to run the backend, just execute any of the next commands:

$ perceval git --git-log '/tmp/gitlog.log' 'file:///myrepo.git'

or

$ perceval git '/tmp/gitlog.log'

GitHub

$ perceval github elastic logstash --from-date '2016-01-01'

The GitHub backend accepts the categories issue, pull_request and repository which allow to fetch the specific data.

$ perceval github --category issue elastic logstash

GitLab

$ perceval gitlab fdroid fdroiddata -t $GITLAB_TOKEN --from-date '2016-01-01'

Gitter

$ perceval gitter -t 'abcdefghi' --from-date '2020-03-18' 'jenkinsci' 'jenkins'

GoogleHits

$ perceval googlehits "bitergia grimoirelab"

Groups.io

$ perceval groupsio 'updates' -e '<[email protected]>' -p 'my-password' --from-date '2016-01-01'

In order to fetch the data from a group, you should first subscribe to it via the Groups.io website. In case you want to know the group names where you are subscribed, you can use the following script: https://gist.github.com/valeriocos/ad33a0b9b2d13a8336230c8c59df3c55

HyperKitty

$ perceval hyperkitty 'https://lists.mailman3.org/archives/list/[email protected]' --from-date 2017-01-01

Jenkins

$ perceval jenkins 'https://build.opnfv.org/ci/'

JIRA

$ perceval jira 'https://tickets.puppetlabs.com' --project PUP --from-date '2016-01-01'

Launchpad

$ perceval launchpad ubuntu --from-date '2016-01-01'

Mattermost

$ perceval mattermost 'http://mattermost.example.com' jgw7jdmjkjf19ffkwnw59i5f9e --from-date '2016-01-01' -t 'abcdefghijk'

MBox

$ perceval mbox 'http://example.com' /tmp/mboxes/

MediaWiki

$ perceval mediawiki 'https://wiki.mozilla.org' --from-date '2016-06-30'

Meetup

$ perceval meetup 'Software-Development-Analytics' --from-date '2016-06-01' -t abcdefghijk

NNTP

$ perceval nntp 'news.mozilla.org' 'mozilla.dev.project-link' --offset 10

Pagure

$ perceval pagure '389-ds-base' --from-date '2020-03-06'

Phabricator

$ perceval phabricator 'https://secure.phabricator.com/' -t 123456789abcefe

Pipermail

$ perceval pipermail 'https://mail.gnome.org/archives/libart-hackers/'

Pipermail also is able to fetch data from Apache's mod_box interface:

$ perceval pipermail 'http://mail-archives.apache.org/mod_mbox/httpd-dev/'

Redmine

$ perceval redmine 'https://www.redmine.org/' --from-date '2016-01-01' -t abcdefghijk

Rocket.Chat

Rocket.Chat backend needs an API token and a User Id to authenticate to the server.

$ perceval rocketchat -t 'abchdefghij' -u '1234abcd' --from-date '2020-05-02' https://open.rocket.chat general

RSS

$ perceval rss 'https://blog.bitergia.com/feed/'

Slack

Slack backend requires an API token for authentication. Slack apps can be used to generate and configure this API token. The scopes required by a Slack app for the backend are channels:history, channels:read and users:read. To know more about Slack apps and its integration please refer the Slack apps documentation. For more information about the scopes required by a Slack app please refer the Scopes and permissions documentation.

The following script can also be used to generate an OAuth2 token to access the Slack API.

$ perceval slack C0001 --from-date 2016-01-12 -t abcedefghijk

StackExchange

$ perceval stackexchange --site stackoverflow --tagged python --from-date '2016-01-01' -t abcdabcdabcdabcd

Supybot

$ perceval supybot 'http://channel.example.com' /tmp/supybot/

Telegram

Telegram backend needs an API token to authenticate the bot. In addition and in order to fetch messages from a group or channel, privacy settings must be disabled. To know how to create a bot, to obtain its token and to configure it please read the Telegram Bots docs pages.

Note that the messages are available on the Telegram server until the bot fetches them, but they will not be kept longer than 24 hours.

$ perceval telegram mybot -t 12345678abcdefgh --chats 1 2 -10

Twitter

Twitter backend needs a bearer token to authenticate the requests. It can be obtained using the code available on GistGitHub: https://gist.github.com/valeriocos/7d4d28f72f53fbce49f1512ba77ef5f6

$ perceval twitter grimoirelab -t 12345678abcdefgh

Community Backends

Some backends are implemented in a seperate repository but not merged into chaoss/grimoirelab-perceval due to long-run maintainence reasons. Please feel free to check the backends and contact the maintainers for any issues or questions related to them.

Running tests

Perceval comes with a comprehensive list of unit tests. To run them, in addition to the dependencies installed with Perceval, you need httpretty.

License

Licensed under GNU General Public License (GPL), version 3 or later.

More Repositories

1

augur

Python library and web service for Open Source Software Health and Sustainability metrics & data collection. You can find our documentation and new contributor information easily here: https://oss-augur.readthedocs.io/en/main/ and learn more about Augur at our website https://augurlabs.io
Python
549
star
2

grimoirelab

GrimoireLab: platform for software development analytics and insights
Roff
432
star
3

metrics

Implementation-agnostic metrics for assessing open source community health. Maintained by the CHAOSS Metrics Committee.
182
star
4

community

This is the main CHAOSS community repository. Feel free to open an issue to discuss a topic of community interest! This repository also holds governance, mentorship, and other community-related documentation
JavaScript
94
star
5

wg-dei

CHAOSS Working Group focused on Diversity, Equity, and Inclusion metrics
84
star
6

prospector

Prospector permits automated collection of a wide range of metrics of open source projects useful in evaluating the project.
Python
67
star
7

grimoirelab-elk

Python
58
star
8

grimoirelab-sortinghat

A tool to manage identities
Python
51
star
9

wg-evolution

Working Group focused on Evolution metrics (for software development projects)
39
star
10

wg-value

CHAOSS Value Working Group
39
star
11

grimoirelab-sirmordred

Orchestrate the execution of GrimoireLab tools to produce a dashboard
Python
35
star
12

website

(Some of the) Content of http://chaoss.community website.
HTML
34
star
13

grimoirelab-sigils

Panels from GrimoireLab dashboards
Python
33
star
14

wg-metrics-development

CHAOSS Common Metrics Working Group
33
star
15

grimoirelab-tutorial

Tutorial for installing, using, developing GrimoireLab
Python
28
star
16

wg-risk

Risk Working Group Repository
26
star
17

grimoirelab-kingarthur

King Arthur commands his loyal knight Perceval on the quest to retrieve data from software repositories.
Python
20
star
18

grimoirelab-graal

A Generic Repository AnALyzer
Python
19
star
19

AFOS-AfricanOpenSource

AFOS (AfricanOpenSource) is a platform for showcasing Open Source projects build by Africans to increase visibility, and expose the projects to more contributors. This was a GitHub-funded project
JavaScript
17
star
20

grimoirelab-manuscripts

Bitergia reports engine
Jupyter Notebook
16
star
21

community-handbook

This is the home of documentation included in the Community Handbook.
15
star
22

chaoss-slack-bot

This holds the code to the CHAOSS slack bot for newcomers to the project
JavaScript
15
star
23

augur-community-reports

A set of Jupyter Lab Notebooks and Other Implementations of Community Reports in Standard Form
Jupyter Notebook
15
star
24

wg-app-ecosystem

Working group for community metrics in the context of the open source app ecosystem
13
star
25

wg-metrics-models

Working Group for Metrics Model
Jupyter Notebook
12
star
26

grimoirelab-kidash

Kidash: A GrimoireLab tool & library to manage Kibana/Kibiter visualizations and dashboards
Python
11
star
27

grimoirelab-cereslib

This project aims at unifying, eventizing and enriching information from the Perceval tool
Python
10
star
28

translations

9
star
29

chaoss-africa

CHAOSS Africa is the local chapter of the CHAOSS Project
9
star
30

wg-science

Focused on the development of metrics, metrics models, and software for improving scientific open source community health and sustainability.
8
star
31

grimoirelab-hatstall

HTML
8
star
32

grimoirelab-toolkit

Toolkit of common functions used across GrimoireLab projects.
Python
6
star
33

augur-license

Augur's Open Source License coverage tool. Provides license identification by file, identification of non-OSI compliant licenses, and percentage of a project with license declarations. Also provides a downloadable SBOM with license information by file. Integrated with Augur, and leveraging Fossology scanners and DosocsV2.
Python
6
star
34

MARS

Metrics Automated Release System
Python
4
star
35

grimoirelab-perceval-weblate

GrimoireLab: Bundle of Perceval backends for Weblate
Python
3
star
36

wg-ospo

3
star
37

education

This holds info on the CHAOSS Onboarding Courses
3
star
38

community-reports

A place to discuss how we package and deliver sets of metrics for open source health reports to users.
3
star
39

grimoirelab-bestiary

Python
3
star
40

grimoirelab-perceval-mozilla

GrimoireLab: Bundle of Perceval backends for Mozilla ecosystem
Python
3
star
41

Accessibility

This holds information on accessibility testing for all websites from the CHAOSS Community.
3
star
42

grimoirelab-elk-gitee

Python
2
star
43

augur-auggie

Auggie implementation utilizing Amazon Lex to classify messages. You can checkout our GitHub Page at https://chaoss.github.io/augur-auggie/
JavaScript
2
star
44

org-affiliation-data

List of domains and organizations for affiliation purposes
2
star
45

grimoirelab-perceval-gitee

GrimoireLab: Bundle of Perceval backends for Gitee
Python
2
star
46

grimoirelab-perceval-opnfv

GrimoireLab: Bundle of Perceval backends for OPNFV ecosystem
Python
2
star
47

wg-un-sdg

UN-SDG working group aims to empower open source communities with metrics to align their contributions with the UN's Sustainable Development Goals (SDGs) and enhance the role of open source in achieving global sustainability.
2
star
48

grimoirelab-perceval-puppet

GrimoireLab: Bundle of Perceval backends for Puppet, Inc. ecosystem.
Python
1
star
49

grimoirelab-github-actions

GrimoireLab CI GitHub actions
1
star
50

wg-data-science

CHAOSS Data Science Working Group: collaborate and improve open source project health using data science-based approaches
1
star
51

chaoss.github.io

GitHub web presence for the CHAOSS Project. Jekyll-based site with a blog, innovation experiments, documentation, and more.
HTML
1
star
52

AFOS-API

Backend service for the African Open Source(AFOS) project
JavaScript
1
star