• Stars
    star
    121
  • Rank 293,924 (Top 6 %)
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created almost 9 years ago
  • Updated about 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A scraper for the data made available by the Italian Senate, and a cluster analysis to detect similar amendments.

senato.py

Automated Clustering of Similar Amendments in the Italian Senate

The problem

The Italian Senate is under a Denial-of-Service attack. Software is being used to generate millions of amendments to block the passing of certain laws. The amendments are generated using a black-hat technique that produces several variations of a given text. This puts a huge strain on the Senate, which has to discuss and vote on the individual amendments, effectively bringing proceedings to a standstill.

The solution

The Italian Senate makes its data available publicly. An automated clustering analysis can be performed on these data to eliminate what are essentially duplicate amendments and reduce the total number of amendments that have to be considered.

clusters.png

senato.py is a scraper for data from the Senate. The data can be analysed using the Jupyter notebook provided in this repository.

Installation and Usage

  1. Clone this repository: git clone https://github.com/jacquerie/senato.py.git
  2. Install the dependencies: cd senato.py && pip install -r requirements.txt
  3. Fetch the amendments by running the scraper: scrapy crawl cirinna
  4. Examine the analysis by running the notebook: jupyter notebook cirinna.ipynb

About senato.py

senato.py is authored by Jacopo Notarstefano (@Jaconotar). You can learn more about it by watching this short "lightning talk" given by Jacopo at CERN on 17 June 2016.

License

MIT

More Repositories

1

stop-the-bullshit

Blocks websites that publish fake news, and hides their posts on Facebook.
HTML
154
star
2

hh

A game based on the Havel-Hakimi algorithm.
JavaScript
51
star
3

flask-shell-ptpython

Replace the default flask shell command with a similar one running PTPython.
Python
13
star
4

biorxiv-cli

A Python wrapper for the bioRxiv API.
Python
9
star
5

flask-shell-bpython

Replace the default flask shell command with a similar one running BPython.
Python
8
star
6

github-file

Configure your GitHub repository from a file, without having to click around in the UI.
Python
7
star
7

arxiv-cli

A Python wrapper for the arXiv API.
Python
7
star
8

circuit-design

Your laboratory partner made a mess while designing the circuit boards for your experiment!
JavaScript
7
star
9

leetcode

My solutions to the problems hosted on LeetCode.
Python
3
star
10

mailbomb.py

A simple script to send a lot of emails
Python
3
star
11

sunflower

Turing's sunflower.
HTML
2
star
12

how-to-write-good-tests-presentation

TeX
2
star
13

github-issues

Print statistics about the age of issues in JSON
JavaScript
1
star
14

simko

JavaScript
1
star
15

lars

TeX
1
star
16

google-foobar-post-mortem

HTML
1
star
17

lamport-clocks

TeX
1
star
18

github-compose

Orchestrate your GitHub repositories like docker-compose orchestrates your Docker containers.
Python
1
star
19

primer-examples

Reimplement a selected subset of Bootstrap's examples using Primer.
HTML
1
star
20

wired-nextfest-presentation

TeX
1
star
21

duke

Who killed the Duke of Densmore?
TeX
1
star
22

italian-startups-report

This project aims to free and interpret the data about italian startups.
Python
1
star