• Stars
    star
    125
  • Rank 286,335 (Top 6 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 8 years ago
  • Updated almost 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Scrape GSoC organisations using a single script.

GSoC Organisation Scraper

Makes life easier by scraping instead of searching for each and every organisation by name. Also shows number of times that an organisation has appeared in GSoC. Used Requests library of python and BeautifulSoup

Use Python-2.7

Requirements :

  • BeautifulSoup
  • Requests

Instructions :

# Clone this repository
git clone https://github.com/rohithasrk/GSoC-Organisation-Scraper.git

# Go into the repository
cd GSoC-Organisation-Scraper

# Install dependencies
[sudo] pip2 install -r requirements.txt

# Run the app without giving technology as a command line argument 
python2 scrape.py

# Enter the technology of preference when prompted.
# Example: python

# Run the app by giving technology as a command line argument 
python2 scrape.py javascript

#To store the output to a text file use pipe
python2 scrape.py ruby > ruby_orgs

Screenshots :

When browsed for javascript and ruby, some of the results are as shown below.

Python orgs 1

Python orgs 2

TODOs :

  • Make the code run faster.
  • Remove multiple results.

Contributing :

  • Fork the repo.
  • Create a new branch named <your_feature>
  • Commit changes and make a PR.
  • PRs are welcome.

This program uses PyTerm-Colors : https://github.com/vinamarora8/PyTerm-Colors.git