• Stars
    star
    309
  • Rank 134,532 (Top 3 %)
  • Language
    Python
  • License
    MIT License
  • Created over 3 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A universal package of scraper scripts for humans

Logo

MIT License version-shield release-shield python-shield

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Contributing
  5. Sponsors
  6. License
  7. Contact
  8. Acknowledgements

About The Project

Scrapera is a completely Chromedriver free package that provides access to a variety of scraper scripts for most commonly used machine learning and data science domains. Scrapera directly and asynchronously scrapes from public API endpoints, thereby removing the heavy browser overhead which makes Scrapera extremely fast and robust to DOM changes. Currently, Scrapera supports the following crawlers:

  • Images
  • Text
  • Audio
  • Videos
  • Miscellaneous

  • The main aim of this package is to cluster common scraping tasks so as to make it more convenient for ML researchers and engineers to focus on their models rather than worrying about the data collection process

    DISCLAIMER: Owner or Contributors do not take any responsibility for misuse of data obtained through Scrapera. Contact the owner if copyright terms are violated due to any module provided by Scrapera.

    Prerequisites

    Prerequisites can be installed separately through the requirements.txt file as below

    pip install -r requirements.txt

    Installation

    Scrapera is built with Python 3 and can be pip installed directly

    pip install scrapera

    Alternatively, if you wish to install the latest version directly through GitHub then run

    pip install git+https://github.com/DarshanDeshpande/Scrapera.git

    Usage

    To use any sub-module, you just need to import, instantiate and execute

    from scrapera.video.vimeo import VimeoScraper
    scraper = VimeoScraper()
    scraper.scrape('https://vimeo.com/191955190', '540p')

    For more examples, please refer to the individual test folders in respective modules

    Contributing

    Scrapera welcomes any and all contributions and scraper requests. Please raise an issue if the scraper fails at any instance. Feel free to fork the repository and add your own scrapers to help the community!
    For more guidelines, refer to CONTRIBUTING

    License

    Distributed under the MIT License. See LICENSE for more information.

    Sponsors

    Logo

    Contact

    Feel free to reach out for any issues or requests related to Scrapera

    Darshan Deshpande (Owner) - Email | LinkedIn

    Acknowledgements

    More Repositories

    1

    jax-models

    Unofficial JAX implementations of deep learning research papers
    Python
    150
    star
    2

    tf-madgrad

    A tf.keras implementation of Facebook AI's MadGrad optimization algorithm
    Python
    21
    star
    3

    research-paper-implementations

    Open Sourced ML Research Paper Implementations in Tensorflow
    Python
    19
    star
    4

    COVID-19-Detector

    Diagnosing COVID-19 patients through X-Rays
    Jupyter Notebook
    17
    star
    5

    tfrecord-generator

    This repository contains scripts for conversion of data required for most commonly found Machine Learning tasks to TFRecords
    Python
    13
    star
    6

    audio-spectral-enhancement

    Tensorflow Implementation of Audio Spectral Enhancement: Leveraging Autoencoders for Low Latency Reconstruction of Long, Lossy Audio Sequences (Darshan Deshpande and Harshavardhan Abichandani, 2021)
    Python
    5
    star
    7

    Instagram-Bot-Reporter

    πŸ€– Machine Learning and Automation Against Unsolicited Bots on Instagram
    Python
    3
    star
    8

    Miscellaneous-Deep-Learning-Projects

    Starter notebooks for Deep Learning enthusiasts
    Jupyter Notebook
    2
    star
    9

    Conversational-AI-Chatbot

    Transformer based sequence model for contextual conversation generation
    Jupyter Notebook
    1
    star
    10

    Face-Detection

    This is an extension to the face_recognition library by ageitgey
    Python
    1
    star
    11

    complex-optimization

    Tensorflow examples demonstrating advantages of complex variable optimization
    Jupyter Notebook
    1
    star
    12

    Speech-Recognition

    A speech recognition model using a three layered Conv1D neural network made for Tensorflow's Speech Recognition Kaggle Challenge
    Jupyter Notebook
    1
    star
    13

    Research-With-TF

    πŸš€ Implementation of my research ideas
    Jupyter Notebook
    1
    star