instascrape: powerful Instagram data scraping toolkit
Note: This module is no longer actively maintained.
DISCLAIMER:
Instagram has gotten increasingly strict with scraping and using this library can result in getting flagged for botting AND POSSIBLE DISABLING OF YOUR INSTAGRAM ACCOUNT. This is a research project and I am not responsible for how you use it. Independently, the library is designed to be responsible and respectful and it is up to you to decide what you do with it. I don't claim any responsibility if your Instagram account is affected by how you use this library.
What is it?
instascrape is a lightweight Python package that provides an expressive and flexible API for scraping Instagram data. It is geared towards being a high-level building block on the data scientist's toolchain and can be seamlessly integrated and extended with industry standard tools for web scraping, data science, and analysis.
Key features
Here are a few of the things that instascrape
does well:
- Powerful, object-oriented scraping tools for profiles, posts, hashtags, reels, and IGTV
- Scrapes HTML, BeautifulSoup, and JSON
- Download content to your computer as png, jpg, mp4, and mp3
- Dynamically retrieve HTML embed code for posts
- Expressive and consistent API for concise and elegant code
- Designed for seamless integration with Selenium, Pandas, and other industry standard tools for data collection and analysis
- Lightweight; no boilerplate or configurations necessary
- The only hard dependencies are Requests and Beautiful Soup
Table of Contents
💻 Installation
Minimum Python version
This library currently requires Python 3.7 or higher.
pip
Install from PyPI using
$ pip3 install insta-scrape
WARNING: make sure you install insta-scrape and not a package with a similar name!
🔎 Sample Usage
All top-level, ready-to-use features can be imported using:
from instascrape import *
instascrape uses clean, consistent, and expressive syntax to make the developer experience as painless as possible.
# Instantiate the scraper objects
google = Profile('https://www.instagram.com/google/')
google_post = Post('https://www.instagram.com/p/CG0UU3ylXnv/')
google_hashtag = Hashtag('https://www.instagram.com/explore/tags/google/')
# Scrape their respective data
google.scrape()
google_post.scrape()
google_hashtag.scrape()
print(google.followers)
print(google_post['hashtags'])
print(google_hashtag.amount_of_posts)
>>> 12262794
>>> ['growwithgoogle']
>>> 9053408
See the Scraped data points section of the Wiki for a complete list of the scraped attributes provided by each scraper.
📚 Documentation
The official documentation can be found on Read The Docs
📰 Blog Posts
Check out blog posts on the official site or DEV for ideas and tutorials!
- Scrape data from Instagram with instascrape
- Visualizing Instagram engagement with instascrape
- Exploratory data analysis of Instagram using instascrape and Python
- Creating a scatter matrix of Instagram data using Python
- Downloading an Instagram profile's recent photos using Python
- Scraping 25,000 data points from Joe Biden's Instagram using instascrape
- Compare major tech Instagram page's with instascrape
- Tracking an Instagram posts engagement in real time with instascrape
- Dynamically generate embeddable Instagram HTML with instascrape
- Scraping an Instagram location tag with instascrape
- Scraping Instagram reels with instascrape
- Scraping IGTV data with instascrape
- Scraping 10,000 data points from Donald Trump's Instagram with Python
🙏 Contributing
All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome!
Feel free to open an Issue, check out existing Issues, or start a discussion.
Beginners to open source are highly encouraged to participate and ask questions if you're unsure what to do/where to start
🕸️ Dependencies
💳 License
This library operates under the MIT license.
❔ Support
Check out the FAQ
Reach out to me if you want to connect or have any questions and I will do my best to get back to you
- Email:
- Twitter:
- Personal contact form: