• Stars
    star
    368
  • Rank 115,958 (Top 3 %)
  • Language
    Python
  • Created almost 8 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Google search results crawler, get google search results that you need

magic_google

1.What's magic_google

This is an easy Google Searching crawler that you can get anything you want in the page by using it.

During the process of  crawling,you need to pay attention to the limitation from google towards ip address and the warning of exception , so I suggest that you should pause running the program and own the Proxy ip

php - MagicGoogle

2.How to Use?

Run

pip install magic_google
# Or
pip install git+https://github.com/howie6879/magic_google.git
# Or
git clone https://github.com/howie6879/magic_google.git
cd magic_google
vim google_search.py
# Or 
python setup.py install

Example

from magic_google import MagicGoogle
import pprint

# Or PROXIES = None
PROXIES = [{
    'http': 'http://192.168.2.207:1080',
    'https': 'http://192.168.2.207:1080'
}]

# Or MagicGoogle()
mg = MagicGoogle(PROXIES)

#  Crawling the whole page
result = mg.search_page(query='python')

# Crawling url
for url in mg.search_url(query='python'):
    pprint.pprint(url)
    
# Output
# 'https://www.python.org/'
# 'https://www.python.org/downloads/'
# 'https://www.python.org/about/gettingstarted/'
# 'https://docs.python.org/2/tutorial/'
# 'https://docs.python.org/'
# 'https://en.wikipedia.org/wiki/Python_(programming_language)'
# 'https://www.codecademy.com/courses/introduction-to-python-6WeG3/0?curriculum_id=4f89dab3d788890003000096'
# 'https://www.codecademy.com/learn/python'
# 'https://developers.google.com/edu/python/'
# 'https://learnpythonthehardway.org/book/'
# 'https://www.continuum.io/downloads'

# Get {'title','url','text'}
for i in mg.search(query='python', num=1):
    pprint.pprint(i)
    
# Output
# {'text': 'The official home of the Python Programming Language.',
# 'title': 'Welcome to Python .org',
# 'url': 'https://www.python.org/'}

You can see google_search.py

If  you need a big amount of querie but only having an ip address,I suggest  you can have a time lapse between 5s ~ 30s.

The reason that it always return empty might be as follows:

<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="https://ipv4.google.com/sorry/index?continue=https://www.google.me/s****">here</A>.
</BODY></HTML>

More Repositories

1

owllook

owllook-小说搜索引擎
Python
2,549
star
2

ruia

Async Python 3.6+ web scraping micro-framework based on asyncio
Python
1,730
star
3

mlhub123

机器学习&深度学习网站资源汇总(Machine Learning Resources)
852
star
4

liuli

一站式构建多源、干净、个性化的阅读环境(Build a multi-source, clean and personalized reading environment in one stop.)
Python
852
star
5

weekly

老胡的信息技术周刊❤️记录我本周看到的有价值的信息,针对优秀项目、软件、教程资料、网站等。
Python
593
star
6

Sanic-For-Pythoneer

📚 一份sanic使用教程,开源小书
Python
387
star
7

NIYT

在你的终端看小说(Read the novel in your terminal) - NIYT
Go
154
star
8

examiner

操作系统通知中心监控(不论微信、钉钉、QQ,只要开启消息通知),可编写对应处理脚本
Python
143
star
9

owllook_api

owllook - 简洁优雅的小说API🎉
Go
130
star
10

ITBooks

Get itbooks from ebooks's website for free,such as allitebooks,digilibraries,etc
Python
106
star
11

owllook_gui

简洁优雅的小说监控工具🎉
Python
86
star
12

hproxy

hproxy - Asynchronous IP proxy pool, aims to make getting proxy as convenient as possible.(异步爬虫代理池)
Python
66
star
13

talospider

talospider - A simple,lightweight scraping micro-framework
Python
54
star
14

pylab

和Python相关的学习笔记:机器学习、算法、进阶书籍、文档,博客地址:https://www.howie6879.cn
Jupyter Notebook
51
star
15

getNews

互联网新闻推荐系统(myNews)--2016全国计算机设计大赛企业命题参赛作品
Python
44
star
16

w2b

将微信接收的文章自动解析同步到Bear
Python
40
star
17

k8s_note

k8s学习笔记
29
star
18

php-google

Google search results crawler, get google search results that you need - php
PHP
29
star
19

anan

安安 - 育儿医疗问答机器人
Python
24
star
20

book_swop

二手书籍转赠交换计划
21
star
21

sanic_annotation

sanic 源码注释 用于学习
Python
18
star
22

coolshell_qa

CoolShell 博客备份&基于 ChatGPT 的问答机器人
Python
17
star
23

instdd

Instagram Photos Download - Save Instagram photos and videos online
Python
15
star
24

mac-soft

记录我在使用 macOS 过程中使用&看到的软件项目
15
star
25

howie6879.github.io

努力就好
HTML
9
star
26

py_project_template

Python project template for you
Python
7
star
27

weeklyhub

汇聚优质精选技术周刊,为你提供高质量信息流
5
star
28

leaf

A CLI tool for hiding the application's icon in the Dock. (MacOS Dock栏软件图标隐藏终端工具)
Python
5
star
29

monkey

Search engine for programmers
Python
5
star
30

importData

将csv xls json等数据格式导入mysql
Python
4
star
31

expire

Expire aims to make using cache as convenient as possible.
Python
4
star
32

gpt123.ai-daily

老胡的 ChatGPT 日报信息流
Python
3
star
33

ml_note

我的机器学习笔记
Python
3
star
34

liuli_backup

Liuli 阅读环境文章留存
HTML
1
star
35

Mastering-Python

Mastering Python---阅读python相关书籍笔记
Jupyter Notebook
1
star
36

nand2tetris

✍️ 计算机系统要素-从零开始构建现代计算机
Scilab
1
star
37

howie6879

1
star
38

vim_config

vim
Vim Script
1
star