• Stars
    star
    179
  • Rank 214,039 (Top 5 %)
  • Language
    JavaScript
  • Created over 8 years ago
  • Updated over 8 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Web crawler for zhihu.com

#知乎关系网爬虫

DEMO


#使用方法 1、初始化

git clone https://github.com/starkwang/Spider.git && cd Spider

npm run init

2、配置

参考server.config.example.jsspider.config.example.js,配置你自己的server.config.jsspider.config.js

3、构建并开始

npm run build

npm run start // Server runs at localhost:3000

#配置 1、spider.config.js

  • cookie [string](必填项) : 自己在知乎上的cookie
  • _xsrf [string](必填项): 自己在知乎上的_xsrf
  • concurrency [number](可选项): 请求的并发数,默认为3

由于知乎的API较不稳定,concurrency并发数太大可能会造成卡死,在网络环境不好时建议设置为2或者1

2、server.config.js

  • socketPort [number](必填项) : 用于websocket的端口号
  • httpPort [number](必填项): 用于http的端口号

###附:cookie与_xsrf配置方法

打开知乎任意用户的关注者页,例如https://www.zhihu.com/people/starkwei/followers

打开浏览器控制台,选择Network: DEMO

下拉页面,会自动加载更多关注者,可以看到对/node/ProfileFollowersListV2这个接口发起了多次请求: DEMO 打开请求详情,Cookie和_xsrf就在里面: DEMO


#已知的BUG或者缺陷

  1. 对于粉丝数过多的大V,爬取速度过慢
  2. 当相互关注的人中有自己时,不能爬取和自己有关的关系链
  3. 请求失败或者timeout时,没有重发请求,可能会导致部分数据缺失

More Repositories

1

vue-virtual-collection

Vue component for efficiently rendering large collection data
JavaScript
642
star
2

DOM-Drawer

A small toy to help you draw the DOM structure on Canvas
JavaScript
71
star
3

alphabetJS

A small tool to help you output big English character in console/shell or anyother platform.
JavaScript
69
star
4

FDSHM

复旦二手交易平台
CSS
56
star
5

react-redux-es6-quickstart

JavaScript
55
star
6

Maus

A Light RPC Framework for NodeJS or Browser.
JavaScript
53
star
7

XKHelper

复旦新版选课助手
JavaScript
36
star
8

quickr

Node.js Framework for Future 🚀
JavaScript
23
star
9

keras-js-demo

JavaScript
18
star
10

naive-complier

JavaScript
16
star
11

IFE-Homework

JavaScript
15
star
12

Incremental

JavaScript
13
star
13

Simage

A lightweight image processing library for Javascript , based on Canvas.
JavaScript
8
star
14

create-sw

A tool to generate robust Service Worker for your application
JavaScript
6
star
15

cofree

A Node.js Server Framework for Cloud and Serverless
TypeScript
4
star
16

tiny-commonjs-pack

A simple packer for commonjs
JavaScript
3
star
17

cloudbase-blog-demo

使用Next.js + 云开发快速搭建个人博客
JavaScript
3
star
18

BubbleBreak

A simple HTML5 game
JavaScript
2
star
19

launch-app

launch your app from web page
JavaScript
2
star
20

node-stream-replayer

Store a stream and replay it
JavaScript
2
star
21

Tahiti

God is in his heaven . all's right with the world.
JavaScript
2
star
22

FDU-MailHelper

复旦扫邮助手
JavaScript
1
star
23

HackFDU

JavaScript
1
star
24

DSL

JavaScript
1
star
25

Project-for-Multimedia-Technology

JavaScript
1
star
26

ngClipBoard

Copy text to clipboard in angular. Without Flash or anyother dependencies
JavaScript
1
star