• Stars
    star
    39
  • Rank 693,563 (Top 14 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 6 years ago
  • Updated about 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Uses Screaming Frog Internal HTML with text extraction along with a shingling algorithm to compare content duplication across the pages of a crawled site.

More Repositories

1

tech-seo-crawler

Build a small, 3 domain internet using Github pages and Wikipedia and construct a crawler to crawl, render, and index.
Python
67
star
2

ForecastGA

A Python tool to forecast Google Analytics data using several popular time series models.
Python
42
star
3

gsc-logger

Google Search Console Logger for Google App Engine
Python
42
star
4

querycat

A Sample repo using the Apriori and FP Growth algorithms to produce categories for queries, and BERT for PoP change visualization.
Jupyter Notebook
37
star
5

ghost-material

Materialize Theme For Ghost.js
JavaScript
36
star
6

glove-to-word2vec

Converting GloVe vectors into word2vec format for easy usage with Gensim
Python
32
star
7

iCodeSEO

Repo for Content for iCodeSEO.dev
23
star
8

NodeRank

Content Extraction using the PageRank algorithm to find the element containing the best content.
Jupyter Notebook
12
star
9

Taxonomy

Jupyter Notebook
9
star
10

MassStructuredDataTester

Mass URL Checker for Google's Structured Data Testing Tool
Python
7
star
11

Nozzle2BigQuery

Local or Google Cloud Function for pulling Nozzle ranking data into BigQuery
Python
6
star
12

geneticML

Automatically refine Python code to meet specified objectives.
Python
6
star
13

Google-Data

Python
5
star
14

chord-rnn

Character and Word Level RNN, LSTM, GRU
Lua
4
star
15

WayDiffer

Waydiffer is a Streamlit application that compares website versions archived in the Wayback Machine.
Python
4
star
16

CrUX-Queries

3
star
17

Npath

Exploring path sequences in GA4 BigQuery data
Python
2
star
18

page-analytics-to-csv

Download Google Page Analytics to CSV
JavaScript
2
star
19

GA4Map

Jupyter Notebook
1
star
20

schema-playground

1
star
21

ecs-fargate-taskqueue

Uses AWS Lambda and Fargate for exposing an API for long running tasks.
Python
1
star
22

SF-Issues-to-Pages

Python
1
star
23

daily-seo

Uses Gemini Model to auto-curate content from 20+ individual user and website feeds. Feeds are setup using the rss.app tool because it supports pulling Twitter posts for provided users. Script runs with a GitHub action every four hours.
Python
1
star