• Stars
    star
    1
  • Language
    Clojure
  • Created over 11 years ago
  • Updated over 10 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Clojure wrapper around a Java library to read warc files.

More Repositories

1

pegasus

🐎✈️ Pegasus is a scalable, modular, polite web-crawler for Clojure
Clojure
258
star
2

Listener

Detect calls of attention in the surroundings
Python
52
star
3

clj-lmdb

Clojure wrapper for lmdb
Clojure
36
star
4

subotai

Subotai brings routines for extracting information from HTML documents to clojure
Clojure
25
star
5

fort-knox

A disk-backed core.cache implementation based on LMDB
Clojure
23
star
6

sleipnir

A simple, performant web-crawler for clojure
Clojure
17
star
7

clojure-manifold

Manifold learning algorithms in clojure
Clojure
15
star
8

polyglot-toolbox

Polyglot skipgram embeddings, and their many health benefits
Python
11
star
9

vad_python

A solid VAD in Python
Python
9
star
10

VAD-py

Webrtc VAD in Python
C
9
star
11

JPredict

Applying ML Techniques to Predict Drawn Japanese Characters. Currently Hiragana is implemented
C#
8
star
12

robust_pcp

Robust Principal Component Pursuit
Python
7
star
13

clojure_scraping_overview

XPath and enlive
Clojure
7
star
14

tinywm-rkt

TinyWM Implementation in Racket
Racket
6
star
15

tree-edit-distance

An implementation of a tree-edit-distance algorithm for structure-based clustering in clojure
Clojure
5
star
16

kublai

Truncated matrix decompositions for core.matrix
Clojure
4
star
17

sutime-clojure

A wrapper around the Time NER Tagger in Stanford Core NLP Suite.
Clojure
3
star
18

enlive-helper

A more powerful html-resource for use with enlive's functions
Clojure
3
star
19

clj-heritrix

Clojure implementation of the heritrix REST API
Clojure
2
star
20

structural_similarity

Compare html documents for similarity in structure (or template)
Clojure
2
star
21

probabilistic-counting

Cardinality estimation algorithms in clojure
Clojure
2
star
22

crawler

ephemeral content finder
Clojure
2
star
23

clj-named-leveldb

named databases for leveldb using one simple hack they don't want you to know
Clojure
1
star
24

trec

Trec Federated Search Track
Python
1
star
25

satcharitra

Clojure
1
star
26

pegasus-examples

Pegasus Examples
Clojure
1
star
27

clj-spectral

Spectral algorithms in clojure targeting core.matrix
Clojure
1
star
28

pgm-indian-buffet-process

Scribe Notes for CMU 10-708 Lecture on Indian Buffer Process
1
star
29

racket-whistlepig

Racket bindings to the whistlepig engine
Racket
1
star
30

clojure-kindle-highlights

Scrape the kindle highlights webpage and download the highlights for a book from there.
Clojure
1
star
31

geojson3d

3d Render GeoJsons
JavaScript
1
star
32

heritrix-clojure

Heritrix API implementation in clojure (a bit of a kludge at the moment)
Clojure
1
star
33

index-page-crawler

Follow pagination and get pages
Clojure
1
star
34

web-corpus

Clueweb web corpus pipeline
Clojure
1
star
35

india_in_data

India in data source, datasets, materials
1
star
36

consistent-hashing

Consistent hashing implementation in clojure
Java
1
star
37

clj-dimension

Algorithms to study and reduce dimensions of datasets
Clojure
1
star