• Stars
    star
    102
  • Rank 335,584 (Top 7 %)
  • Language
    Clojure
  • License
    Other
  • Created over 14 years ago
  • Updated about 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Atom/RSS Feed Parsing for Clojure

feedparser-clj Build Status

Parse RSS/Atom feeds with a simple, clojure-friendly API. Uses the Java ROME library, wrapped in StructMaps.

Status

Usable for parsing and exploring feeds. No escaping of potentially-malicious content is performed, and we've inherited any quirks that ROME itself has.

Supports the following syndication formats:

  • RSS 0.90
  • RSS 0.91 Netscape
  • RSS 0.91 Userland
  • RSS 0.92
  • RSS 0.93
  • RSS 0.94
  • RSS 1.0
  • RSS 2.0
  • Atom 0.3
  • Atom 1.0

Usage

For a more detailed understanding about supported feed types and meanings, the ROME javadocs (under com.sun.syndication.feed.synd) are a good resource.

There is only one function, parse-feed, which takes a URL and returns a StructMap with all the feed's structure and content.

The following REPL session should give an idea about the capabilities and usage of feedparser-clj.

Load the package into your namespace:

user=> (ns user (:require [feedparser-clj.core] [clojure.string :as string]))

Retrieve and parse a feed:

user=> (def f (parse-feed "http://gregheartsfield.com/atom.xml"))

parse-feed also accepts a java.io.InputStream for reading from a file or other sources (see clojure.java.io/input-stream):

;; Contents of resources/feed.rss
<rss>
  ...
</rss>

user=> (def f (with-open
                [feed-stream (-> "feed.rss"
                                 clojure.java.io/resource
                                 clojure.java.io/input-stream)]
                (parse-feed feed-stream)))

f is now a map that can be accessed by key to retrieve feed information:

user=> (keys f)
(:authors :author :categories :contributors :copyright :description :encoding :entries :feed-type :image :language :link :entry-links :published-date :title :uri)

A key applied to the feed gives the value, or nil if it was not defined for the feed.

user=> (:title f)
"Greg Heartsfield"

Feed/entry ID or GUID can be obtained with the :uri key:

user=> (:uri f)
"http://gregheartsfield.com/"

Some feed attributes are maps themselves (like :image) or lists of structs (like :entries and :authors):

user=> (map :email (:authors f))
("[email protected]")

Check how many entries are in the feed:

user=> (count (:entries f))
18

Determine the feed type:

user=> (:feed-type f)
"atom_1.0"

Look at the first few entry titles:

user=> (map :title (take 3 (:entries f)))
("Version Control Diagrams with TikZ" "Introducing cabal2doap" "hS3, with ByteString")

Find the most recently updated entry's title:

user=> (first (map :title (reverse (sort-by :updated-date (:entries f)))))
"Version Control Diagrams with TikZ"

Compute what percentage of entries have the word "haskell" in the body (uses clojure.string):

user=> (let [es (:entries f)]
           (* 100.0 (/ (count (filter #(string/substring? "haskell"
               (:value (first (:contents %)))) es))
           (count es))))
55.55555555555556

Installation

This library uses the Leiningen build tool.

ROME and JDOM are required dependencies, which may have to be manually retrieved and installed with Maven. After that, simply clone this repository, and run:

lein install

License

Distributed under the BSD-3 License.

Copyright

Copyright (C) 2010 Greg Heartsfield

More Repositories

1

nostr-rs-relay

Mirror of https://sr.ht/~gheartsfield/nostr-rs-relay/
Rust
406
star
2

hbeanstalk

haskell client for beanstalk message queue
Haskell
15
star
3

fermata

fake MTA for application testing
Scala
12
star
4

hS3

Haskell
9
star
5

7-databases

Ruby
8
star
6

tarski_logic

Solutions for Tarski's Introduction to Logic
TeX
5
star
7

btc-osx-wallet-backup

Automate swapping Bitcoin wallets around and backing up with Git
Perl
4
star
8

read52

Keep track of your 52-books-in-52-weeks goal
JavaScript
3
star
9

msp430_serial

Serial interface to MSP430 Launchpad
C
3
star
10

pairingplace.com

Pairing Place
Scala
3
star
11

nolly

Nostr Microblogging Platform
3
star
12

AppStoreSalesCharting

Example of using R to chart some Apple AppStore sales data.
R
3
star
13

Revelation

versioned migrations for arbitrary data stores
Java
2
star
14

kumquat

Location-based chat with Node.js/Redis
JavaScript
2
star
15

jenkins_census_analysis

Analysis of Jenkins census data
Haskell
2
star
16

osxstats2pulsar

Send OS X system statistics to Apache Pulsar
Python
2
star
17

dallas_open_data_vehicle

Dallas Open Data Hackathon for Vehicle Theft Data
1
star
18

scsibug.github.io

github pages
CSS
1
star
19

BayesianComputationWithR

Code from the book in the UseR! series
R
1
star
20

backup_inventory

Python
1
star
21

Fermata-Site

Website for Fermata project
JavaScript
1
star
22

fermata-compojure

fake MTA for application testing
Clojure
1
star
23

PlacesAndThings

experiments with HTML5 offline storage, client-side DB, and jquery mobile
JavaScript
1
star
24

jSigPad

Java library for creating PNGs from jQuery Signature Pad plugin
Java
1
star
25

minni

search engine and unfiltered playground
Scala
1
star
26

pictureframe

e-ink picture frame
C
1
star
27

class-central

Source code for class-central.com
PHP
1
star
28

jmx_discovery

Demo of local discovery and querying of JMX servers
Java
1
star
29

UserRegistrationSignature

testing sigpad library
1
star
30

blunderbore

Happstack beanstalk status webapp
JavaScript
1
star
31

scala-selenium-explore

Playing with Scala+Selenium+ScalaTest
Scala
1
star
32

tagexif

Collection of scripts to aid in tagging EXIF/XMP data on photos taken with analog cameras.
Python
1
star
33

FB_SemanticWebExporter

Export FOAF/SIOC linked data from Facebook
Python
1
star
34

SPARQLPad

JavaScript
1
star
35

vsmp

Very Slow Movie Player
Python
1
star