• Stars
    star
    519
  • Rank 85,261 (Top 2 %)
  • Language
    Emacs Lisp
  • License
    GNU General Publi...
  • Created over 7 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

View, capture, and archive Web pages in Org-mode

org-web-tools

https://melpa.org/packages/org-web-tools-badge.svg https://stable.melpa.org/packages/org-web-tools-badge.svg

This file contains library functions and commands useful for retrieving web page content and processing it into Org-mode content.

For example, you can copy a URL to the clipboard or kill-ring, then run a command that downloads the page, isolates the โ€œreadableโ€ content with eww-readable, converts it to Org-mode content with Pandoc, and displays it in an Org-mode buffer. Another command does all of that but inserts it as an Org entry instead of displaying it in a new buffer.

Installation

Requirements

  • Emacs 25.1 or later.
  • Commands that process HTML into Org require Pandoc. Note: The output of current Pandoc versions differs substantially from versions that may still be present in stable Linux distros. If you encounter any issues, please install a more recent version of Pandoc.

MELPA

If you installed from MELPA, just run one of the commands below. If you want to use any of the functions in your own code, you should (require 'org-web-tools).

Manual

Install dash.el, esxml, request, and s.el. Then require this package in your init file:

(require 'org-web-tools)

Usage

Commands

  • org-web-tools-insert-link-for-url: Insert an Org-mode link to the URL in the clipboard or kill-ring. Downloads the page to get the HTML title.
  • org-web-tools-insert-web-page-as-entry: Insert the web page for the URL in the clipboard or kill-ring as an Org-mode entry, as a sibling heading of the current entry.
  • org-web-tools-read-url-as-org: Display the web page for the URL in the clipboard or kill-ring as Org-mode text in a new buffer, processed with eww-readable.
  • org-web-tools-convert-links-to-page-entries: Convert all URLs and Org links in current Org entry to Org headings, each containing the web page content of that URL, converted to Org-mode text and processed with eww-readable. This should be called on an entry that solely contains a list of URLs or links.
  • org-web-tools-archive-attach: Download archive of page at URL and attach with org-attach. If CHOOSE-FN is non-nil (interactively, with universal prefix), prompt for the archive function to use. If VIEW is non-nil (interactively, with two universal prefixes), view the archive immediately after attaching. (See also org-board).
  • org-web-tools-archive-view: Open Zip file archive of web page. Extracts to a temp directory and opens with browse-url-default-browser. Note: the extracted files are left on-disk in the temp directory.

Functions

These are used in the commands above and may be useful in building your own commands.

  • org-web-tools--dom-to-html: Return parsed HTML DOM as an HTML string. Note: This is an approximation and is not necessarily correct HTML (e.g. IMG tags may be rendered with a closing โ€œ</img>โ€ tag).
  • org-web-tools--eww-readable: Return โ€œreadableโ€ part of HTML with title.
  • org-web-tools--get-url: Return content for URL as string.
  • org-web-tools--html-title: Return title of HTML page.
  • org-web-tools--html-to-org-with-pandoc: Return string of HTML converted to Org with Pandoc. When SELECTOR is non-nil, the HTML is filtered using esxml-query SELECTOR and re-rendered to HTML with org-web-tools--dom-to-html, which see.
  • org-web-tools--url-as-readable-org: Return string containing Org entry of URLโ€™s web page content. Content is processed with eww-readable and Pandoc. Entry will be a top-level heading, with article contents below a second-level โ€œArticleโ€ heading, and a timestamp in the first-level entry for writing comments.
  • org-web-tools--demote-headings-below: Demote all headings in buffer so the highest level is below LEVEL.
  • org-web-tools--get-first-url: Return URL in clipboard, or first URL in the kill-ring, or nil if none.
  • org-web-tools--read-url: Return a URL by searching at point, then in clipboard, then in kill-ring, and finally prompting the user.
  • org-web-tools--read-org-bracket-link: Return (TARGET . DESCRIPTION) for Org bracket LINK or next link on current line.
  • org-web-tools--remove-dos-crlf: Remove all DOS CRLF (^M) in buffer.

Changelog

1.2-pre

Compatibility

Fixed

  • org-web-tools--org-link-for-url now returns the URL if the HTML page has no title tag. This avoids an error, e.g. when used in an Org capture template.

Improvements

  • Archiving tools:
    • Can use multiple functions to attempt archiving.
    • Associated options control retry attempts, delays, and fallbacks to other functions.
    • Functions to archive Web pages with wget and tar:
      • Function org-web-tools-archive--wget-tar archives a URLโ€™s Web page, including page resources.
      • Function org-web-tools-archive--wget-tar-html-only archives a URLโ€™s HTML only.
    • Command org-web-tools-archive-view handles both zip and tar archives.
    • The default settings attempt to archive with archive.is, and if that fails after retrying for 75 seconds, falls back to using wget and tar.

1.1.2

Fixed

  • Only test non-nil items in org-web-tools--get-first-url. This makes it work properly in non-GUI Emacs sessions. (Thanks to Ben Sima for reporting.)

1.1.1

Fixed

  • Require org-attach.

1.1

Additions

  • Command org-web-tools-attach-url-archive.
  • Command org-web-tools-view-archive.
  • Function org-web-tools--read-url.

1.0.1

Changes

  • Remove all property drawers that contain the CUSTOM_ID property from Pandoc output.

1.0

  • First declared stable release.

Development

Contributions and suggestions are welcome.

License

GPLv3

More Repositories

1

org-super-agenda

Supercharge your Org daily/weekly agenda by grouping items
Emacs Lisp
1,182
star
2

org-ql

An Org-mode query language, including search commands and saved views
Emacs Lisp
1,159
star
3

emacs-package-dev-handbook

An Emacs package development handbook. Built with Emacs, by Emacs package developers, for Emacs package developers.
JavaScript
987
star
4

magit-todos

Show source files' TODOs (and FIXMEs, etc) in Magit status buffer
Emacs Lisp
601
star
5

org-sidebar

A helpful sidebar for Org mode
Shell
492
star
6

org-rifle

Rifle through your Org-mode buffers and acquire your target
Emacs Lisp
488
star
7

org-protocol-capture-html

Capture HTML from the browser selection into Emacs as org-mode content
Emacs Lisp
442
star
8

bufler.el

A butler for your buffers. Group buffers into workspaces with programmable rules, and easily switch to and manipulate them.
Emacs Lisp
378
star
9

ement.el

Matrix client for Emacs
Emacs Lisp
354
star
10

unpackaged.el

A collection of useful Emacs Lisp code that isn't substantial enough to be packaged
Emacs Lisp
345
star
11

solarized-everything-css

A collection of Solarized user-stylesheets for...everything?
CSS
277
star
12

burly.el

Save and restore frames and windows with their buffers in Emacs
Emacs Lisp
252
star
13

matrix-client.el

A Matrix client for Emacs! (deprecated in favor of Ement.el)
Emacs Lisp
242
star
14

prism.el

Disperse Lisp forms (and other languages) into a spectrum of colors by depth
Emacs Lisp
228
star
15

org-graph-view

View Org buffers as a clickable, graphical mind-map
Emacs Lisp
190
star
16

pocket-reader.el

Emacs client for Pocket reading list (getpocket.com)
Emacs Lisp
188
star
17

yequake

Drop-down Emacs frames, like Yakuake
Emacs Lisp
175
star
18

ts.el

Emacs timestamp and date-time library
Emacs Lisp
159
star
19

dogears.el

Never lose your place in Emacs again
Emacs Lisp
154
star
20

with-emacs.sh

Script to easily run Emacs with specified configurations
Shell
142
star
21

makem.sh

Makefile-like script for building and testing Emacs Lisp packages
Shell
128
star
22

plz.el

An HTTP library for Emacs
Emacs Lisp
126
star
23

bucket

A bucket for your shell (like a set of registers, or a clipboard manager)
Shell
118
star
24

restic-runner

Configure and run Restic more easily
Shell
108
star
25

hammy.el

Programmable, interactive interval timers (e.g. for working/resting)
Emacs Lisp
103
star
26

org-sticky-header

Show off-screen Org heading at top of window
Emacs Lisp
103
star
27

alpha-org

A powerful Org configuration
Emacs Lisp
100
star
28

org-make-toc

Automatic tables of contents for Org files
Shell
83
star
29

taxy.el

Programmable taxonomical hierarchies for arbitrary objects
Emacs Lisp
82
star
30

transclusion-in-emacs

Resources about implementing transclusion in Emacs
79
star
31

topsy.el

Simple sticky header showing definition beyond top of window
Emacs Lisp
77
star
32

org-bookmark-heading

Emacs bookmark support for Org-mode
Emacs Lisp
75
star
33

snow.el

Let it snow in Emacs!
Emacs Lisp
72
star
34

org-now

Conveniently show current Org tasks in a sidebar window
Emacs Lisp
50
star
35

bashcaster

An actually simple screen recorder for Linux
Shell
48
star
36

org-recent-headings

Go to recently used Org headings
Shell
47
star
37

obvious.el

Who needs comments when the code is so obvious
Emacs Lisp
46
star
38

frame-purpose.el

Purpose-specific frames for Emacs
Emacs Lisp
46
star
39

org-almanac

Almanac for Org mode
43
star
40

mosey.el

Mosey around inside your Emacs buffer
Emacs Lisp
37
star
41

org-html-theme-darksun

A Solarized Dark version of the Bigblow Org HTML export theme
JavaScript
36
star
42

salv.el

Local minor mode to save a buffer when Emacs is idle
Emacs Lisp
33
star
43

sword-to-org

Convert Sword modules to Org-mode outlines
Emacs Lisp
33
star
44

org-auto-expand

Automatically expand certain Org headings
Shell
28
star
45

mangle

Mangle man pages to show just the parts you need (suitable for aliasing to "man")
Shell
26
star
46

magit.sh

Run Magit in a separate Emacs instance
Shell
26
star
47

org-notely

Pop to new Org headings for quick notetaking
Shell
26
star
48

scrollkeeper.el

Configurable scrolling commands with visual guidelines, for Emacs
Emacs Lisp
22
star
49

ap.el

A simple, Emacs Lisp-focused Emacs config
Emacs Lisp
21
star
50

highlight-function-calls

Highlight function/macro calls in Emacs
Emacs Lisp
21
star
51

org-quick-peek

Quick inline peeks at agenda items and linked nodes in Org-mode
Emacs Lisp
21
star
52

defrepeater.el

Easily define repeatable Emacs commands
Emacs Lisp
20
star
53

pocket-lib.el

Emacs library for the getpocket.com API
Emacs Lisp
19
star
54

org-pocket

Tools to use Pocket with Org-mode
Emacs Lisp
16
star
55

elexandria

Alexandria-like library for Emacs Lisp
Emacs Lisp
13
star
56

sword-converter

Convert SWORD modules to JSON and SQLite and search the converted files
Emacs Lisp
13
star
57

frecency.el

Library to sort items by "frecency" in Emacs
Emacs Lisp
11
star
58

chromatext.el

Apply color gradients to lines of text in Emacs (possibly increasing legibility)
Emacs Lisp
9
star
59

plamix

Mix together M3U playlists, optionally with a desired duration, outputting either a list of files to STDOUT, or writing an M3U playlist to a file
Python
9
star
60

pyza

A command-line/terminal/console Songza player, using VLC or MPD to play audio
Python
8
star
61

melpa-stats

Stats tools for MELPA
Emacs Lisp
6
star
62

org-books

Tools for books in Org-mode
Emacs Lisp
5
star
63

rubbish.py

WIP: A CLI to the XDG trash bin in Python
Python
5
star
64

tp.el

Emacs text-property convenience library
Emacs Lisp
5
star
65

org-search-goto

org-search-goto
Emacs Lisp
5
star
66

buffer-groups.el

A lightweight, automatic grouping rule-based buffer grouper and switcher
Emacs Lisp
5
star
67

ibuffer-auto-groups

Automatically make groups for ibuffer
Emacs Lisp
5
star
68

ampd-tools

A small collection of MPD-related tools
Python
4
star
69

reddit-emacs-css

CSS for /r/emacs
CSS
4
star
70

ox-elisp

Export Org buffers to Emacs Lisp comments
Emacs Lisp
3
star
71

dbg.el

Simple debugging macros
Emacs Lisp
3
star
72

tabtint

Firefox extension which tints Firefox tabs to match color of web page
JavaScript
3
star
73

ya-solarized.el

Yet Another Solarized theme for Emacs
Emacs Lisp
2
star
74

overwatch-formula76

A racing custom game type for Overwatch
C
2
star
75

overwatch-custom-games

A collection of custom games for Overwatch
Emacs Lisp
2
star
76

listen.el

Audio/music player for Emacs
Emacs Lisp
2
star
77

unsplash.hy

Hy
1
star
78

helm-swish

Like helm-swoop, but a little bit faster
Emacs Lisp
1
star
79

greek-hebrew-emacs

How to set up Emacs to easily type Greek and Hebrew
1
star
80

source-status-linker

Turns output of Source engine's status command into links to Steam user profiles
Python
1
star
81

pentadactyl-tabmattach

JavaScript
1
star
82

github-solarized

A Solarized user stylesheet for GitHub made with Stylus
CSS
1
star