• Stars
    star
    219
  • Rank 175,149 (Top 4 %)
  • Language
    Python
  • License
    GNU General Publi...
  • Created almost 12 years ago
  • Updated about 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A script to convert Wordpress XML dump to markdown files

WordPress to Markdown Exporter

Update: I don't have much time to maintain this project, but I would really appreciate community help. If you looking for an open source project to contribute, it's a great opportunity. Pull request a very appreciated by me and migrating WordPress users.

A python script to convert WordPress XML dump to a set of plain text/markdown files. Intended to be used for migration from WordPress to public-static website generator, but could also be helpful as general purpose WordPress content processor.

Installation

The script could be installed by command:

pip install git+https://github.com/dreikanter/wp2md

It will install wp2md and the following dependencies:

Usage

Export WordPress data to XML file (Tools → Export → All content):

WordPress content export

And then run the following command:

wp2md -d /export/path/ wordpress-dump.xml

Where /export/path/ is the directory where post and page files will be generated, and wordpress-dump.xml is the XML file exported by WordPress.

Use --help parameter to see the complete list of command line options:

usage: wp2md [options] source

Export WordPress XML dump to markdown files

positional arguments:
  source      source XML dump exported from WordPress

optional arguments:
  -h, --help  show this help message and exit
  -v          verbose logging
  -l FILE     log to file
  -d PATH     destination path for generated files
  -u FMT      <pubDate> date/time parsing format
  -o FMT      <wp:post_date> and <wp:post_date_gmt> parsing format
  -f FMT      date/time fields format for exported data
  -p FMT      date prefix format for generated files
  -m          preprocess content with Markdown (helpful for MD input)
  -n LEN      post name (slug) length limit for file naming
  -r          generate reference links instead of inline
  -ps PATH    post files path (see docs for variable names)
  -pg PATH    page files path
  -dr PATH    draft files path
  -url        keep absolute URLs in hrefs and image srcs
  -b URL      base URL to subtract from hrefs (default is the root)

The output

The script generates a separate file for each post, page and draft, and groups it by configurable directory structure. By default posts are grouped by year-named directories and pages are just stored to the output folder.

Exported files

But you could specify different directory structure and file naming pattern using -ps, -pg and -dr parameters for posts, pages and drafts respectively. For example -ps {year}/{month}/{day}/{title}.md will produce date-based subfolders for blog posts.

Each exported file has a straightforward structure intended for further processing with public-static website generator. It has an INI-like formatted header followed by markdown-formatted post (or page) contents:

title: Я.Субботник в Санкт-Петербурге, 3 декабря
link: http://paradigm.ru/yandex-subbotni
creator: admin
description: 
post_id: 635
post_date: 2011-11-23 22:10:35
post_date_gmt: 2011-11-23 19:10:35
comment_status: open
post_name: yandex-subbotnik
status: publish
post_type: post

# Я.Субботник в Санкт-Петербурге, 3 декабря

Я.Субботник в Санкт-Петербурге пройдет 3 декабря в [офисе Яндекса](http://company.yandex.ru/contacts/spb/).
...

If the post contains comments, they will be included below.

See also

Copyright and licensing

Copyright © 2013 by Alex Musayev.
License: GNU (see LICENSE).

Project home: https://github.com/dreikanter/wp2md.

More Repositories

1

ruby-bookmarks

Ruby and Ruby on Rails bookmarks collection
Ruby
2,175
star
2

sublime-bookmarks

Sublime Text essential plugins and resources
1,007
star
3

pyke

Python make tool based on function annotations (PEP 3107)
Python
28
star
4

callee

A Ruby gem to make classes callable
Ruby
15
star
5

python-urlify

Django's urlify.js ported to Python
Python
13
star
6

public-static

Programmer's approach to web content management
Python
11
star
7

markdown-grid

Twitter Bootstrap compatible grid extension for markdown syntax.
Python
11
star
8

paradigm.ru

Paradigm.ru blog archive.
JavaScript
11
star
9

mql4-helpers

MQL4 helpers functions for MetaTrader trading platform
9
star
10

calendar-generator

[This code is 10 years old historical artifact. Not supported.] PHP-class to generate HTML calendars
PHP
8
star
11

boodka

Experimental personal budget calculator
Ruby
7
star
12

win-tweaks

Windows Registry Tweaks
7
star
13

feeder

Imaginary friends automata for Freefeed.net
Ruby
6
star
14

cs-interview-cases

C# interview questions
C#
5
star
15

gistopin

Python script to bookmark new Gist entries with pinboard.in. By magic!
Python
4
star
16

python-bootstrap

A template project for python command line tools
Python
3
star
17

wishlist-generator

Git-based generator for wishlist web pages.
Python
2
star
18

dbarc

Dropbox incremental archiver for Raspberry Pi
Python
2
star
19

alex.musayev.com

Alex Musayev's home page. Quadratisch. Praktisch. Gut.
HTML
1
star
20

topicstarter

Freefeed Conversation Club Topic Starter
Ruby
1
star
21

save2run

Python script to run other python scripts on `Ctrl-S`.
Python
1
star
22

wordpress-post-list

[Archived] WordPress Post List Plugin
PHP
1
star
23

drafts.cc

@dreikanter's website sources
Ruby
1
star
24

refrep

Bibliographic reference processor for Microsoft Word documents
C#
1
star
25

chat2md

Slack/Skype chat history formatter for GitHub and Jira
Ruby
1
star
26

search-moar-for-chrome

One-click switcher between Yandex and Google SERPs
JavaScript
1
star
27

dropbox-uploader

A simple script to backup files to Dropbox. Client installation not required.
Python
1
star
28

dotfiles

Alex's dotfiles
Lua
1
star
29

lrthw-bootstrap

Learn Ruby the Hard Way Bootstrap
Ruby
1
star
30

enu

Missing enum type for Ruby and Rails
Ruby
1
star
31

vs-loc-counter

Source lines counter for Visual Studio projects
C#
1
star
32

chrome-dev-tools

Tools for Chrome extensions development.
Python
1
star
33

feeder-ansible

Ansible automation for feeder project (based on https://github.com/dreikanter/rails-ansible)
Jinja
1
star
34

b23-for-chrome

Google Chrome extension for b23.ru, an URL shortening service.
Python
1
star
35

instagram-subscribers-importer

Ruby
1
star
36

rails-ansible

Ansible automation for Rails
Python
1
star
37

motoko

Motoko Blogging Engine
PHP
1
star
38

polexp-track

Parcel tracking information extractor for Polar Express delivery service (polexp.com).
Python
1
star
39

narod-for-chrome

Chrome extension for file downloading from Yandex.Narod with no need to type annoying captcha.
Python
1
star
40

mattermost-playbooks

Ansible playbooks for Mattermost server setup
Shell
1
star
41

vagrant-rails

Vagrant configuration for Rails development environment
Shell
1
star