• Stars
    star
    134
  • Rank 270,967 (Top 6 %)
  • Language
    Python
  • License
    Do What The F*ck ...
  • Created almost 12 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

PyYAML-based module to produce a bit more pretty and readable YAML-serialized data

pretty-yaml (or pyaml)

PyYAML-based python module to produce a bit more pretty and human-readable YAML-serialized data.

This module is for serialization only, see ruamel.yaml module for literate YAML parsing (keeping track of comments, spacing, line/column numbers of values, etc).

(side-note: to dump stuff parsed by ruamel.yaml with this module, use only YAML(typ='safe') there)

It's a small module, and for projects that only need part of its functionality, I'd recommend copy-pasting that in, instead of adding janky dependency.

Repository URLs:

Warning

Prime goal of this module is to produce human-readable output that can be easily diff'ed, manipulated and re-used, but maybe with occasional issues.

So please do not rely on the thing to produce output that can always be deserialized exactly to what was exported, at least - use PyYAML directly for that (but maybe with options from the next section).

What this module does and why

YAML is generally nice and easy format to read if it was written by humans.

PyYAML can a do fairly decent job of making stuff readable, and the best combination of parameters for such output that I've seen so far is probably this one:

>>> m = [123, 45.67, {1: None, 2: False}, 'some text']
>>> data = dict(a='asldnsa\nasldpáknsa\n', b='whatever text', ma=m, mb=m)
>>> yaml.safe_dump(data, sys.stdout, allow_unicode=True, default_flow_style=False)
a: 'asldnsa

  asldpáknsa

  '
b: whatever text
ma: &id001
- 123
- 45.67
- 1: null
  2: false
- some text
mb: *id001

pyaml (this module) tries to improve on that a bit, with the following tweaks:

  • Most human-friendly representation options in PyYAML (that I know of) are used as defaults.

  • Dump "null" values as empty values, if possible, which have the same meaning but reduce visual clutter and are easier to edit.

  • Dicts, sets, OrderedDicts, defaultdicts, namedtuples, enums, dataclasses, etc are represented as their safe YAML-compatible base (like int, list or mapping), with mappings key-sorted by default for more diff-friendly output.

  • Use shorter and simplier yes/no for booleans.

  • List items get indented, as they should be.

  • Attempt is made to pick more readable string representation styles, depending on the value, e.g.:

    >>> yaml.safe_dump(cert, sys.stdout)
    cert: '-----BEGIN CERTIFICATE-----
    
      MIIH3jCCBcagAwIBAgIJAJi7AjQ4Z87OMA0GCSqGSIb3DQEBCwUAMIHBMRcwFQYD
    
      VQQKFA52YWxlcm9uLm5vX2lzcDEeMBwGA1UECxMVQ2VydGlmaWNhdGUgQXV0aG9y
    ...
    
    >>> pyaml.p(cert):
    cert: |
      -----BEGIN CERTIFICATE-----
      MIIH3jCCBcagAwIBAgIJAJi7AjQ4Z87OMA0GCSqGSIb3DQEBCwUAMIHBMRcwFQYD
      VQQKFA52YWxlcm9uLm5vX2lzcDEeMBwGA1UECxMVQ2VydGlmaWNhdGUgQXV0aG9y
    ...
    
  • "force_embed" option (default=yes) to avoid having &id stuff scattered all over the output. Might be more useful to disable it in some specific cases though.

  • "&id" anchors, if used, get labels from the keys they get attached to, not just meaningless enumerators.

  • "string_val_style" option to only apply to strings that are values, not keys, i.e:

    >>> pyaml.p(data, string_val_style='"')
    key: "value\nasldpáknsa\n"
    >>> yaml.safe_dump(data, sys.stdout, allow_unicode=True, default_style='"')
    "key": "value\nasldpáknsa\n"
    
  • Add vertical spacing (empty lines) between keys on different depths, to separate long YAML sections in the output visually, make it more seekable.

  • Discard end-of-document "..." indicators for simple values.

Result for the (rather meaningless) example above:

>>> pyaml.p(data, force_embed=False, vspacing=dict(split_lines=10))

a: |
  asldnsa
  asldpáknsa

b: whatever text

ma: &ma
  - 123
  - 45.67
  - 1:
    2: no
  - some text

mb: *ma

(force_embed=False enabled deduplication with &ma anchor, vspacing is adjusted to split even this tiny output)


Extended example:

>>> pyaml.dump(data, vspacing=dict(split_lines=10))

destination:

  encoding:
    xz:
      enabled: yes
      min_size: 5120
      options:
      path_filter:
        - \.(gz|bz2|t[gb]z2?|xz|lzma|7z|zip|rar)$
        - \.(rpm|deb|iso)$
        - \.(jpe?g|gif|png|mov|avi|ogg|mkv|webm|mp[34g]|flv|flac|ape|pdf|djvu)$
        - \.(sqlite3?|fossil|fsl)$
        - \.git/objects/[0-9a-f]+/[0-9a-f]+$

  result:
    append_to_file:
    append_to_lafs_dir:
    print_to_stdout: yes

  url: http://localhost:3456/uri

filter:
  - /(CVS|RCS|SCCS|_darcs|\{arch\})/$
  - /\.(git|hg|bzr|svn|cvs)(/|ignore|attributes|tags)?$
  - /=(RELEASE-ID|meta-update|update)$

http:
  ca_certs_files: /etc/ssl/certs/ca-certificates.crt
  debug_requests: no
  request_pool_options:
    cachedConnectionTimeout: 600
    maxPersistentPerHost: 10
    retryAutomatically: yes

logging:

  formatters:
    basic:
      datefmt: '%Y-%m-%d %H:%M:%S'
      format: '%(asctime)s :: %(name)s :: %(levelname)s: %(message)s'

  handlers:
    console:
      class: logging.StreamHandler
      formatter: basic
      level: custom
      stream: ext://sys.stderr

  loggers:
    twisted:
      handlers:
        - console
      level: 0

  root:
    handlers:
      - console
    level: custom

Note that unless there are many moderately wide and deep trees of data, which are expected to be read and edited by people, it might be preferrable to directly use PyYAML regardless, as it won't introduce another (rather pointless in that case) dependency and a point of failure.

Some Tricks

  • Pretty-print any yaml or json (yaml subset) file from the shell:

    % python -m pyaml /path/to/some/file.yaml
    % curl -s https://www.githubstatus.com/api/v2/summary.json | python -m pyaml
    
  • Process and replace json/yaml file in-place:

    % python -m pyaml -r file-with-json.data
    
  • Easier "debug printf" for more complex data (all funcs below are aliases to same thing):

    pyaml.p(stuff)
    pyaml.pprint(my_data)
    pyaml.pprint('----- HOW DOES THAT BREAKS!?!?', input_data, some_var, more_stuff)
    pyaml.print(data, file=sys.stderr) # needs "from __future__ import print_function"
    
  • Force all string values to a certain style (see info on these in PyYAML docs):

    pyaml.dump(many_weird_strings, string_val_style='|')
    pyaml.dump(multiline_words, string_val_style='>')
    pyaml.dump(no_want_quotes, string_val_style='plain')
    

    Using pyaml.add_representer() (note *p*yaml) as suggested in this SO thread (or github-issue-7) should also work.

  • Control indent and width of the results:

    pyaml.dump(wide_and_deep, indent=4, width=120)
    

    These are actually keywords for PyYAML Emitter (passed to it from Dumper), see more info on these in PyYAML docs.

  • Dump multiple yaml documents into a file: pyaml.dump_all([data1, data2, data3], dst_file)

    explicit_start=True is implied, unless explicit_start=False is passed.

Installation

It's a regular Python 3.8+ module/package, published on PyPI (as pyaml).

Module uses PyYAML for processing of the actual YAML files and should pull it in as a dependency.

Dependency on unidecode module is optional and should only be necessary with force_embed=False keyword, and same-id objects or recursion is used within serialized data.

Using pip is how you generally install it, usually coupled with venv usage (which will also provide "pip" tool itself):

% pip install pyaml

Current-git version can be installed like this:

% pip install git+https://github.com/mk-fg/pretty-yaml

pip will default to installing into currently-active venv, then user's home directory (under ~/.local/lib/python...), and maybe system-wide when running as root (only useful in specialized environments like docker containers).

There are many other python packaging tools - pipenv, poetry, pdm, etc - use whatever is most suitable for specific project/environment.

More general info on python packaging can be found at packaging.python.org.

When changing code, unit tests can be run with python -m unittest discover from the local repository checkout.

More Repositories

1

python-onedrive

Obsolete python/cli module for MS SkyDrive/OneDrive's old API, do not use for new projects
Python
200
star
2

python-pulse-control

Python high-level interface and ctypes-based bindings for PulseAudio (libpulse)
Python
156
star
3

fgtk

A set of a misc tools to work with files and processes
Python
145
star
4

pulseaudio-mixer-cli

Interactive python/ncurses UI to control volume of pulse streams with some automation
Python
91
star
5

graphite-metrics

metric collectors for various stuff not (or poorly) handled by other monitoring daemons
Python
84
star
6

image-deduplication-tool

Tool to detect (and get rid of) similar images using perceptual hashing (pHash lib)
Python
79
star
7

onedrive-fuse-fs

Script to mount Microsoft OneDrive (formerly known as SkyDrive) folder as a FUSE filesystem
Python
73
star
8

apparmor-profiles

My local AppArmor profiles for apps that can use those
Shell
64
star
9

reliable-discord-client-irc-daemon

Reliable personal discord-client to irc-server translation daemon
Python
59
star
10

NetworkManager-WiFi-WebUI

Web interface (python2/twisted) for NetworkManager daemon to manage WiFi connections
Python
48
star
11

trip-based-public-transit-routing-algo

Python implementation of Trip-Based public transit routing algorithm
Python
47
star
12

convergence

Integration repo for various forks and non-merged patches for Convergence floating around
JavaScript
44
star
13

notification-thing

Python-based implementation of Desktop Notifications Specification (notification-daemon)
Python
28
star
14

tahoe-lafs-public-clouds

tahoe-lafs backend drivers for no-cost cloud providers
Python
25
star
15

dracut-crypt-sshd

dracut initramfs module to start sshd on early boot to enter encryption passphrase from across the internets
Shell
25
star
16

nflog-zmq-pcap-pipe

Tool to collect nflog and pipe it to a pcap stream/file over network (0mq) for real-time (or close to) analysis
Python
23
star
17

python-libraptorq

Python CFFI bindings for libRaptorQ (RaptorQ RFC6330 FEC implementation)
Python
22
star
18

tcp-connection-hijack-reset

Simple scapy-based tool to hijack and reset existing TCP connections
Python
22
star
19

layered-yaml-attrdict-config

YAML-based configuration module with object-attribute style access, ordering and recursive ops
Python
20
star
20

conntrack-logger

Tool to log conntrack flows and associated process/service info
Python
15
star
21

fs-bitrot-scrubber

Tool to detect userspace-visible changes to (supposedly) at-rest data
Python
15
star
22

git-nerps

Tool to encrypt and manage selected files (or parts of files) under git repository
Python
14
star
23

pfsense-scripts

Misc ad-hoc helper scripts for pfSense boxes
Shell
14
star
24

infinite-image-scroller

Python/GTK desktop app to scroll images across the window carousel-style
Python
14
star
25

acme-cert-tool

Simple one-stop tool to manage X.509/TLS certs and all the ACME CA authorization stuff
Python
13
star
26

de-setup

Local Desktop Environment setup - X11 WM, conky, mpv, systemd --user and such
Python
11
star
27

sht-sensor

Historical version of sht-sensor module, see https://github.com/kizniche/sht-sensor/
Python
11
star
28

lafs-backup-tool

Tool to securely push incremental (think "rsync --link-dest") backups to tahoe-lafs
Python
9
star
29

feedjack

Feedparser-based feed aggregation django app
Python
8
star
30

cgroup-tools

A set of tools to work with cgroup tree and process classification/QoS according to it
Python
8
star
31

codetag

A tool to index and tag local code using tmsu (tmsu.org)
Go
8
star
32

games

Misc game-related tweaks and tools that I tend to write
Lua
7
star
33

scapy-nflog-capture

Driver for scapy to allow capturing packets via Linux NFLOG interface
Python
7
star
34

waterfox

Various extensions and hacks that I use with Mozilla Firefox browser forks like Waterfox
JavaScript
6
star
35

aura

Desktop background setter with emphasis on image processing (liquid rescale, label).
Python
6
star
36

unified2

Pure-python parser for IDS unified2 binary log format
Python
6
star
37

emacs-setup

My weird emacs configuration
Emacs Lisp
6
star
38

systemd-cgroup-nftables-policy-manager

Tool to add/update nftables cgroupv2 rules for systemd-managed unit cgroups (slices, services, scopes)
Nim
6
star
39

archlinux-pkgbuilds

Arch Linux PKGBUILD scripts not included in main Arch repos that I mostly update and use
Shell
5
star
40

rp2040-sen5x-air-quality-webui-monitor

Simple micropython air quality data exporter/WebUI using RP2040 and SEN5x sensor
Python
5
star
41

gitolite-ssh-user-proxy

Custom shell+trigger to proxy ssh connection to gitolite user@host through a proxy/bastion host securely and transparently
Python
5
star
42

skype-space

Headless desktop setup just for proxying skype
Python
5
star
43

trilobite

iptables wrapper for easy management of dual-stack (ipv4/ipv6) firewall configuration
Python
5
star
44

gmond-amqp-graphite

Daemon to pull xml data from a cluster of gmond nodes and push them to graphite via amqp
Python
4
star
45

txonedrive

Obsolete twisted-based python async interface for old MS OneDrive API
Python
4
star
46

systemd-password-agent

Python implementation of a systemd password agent interface
Python
3
star
47

dspam-milter

pymilter-based daemon to safely classify email using dspam
Python
3
star
48

txboxdotnet

Twisted-based python async interface for Box (box.net) API v2.0
Python
3
star
49

tinydns-dynamic-dns-updater

Tool to generate and keep tinydns zone file with dynamic dns entries for remote hosts
Python
3
star
50

dns2udp

Python/twisted script to proxy UDP traffic over DNS (TXT) queries
Python
3
star
51

nginx-stat-check

Simple nginx module that returns 403 if stat() of specified path succeeds
C
2
star
52

gst-yaml-pipeline

Python 3.x script to build GStreamer pipeline from YAML configuration file and run it
Python
2
star
53

urbus-map-enhanced

Better interface to local (Yekaterinburg, RU) public transport real-time (position) data
JavaScript
1
star
54

rst-icalendar-event-tracker

Script to make text/conky/ical calendars and event notifications from markup in ReST (.rst) files
Python
1
star
55

open-track-dtd2mysql-gtfs

Converter for open-track/dtd2mysql UK DTD Timetable schema to GTFS schema
Python
1
star
56

dbus-lastfm-scrobbler

DBus service (API) to scrobble tracks to last.fm (or API-compatible services, via pylast)
Python
1
star
57

blog

Source for blog.fraggod.net
Python
1
star
58

name-based-routing-policy-controller

Tool for monitoring service availibility and policy-routing around the issues
Python
1
star
59

txu1

Twisted-based async interface for Ubuntu One Files Cloud REST API v1
Python
1
star
60

bitlbee

Don't use this, clone official repo from bzr::http://code.bitlbee.org/bitlbee/
C
1
star
61

distfiles-convergence

Tool to verify integrity of the local source tarballs (or distfiles) by mirror network consensus
Python
1
star
62

firefox-homepage-generator

Tool to generate a dynamic version of a firefox "homepage" with tag cloud of bookmarks and a backlog
Python
1
star
63

bordercamp-irc-bot

Python/twisted IRC bot, mainly for monitoring/notification purposes
Python
1
star