w3lib
Overview
This is a Python library of web-related functions, such as:
- remove comments, or tags from HTML snippets
- extract base url from HTML snippets
- translate entites on HTML strings
- convert raw HTTP headers to dicts and vice-versa
- construct HTTP auth header
- converting HTML pages to unicode
- sanitize urls (like browsers do)
- extract arguments from urls
Requirements
Python 3.7+
Install
pip install w3lib
Documentation
See http://w3lib.readthedocs.org/
License
The w3lib library is licensed under the BSD license.