• Stars
    star
    473
  • Rank 92,832 (Top 2 %)
  • Language
    HTML
  • License
    MIT License
  • Created over 10 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A PHP component to convert HTML into a plain text format

example workflow Total Downloads

html2text is a very simple script that uses DOM methods to convert HTML into a format similar to what would be rendered by a browser - perfect for places where you need a quick text representation. For example:

<html>
<title>Ignored Title</title>
<body>
  <h1>Hello, World!</h1>

  <p>This is some e-mail content.
  Even though it has whitespace and newlines, the e-mail converter
  will handle it correctly.

  <p>Even mismatched tags.</p>

  <div>A div</div>
  <div>Another div</div>
  <div>A div<div>within a div</div></div>

  <a href="http://foo.com">A link</a>

</body>
</html>

Will be converted into:

Hello, World!

This is some e-mail content. Even though it has whitespace and newlines, the e-mail converter will handle it correctly.

Even mismatched tags.

A div
Another div
A div
within a div

[A link](http://foo.com)

See the original blog post or the related StackOverflow answer.

Installing

You can use Composer to add the package to your project:

{
  "require": {
    "soundasleep/html2text": "~1.1"
  }
}

And then use it quite simply:

$text = \Soundasleep\Html2Text::convert($html);

You can also include the supplied html2text.php and use $text = convert_html_to_text($html); instead.

Options

Option Default Description
ignore_errors false Set to true to ignore any XML parsing errors.
drop_links false Set to true to not render links as [http://foo.com](My Link), but rather just My Link.
char_set 'auto' Specify a specific character set. Pass multiple character sets (comma separated) to detect encoding, default is ASCII,UTF-8

Pass along options as a second argument to convert, for example:

$options = array(
  'ignore_errors' => true,
  // other options go here
);
$text = \Soundasleep\Html2Text::convert($html, $options);

Tests

Some very basic tests are provided in the tests/ directory. Run them with composer install && vendor/bin/phpunit.

Troubleshooting

Class 'DOMDocument' not found

You need to install the PHP XML extension for your PHP version. e.g. apt-get install php7.4-xml

License

html2text is licensed under MIT, making it suitable for both Eclipse and GPL projects.

Other versions

Also see html2text_ruby, a Ruby implementation.

More Repositories

1

jquery-dropdown

Bootstrap-style dropdowns with some added features and no dependencies.
HTML
767
star
2

openclerk

Keep track of cryptocurrency finances
PHP
70
star
3

html2text_ruby

A Ruby component to convert HTML into a plain text format.
HTML
38
star
4

railswiki

A wiki engine in Rails 5.
Ruby
14
star
5

statgit

Generate Git development statistics
PHP
10
star
6

jsonrpcclient

A simple PHP implementation of a JSON-RPC client and server. From http://jsonrpcphp.org
PHP
4
star
7

statgit2

Generate Git repository statistics (version 2)
Ruby
4
star
8

simple-cmis-java

A simple Java library for interacting with CMIS repositories
Java
3
star
9

iaml

Automatically exported from code.google.com/p/iaml
Java
3
star
10

familytree

A Ruby on Rails application for managing, displaying and exporting your personal family tree.
Ruby
2
star
11

stormcloak.games

Source for stormcloak.games
HTML
1
star
12

throttle-debounce-fn

jQuery-based plugin that allows you to throttle and debounce your functions
JavaScript
1
star
13

outerspaces

Source code for outerspaces.org.nz
Ruby
1
star
14

cordova-tictactoe

A Cordova application implementing Tic Tac Toe
Java
1
star
15

rmagic

A Multiplayer Magic the Gathering engine in Ruby on Rails
Ruby
1
star
16

openclerk-cookbook

Chef cookbook for installing Openclerk
Ruby
1
star
17

emberjs-handlebars-sanity

Sanity tests for Handlebars templates within an EmberJS project.
JavaScript
1
star
18

ue4-starter-textures-metal

Starter content for UE4: Textures: Metal
1
star
19

libgdx-openal-reverb-demo

Demo showing OpenAL reverb working with libgdx and lwjgl3
Java
1
star
20

rcd

A multiplayer Ruby dungeon crawler created at Rails Camp NZ
Ruby
1
star