• Stars
    star
    368
  • Rank 115,913 (Top 3 %)
  • Language
    Kotlin
  • License
    Apache License 2.0
  • Created over 1 year ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Ksoup is a lightweight Kotlin Multiplatform library for parsing HTML, extracting HTML tags, attributes, and text, and encoding and decoding HTML entities.

Ksoup - Kotlin Multiplatform HTML Parser

Ksoup is a lightweight Kotlin Multiplatform library for parsing HTML, extracting HTML tags, attributes, and text, and encoding and decoding HTML entities.

Kotlin MohamedRejeb Apache-2.0 BuildPassing Maven Central

Slide 16_9 - 1 (1)

Features

  • Parse HTML from String
  • Extract HTML tags, attributes, and text
  • Encode and decode HTML entities
  • Lightweight and does not depend on any other library
  • Kotlin Multiplatform support
  • Fast and efficient
  • Unit tested

Installation

Maven Central

Add the dependency below to your module's build.gradle.kts or build.gradle file:

Kotlin version Ksoup version
1.9.0 0.2.0
1.8.22 or lower 0.1.4
val version = "0.2.0"

// For parsing HTML
implementation("com.mohamedrejeb.ksoup:ksoup-html:$version")

// Only for encoding and decoding HTML entities 
implementation("com.mohamedrejeb.ksoup:ksoup-entites:$version")

Usage

Parsing HTML

To parse HTML from a String, use the KsoupHtmlParser class, and provide an implementation of the KsoupHtmlHandler interface, and a KsoupHtmlOptions object. Both of them are optional, you can use the default ones if you want.

KsoupHtmlParser

You can create a parser using the KsoupHtmlParser(), there are several methods that you can use, for example write to parse a String, and end to close the parser when you are done:

val ksoupHtmlParser = KsoupHtmlParser()

// String to parse
val html = "<h1>My Heading</h1>"

// Pass the HTML to the parser (It is going to parse the HTML and call the callbacks)
ksoupHtmlParser.write(html)

// Close the parser when you are done
ksoupHtmlParser.end()

KsoupHtmlHandler

You can directly implement KsoupHtmlHandler interface or use KsoupHtmlHandler.Builder():

// Implement `KsoupHtmlHandler` interface
val firstHandler = object : KsoupHtmlHandler {
    override fun onOpenTag(name: String, attributes: Map<String, String>, isImplied: Boolean) {
        println("Open tag: $name")
    }
}

// Use `KsoupHtmlHandler.Builder()`
val secondHandler = KsoupHtmlHandler
    .Builder()
    .onOpenTag { name, attributes, isImplied ->
        println("Open tag: $name")
    }
    .build()

There are several methods that you can override, for example is you want to just extract the text from the HTML, you can override the onText method:

// String to parse
val html = """
    <html>
        <head>
            <title>My Title</title>
        </head>
        <body>
            <h1>My Heading</h1>
            <p>My paragraph.</p>
        </body>
    </html>
""".trimIndent()

// String to store the extracted text
var string = ""

// Create a handler
val handler = KsoupHtmlHandler
    .Builder()
    .onText { text ->
        string += text
    }
    .build()

// Create a parser
val ksoupHtmlParser = KsoupHtmlParser(
    handler = handler,
)

// Pass the HTML to the parser (It is going to parse the HTML and call the callbacks)
ksoupHtmlParser.write(html)

// Close the parser when you are done
ksoupHtmlParser.end()

You can also use onOpenTag and onCloseTag to know when a tag is opened or closed, it can be used for scrapping data from a website or powering a rich text editor, Also you can use onComment to know when a comment is found in the HTML and onAttribute to know when attributes are found in a tag.

KsoupHtmlOptions

You can also pass KsoupHtmlOptions to the parser to change the behavior of the parser, you can for example disable the decoding of HTML entities which is enabled by default:

val options = KsoupHtmlOption(
    decodeEntities = false,
)

Encoding and Decoding HTML Entities

You can use the KsoupEntities class to encode and decode HTML entities:

// Encode HTML entities
val encoded = KsoupEntities.encodeHtml("Hello & World") // return: Hello &amp; World

// Decode HTML entities
val decoded = KsoupEntities.decodeHtml("Hello &amp; World") // return: Hello & World

KsoupEntities also provides methods to encode and decode only XML entities or HTML4. The KsoupEntities class is available in the ksoup-entites module.

Both encodeHtml and decodeHtml methods support all HTML5 entities, XML entities, and HTML4 entities.

Coming Features

  • Add clear documentation
  • Add Markdown parser

Contribution

If you've found an error in this sample, please file an issue.
Feel free to help out by sending a pull request ❤️.

Code of Conduct

Find this library useful? ❤️

Support it by joining stargazers for this repository. ⭐
Also, follow me on GitHub for more libraries! 🤩

You can always

License

Copyright 2023 Mohamed Rejeb

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

More Repositories

1

compose-rich-editor

A Rich text editor library for both Jetpack Compose and Compose Multiplatform, fully customizable, supports HTML and Markdown.
Kotlin
982
star
2

Calf

Calf is a library that allows you to easily create adaptive UIs and access platform specific APIs with Compose Multiplatform (Adaptive UI, File Picker, WebView, Permissions...).
Kotlin
870
star
3

Pokedex

Pokedex - a Kotlin Multiplatform app, built with Compose multiplatform, Coroutines, Flow, Koin, Ktor, SqlDelight, Decompose, MVIKotlin, and Material 3 based on MVI architecture
Kotlin
679
star
4

compose-dnd

Compose DND is a library that allows you to easily add drag and drop functionality to your Jetpack Compose or Compose Multiplatform projects.
Kotlin
301
star
5

Card-Game-Animation

Sample card game app made using Jetpack Compose. Contains some complex animations.
Kotlin
137
star
6

Compose-Youtube-Motion-Layout

Compose Youtube Motion Layout App
Kotlin
86
star
7

Animated-Circular-Download-Button

Animated Circular Download Button
Kotlin
85
star
8

Compose-Geometry-Playground

Geometry playground app made using Compose multiplatform that works for both android and desktop.
Kotlin
74
star
9

Compose-Interactive-Gamepad

Interactive gamepad made using jetpack compose
Kotlin
49
star
10

Dino-Game

Simple Dino Game 🎮 made using Compose multiplatform ( There's no Dino but who cares 🤣 ) I used Kotlin multiplatform with Compose multiplatform, for now the game works for android and desktop (IOS and Web soon ⏳ ).
Kotlin
27
star
11

cmp-clean-architecture

Clean Architecture App with Kotlin and Compose Multiplatform.
Kotlin
21
star
12

Pencil-Loader-Animation

Custom Pencil Loader Animation made using Jetpack compose and canvas.
Kotlin
17
star
13

drawing-android-app

Drawing native android app using Kotlin and XML
Kotlin
15
star
14

ChessGameKmp

Kotlin Design Patterns & Best Practices - Part 1: Chess Game
Kotlin
15
star
15

cmp-youtube-tutorial

Kotlin
9
star
16

android_logo_compose_canvas

A simple android logo made using compose canvas
Kotlin
6
star
17

JudoApp

Python
5
star
18

MohamedRejeb

5
star
19

Notes-App

Kotlin
4
star
20

Compose-Infinite-Pager

Kotlin
4
star
21

Compose-Dynamic-Text

Compose UI Dynamic Text component that enables you to change the text depending on the available width
Kotlin
4
star
22

Kpen

Objective-C
3
star
23

skiko-swing

A demo project comparing Skiko and Swing rendering latency.
Kotlin
3
star
24

Compose-Snake-Game

Kotlin
3
star
25

looker-app

Kotlin
2
star
26

CleanArchitectureSample

Clean architecture android app made using jetpack compose and firebase
Kotlin
2
star
27

aoc-kotlin-2023

Advent of Code Kotlin 2023
Kotlin
2
star
28

CMP-3685

Kotlin
2
star
29

storybook-compose

Nothing interesting here yet!
Kotlin
2
star
30

Android-Clock-Component

1
star
31

CMPIssue4380

Kotlin
1
star
32

Color-Picker-App

1
star
33

Android-Kts-Starter

Kotlin
1
star
34

Travel-App

Clean architecture travel android app
Kotlin
1
star
35

Koltin-Introduction

Kotlin
1
star
36

Android-clean-architecture-sample

1
star