• Stars
    star
    153
  • Rank 243,368 (Top 5 %)
  • Language
    C#
  • License
    MIT License
  • Created about 1 year ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Successor of ZString; UTF8 based zero allocation high-peformance String Interpolation and StringBuilder.

Utf8StringInterpolation

GitHub Actions Releases NuGet package

Successor of ZString; UTF8 based zero allocation high-peformance String Interpolation and StringBuilder.

C# 10.0 Improved Interpolated Strings gets extremely high performance in string generation by deconstructing format strings at compile time and executing the most optimal calls. However, this applies only to String (UTF16) and cannot be applied to the generation of UTF8 strings. Utf8StringInterpolation has achieved the generation of byte[] as UTF8 and direct writing to IBufferWriter<byte> with zero allocation and at peak performance, thanks to its compiler support custom InterpolatedStringHandler optimized for UTF8 writing, all while retaining the user-friendly syntax and compiler support of InterpolatedString. With the addition of IUtf8SpanFormattable in .NET 8, we have achieved optimized Utf8 writing for numerous value types.

using Utf8StringInterpolation;

// Create UTF8 encoded string directly(without encoding).
byte[] utf8 = Utf8String.Format($"Hello, {name}, Your id is {id}!");

// write to IBufferWriter<byte>(for example ASP.NET HttpResponse.BodyWriter)
var bufferWriter = new ArrayBufferWriter<byte>();
Utf8String.Format(bufferWriter, $"Today is {DateTime.Now:yyyy-MM-dd}"); // support format

// write to FileStream directly
using var fs = File.OpenWrite("foo.txt");
var pipeWriter = PipeWriter.Create(fs);
Utf8String.Format(pipeWriter, $"Foo: {id,10} {name,-5}"); // support alignment

// like a StringBuilder
var writer = Utf8String.CreateWriter(bufferWriter);
writer.Append("My Name...");
writer.AppendFormat($"is...? {name}");
writer.AppendLine();
writer.Flush();

// Join, Concat methods
var seq = Enumerable.Range(1, 10);
byte[] utf8seq = Utf8String.Join(", ", seq);

Modern C# treats ReadOnlySpan<byte> as Utf8. Additionally, modern C# writes to the output using IBufferWriter<byte>. Utf8StringInterpolation provides a variety of utilities and writers optimized for use with them. In .NET 8, the new IUtf8SpanFormattable is used to directly write values to UTF8.

Getting Started

This library is distributed via NuGet, supporting .NET Standard 2.0, .NET Standard 2.1, .NET 6(.NET 7) and .NET 8 or above.

PM> Install-Package Utf8StringInterpolation

// `Format(ref Utf8StringWriter)` accepts string interpolation
Utf8String.Format($"Hello, {name}, Your id is {id}!");

// Interpolated string compiles like following.
var writer = new Utf8StringWriter(literalLength: 20, formattedCount: 2);
writer.AppendLiteral("Hello, ");
writer.AppendFormatted<T>(name);
writer.AppendLiteral(", You id is ");
writer.AppendFormatted<T>(id);
writer.AppendLiteral("!");

// internal struct writer write value to utf8 directly without boxing.
[InterpolatedStringHandler]
public ref struct Utf8StringWriter<TBufferWriter> where TBufferWriter : IBufferWriter<byte>
{
    TBufferWriter bufferWriter; // when buffer is full, advance and get more buffer
    Span<byte> buffer;          // current write buffer

    public void AppendLiteral(string value)
    {
        // encode string literal to Utf8 buffer directly
        var bytesWritten = Encoding.UTF8.GetBytes(value, buffer);
        buffer = buffer.Slice(bytesWritten);
    }

    public void AppendFormatted<T>(T value, int alignment = 0, string? format = null)
        where T : IUtf8SpanFormattable
    {
        // write value to Utf8 buffer directly
        while (!value.TryFormat(buffer, out bytesWritten, format))
        {
            Grow();
        }
        buffer = buffer.Slice(bytesWritten);
    }
}

The actual Utf8StringWriter accepts various types, uses the Utf8Formatter in .NET 6 environments that do not support IUtf8SpanFormattable, and is designed with optimizations such as more efficient buffer usage.

Utf8String methods

Entry point is Utf8StringInterpolation.Utf8String. You can use static methods or create writer.

When the argument is ref Utf8StringWriter, you can pass a string interpolation expression like $"{...}".

Utf8String.Format

Format is a one-shot method. The Format that takes an IBufferWriter<byte> performs writing based on the bufferWriter and finally flushes (Advance). The Format that returns a byte[] writes using its internally pooled ArrayBufferWriter and generates a final byte[]. TryFormat writes to the specified Span<byte>, and if the length is insufficient, it returns false.

byte[] Format(ref Utf8StringWriter format)
void Format<TBufferWriter>(TBufferWriter bufferWriter, ref Utf8StringWriter format)
    where TBufferWriter : IBufferWriter<byte>
bool TryFormat(Span<byte> destination, out int bytesWritten, ref Utf8StringWriter format)

Utf8String.Concat, Join

Utf8String.Concat and Utf8String.Join are similar to String.Concat and String.Join, but they write everything in UTF8. Like Format, there are two overloads: one that writes to IBufferWriter<byte> and another that writes to the internally pooled ArrayBufferWriter and then returns a byte[].

// Concat overloads, return byte[] or receive IBufferWriter<byte>.
byte[] Concat(params string?[] values)
void Concat<TBufferWriter>(TBufferWriter bufferWriter, params string?[] values)
    where TBufferWriter : IBufferWriter<byte>
byte[] Concat<T>(IEnumerable<T> values)
void Concat<TBufferWriter, T>(TBufferWriter bufferWriter, IEnumerable<T> values)
    where TBufferWriter : IBufferWriter<byte>
// Join overloads, return byte[] or receive IBufferWriter<byte>.
byte[] Join(string separator, params string?[] values)
void Join<TBufferWriter>(TBufferWriter bufferWriter, string separator, params string?[] values)
    where TBufferWriter : IBufferWriter<byte>
byte[] Join<T>(string separator, IEnumerable<T> values)
void Join<TBufferWriter, T>(TBufferWriter bufferWriter, string separator, IEnumerable<T> values)
    where TBufferWriter : IBufferWriter<byte>

Utf8String.CreateWriter

Utf8String.CreateWriter allows you to obtain a Utf8StringWriter that can write continuously, similar to StringBuilder. Each Append*** method can be used in a manner similar to StringBuilder. However, unlike StringBuilder, the buffer is managed by the provided BufferWriter, which is why it's named Writer. For performance reasons, the Append methods don't always flush (Advance) to the internal BufferWriter. By manually calling Flush, you ensure that Advance is invoked. Additionally, Dispose also calls Flush.

var writer = Utf8String.CreateWriter(bufferWriter);

// call each append methods.
writer.Append("foo");
writer.AppendFormat($"bar {Guid.NewGuid}");

// finally call Flush(or Dispose)
writer.Flush();

When writing larger messages, it's advisable to periodically call Flush. At those times, for instance, if using a PipeWriter, you can invoke pipeWriter.FlushAsync() to stream the write operations without holding onto a large buffer internally.

CreateWriter has two overloads. One takes an IBufferWriter<byte>, and the other uses the internally pooled buffer.

Utf8StringWriter<TBufferWriter> CreateWriter<TBufferWriter>(TBufferWriter bufferWriter, IFormatProvider? formatProvider = null)
    where TBufferWriter : IBufferWriter<byte>

Utf8StringBuffer CreateWriter(out Utf8StringWriter<ArrayBufferWriter<byte>> stringWriter, IFormatProvider? formatProvider = null)

Utf8StringBuffer is convenient because it uses an internally pooled buffer. Therefore, when you just want to obtain a byte[], for example, there's no need to separately prepare or manage an IBufferWriter<byte>.

// buffer must Dispose after used(recommend to use using)
using var buffer = Utf8String.CreateWriter(out var writer);

// call each append methods.
writer.Append("foo");
writer.AppendFormat($"bar {Guid.NewGuid}");

// finally call Flush(no need to call Dispose for writer)
writer.Flush();

// copy to written byte[]
var bytes = buffer.ToArray();

// or copy to other IBufferWriter<byte>, get ReadOnlySpan<byte>
buffer.CopyTo(otherBufferWriter);
var writtenData = buffer.WrittenSpan;

Utf8StringWriter

AppendLiteral, AppendFormatted is called from compiler generated code.

void AppendLiteral(string s)
void AppendWhitespace(int count)
void Append(string? s)
void Append(char c)
void Append(char c, int repeatCount)
void AppendUtf8(scoped ReadOnlySpan<byte> utf8String)
void AppendFormatted(scoped ReadOnlySpan<byte> utf8String)
void AppendFormatted(scoped ReadOnlySpan<char> s)
void AppendFormatted(string value, int alignment = 0, string? format = null)
void AppendFormatted<T>(T value, int alignment = 0, string? format = null)
void AppendFormat(ref Utf8StringWriter format) // extension method
void AppendLine(ref Utf8StringWriter format)   // extension method
void AppendLine()
void AppendLine(string s)
void Flush()
void Dispose() // call Flush and dereference buffer and bufferwriter

Utf8StringBuffer

Utf8StringBuffer can obtain from Utf8String.CreateWriter overload. Since it holds an internally pooled buffer, it's necessary to call Dispose to release the buffer once obtained.

int WrittenCount { get; }
ReadOnlySpan<byte> WrittenSpan { get; }
ReadOnlyMemory<byte> WrittenMemory { get; }
byte[] ToArray()
void CopyTo<TBufferWriter>(TBufferWriter bufferWriter)
    where TBufferWriter : IBufferWriter<byte>
Dispose()

Format strings

The formatting in string interpolation can use alignment and format just like regular .NET. In .NET 8, all formatting follows the standard format. For more details, please refer to the .NET documentation on formatting-types.

However, this is only the case for .NET 8 and above where IUtf8SpanFormattable is implemented. In .NET Standard 2.1, .NET 6 (and .NET 7), UTF8 writing of values is performed using Utf8Formatter.TryFormat. This requires a specific format called StandardFormat, which might not be compatible with standard formats in some cases. The supported format strings are illustrated in the Remarks section of the TryFormat documentation. The types in focus are bool, byte, Decimal, Double, Guid, Int16, Int32, Int64, SByte, Single, UInt16, UInt32, UInt64.

Exceptionally, DateTime, DateTimeOffset, and TimeSpan can use regular format specifiers even in .NET Standard 2.1 and .NET 6 (and .NET 7). This special accommodation was made because StandardFormat only allows for four patterns, which was found to be too limiting.

// .NET 8 supports all numeric custom format string but .NET Standard 2.1, .NET 6(.NET 7) does not.
Utf8String.Format($"Double value is {123.456789:.###}");

// DateTime, DateTimeOffset, TimeSpan support custom format string on all target plaftorms.
// https://learn.microsoft.com/en-us/dotnet/standard/base-types/custom-date-and-time-format-strings
Utf8String.Format($"Today is {DateTime.Now:yyyy-MM-dd}");

Unity

Cysharp/ZLogger is using Utf8StringInterpolation and supports Unity. See instruction for details.

License

This library is licensed under the MIT License.

More Repositories

1

UniTask

Provides an efficient allocation free async/await integration for Unity.
C#
8,201
star
2

MagicOnion

Unified Realtime/API framework for .NET platform and Unity.
C#
3,838
star
3

MemoryPack

Zero encoding extreme performance binary serializer for C# and Unity.
C#
3,288
star
4

R3

The new future of dotnet/reactive and UniRx.
C#
2,177
star
5

ZString

Zero Allocation StringBuilder for .NET and Unity.
C#
2,060
star
6

ConsoleAppFramework

Zero Dependency, Zero Overhead, Zero Reflection, Zero Allocation, AOT Safe CLI Framework powered by C# Source Generator.
C#
1,635
star
7

MasterMemory

Embedded Typed Readonly In-Memory Document Database for .NET and Unity.
C#
1,521
star
8

MessagePipe

High performance in-memory/distributed messaging pipeline for .NET and Unity.
C#
1,406
star
9

Ulid

Fast .NET C# Implementation of ULID for .NET and Unity.
C#
1,314
star
10

ZLogger

Zero Allocation Text/Structured Logger for .NET with StringInterpolation and Source Generator, built on top of a Microsoft.Extensions.Logging.
C#
1,262
star
11

SimdLinq

Drop-in replacement of LINQ aggregation operations extremely faster with SIMD.
C#
775
star
12

csbindgen

Generate C# FFI from Rust for automatically brings native code and C native library to .NET and Unity.
Rust
688
star
13

ObservableCollections

High performance observable collections and synchronized views, for WPF, Blazor, Unity.
C#
559
star
14

ProcessX

Simplify call an external process with the async streams in C# 8.0.
C#
453
star
15

YetAnotherHttpHandler

YetAnotherHttpHandler brings the power of HTTP/2 (and gRPC) to Unity and .NET Standard.
C#
354
star
16

UnitGenerator

C# Source Generator to create value-object, inspired by units of measure.
C#
330
star
17

RuntimeUnitTestToolkit

CLI/GUI Frontend of Unity Test Runner to test on any platform.
C#
300
star
18

AlterNats

An alternative high performance NATS client for .NET.
C#
284
star
19

NativeMemoryArray

Utilized native-memory backed array for .NET and Unity - over the 2GB limitation and support the modern API(IBufferWriter, ReadOnlySequence, scatter/gather I/O, etc...).
C#
276
star
20

StructureOfArraysGenerator

Structure of arrays source generator to make CPU Cache and SIMD friendly data structure for high-performance code in .NET and Unity.
C#
262
star
21

MagicPhysX

.NET PhysX 5 binding to all platforms(win, osx, linux) for 3D engine, deep learning, dedicated server of gaming.
Rust
258
star
22

PrivateProxy

Source Generator and .NET 8 UnsafeAccessor based high-performance strongly-typed private accessor for unit testing and runtime.
C#
239
star
23

KcpTransport

KcpTransport is a Pure C# implementation of RUDP for high-performance real-time network communication
C#
237
star
24

LogicLooper

A library for building server application using loop-action programming model on .NET.
C#
237
star
25

DFrame

Distributed load testing framework for .NET and Unity.
C#
223
star
26

Utf8StreamReader

Utf8 based StreamReader for high performance text processing.
C#
208
star
27

LitJWT

Lightweight, Fast JWT(JSON Web Token) implementation for .NET.
C#
199
star
28

Claudia

Unofficial Anthropic Claude API client for .NET.
C#
162
star
29

CsprojModifier

CsprojModifier performs additional processing when Unity Editor generates the .csproj.
C#
154
star
30

ValueTaskSupplement

Append supplemental methods(WhenAny, WhenAll, Lazy) to ValueTask.
C#
135
star
31

Kokuban

Simplifies styling strings in the terminal for .NET application.
C#
123
star
32

SlnMerge

SlnMerge merges the solution files when generating solution file by Unity Editor.
C#
114
star
33

GrpcWebSocketBridge

Yet Another gRPC over HTTP/1 using WebSocket implementation, primarily targets .NET platform.
C#
76
star
34

WebSerializer

Convert Object into QueryString/FormUrlEncodedContent for C# HttpClient REST Request.
C#
65
star
35

RandomFixtureKit

Fill random/edge-case value to target type for unit testing, supports both .NET Standard and Unity.
C#
46
star
36

Actions

41
star
37

DocfxTemplate

Patchworked DocFX v2 template for Cysharp
JavaScript
7
star
38

Multicaster

A framework for transparently invoking multiple instances or clients.
C#
5
star
39

com.unity.ide.visualstudio-backport

Backport of com.unity.ide.visualstudio to before Unity 2019.4.21
C#
1
star