• This repository has been archived on 01/Dec/2021
  • Stars
    star
    1,296
  • Rank 34,795 (Top 0.8 %)
  • Language
    Go
  • License
    Apache License 2.0
  • Created about 7 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

C to Go Assembly

c2goasm: C to Go Assembly

Introduction

This is a tool to convert assembly as generated by a C/C++ compiler into Golang assembly. It is meant to be used in combination with asm2plan9s in order to automatically generate pure Go wrappers for C/C++ code (that may for instance take advantage of compiler SIMD intrinsics or template<> code).

Mode of operation:

$ c2goasm -a /path/to/some/great/c-code.s /path/to/now/great/golang-code_amd64.s

You can optionally nicely format the code using asmfmt by passing in an -f flag.

This project has been developed as part of developing a Go wrapper around Simd. However it should also work with other projects and libraries. Keep in mind though that it is not intented to 'port' a complete C/C++ project in a single action but rather do it on a case-by-case basis per function/source file (and create accompanying high level Go code to call into the assembly code).

Command line options

$ c2goasm --help
Usage of c2goasm:
  -a	Immediately invoke asm2plan9s
  -c	Compact byte codes
  -f	Format using asmfmt
  -s	Strip comments

A simple example

Here is a simple C function doing an AVX2 intrinsics computation:

void MultiplyAndAdd(float* arg1, float* arg2, float* arg3, float* result) {
    __m256 vec1 = _mm256_load_ps(arg1);
    __m256 vec2 = _mm256_load_ps(arg2);
    __m256 vec3 = _mm256_load_ps(arg3);
    __m256 res  = _mm256_fmadd_ps(vec1, vec2, vec3);
    _mm256_storeu_ps(result, res);
}

Compiling into assembly gives the following

__ZN14MultiplyAndAddEPfS1_S1_S1_: ## @_ZN14MultiplyAndAddEPfS1_S1_S1_
## BB#0:
        push          rbp
        mov           rbp, rsp
        vmovups       ymm0, ymmword ptr [rdi]
        vmovups       ymm1, ymmword ptr [rsi]
        vfmadd213ps   ymm1, ymm0, ymmword ptr [rdx]
        vmovups       ymmword ptr [rcx], ymm1
        pop           rbp
        vzeroupper
        ret

Running c2goasm will generate the following Go assembly (eg. saved in MultiplyAndAdd_amd64.s)

//+build !noasm !appengine
// AUTO-GENERATED BY C2GOASM -- DO NOT EDIT

TEXT ยท_MultiplyAndAdd(SB), $0-32

	MOVQ vec1+0(FP), DI
	MOVQ vec2+8(FP), SI
	MOVQ vec3+16(FP), DX
	MOVQ result+24(FP), CX

	LONG $0x0710fcc5             // vmovups    ymm0, yword [rdi]
	LONG $0x0e10fcc5             // vmovups    ymm1, yword [rsi]
	LONG $0xa87de2c4; BYTE $0x0a // vfmadd213ps    ymm1, ymm0, yword [rdx]
	LONG $0x0911fcc5             // vmovups    yword [rcx], ymm1

	VZEROUPPER
	RET

This needs to be accompanied by the following Go code (in MultiplyAndAdd_amd64.go)

//go:noescape
func _MultiplyAndAdd(vec1, vec2, vec3, result unsafe.Pointer)

func MultiplyAndAdd(someObj Object) {

	_MultiplyAndAdd(someObj.GetVec1(), someObj.GetVec2(), someObj.GetVec3(), someObj.GetResult()))
}

And as you may have gathered the amd64.go file needs to be in place in order for the arguments names to be derived (and allow go vet to succeed).

Benchmark against cgo

We have run benchmarks of c2goasm versus cgo for both Go version 1.7.5 and 1.8.1. You can find the c2goasm benchmark test in test/ and the cgo test in cgocmp/ respectively. Here are the results for both versions:

$ benchcmp ../cgocmp/cgo-1.7.5.out c2goasm.out 
benchmark                      old ns/op     new ns/op     delta
BenchmarkMultiplyAndAdd-12     382           10.9          -97.15%
$ benchcmp ../cgocmp/cgo-1.8.1.out c2goasm.out 
benchmark                      old ns/op     new ns/op     delta
BenchmarkMultiplyAndAdd-12     236           10.9          -95.38%

As you can see Golang 1.8 has made a significant improvement (38.2%) over 1.7.5, but it is still about 20x slower than directly calling into assembly code as wrapped by c2goasm.

Converted projects

Internals

The basic process is to (in the prologue) setup the stack and registers as how the C code expects this to be the case, and upon exiting the subroutine (in the epilogue) to revert back to the golang world and pass a return value back if required. In more details:

  • Define assembly subroutine with proper golang decoration in terms of needed stack space and overall size of arguments plus return value.
  • Function arguments are loaded from the golang stack into registers and prior to starting the C code any arguments beyond 6 are stored in C stack space.
  • Stack space is reserved and setup for the C code. Depending on the C code, the stack pointer maybe aligned on a certain boundary (especially needed for code that takes advantages of SIMD instructions such as AVX etc.).
  • A constants table is generated (if needed) and any rip-based references are replaced with proper offsets to where Go will put the table.

Limitations

  • Arguments need (for now) to be 64-bit size, meaning either a value or a pointer (this requirement will be lifted)
  • Maximum number of 14 arguments (hard limit -- if you hit this maybe you should rethink your api anyway...)
  • Generally no call statements (thus inline your C code) with a couple of exceptions for functions such as memset and memcpy (see clib_amd64.s)

Generate assembly from C/C++

For eg. projects using cmake, here is how to see a list of assembly targets

$ make help | grep "\.s"

To see the actual command to generate the assembly

$ make -n SimdAvx2BgraToGray.s

Supported golang architectures

For now just the AMD64 architecture is supported. Also ARM64 should work just fine in a similar fashion but support is lacking at the moment.

Compatible compilers

The following compilers have been tested:

  • clang (Apple LLVM version) on OSX/darwin
  • clang on linux

Compiler flags:

-masm=intel -mno-red-zone -mstackrealign -mllvm -inline-threshold=1000 -fno-asynchronous-unwind-tables -fno-exceptions -fno-rtti
Flag Explanation
-masm=intel Output Intel syntax for assembly
-mno-red-zone Do not write below stack pointer (avoid red zone)
-mstackrealign Use explicit stack initialization
-mllvm -inline-threshold=1000 Higher limit for inlining heuristic (default=255)
-fno-asynchronous-unwind-tables Do not generate unwind tables (for debug purposes)
-fno-exceptions Disable exception handling
-fno-rtti Disable run-time type information

The following flags are only available in clang -cc1 frontend mode (see below):

Flag Explanation
-fno-jump-tables Do not use jump tables as may be generated for select statements

clang vs clang -cc1

As per the clang FAQ, clang -cc1 is the frontend, and clang is a (mostly GCC compatible) driver for the frontend. To see all options that the driver passes on to the frontend, use -### like this:

$ clang -### -c hello.c
"/usr/lib/llvm/bin/clang" "-cc1" "-triple" "x86_64-pc-linux-gnu" etc. etc. etc.

Command line flags for clang

To see all command line flags use either clang --help or clang --help-hidden for the clang driver or clang -cc1 -help for the frontend.

Further optimization and fine tuning

Using the LLVM optimizer (opt) you can further optimize the code generation. Use opt -help or opt -help-hidden for all available options.

An option can be passed in via clang using the -mllvm <value> option, such as -mllvm -inline-threshold=1000 as discussed above.

Also LLVM allows you to tune specific functions via function attributes like define void @f() alwaysinline norecurse { ... }.

What about GCC support?

For now GCC code will not work out of the box. However there is no reason why GCC should not work fundamentally (PRs are welcome).

Resources

License

c2goasm is released under the Apache License v2.0. You can find the complete text in the file LICENSE.

Contributing

Contributions are welcome, please send PRs for any enhancements.

More Repositories

1

minio

The Object Store for AI Data Infrastructure
Go
43,034
star
2

mc

Simple | Fast tool to manage MinIO clusters โ˜๏ธ
Go
2,683
star
3

minio-go

MinIO Go client SDK for S3 compatible object storage
Go
2,204
star
4

simdjson-go

Golang port of simdjson: parsing gigabytes of JSON per second
Go
1,730
star
5

operator

Simple Kubernetes Operator for MinIO clusters ๐Ÿ’ป
Go
1,092
star
6

minio-java

MinIO Client SDK for Java
Java
995
star
7

sha256-simd

Accelerate SHA256 computations in pure Go using AVX512, SHA Extensions for x86 and ARM64 for ARM. On AVX512 it provides an up to 8x improvement (over 3 GB/s per core). SHA Extensions give a performance boost of close to 4x over native.
Go
919
star
8

minio-js

MinIO Client SDK for Javascript
JavaScript
879
star
9

highwayhash

Native Go version of HighwayHash with optimized assembly implementations on Intel and ARM. Able to process over 10 GB/sec on a single core on Intel CPUs - https://en.wikipedia.org/wiki/HighwayHash
Go
850
star
10

console

Simple UI for MinIO Object Storage ๐Ÿงฎ
TypeScript
788
star
11

minio-py

MinIO Client SDK for Python
Python
758
star
12

awesome-minio

A curated list of Awesome MinIO community projects.
658
star
13

selfupdate

Build self-updating Go programs
Go
583
star
14

docs

MinIO Object Storage Documentation
SCSS
532
star
15

directpv

Simple Kubernetes CSI driver for Direct Attached Storage ๐Ÿ’ฝ
Go
517
star
16

sidekick

High Performance HTTP Sidecar Load Balancer
Go
515
star
17

minio-dotnet

MinIO Client SDK for .NET
C#
506
star
18

warp

S3 benchmarking tool
Go
463
star
19

minfs

A network filesystem client to connect to MinIO and Amazon S3 compatible cloud storage servers
Go
451
star
20

kes

Key Managament Server for Object Storage and more
Go
441
star
21

dsync

A distributed sync package.
Go
399
star
22

doctor

Doctor is a documentation server for your docs in github
Ruby
389
star
23

minsql

High-performance log search engine.
Rust
358
star
24

minio-service

Collection of MinIO server scripts for upstart, systemd, sysvinit, launchd.
Shell
345
star
25

sio

Go implementation of the Data At Rest Encryption (DARE) format.
Go
340
star
26

blake2b-simd

Fast hashing using pure Go implementation of BLAKE2b with SIMD instructions
Go
245
star
27

concert

Concert is a console based certificate generation tool for https://letsencrypt.org.
Go
195
star
28

minio-rs

MinIO Rust SDK for Amazon S3 Compatible Cloud Storage
Rust
169
star
29

asm2plan9s

Tool to generate BYTE sequences for Go assembly as generated by YASM
Go
165
star
30

md5-simd

Accelerate aggregated MD5 hashing performance up to 8x for AVX512 and 4x for AVX2. Useful for server applications that need to compute many MD5 sums in parallel.
Go
159
star
31

certgen

A dead simple tool to generate self signed certificates for MinIO TLS deployments
Go
104
star
32

thumbnailer

A thumbnail generator example using Minio's listenBucketNotification API
JavaScript
103
star
33

charts

MinIO Helm Charts
Mustache
98
star
34

spark-select

A library for Spark DataFrame using MinIO Select API
Scala
97
star
35

minio-cpp

MinIO C++ Client SDK for Amazon S3 Compatible Cloud Storage
C++
92
star
36

mint

Collection of tests to detect overall correctness of MinIO server.
Go
76
star
37

madmin-go

The MinIO Admin Go Client SDK provides APIs to manage MinIO services
Go
65
star
38

minio-java-rest-example

REST example using minio-java library.
Java
62
star
39

minio-go-media-player

A HTML5 media player using minio-go library.
HTML
57
star
40

minio-js-store-app

Store Application using minio-js library to manage product assets
HTML
49
star
41

minio-hs

MinIO Client SDK for Haskell
Haskell
46
star
42

dperf

Drive performance measurement tool
Go
46
star
43

msf

MFS (Minio Federation Service) is a namespace, identity and access management server for Minio Servers
Go
43
star
44

openlake

Build Data Lake using Open Source tools
Jupyter Notebook
39
star
45

zipindex

Package for indexing zip files and storing a compressed index
Go
39
star
46

hperf

Distributed HTTP Speed Test.
Go
38
star
47

simdcsv

Go
33
star
48

nifi-minio

A custom ContentRepository implementation for NiFi to persist data to MinIO Object Storage
Java
30
star
49

benchmarks

Collection of benchmarks captured for MinIO server.
29
star
50

m3

MinIO Kubernetes Cloud
Go
27
star
51

android-photo-app

Android Photo App example using minio-java library.
Java
26
star
52

minio-ruby

MinIO Client SDK for Ruby
Ruby
26
star
53

lxmin

Backup and Restore LXC instances from MinIO
Go
26
star
54

radio

Redundant Array of Distributed Independent Objectstores in short RADIO performs synchronous mirroring, erasure coding across multiple object stores
Go
24
star
55

parquet-go

Go library to work with Parquet Files
Go
23
star
56

presto-minio

How to use Presto (with Hive metastore) and MinIO?
23
star
57

pkg

Repository to hold all the common packages imported by MinIO projects
Go
22
star
58

bottlenet

Find bottlenecks in distributed network
Go
21
star
59

lsync

Local syncing package with support for timeouts. This package offers both a sync.Mutex and sync.RWMutex compatible interface.
Go
17
star
60

simple-ci

Stateless. Infinite scalability. Easy Setup. Microservice. Minimalist CI
JavaScript
17
star
61

ming

Object Storage Gateway for Hybrid Cloud
Go
17
star
62

blog-assets

Collection of assets used for various articles at https://blogs.min.io
Jupyter Notebook
17
star
63

gluegun

Glues Github markdown docs to present a beautiful documentation site.
CSS
16
star
64

swift-photo-app

Swift photo app
Swift
15
star
65

homebrew-stable

Homebrew tap for MinIO
Ruby
15
star
66

mnm

Minimal Minio API aggregates many minio instances to look like one
Go
13
star
67

perftest

Collection of scripts used in Minio performance testing.
Go
12
star
68

ror-resumeuploader-app

Ruby on rails app using aws-sdk-ruby
JavaScript
11
star
69

mds

MinIO Design System is a common library of all the UI design elements.
TypeScript
10
star
70

minio-iam-testing

Shell
10
star
71

rsync-go

This is a pure go implementation of the rsync algorithm with highwayhash signature
Go
9
star
72

select-simd

Go
8
star
73

chaos

A framework for testing Minio's fault tolerance capability.
Go
8
star
74

hdfs-to-minio

A simple containerized hadoop CLI to migrate content between various HCFS implementations
Dockerfile
7
star
75

simdjson-fuzz

Fuzzers and corpus for https://github.com/minio/simdjson-go
Go
7
star
76

minio-lambda-notification-example

Example App that uses MinIO Lambda Notification with Postgres
JavaScript
7
star
77

buzz

A prototype for github issue workflow management
Less
7
star
78

dmt

Direct MinIO Tunnel
Go
6
star
79

go-cv

Golang wrapper for https://github.com/ermig1979/Simd
Go
6
star
80

spark-data-generator

Generates dummy parquet, csv, json files for testing and validating MinIO compatibility
Scala
6
star
81

kms-go

MinIO key managment SDK
Go
6
star
82

xxml

Package xml implements a simple XML 1.0 parser that understands XML name spaces, extended support for control characters.
Go
5
star
83

spark-streaming-checkpoint

Spark Streaming Checkpoint File Manager for MinIO
Scala
5
star
84

minio-jenkins

This is a simple Jenkins plugin that lets you upload Jenkins artifacts to a Minio Server
Java
5
star
85

disco

Disco discovery service for MinIO.
Go
5
star
86

docs-k8s

MinIO Docs for Kubernetes
Python
4
star
87

attic

Collection of deprecated packages ๐Ÿ˜Ÿ
C++
4
star
88

pkger

Debian, RPMs and APKs for MinIO
Go
4
star
89

marketplace

Makefile
4
star
90

kitchensink

Go
3
star
91

confess

Object store consistency checker
Go
3
star
92

webhook

HTTP events to file logger
Go
3
star
93

colorjson

Package json implements encoding and decoding of JSON as defined in RFC 7159. The mapping between JSON and Go values is described in the documentation for the Marshal and Unmarshal functions
Go
2
star
94

minio-pcf-adapter

MinIO Service Adapter for Pivotal
Go
2
star
95

training

Materials for supporting MinIO-led training and curriculum.
Python
2
star
96

docs-vsphere

MinIO Docs for VMware Cloud Foundation
Python
2
star
97

xfile

Determines information about the object.
Go
2
star
98

wiki

MinIO's Wiki
2
star
99

hcp-to-minio

About A simple CLI to migrate content from HCP to MinIO
Go
2
star
100

csvparser

Package csv reads and writes comma-separated values (CSV) files.
Go
2
star