• Stars
    star
    113
  • Rank 298,965 (Top 7 %)
  • Language
    C++
  • Created over 4 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Rcpp Bindings for the 'simdjson' Header Library

RcppSimdJson: Rcpp Bindings for the simdjson Header Library

CI License CRAN Dependencies Downloads Code Coverage Last Commit

Motivation

simdjson by Daniel Lemire (with contributions by Geoff Langdale, John Keiser and many others) is an engineering marvel. Through very clever use of SIMD instructions, it manages to parse JSON files faster than disc access. Wut? Yes you read that right: parallel processing with so little overhead that the net throughput is limited only by disk speed.

Moreover, it is implemented in neat modern C++ and can be accessed as a header-only library. (Well, one library in two files, really.) Which makes R packaging easy and convenient and compelling. So here we are.

For further introduction, see the arXiv paper by Langdale and Lemire (out/to appear in VLDB Journal 28(6) as well) and/or the video of the recent talk by Daniel Lemire at QCon (voted best talk).

Example

jsonfile <- system.file("jsonexamples", "twitter.json", package="RcppSimdJson")
library(RcppSimdJson)
validateJSON(jsonfile)                  # validate a JSON file
res <- fload(jsonfile)                  # parse a JSON file

Comparison

A simple parsing benchmark against four other R-accessible JSON parsers:

R> res
Unit: milliseconds
     expr      min       lq     mean   median       uq       max neval  cld
 simdjson  1.87118  2.03252  2.24351  2.17228  2.27756   6.57145   100 a
  jsonify  8.91694  9.20124  9.58652  9.46077  9.73692  13.41707   100  b
  RJSONIO 10.49187 11.09410 11.69109 11.42555 11.95780  17.93653   100  b
   ndjson 27.04830 28.62251 31.44330 29.51343 32.05847 146.88221   100   c
 jsonlite 34.93334 36.54784 38.67843 37.74890 40.19555  46.32444   100    d
R>

Or in chart form:

Status

All three major OSs are supported, and JSON can be parsed from file and string under a variety of settings. A C++17 compiler is required for ease of setup (though the upstream can fall back to older compiler; one can edit src/Makevars accordingly if need be).

Contributing

Any problems, bug reports, or features requests for the package can be submitted and handled most conveniently as Github issues in the repository.

Before submitting pull requests, it is frequently preferable to first discuss need and scope in such an issue ticket. See the file Contributing.md (in the Rcpp repo) for a brief discussion.

See Also

For standard JSON work on R, as well as for other nicely done C++ libraries, consider these:

Author

For the R package, Dirk Eddelbuettel and Brendan Knapp.

For everything pertaining to simdjson, Daniel Lemire (and many contributors).

More Repositories

1

littler

A scripting and command-line front-end for GNU R
R
293
star
2

tint

Tint is not Tufte
R
259
star
3

gsir-te

Getting Started in R -- Tinyverse Edition
R
226
star
4

mkl4deb

Adding the Intel MKL to a Debian / Ubuntu system via one simple script
Shell
204
star
5

binb

Binb is not Beamer
TeX
188
star
6

r2u

CRAN as Ubuntu Binaries
R
183
star
7

anytime

Anything to POSIXct or Date Converter
R
153
star
8

drat

Drat R Archive Template
HTML
148
star
9

pinp

Pinp Is Not PNAS -- Two-Column PDF Template
TeX
146
star
10

rinside

Seamless embedding of R in C++ programs
C++
130
star
11

rquantlib

R interface to the QuantLib library
C++
111
star
12

linl

Linl Is Not Letter -- Markdown-based LaTeX Letter Template
TeX
110
star
13

digest

R package to create compact hash digests of R objects
C
108
star
14

rpushbullet

R interface to the awesome Pushbullet service
R
94
star
15

bh

R package providing Boost Header files
C++
80
star
16

crp

Archived copies of the CRAN Repo Policy
HTML
72
star
17

rcppannoy

Rcpp bindings for Annoy
C++
69
star
18

rprotobuf

R Interface to Protocol Buffers
C++
68
star
19

ctv-finance

CRAN Task View: Empirical Finance
R
55
star
20

samples-rmarkdown-metropolis

RMarkdown with Metropolis/Mtheme for Beamer
Makefile
52
star
21

nanotime

Nanosecond Resolution Time Functionality for R
R
52
star
22

rcppredis

R interface to Redis using the hiredis library
C
49
star
23

rcppexamples

Examples of using Rcpp to interface R and C++
C++
45
star
24

inline

Inline C, C++ or Fortran functions in R
R
40
star
25

tidycpp

Tidy C++ wrapping of the C API of R
C++
38
star
26

ctv-hpc

CRAN Task View: High-Performance Computing with R
R
38
star
27

rcppkalman

Kalman filtering via RcppArmadillo
MATLAB
35
star
28

pkgkitten

Create simple packages which pass R CMD check
R
33
star
29

rcpptoml

Rcpp Bindings to C++ parser for TOML files
C++
31
star
30

rcppgsl

Rcpp integration for GNU GSL vectors and matrices
TeX
30
star
31

rfoaas

R interface to FOAAS service
R
28
star
32

rcppcnpy

Rcpp bindings for NumPy files
C++
26
star
33

rvowpalwabbit

R interface to the Vowpal Wabbit
C++
24
star
34

rcppcctz

Rcpp interface to CCTZ library
C++
22
star
35

r-ci

CI for R at Travis, GitHub Actions, Azure Pipelines, ...
22
star
36

ttdo

Extend tinytest with diffobj
R
21
star
37

tinythemes

Lightweight Repackaging of 'Themes' for 'ggplot2'
R
20
star
38

rcppmsgpack

MsgPack Headers for R / msgpack.org[R]
C++
19
star
39

rcppfastfloat

Rcpp Bindings for the 'fastfloat' Header-Only Library
C++
19
star
40

rcppbdt

Rcpp bindings for Boost Date_Time
C++
17
star
41

docker-ubuntu-r

Docker images for R on Ubuntu
Shell
17
star
42

rcppstreams

A C++ DSEL for real-time event stream processing
C++
15
star
43

t4

Support repo for T^4 Video Lightning Talks
Shell
15
star
44

rticles-gallery

Preview of Templates in the rticles Package
R
14
star
45

dtts

Time-series functionality based on nanotime and data.table
R
14
star
46

rcppspdlog

Bundling of spdlog for use from R and Rcpp
C++
14
star
47

filter-journal-spam

spamassassin blocked-list to avoid traffic from predatory publishers
Makefile
13
star
48

tiledb-user2021

Repository for useR! 2021 TileDB Tutorial Helper Package
R
13
star
49

rcppde

Rcpp port of Differential Evolution
C++
13
star
50

pgapack

A general-purpose, data-structure-neutral, and parallel genetic algorithm library
C
12
star
51

rcppnloptexample

Rcpp Example for accessing NLopt
C++
12
star
52

ppa-rstudio

apt install rstudio rstudio-server quarto
R
12
star
53

prrd

Parallel Running of Reverse Depends
R
12
star
54

rcppdate

R package providing date C++ library header files
C++
11
star
55

lwplot

(Experimental but working) LightWeight Plot / Leland Wilkinson Plot -- a ggplot2 2.1.0 fork aiming for lighter weight
R
11
star
56

rcppziggurat

Rcpp bindings for different Ziggurat RNG implementations
C++
11
star
57

pkg-fonts-fira

(Unofficial) Debian packaging for Mozilla Fira fonts
Makefile
11
star
58

asioheaders

R package providing Asio C++ library header files
C++
11
star
59

td

R interface to 'twelvedata' API
R
11
star
60

rapiserialize

Serialization from the C API for R
C++
10
star
61

minm

Minm Is Not Meta: One way to get several RMarkdown-using packages
10
star
62

rapidatetime

Datetime functionality from the C API for R
C
10
star
63

samples-uzuerich-2017-06

Some Examples for Rcpp Workshop
9
star
64

rcppxts

Rcpp interface to xts objects
C++
9
star
65

rcppapt

Rcpp Interface to the APT Package Manager
C++
9
star
66

dang

A collection of utility functions for R
R
8
star
67

samplecode

Example code for talks or workshops
HTML
8
star
68

earthmovdist

Earth Mover's Distance for R via the Emd-L1 library
C++
8
star
69

rcppfastad

Rcpp Bindings to FastAD Automatic Differentiation
C++
8
star
70

rcpputs

Rcpp bindings for algorithms for unevenly spaced time series
C++
8
star
71

random

R package for true random numbers from random.org
R
8
star
72

dieharder-rgb

Random number generator tester
C
7
star
73

beancounter

Stock Portfolio Performance Tool
Perl
7
star
74

gcbd

R package for GPU/CPU benchmarking on Debian-based systems
R
7
star
75

rdieharder

R interface to the Dieharder RNG test suite
C
7
star
76

ldlasb

Lies, Damned Lies, and Selective Benchmarks
C++
6
star
77

user2022-r-core-panel

useR! 2022 R Core Keynote and Panel
6
star
78

r-travis

Tools for using R with Travis, GitHub Actions and other CI backends
Shell
6
star
79

docker-debian-r

Docker image for R on Debian
Shell
6
star
80

dieharder

C
6
star
81

spotifytop50us

Visualize Most Popular Songs from the Top50
R
6
star
82

rf2

Iteration Two of RinFinance.com
5
star
83

rcpp_comparison_convolution

Rcpp versus C on the Standard Convolution Example
R
5
star
84

rbenchmark

Benchmarking routine for R
R
5
star
85

safy

Show the Global Environment Some Appreciation
R
5
star
86

snap-r-base

Snapcraft for R
R
5
star
87

pkg-latex-metropolis

[deprecated as Metropolis is now in TeXLive and hence Debian] (Unofficial) Debian packaging for Metropolis theme for Beamer
TeX
5
star
88

sanitizers

Sample R package with C/C++ code to trigger Address and Undefined Behaviour Sanitizers
C++
5
star
89

stackoverflow

Grabbag repo collecting some answers I posted. If only I had started this repo a decade ago...
HTML
5
star
90

data-examples

Unpacked data sets with Gapminder, Titanic and more
R
4
star
91

rocker-tiledb

Unofficial Development Docker files for R and TileDB
Makefile
4
star
92

rmsfact

Amazing Random Facts About the World's Greatest Hacker
R
4
star
93

arch

Arrow R and C Helpers
C
4
star
94

gaussfacts

The Greatest Mathematician since Antiquity
R
4
star
95

pkg-fonts-jetbrains-mono

(Unofficial) Debian packaging for Jetbrains Mono fonts
Makefile
4
star
96

pkg-fonts-plex

(Unofficial) Debian packaging for IBM Plex fonts
Makefile
4
star
97

drr35

Temporary repository for Debian packages built with R 3.5.0
3
star
98

docker-swc

Docker image for Software Carpentry
3
star
99

crc32c

R Package for crc32c with hardware-acceleration and software fallback
C++
3
star
100

cook-county-tax-model

Modeling Cook County Property Taxes
R
3
star