• Stars
    star
    3
  • Rank 3,963,521 (Top 79 %)
  • Language
    C++
  • License
    Other
  • Created over 6 years ago
  • Updated over 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Layrub is a runtime data placement strategy for extreme-scale net training. It is developed on BVLC Caffe, and achieves memory savings more than 50% over BVLC Caffe.

More Repositories

1

YiTu

YiTu is an easy-to-use runtime to fully exploit the hybrid parallelism of different hardwares (e.g., GPU) to efficiently support the execution of various kinds of graph algorithms (e.g., GNNs).
Python
353
star
2

VulDeePecker

VulDeePecker: A Deep Learning-Based System for Vulnerability Detection
C
289
star
3

naturalcc

NaturalCC: An Open-Source Toolkit for Code Intelligence
Python
266
star
4

Android-Container

Method on running Linux containers (Docker) on the android platform. Migrate container from X86-Based ubuntu to ARM-Based android.
Python
187
star
5

SCVDT

Source Code Vulnerability Detection Tools(SCVDT)provides a vulnerable code database, vulnerability detection service for Java and C/C++ programs, and other security service.
C
109
star
6

AdvCLIP

The implementation of our ACM MM 2023 paper "AdvCLIP: Downstream-agnostic Adversarial Examples in Multimodal Contrastive Learning"
Python
80
star
7

AMT-GAN

The official implementation of our CVPR 2022 paper "Protecting Facial Privacy: Generating Adversarial Identity Masks via Style-robust Makeup Transfer".
Python
78
star
8

AdvEncoder

The implementation of our ICCV 2023 paper "Downstream-agnostic Adversarial Examples"
Python
68
star
9

Tensorflow-RDMA

Tensorflow is a computational library using data flow graphs for scalable machine learning, and Tensorflow-RDMA is the implementation over RDMA, which can get about 4.5x speedup on two nodes comparing with TCP/IP.
C++
60
star
10

awesome-code-intelligence

57
star
11

BadHash

The official implementation of BadHash
Python
56
star
12

XGCN_library

Python
52
star
13

HSCC

HSCC is implemented with zsim-nvmain hybrid simulator, it has achieved the following functions: (1) Memory management simulations (such as MemoryNode, Zone, Buddy Allocator etc.); (2) TLB, page table and reversed page table simulations; (3) Implementation of SHMA, a hierarchical hybrid DRAM/NVM memory system that brought DRAM caching issues into software level; (4) Multiple DRMA-NVM hybrid architecture supports.
C++
51
star
14

HME

HME a hybrid memory emulator for studying the performance and energy characteristics of upcoming NVM technologies. HME exploits features available in commodity NUMA architectures to emulate two kinds of memories: fast, local DRAM, and slower, remote NVM on other NUMA nodes. HME can emulates a wide range of NVM latencies and bandwidth by injecting different memory access delay on the remote NUMA nodes. To facilitate programmers and researchers in evaluating the impact of NVM on the application performance, a high-level programming interface is also provided to allocate memory from NVM or DRAM nodes.
49
star
15

VulCNN

C++
46
star
16

MavenEcoSysResearch

Python
43
star
17

Libdroid

An unikernel-based runtime for mobile computation offloading under Mobile Fog Computing or Mobile Edge Computing scenarios.
Java
38
star
18

Frog

Frog is Asynchronous Graph Processing on GPU with Hybrid Coloring Model. The fundamental idea is based on Pareto principle (or 80-20 rule) about coloring algorithms as we observed through masses of real graph coloring cases.
Cuda
37
star
19

DCF

Dynamic Cuckoo Filter (DCF) is succinct data structure of approximate set representing and membership testing for large-scale dynamic data sets. DCF supports item insertion/deletion/query, and can flexibly adjust its capacity. A DCF reduces the memory space of the state-of-the-art Dynamic Bloom Filter significantly by 75% as well as greatly improving the speeds of insert/query/delete operation by 30% to 80%.
C++
37
star
20

Graphchallenge21

graph challenge 2021
Cuda
27
star
21

pFedSD

The implementation of "Personalized Edge Intelligence via Federated Self- Knowledge Distillation".
Python
20
star
22

Pensieve

Pensieve is a skewness-aware multi-version graph processing system that exploits the time locality of graph version access and leverages a differentiated graph storage strategy.
C++
20
star
23

TeCo

The official implementation of our CVPR 2023 paper "Detecting Backdoors During the Inference Stage Based on Corruption Robustness Consistency".
Python
19
star
24

HierGAT

the implementation of "Entity Resolution via Hierarchical Graph Attention Network"
Python
18
star
25

Rattrap

Rattrap is a container-based cloud platform for mobile code offloading and provides mobile code runtime environments through Cloud Android Container. In this framework, the cloud runtime is not VM or JVM. We use OS-level virtualization "Linux Container (LXC)" as the runtime for mobile code. For the purpose of running android code in x86 GNU-Linux server, we have modified android source code and the linux kernel it uses. The modification work is based on Android-x86 project. With our effort, android OS can finally run in the ordinary linux containers!
Makefile
18
star
26

Horae

Horae is a graph stream summarization structure for efficient temporal range queries. Horae can deal with temporal queries with arbitrary and elastic range while guaranteeing one-sided and controllable errors. More to the point, Horae provides a worst query time of O(log L), where L is the length of query range. Hoare leverages multi-layer storage and Binary Range Decomposition (BRD) algorithm to decompose the temporal range query to logarithmic time interval queries and executes these queries in corresponding layers.
C++
18
star
27

gengar

Gengar, a distributed shared hybrid memory pool with RDMA support. Gengar allows applications to access remote DRAM/NVM in a large and global memory space through a client/server model.
JavaScript
18
star
28

RETIA

Released codes of the RETIA model.
Python
16
star
29

PathEval

This is an evaluation set for the problem of directed/targeted test input generation. We use it to benchmark the ability of Large Language Models for generating inputs to reach a certain code location or produce a particular result.
C
15
star
30

FastJoin

A scalable distributed stream join system
Java
14
star
31

PStream

PStream is a popularity-aware differentiated distributed stream processing system, which identifies the popularity of keys in the stream data and uses a differentiated partitioning scheme. PStream greatly outperforms Storm on skew distributed data in terms of throughput and processing latency.
Java
14
star
32

RGraph

RGraph is an RDMA-assisted asynchronous distributed graph processing system. RGraph distributes edges into two parts to isolate master and mirror vertices. RGraph exploits the asymmetry of RDMA to accelerate the one-to-many communication between master and mirror vertices. The results in comprehensive experiments show that compared to existing designs, PowerGraph, RGraph reduces the execution time by up to 81%.
C++
14
star
33

LDCF

LDCF is a novel efficient approximate set representation structure for large-scale dynamic data sets. LDCF uses a novel multi-level tree structure and reduces the worst insertion and membership testing times from O(N) to O(1).
C++
14
star
34

mioDB

MioDB: Devouring Data Byte-addressable LSM-based KV Stores for Hybrid Memory
C++
13
star
35

TransferAttackSurrogates

The official code of IEEE S&P 2024 paper "Why Does Little Robustness Help? A Further Step Towards Understanding Adversarial Transferability". We study how to train surrogates model for boosting transfer attack.
Python
13
star
36

Simois

Simois is a scalable distributed stream join system, which supports efficient join operations in two streams with highly skewed data distribution. Simois can support the completeness of the join results, and greatly outperforms the existing stream join systems in terms of system throughput and the average processing latency.
Java
13
star
37

HistFuzz

A practical fuzzing tool for SMT solvers
SMT
11
star
38

BCF

Better Choice Cuckoo Filter (BCF) is an efficient approximate set representation data structure. Different from the standard Cuckoo Filter (CF), BCF leverages the principle of the power of two choices to select the better candidate bucket during insertion. BCF reduces the average number of relocations of the state-of-the-art CF by 35%.
C++
11
star
39

ScalaBFS

A Scalable BFS Accelerator on FPGA-HBM Platform
Scala
10
star
40

GraphInstruct

The benchmark proposed in paper: GraphInstruct: Empowering Large Language Models with Graph Understanding and Reasoning Capability
Python
10
star
41

Argus

Argus is a novel RDMA-assisted job scheduler which achieves high resource utilization by fully exploiting the structure feature of stage dependency. Comprehensive experiments using large-scale traces collected from real world show that Argus reduces job completion time and job makespan by 21% and 20%, respectively, compared to RDMA-Spark.
Scala
10
star
42

DGraph

DGraph is a system for directed graph processing with taking advantage of the strongly connected component structure. On this system, most graph partitions are able to reach convergence in order and need to be loaded into the main memory for exactly once, getting much lower data access cost and faster convergence.
C++
10
star
43

HCB-pHCB

Python
9
star
44

PathGraph

PathGraph, a path-centric graph processing system for fast iterative computation on large graphs with billions of edges. Large scale graph analysis applications typically involve datasets of massive scale. Most of existing approaches address the iterative graph computation problem by programming and executing graph computation using either vertex centric or edge centric approaches. We develop a path-centric graph processing system PathGraph for fast iterative computation on large graphs with billions of edges.
C++
9
star
45

Whale

Whale is a novel RDMA-assisted DSPS with efficient one-to-many data partitioning. Whale explores a novel RDMA-assisted stream multicast mechanism and a new worker-oriented communication mechanism. We implement Whale on top of Apache Storm and evaluate it using experiments with large-scale datasets. The results show that Whale achieves 56.6x improvement of system throughput and 97% reduction of processing latency compared to existing designs.
Java
9
star
46

DHUNET

Released code of the DHU-NET model published in ICDM2022.
Python
8
star
47

Attack_PTMC

The dataset, source code and the results of our ESEC/FSE 2023 paper "An Extensive Study on Adversarial Attack against Pre-trained Models of Code".
Python
8
star
48

ShareRender

ShareRender is a cloud gaming system that enables fine-grained resource sharing at the frame-level. Existing cloud gaming systems suffer from low GPU utilization in the virtualized environment. Moreover, GPU resources are scheduled in units of virtual machines (VMs) and this kind of coarse-grained scheduling at the VM-level fails to fully exploit GPU processing capacity. ShareRender offloads graphics workloads within VMs directly to GPUs, bypassing GPU virtualization. For each game running in a VM, ShareRender starts a graphics wrapper to intercept frame rendering requests and assign them to render agents responsible for frame rendering on GPUs. Thanks to the flexible workload assignment among multiple render agents, ShareRender enables fine-grained resource sharing at the frame-level to significantly improve GPU utilization. If you want to know more about ShareRender, please refer to our paper in Multimedia 2017. Wei Zhang, Xiaofei Liao, Peng Li, Hai Jin, Li Lin, "ShareRender: Bypassing GPU Virtualization to Enable Fine-grained Resource Sharing for Cloud Gaming". In Proceedings of ACM International Conference on Multimedia (MM'17), Mountain View, CA, 2017.
C++
8
star
49

PHunter

Java
7
star
50

FeatureIndistinguishableAttack

Implementation of ACM CCS 2021 paper "Feature-Indistinguishable Attack to Circumvent Trapdoor-enabled Defense".
Python
7
star
51

Ares

Ares is a high performance and fault tolerant distributed stream processing system, which considers both both system performance and fault tolerant capability during task allocation and use a game-theoretic approach to obtain an optimal scheduler for task allocation. Ares greatly outperforms Storm in terms of system throughput and the average processing latency.
Java
7
star
52

Amain

Detecting Semantic Code Clones by Building AST-based Markov Chains Model
Python
7
star
53

MorphDAG-prototype

Released code of the MorphDAG prototype (version 1.0)
Go
7
star
54

PRDMA

pRDMA proposes persistent RPC designs. Persistent RPCs use several hardware-supported RDMA Flush primitives to decouple the data persisting from the complicated RPC processing. Also, pRDMA implements several RPC transmission models of state-of-the-art RPC work for performance comparison.
C
7
star
55

MXNet-G

MXNet-G is a deep learning framework designed based on MXNet (https://mxnet.incubator.apache.org/index.html). It allows you to train models with a novel distributed SGD (Stochastic Gradient Descent) algorithm named Grouping-SGD. A new parallelization scheme named GSP (Grouping Synchronous Parallel) is used in Grouping-SGD for distributed deep learning on heterogeneous clusters.
C++
7
star
56

HMCached

HMCached is an in-memory K-V store built on a hybrid DRAM/NVM system. HMCached utilizes an application-level data access counting mechanism to identify frequently-accessed (hotspot) objects (i.e., K-V pairs) in NVM, and migrates them to fast DRAM to reduce the costly NVM accesses. We also propose an NVM-friendly index structure to store the frequently-updated portion of object metadata in DRAM, and thus further mitigate the NVM accesses. Moreover, we propose a benefit-aware memory reassignment policy to address the slab calcification problem in slab-based K-V store systems, and significantly improve the benefit gain from the DRAM.
C
7
star
57

BlockSim

A blockchain network simulator, which can be used for blockchain network protocol verification.
Java
7
star
58

TripeBit

TripeBit is designed based on two important observations. First, it is important to design a storage structure that can directly and efficiently query the RDF graph. This motivates us to design a compact storage and index structure in TripleBit. Second, in order to truly scale the RDF query processor, we need efficient index structures and query evaluation algorithms to minimize the size of intermediate results generated when evaluating queries, especially complex join queries. This leads us to the design decision that we should not only reduce the size of indexes, but also minimize the number of indexes used in query evaluation.
C++
7
star
59

TreeCen

Python
6
star
60

VulLLM

An implementation of the ACL 2024 Findings paper "Generalization-Enhanced Code Vulnerability Detection via Multi-Task Instruction Fine-Tuning".
Python
6
star
61

FedGKD

Python
6
star
62

vdgraph

Python
6
star
63

Patrol

Promela
6
star
64

Auxo

Auxo is a scalable and efficient framework for graph stream summarization
C++
6
star
65

DarkSAM

The implementation of our NeurIPS 2024 paper "DarkSAM: Fooling Segment Anything Model to Segment Nothing".
6
star
66

Nezha

An efficient concurrency control mechanism towards DAG-based blockchains
Go
6
star
67

VulBG

Python
6
star
68

AdvHash

The official implementation of our ACM MM 2021 paper "AdvHash: Set-to-set Targeted Attack on Deep Hashing with One Single Adversarial Patch".
Python
6
star
69

NightWatch

NightWatch is an extension of memory management system that provides general, transparent and low-overhead cache pollution control. NightWatch extends the memory mapping into two types: restrictive-mapping and open-mapping. The restrictive-mapping is used for restricting the pollution effect of the poor locality data, while the open-mapping is used for cache friendly data. When a malloc request arrives, NightWatch will predict the access locality of the to be allocated memory, determine the proper cache demand, and select the right mapping type for the malloc request. NightWatch is based on the observation that data within the same memory chunk or chunks within the same allocation context often share similar locality property. NightWatch embodies this observation by online monitoring current cache locality to predict future behavior and restricting potential cache polluters proactively.
C++
6
star
70

MHSim

C++
5
star
71

HomDroid

5
star
72

Gen-AF

The implementation of our IEEE S&P 2024 paper "Securely Fine-tuning Pre-trained Encoders Against Adversarial Examples".
Python
5
star
73

PointCRT

PointCRT: Detecting Backdoor in 3D Point Cloud via Corruption Robustness (MM '23)
Python
5
star
74

SSDUPplus

The new version of SSDUP, an optimized SSD Burst Buffer for HPC by Traffic Detection
C
5
star
75

AndroidX

AndroidX is a customizable execution runtime environment for running Android applications on clouds
C
5
star
76

SubInfer

the source code and supplementary materials of paper "An Efficient Subgraph-inferring Framework for Large-scale Heterogeneous Graphs"
Python
5
star
77

Scube

Scube is an efficient summarization structure for skewed graph stream. Two factors contribute to the efficiency of Scube. First, Scube proposes a space and computation efficient probabilistic counting scheme to identify high-degree nodes in a graph stream. Second, Scube differentiates the storage strategy for the edges associated with high-degree nodes by dynamically allocating multiple rows or columns. We conduct comprehensive experiments to evaluate the performance of Scube on large-scale real-world datasets. The results show that Scube significantly reduces the query latency over a graph stream by 48%-99%, as well as achieving acceptable query accuracy compared to the state-of-the-art designs.
C++
5
star
78

SSDUP

SSDUP is a traffic-aware burst buffer for HPC systems, which detects the randomness in HPC IO write operations and flush the SSD buffer with a pipeline mode overlapping the SSD flush phase and write phase.
C
5
star
79

streambox

Python
4
star
80

PandaKit

Kotlin
4
star
81

GradientsScrutinizer

Python
4
star
82

DynamicTG

Java
4
star
83

GoPie

Go
4
star
84

FL_Bug_Study

The data, source code and the results of our ESEC/FSE 2023 paper "Understanding the Bug Characteristics and Fix Strategies of Federated Learning Systems".
Python
4
star
85

Mammoth

Mammoth is a new MapReduce system which aims to improve MapReduce performance using global memory management. We have conducted extensive experiments with comparison against the native Hadoop platform. The results show that the Mammoth system can reduce the total job execution time by 40% in typical cases, without requiring any modifications of Hadoop programs. When a system is short of memory, the performance improvement can be up to 5 times as observed for CPU and I/O intensive applications, such as PageRank. Given the growing importance of supporting large-scale data processing and analysis, and the proven success of the MapReduce platform, the Mammoth system can have a promising potential and impact.
Java
4
star
86

LiveRender

LiveRender is an open source cloud gaming system based on graphics streaming. LiveRender intercepts the D3D graphics commands and migrates them from the server to the client. We use several compression techniques to reduce the data transmission of graphics streaming, and so LiveRender provides a better experience of cloud gaming.
C
4
star
87

ACStor

In virtualized data centers, the access of virtual disk images (VDIs) is critical for the overall system performance. As the system scales up to a large number of running VMs, the overall network traffic would become unbalanced with hot spots on some VMs inevitably, leading to I/O performance degradation when accessing the VMs. We propose an adaptive and collaborative VDI storage system (ACStor) to resolve the above performance issue, which can dynamically balance the traffic workloads in accessing VDI chunks based on the run-time network state.
C
4
star
88

LoomIO

LoomIO is an object-level coordination system for distributed file systems. It adopts wait-free design to enable interfering object requests self-organizing and obtain an optimized scheduling decision. Currently, LoomIO is implemented and integrated in Ceph.
C++
4
star
89

JOpFuzzer

Java
4
star
90

LDV

A Lightweight DAG-Based Blockchain for Vehicular Social Networks
Go
3
star
91

ChartStamp

The implementation of ACM MM 2022 paper "ChartStamp: Robust Chart Encoding for Real-World Applications".
C++
3
star
92

CausalNET

Python
3
star
93

StateDiver

Python
3
star
94

Minix-Container-Support

Minix kernel hacking
C
3
star
95

E2CF

The entry-extensible cuckoo filter (E2CF) is an approximate set representation structure, which supports entry-level extension and avoids many discrete memory accesses in a query.
C++
3
star
96

FastResponse

C
3
star
97

Robin

The implementation of our ASE 2023 paper "Robin: A Novel Method to Produce Robust Interpreters for Deep Learning-Based Code Classifiers".
Python
3
star
98

COVID19-Dataset

The labeled news data about coronavirus pneumonia.
3
star
99

FJoin

SystemVerilog
3
star
100

LambdaMisuse

Jupyter Notebook
3
star