• Stars
    star
    194
  • Rank 200,219 (Top 4 %)
  • Language
    Shell
  • License
    Other
  • Created about 13 years ago
  • Updated almost 6 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

dcp is a distributed file copy program that automatically distributes and dynamically balances work equally across nodes in a large distributed system without centralized state.

dcp

A tool to copy file(s) in parallel on a distributed system.

SYNOPSIS

dcp [cCdfhpRrUv] [--] source_file target_file
dcp [cCdfhpRrUv] [--] source_file ... target_directory

DESCRIPTION

dcp is a file copy tool in the spirit of cp(1) that evenly distributes work across a large cluster without any centralized state. It is designed for copying files which are located on a distributed parallel file system. The method used in the file copy process is a self-stabilization algorithm which enables per-node autonomous processing and a token passing scheme to detect termination.

PREREQUISITES

An MPI environment is required (such as Open MPI's mpirun(1)) as well as the self-stabilization library known as LibCircle.

OPTIONS

-c, --conditional

When copying a source directory to a destination directory, copy the source directory over the destination directory. The default behavior is to copy the source directory inside the destination directory.

-C, --skip-compare

Skip the compare operation to confirm file integrity. When using this option, a file integrity check, such as md5sum, should be performed after the file(s) have been copied.

-d , --debug=level

Specify the level of debug information to output. Level may be one of: fatal, err, warn, info, or dbg. Increasingly verbose debug levels include the output of less verbose debug levels.

-f, --force

Remove existing destination files if creation or truncation fails. If the destination filesystem is specified to be unreliable (-U, --unreliable-filesystem), this option may lower performance since each failure will cause the entire file to be invalidated and copied again.

-h, --help

Print a brief message listing the dcp(1) options and usage.

-p, --preserve

Preserve the original files' owner, group, permissions (including the setuid and setgid bits), time of last modification and time of last access. In case duplication of owner or group fails, the setuid and setgid bits are cleared.

-R, --recursive

Copy directories recursively, and do the right thing when objects other than ordinary files or directories are encountered.

-r, --recursive-unspecified

Copy directories recursively, and ignore objects other than ordinary files or directories.

-U, --unreliable-filesystem

If the filesystem is very unreliable, this option may be used to always retry an operation when a failure occurs. If failures are permanent, this option will cause an infinite loop. Specifying this option when force is enabled (-f, --force) may lower performance.

-v, --version

Print version information and exit.

Known bugs

When the force option is specified and truncation fails, the copy and truncation will be stuck in an infinite loop until the truncation operation returns with success.

The maximum supported filename length for any file transfered is approximately 4068 characters. This may be less than the number of characters that your operating system supports.

RPM Creation

First, check the Build Status. If all tests are passing, create an rpm using the following instructions:

  1. rpmbuild -ta dcp-<version>.tar.gz
  2. rpm --install <the appropriate RPM files>

Contributions

Please view the HACKING.md file for more information on how to contribute to dcp.

COPYING

See the included COPYING file for additional information.

More Repositories

1

ior

IOR and mdtest
C
371
star
2

charliecloud

Lightweight user-defined software stacks for high-performance computing.
Shell
310
star
3

mpifileutils

File utilities designed for scalability and performance.
C
164
star
4

libcircle

An API to provide an efficient distributed queue on a cluster. Libcircle is currently used in production to quickly traverse and perform operations on a file tree which contains several hundred-million file nodes.
C
97
star
5

Spindle

Scalable dynamic library and python loading in HPC environments
Makefile
95
star
6

MPI-Examples

Some example MPI programs
92
star
7

xpmem

Linux Cross-Memory Attach
C
85
star
8

pavilion2

Pavilion is a Python 3 (3.5+) based framework for running and analyzing tests targeting HPC systems.
Python
43
star
9

libhio

libhio is a library intended for writing data to hierarchical data store systems.
C
20
star
10

libdftw

A distributed and decentralized filesystem treewalk function, similiar to the interface of linux's ftw(3). libdftw automatically and dynamically balances the treewalk workload across many nodes in a large distributed system.
Shell
19
star
11

mpimemu

MPI Memory Consumption Utilities
C
18
star
12

cluster-school

LANL Supercomputing Institute curriculum
Shell
14
star
13

supermagic

Very simple MPI sanity code. Nothing more, nothing less.
C
14
star
14

hybridize

Generate an optimal rootfs hybridize list of files that should be symlinked to NFS mount and not required before NFS mount happens
Perl
13
star
15

iptablesbuild

iptablesbuild is effectively a configuration manager for iptables. It is intended to manage iptables configurations in a centralized location for multiple systems.
Perl
13
star
16

give

A tool to transfer permission of files to others in a linux-based environment.
Python
13
star
17

Parallel-coreutils

Parallelized gnu-coreutils
C
12
star
18

sprintstatf

Print a stat struct using a method similar to sprintf(3).
C
10
star
19

hpc-collab

This project provides provisioned HPC cluster models using underlying virtualization mechanisms.
Shell
10
star
20

pexec

parallel execution command, on host or across a cluster, run commands, copy, etc
Perl
9
star
21

lustre

Yet another branch of lustre.
C
8
star
22

gnawts

A Splunk app for fast detangling of supercomputer logs.
Python
8
star
23

quo-vadis

A cross-stack coordination layer to dynamically map runtime components to hardware resources
C++
7
star
24

dpusm

Data Processing Unit Services Module
C
7
star
25

rma-mt

C
7
star
26

OpenLorenz

Web-based HPC dashboard and more
JavaScript
5
star
27

hxhim

C++
5
star
28

genpxe

generate cluster pxe files from a flat config file
Perl
5
star
29

clusterscripts

useful? cluster. scripts!
Shell
4
star
30

ethcfg

Perform external ethernet interface configuration and hostnames
Perl
4
star
31

nrd

Neighborless Route Detection (NRD) is a utility that dynamically manages ECMP/MultiPath routes by listening for OSPF Hello packets.
Go
4
star
32

trinity_net_tests

low level ugni based network tests for Trinity
C
3
star
33

ppsst

Prerequisites, Packages, Services, Sanity check tools
Shell
3
star
34

mpi_sessions_code_sandbox

sandbox for exploring concepts for MPI Sessions
C
2
star
35

batsched4

C++
1
star
36

shasta_wrapper

Scripting to simplify the administration of HPE Cray EX Systems
Shell
1
star
37

cce-mpi-openmpi-1.6.4

CCE Open MPI 1.6.4
C
1
star
38

rca-mesh-coords

An application that uses RCA to get the mesh coordinates of NIDs within an allocation.
C
1
star
39

scality-dl

Collection of scripts specifically for starting diskless Scality 4.3.7
Shell
1
star
40

perceus-reload

HOWTO dump and reload a PERCEUS db from a flat-file configuration
Perl
1
star
41

cce-mpi-openmpi-1.7.1

CCE Open MPI 1.7.1
C
1
star
42

openmpi-plat

Open MPI platform files
1
star
43

ACES-fs-acceptance

These are test plans and scripts or code input to execute file system acceptance tests for ACES systems.
Python
1
star
44

hpc.github.io

JavaScript
1
star
45

simulator

Python
1
star
46

Evalys-LANL

Python
1
star
47

BatsimGantt-LANL

Python
1
star
48

batsim4

C++
1
star