• Stars
    star
    134
  • Rank 270,967 (Top 6 %)
  • Language
    Go
  • License
    Apache License 2.0
  • Created almost 11 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A Go client binding for Hadoop HDFS using WebHDFS.

Build Status

gowfs

gowfs is a Go bindings for Hadoop HDFS via its WebHDFS interface. It provides typed access to remote HDFS resources via Go's JSON marshaling system. gowfs follows the WebHDFS JSON protocol outline in http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/WebHDFS.html. It has been tested with Apache Hadoop 2.x.x - series.

GoDoc Package Documentation

GoDoc documentation - https://godoc.org/github.com/vladimirvivien/gowfs

Usage

go get github.com/vladimirvivien/gowfs
import github.com/vladimirvivien/gowfs
...
fs, err := gowfs.NewFileSystem(gowfs.Configuration{Addr: "localhost:50070", User: "hdfs"})
if err != nil{
	log.Fatal(err)
}
checksum, err := fs.GetFileChecksum(gowfs.Path{Name: "location/to/file"})
if err != nil {
	log.Fatal(err)
}
fmt.Println (checksum)

Run HDFS Test

To see the API used, see directory test-hdfs. Compile and use that code to test against a running HDFS deployment. See https://github.com/vladimirvivien/gowfs/tree/master/test-hdfs.

HDFS Setup

  • Enable dfs.webhdfs.enabled property in your hsdfs-site.xml
  • Ensure hadoop.http.staticuser.user property is set in your core-site.xml.

API Overview

gowfs lets you access HDFS resources via two structs FileSystem and FsShell. Use FileSystem to get access to low level callse. FsShell is designed to provide a higer level of abstraction and integration with the local file system.

FileSystem API

Configuration{} Struct

Use the Configuration{} struct to specify paramters for the file system. You can create configuration either using a Configuration{} literal or using NewConfiguration() for defaults.

conf := *gowfs.NewConfiguration()
conf.Addr = "localhost:50070"
conf.User = "hdfs"
conf.ConnectionTime = time.Second * 15
conf.DisableKeepAlives = false 

FileSystem{} Struct

Create a new FileSystem{} struct before you can make call to any functions. You create the FileSystem by passing in a Configuration pointer as shown below.

fs, err := gowfs.NewFileSystem(conf)

Now you are ready to communicate with HDFS.

Create File

FileSystem.Create() creates and store a remote file on the HDFS server. See https://godoc.org/github.com/vladimirvivien/gowfs#FileSystem.Create

ok, err := fs.Create(
    bytes.NewBufferString("Hello webhdfs users!"),
	gowfs.Path{Name:"/remote/file"},
	false,
	0,
	0,
	0700,
	0,
)

Open HDFS File

Use the FileSystem.Open() to open and read a remote file from HDFS. See https://godoc.org/github.com/vladimirvivien/gowfs#FileSystem.Open

data, err := fs.Open(gowfs.Path{Name:"/remote/file"}, 0, 512, 2048)
...
rcvdData, _ := ioutil.ReadAll(data)
fmt.Println(string(rcvdData))

Append to File

To append to an existing HDFS file, use FileSystem.Append(). See https://godoc.org/github.com/vladimirvivien/gowfs#FileSystem.Append

ok, err := fs.Append(
    bytes.NewBufferString("Hello webhdfs users!"),
    gowfs.Path{Name:"/remote/file"}, 4096)

Rename File

Use FileSystem.Rename() to rename HDFS resources. See https://godoc.org/github.com/vladimirvivien/gowfs#FileSystem.Rename

ok, err := fs.Rename(gowfs.Path{Name:"/old/name"}, Path{Name:"/new/name"})

Delete HDFS Resources

To delete an HDFS resource (file/directory), use FileSystem.Delete(). See https://godoc.org/github.com/vladimirvivien/gowfs#FileSystem.Delete

ok, err := fs.Delete(gowfs.Path{Name:"/remote/file/todelete"}, false)

File Status

You can get status about an existing HDFS resource using FileSystem.GetFileStatus(). See https://godoc.org/github.com/vladimirvivien/gowfs#FileSystem.GetFileStatus

fileStatus, err := fs.GetFileStatus(gowfs.Path{Name:"/remote/file"})

gowfs returns a value of type FileStatus which is a struct with info about remote file.

type FileStatus struct {
	AccesTime int64
    BlockSize int64
    Group string
    Length int64
    ModificationTime int64
    Owner string
    PathSuffix string
    Permission string
    Replication int64
    Type string
}

You can get a list of file stats using FileSystem.ListStatus().

stats, err := fs.ListStatus(gowfs.Path{Name:"/remote/directory"})
for _, stat := range stats {
    fmt.Println(stat.PathSuffix, stat.Length)
}

FsShell Examples

Create the FsShell

To create an FsShell, you need to have an existing instance of FileSystem.

shell := gowfs.FsShell{FileSystem:fs}

FsShell.Put()

Use the put to upload a local file to an HDFS file system. See https://godoc.org/github.com/vladimirvivien/gowfs#FsShell.PutOne

ok, err := shell.Put("local/file/name", "hdfs/file/path", true)

FsShell.Get()

Use the Get to retrieve remote HDFS file to local file system. See https://godoc.org/github.com/vladimirvivien/gowfs#FsShell.Get

ok, err := shell.Get("hdfs/file/path", "local/file/name")

FsShell.AppendToFile()

Append local files to remote HDFS file or directory. See https://godoc.org/github.com/vladimirvivien/gowfs#FsShell.AppendToFile

ok, err := shell.AppendToFile([]string{"local/file/1", "local/file/2"}, "remote/hdfs/path")

FsShell.Chown()

Change owner for remote file. See https://godoc.org/github.com/vladimirvivien/gowfs#FsShell.Chown.

ok, err := shell.Chown([]string{"/remote/hdfs/file"}, "owner2")

FsShell.Chgrp()

Change group of remote HDFS files. See https://godoc.org/github.com/vladimirvivien/gowfs#FsShell.Chgrp

ok, err := shell.Chgrp([]string{"/remote/hdfs/file"}, "superduper")

FsShell.Chmod()

Change file mod of remote HDFS files. See https://godoc.org/github.com/vladimirvivien/gowfs#FsShell.Chmod

ok, err := shell.Chmod([]string{"/remote/hdfs/file/"}, 0744)

Limitations

  1. Only "SIMPLE" security mode supported.
  2. No support for kerberos (none plan right now)
  3. No SSL support yet.

References

  1. WebHDFS API - http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/WebHDFS.html
  2. FileSystemShell - http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html#getmerge

More Repositories

1

go-cshared-examples

Calling Go Functions from Other Languages using C Shared Libraries
Dart
875
star
2

automi

A stream processing API for Go (alpha)
Go
790
star
3

ktop

A top-like tool for your Kubernetes clusters
Go
728
star
4

gosh

Gosh - a pluggable framework for building command shell programs
Go
530
star
5

go-plugin-example

Playing around with Go 1.8 plugin system
Go
319
star
6

go-grpc

A collection of gRPC and Go examples showcasing features of the framework
Go
241
star
7

go4vl

A Go library for working with the Video for Linux API (V4L2).
C
236
star
8

learning-go

Source code repository for my book "Learning Go Programming"
Go
232
star
9

go-networking

Code sample for Learning Network Programming with Go
Go
226
star
10

clamshell-cli

A framework to build command-line console applications in Java
Java
134
star
11

k8s-client-examples

Building stuff with the Kubernetes API
Go
118
star
12

gexe

Script-like OS interaction wrapped in the security and type safety of the Go programming language
Go
72
star
13

iot-dev

Example IoT projects
Go
70
star
14

jmx-cli

[Project Inactive] Jmx-Cli is a command-line interface console for JMX
Java
65
star
15

go-ntp-client

A Network Time Protocol client in Go
Go
50
star
16

gomes

Pure Go Framework API for Apache Mesos
Go
33
star
17

workbench

My code collection for testing new ideas, blog examples, etc
Java
32
star
18

go-tar

Examples using archive/tar compress/gz Go packages
Go
17
star
19

go-binary

Examples using encoding/binary package
Go
16
star
20

streaming-runtime-go

Go
11
star
21

docker.io-recipes

Some favorite Docker.Io recipes
9
star
22

dapr-examples

Examples of Dapr distributed services in Go
Go
6
star
23

go-tutorials

A place for quick Go tutorials
Go
5
star
24

startype

Roundtrip automatic conversion of Starlark-Go API types to regular Go types and back🤩
Go
4
star
25

go-algorithms

Classic CS algorithms examples in Go
Go
4
star
26

embedding-starlark

Examples of how to embed Starlark in Go programs using the Starlark-Go project
Go
4
star
27

go-httpmux-example

Example to show use of the new enhanced http.ServeMux router in Go v1.22.0 or later
Go
3
star
28

mesos-http

Example of Mesos HTTP API
Protocol Buffer
3
star
29

jmx-logger

JMX Logger for JUL and Log4J (old project & little support)
Java
3
star
30

kob

kob simplifies the programmatic construction of Kubernetes API object graphs
Go
2
star
31

gophercon2022

GopherCon 2022 - reveal.js presentation
JavaScript
2
star
32

timeapp

A simple application to print time based on configured time layout (perfect Kubernetes sample app)
Go
2
star
33

go-in-10

Go
2
star
34

mqt

MQT = Mesos Query Tool
Go
1
star
35

cloudy-apps

Cloud native application examples
Go
1
star
36

emojiis

Emojiis is a Go module for emoji icon search
Go
1
star
37

knative-workbench

Playing around with knative examples
Go
1
star
38

libstorage-client

Sample code on writing libstorage client code
Go
1
star
39

mango

Playground for an automated build tool in Go
Go
1
star
40

pourover

simple http reverse proxy
Go
1
star
41

vladimirvivien

1
star
42

go-tour

Examples and test code I use to tour the Go language and packages
Go
1
star
43

homebrew-oss-tools

Homebrew repository for distributing OSS binaries.
Ruby
1
star
44

go-workbench

A playground for Go proof of concepts
Go
1
star
45

horizon

Framework for building distributed apps
Go
1
star
46

e2eframework-controller-example

Repository for showing how to test Kubebuilder's Cronjob example controller using the e2e-framework - https://github.com/kubernetes-sigs/e2e-framework
Go
1
star