• This repository has been archived on 30/Mar/2021
  • Stars
    star
    328
  • Rank 128,352 (Top 3 %)
  • Language
    Java
  • License
    Apache License 2.0
  • Created almost 11 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Amazon Kinesis Connector Library

The Amazon Kinesis Connector Library helps Java developers integrate Amazon Kinesis with other AWS and non-AWS services. The current version of the library provides connectors for Amazon DynamoDB, Amazon Redshift, Amazon S3, Elasticsearch. The library also includes sample connectors of each type, plus Apache Ant build files for running the samples.

Requirements

  • Amazon Kinesis Client Library: In order to use the Amazon Kinesis Connector Library, you'll also need the Amazon Kinesis Client Library.
  • Java 1.7: The Amazon Kinesis Client Library requires Java 1.7 (Java SE 7) or later.
  • Elasticsearch 1.2.1: The Elasticsearch connector depends on Elasticsearch 1.2.1.
  • SQL driver (Amazon Redshift only): If you're using an Amazon Redshift connector, you'll need a driver that will allow your SQL client to connect to your Amazon Redshift cluster. For more information, see Download the Client Tools and the Drivers in the Amazon Redshift Getting Started Guide.

Overview

Each Amazon Kinesis connector application is a pipeline that determines how records from an Amazon Kinesis stream will be handled. Records are retrieved from the stream, transformed according to a user-defined data model, buffered for batch processing, and then emitted to the appropriate AWS service.

A connector pipeline uses the following interfaces:

  • IKinesisConnectorPipeline: The pipeline implementation itself.
  • ITransformer: Defines the transformation of records from the Amazon Kinesis stream in order to suit the user-defined data model. Includes methods for custom serializer/deserializers.
  • IFilter: IFilter defines a method for excluding irrelevant records from the processing.
  • IBuffer: IBuffer defines a system for batching the set of records to be processed. The application can specify three thresholds: number of records, total byte count, and time. When one of these thresholds is crossed, the buffer is flushed and the data is emitted to the destination.
  • IEmitter: Defines a method that makes client calls to other AWS services and persists the records stored in the buffer. The records can also be sent to another Amazon Kinesis stream.

Each connector depends on the implementation of KinesisConnectorRecordProcessor to manage the pipeline. The KinesisConnectorRecordProcessor class implements the IRecordProcessor interface in the Amazon Kinesis Client Library.

Implementation Highlights

The library includes implementations for use with Amazon DynamoDB, Amazon Redshift, Amazon S3, and Elasticsearch. This section provides a few notes about each connector type. For full details, see the samples and the Javadoc.

kinesis.connectors.dynamodb

  • DynamoDBTransformer: Implement the fromClass method to map your data model to a format that's compatible with the AmazonDynamoDB client (Map<String,AttributeValue>).
  • For more information on Amazon DynamoDB formats and putting items, see Working with Items Using the AWS SDK for Java Low-Level API in the Amazon DynamoDB Developer Guide.

kinesis.connectors.redshift

  • RedshiftTransformer: Implement the toDelimitedString method to output a delimited-string representation of your data model. The string must be compatible with an Amazon Redshift COPY command.
  • For more information about Amazon Redshift copy operations and manifests, see COPY and Using a manifest to specify data files in the Amazon Redshift Developer Guide.

kinesis.connectors.s3

  • S3Emitter: This class writes the buffer contents to a single file in Amazon S3. The file name is determined by the Amazon Kinesis sequence numbers of the first and last records in the buffer. For more information about sequence numbers, see Add Data to a Stream in the Amazon Kinesis Developer Guide.

kinesis.connectors.elasticsearch

  • KinesisMessageModelElasticsearchTransformer: This class provides an implementation for fromClass by transforming the record into JSON format and setting the index, type, and id to use for Elasticsearch.
  • BatchedKinesisMessageModelElasticsearchTransformer: This class extends KinesisMessageModelElasticsearchTransformer. If you batch events before putting data into Kinesis, this class will help you unpack the events before loading them into Elasticsearch.

Configuration

Set the following variables (common to all connector types) in kinesis.connectors.KinesisConnectorConfiguration:

  • AWSCredentialsProvider: Specify the implementation of AWSCredentialsProvider that supplies your AWS credentials.
  • APP_NAME: The Amazon Kinesis application name (not the connector application name) for use with kinesis.clientlibrary.lib.worker.KinesisClientLibConfiguration. For more information, see Developing Record Consumer Applications in the Amazon Kinesis Developer Guide.
  • KINESIS_ENDPOINT and KINESIS_INPUT_STREAM: The endpoint and name of the Kinesis stream that contains the data you're connecting to other AWS services.

Service-specific configuration variables are set in the respective emitter implementations (e.g., kinesis.connectors.dynamodb.DynamoDBEmitter).

Samples

The samples folder contains common classes for all the samples. The subfolders contain implementations of the pipeline and executor classes, along with Apache Ant build.xml files for running the samples.

Each sample uses the following files:

  • StreamSource.java: A simple application that sends records to an Amazon Kinesis stream.
  • users.txt: JSON records that are parsed line by line by the StreamSource program; the basis of KinesisMessageModel.
  • KinesisMessageModel.java: The data model for the users.txt records.
  • KinesisConnectorExecutor.java: An abstract implementation of an Amazon Kinesis connector application, which includes these features:
    • Configures the constructor, using the samples.utils package and the .properties file in the sample subfolder.
    • Provides the getKinesisConnectorRecordProcessorFactory() method, which is implemented by the executors in the sample subfolders; each executor returns an instance of a factory configured with the appropriate pipeline.
    • Provides a run() method for spawning a worker thread that uses the result of getKinesisConnectorRecordProcessorFactory().
  • .properties: The service-specific key-value properties for configuring the connector.
  • <service/type>Pipeline: The implementation of IKinesisConnectorPipeline for the sample. Each pipeline class returns a service-specific transformer and emitter, as well as simple buffer and filter implementations (BasicMemoryBuffer and AllPassFilter).

Running a Sample

To run a sample, complete these steps:

  1. Edit the *.properties file, adding your AWS credentials and any necessary AWS resource configurations.
  2. Confirm that the required AWS resources exist, or set the flags in the *.properties file to indicate that resources should be created when the sample is run.
  3. Build the samples using Maven
    cd samples
    mvn package
    
  4. Scripts to start each of the samples will be available in target/appassembler/bin

Release Notes

Release 1.3.0 (November 17, 2016)

  • Upgraded the Amazon Kinesis Client Library to version 1.7.2.
  • Upgraded the AWS Java SDK to 1.11.14.
  • Migrated the sample to now use Maven for dependency management, and execution.
  • Maven Artifact Signing Change

Release 1.2.0 (June 23, 2015)

  • Upgraded KCL to 1.4.0
  • Added pipelined record processor that decouples Amazon Kinesis GetRecords() and IRecordProcessor's ProcessRecords() API calls for efficiency.

Release 1.1.2 (May 27, 2015)

  • Upgraded AWS SDK to 1.9, KCL to 1.3.0
  • Added pom.xml file

Release 1.1.1 (Sep 11, 2014)

  • Added connector to Elasticsearch

Release 1.1 (June 30, 2014)

  • Added time threshold to IBuffer
  • Added region name support

Related Resources

Amazon Kinesis Developer Guide
Amazon Kinesis API Reference

Amazon DynamoDB Developer Guide
Amazon DynamoDB API Reference

Amazon Redshift Documentation

Amazon S3 Documentation and Videos

Elasticsearch

License

This library is licensed under the Apache 2.0 License.

More Repositories

1

amazon-dsstne

Deep Scalable Sparse Tensor Network Engine (DSSTNE) is an Amazon developed library for building Deep Learning (DL) machine learning (ML) models
C++
4,430
star
2

aws-mobile-react-native-starter

AWS Mobile React Native Starter App https://aws.amazon.com/mobile
JavaScript
2,230
star
3

aws-lambda-container-image-converter

The AWS Lambda container image converter tool (img2lambda) repackages container images (such as Docker images) into AWS Lambda function deployment packages and Lambda layers.
Go
1,321
star
4

amazon-cognito-identity-js

Amazon Cognito Identity SDK for JavaScript
JavaScript
985
star
5

serverless-image-resizing

ARCHIVED
JavaScript
815
star
6

aws-serverless-auth-reference-app

Serverless reference app and backend API, showcasing authentication and authorization patterns using Amazon Cognito, Amazon API Gateway, AWS Lambda, and AWS IAM.
TypeScript
753
star
7

aws-service-operator

AWS Service Operator allows you to create AWS resources using kubectl.
Go
733
star
8

serverless-app-examples

JavaScript
716
star
9

aws-cognito-angular-quickstart

An Angular(v5)-based QuickStart single-page app utilizing Amazon Cognito, S3, and DynamoDB (Serverless architecture)
TypeScript
690
star
10

aws-mobile-react-sample

A React Starter App that displays how web developers can integrate their front end with AWS on the backend. The App interacts with AWS Cognito, API Gateway, Lambda and DynamoDB on the backend.
JavaScript
659
star
11

aws-sdk-react-native

AWS SDK for React Native (developer preview)
JavaScript
634
star
12

aws-lambda-zombie-workshop

Code and walkthrough labs to set up a serverless chat application for the Zombie Apocalypse Workshop
JavaScript
619
star
13

aws-security-benchmark

Open source demos, concept and guidance related to the AWS CIS Foundation framework.
Python
612
star
14

aws-appsync-chat

Real-Time Offline Ready Chat App written with GraphQL, AWS AppSync, & AWS Amplify
JavaScript
557
star
15

aws-apigateway-importer

Tools to work with Amazon API Gateway, Swagger, and RAML
Java
518
star
16

realworld-serverless-application

This project is inspired by the design and development of the AWS Serverless Application Repository - a production-grade AWS service. Learn how AWS built a production service using serverless technologies.
Java
515
star
17

aws-waf-sample

This repository contains example scripts and sets of rules for the AWS WAF service. Please be aware that the applicability of these examples to specific workloads may vary.
Python
512
star
18

aws-full-stack-template

AWS Full-Stack Template is a full-stack sample web application that creates a simple CRUD (create, read, update, delete) app, and provides the foundational services, components, and plumbing needed to get a basic web application up and running.
TypeScript
494
star
19

data-pipeline-samples

This repository hosts sample pipelines
Python
460
star
20

aws-sdk-ios-v1

ARCHIVED: Version 1 of the AWS SDK for iOS
Objective-C
450
star
21

dynamodb-janusgraph-storage-backend

The Amazon DynamoDB Storage Backend for JanusGraph
Java
444
star
22

amazon-cognito-auth-js

The Amazon Cognito Auth SDK for JavaScript simplifies adding sign-up, sign-in with user profile functionality to web apps.
JavaScript
423
star
23

cloudwatch-logs-subscription-consumer

A specialized Amazon Kinesis stream reader (based on the Amazon Kinesis Connector Library) that can help you deliver data from Amazon CloudWatch Logs to any other system in near real-time using a CloudWatch Logs Subscription Filter.
Java
398
star
24

web-app-starter-kit-for-fire-tv

Web App Starter Kit Examples
JavaScript
376
star
25

aws-mobile-appsync-events-starter-react

GraphQL starter application with Realtime and Offline functionality using AWS AppSync
JavaScript
369
star
26

aws-amplify-vue

A Vue.js starter app integrated with AWS Amplify
JavaScript
350
star
27

dynamodb-geo

Java
271
star
28

aws-sdk-core-ruby

This repository has moved to the master branch of aws/aws-sdk-ruby
244
star
29

golang-deployment-pipeline

An example of infrastructure and application CI/CD with AWS CodePipeline, AWS CodeBuild, AWS CloudFormation and AWS CodeDeploy
Go
242
star
30

amazon-transcribe-websocket-static

A static site demonstrating real-time audio transcription via Amazon Transcribe over a WebSocket.
JavaScript
202
star
31

amazon-cognito-js

Amazon Cognito Sync Manager for JavaScript
JavaScript
202
star
32

aws-week-in-review

ARCHIVED: These files are used to produce the AWS Week in Review.
HTML
181
star
33

amazon-kinesis-data-visualization-sample

Amazon Kinesis Data Visualization Sample Application
JavaScript
170
star
34

ecs-mesos-scheduler-driver

Amazon ECS Scheduler Driver
Java
168
star
35

service-discovery-ecs-dns

ARCHIVED: Service Discovery via DNS with ECS.
Go
167
star
36

railsconf2013-tech-demo

Seahorse is a way to describe your API
Ruby
167
star
37

aws-appsync-chat-starter-react

GraphQL starter progressive web application (PWA) with Realtime, Offline and AI/ML functionality using AWS AppSync
CSS
163
star
38

k8s-cloudwatch-adapter

An implementation of Kubernetes Custom Metrics API for Amazon CloudWatch
Go
157
star
39

certlint

X.509 certificate linter
C
156
star
40

amazon-polly-sample

Sample application for Amazon Polly. Allows to convert any blog into an audio podcast.
Python
147
star
41

aws-mobile-appsync-events-starter-react-native

GraphQL starter application with Realtime and Offline functionality using AWS AppSync
JavaScript
146
star
42

ec2-scheduler

The EC2 Scheduler uses a recurring Lambda function to automatically start and stop EC2 instances based on either default schedule or custom schedule defined per EC2 instance. - Now found at https://github.com/awslabs/aws-instance-scheduler
Python
146
star
43

amplify-photo-gallery-workshop

AWS Workshop tutorial for building a photo gallery web app using AWS Amplify and AWS AppSync.
JavaScript
145
star
44

awsmobile-cli

CLI experience for Frontend developers in the JavaScript ecosystem.
JavaScript
142
star
45

aws-serverless-event-fork-pipelines

AWS Event Fork Pipelines helps you build event-driven serverless applications by providing pipelines for common event-handling requirements, such as event backup, analytics, and replay. The pipelines are based on AWS SAM, and can be deployed directly from AWS SAR into your AWS account.
Python
141
star
46

aws-flow-ruby

ARCHIVED
Ruby
138
star
47

aws-appsync-rds-aurora-sample

An AWS AppSync Serverless resolver for the Amazon Aurora relational database.
JavaScript
132
star
48

aws-training-demo

AWS Technical Trainers Demos
Scala
128
star
49

automating-governance-sample

Sample pipeline for handling of security events in AWS.
Python
128
star
50

cognito-sample-nodejs

Amazon Cognito Sample App for Node.js
CSS
124
star
51

aws-amplify-serverless-plugin

Plugin for the Serverless Framework to output AWS Amplify configuration files.
JavaScript
123
star
52

lightsail-auto-snapshots

Lambda function to automatically back up your Lightsail instances.
Python
119
star
53

aws-serverless-appsync-loyalty

Unicorn Loyalty: E-Commerce Serverless GraphQL Loyalty Sample App
JavaScript
115
star
54

aws-robomaker-sample-application-deepracer

Use AWS RoboMaker and demonstrate running a simulation which trains a reinforcement learning (RL) model to drive a car around a track
Python
113
star
55

sql-jdbc

🔍 Open Distro for Elasticsearch JDBC Driver
Java
111
star
56

BSMobileProvision

ARCHIVED: A category for parsing your iOS app's embedded.mobileprovision at runtime. Use it to, among other things, determine at runtime whether your app is being distributed as dev, release, ad hoc, app store, or enterprise.
Objective-C
108
star
57

service-discovery-ecs-consul

This repository provides the assets referred to in the blog post "Service Discovery via Consul with Amazon ECS"
HTML
108
star
58

kinesis-storm-spout

Kinesis spout for Storm
Java
106
star
59

aws-sdk-unity

ARCHIVED: The aws sdk for unity is now distributed as a part of aws sdk for dotnet:
C#
106
star
60

samljs-serverless-sample

Sample Lambda code, CloudFormation, SAM templates and Client website for performing SAML auth flows for AWS access in user applications
JavaScript
105
star
61

logstash-input-dynamodb

This input plugin for Logstash scans a specified DynamoDB table and then reads changes to a DynamoDB table from the associated DynamoDB Stream.This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline. This gem is not a stand-alone program.
Ruby
105
star
62

aws-dynamodb-session-tomcat

ARCHIVED: Amazon DynamoDB based session store for Apache Tomcat
Java
95
star
63

legacy-skill-samples-java

These samples utilize a version of the Alexa Skills Kit SDK that is no longer supported. Please visit https://github.com/alexa/alexa-skills-kit-sdk-for-java
Java
94
star
64

aws-sdk-arduino

An experimental SDK for working with AWS Services on Arduino-compatible devices. Currently has support for DynamoDB and Kinesis.
C++
90
star
65

dynamodb-import-export-tool

Exports DynamoDB items via parallel scan into a blocking queue, then consumes the queue and import DynamoDB items into a replica table using asynchronous writes.
Java
90
star
66

cost-optimization-ec2-right-sizing

The EC2 Right Sizing solution has reached the end of its useful life. Right-sizing functionality is available as a native feature of AWS Compute Optimizer. Details here: https://aws.amazon.com/compute-optimizer/. The solution will be removed the AWS Solutions library and archived on GitHub. Archived solutions will continue to be available on GitHub; however, the AWS Solutions Team has no further plans to update or provide technical support for the solution.
Python
86
star
67

aws-vpc-flow-log-appender

Sample code to append additional information (e.g. Security Group IDs and geolocation data) to VPC Flow Logs for analysis in Elasticsearch.
JavaScript
84
star
68

aws-mobile-ionic-sample

It is a Ionic Sample App that displays how web developers can integrate their front end with AWS on the backend. The App interacts with AWS Cognito, API Gateway, Lambda and DynamoDB on the backend.
TypeScript
82
star
69

aws-appsync-codegen

Code Generator utility for AWS Appsync
TypeScript
81
star
70

aws-appsync-gatsby-sample

Demonstrates how Gatsby can call AWS AppSync GraphQL APIs. This sample project displays events created in an AWS AppSync endpoint within Gatsby.
JavaScript
80
star
71

cloudwatch-logs-centralize-logs

Sample code - A Lambda function that helps in centralizing logs from Elastic Load Balancing (ELB) using Amazon S3 bucket triggers.
JavaScript
80
star
72

lambda-runcommand-configuration-management

Serverless, SSHless, Continuous Configuration Management
Python
78
star
73

aws-serverless-ember

Example web application for building a Serverless EmberJS based web application using AWS JavaScript SDK, Cognito User Pools, API Gateway, DynamoDB, and Lambda/S3.
JavaScript
77
star
74

aws-reinvent-2019-mobile-workshops

AWS re:Invent 2019 Mobile Workshops
CSS
75
star
75

aws-weathergen

This software provides a starter kit for users to be able to take a range of data and have this data published on to arbitrary MQTT topics for consumption by any application able to ingest such a stream. This includes AWS IoT.
JavaScript
75
star
76

skill-sample-nodejs-calendar-reader

An Alexa Skill Sample showing how to import calendar data from an .ICS file.
JavaScript
74
star
77

aws-scala-sdk

It's like the AWS SDK for Java, but more Scala-y
Java
72
star
78

aws-cfn-resource-bridge

ARCHIVED
Python
70
star
79

ecs-cloudwatch-logs

This repository provides the assets referred to in the blog post on using Amazon ECS and Amazon CloudWatch logs.
69
star
80

aws-request-signing-apache-interceptor

Provides AWS Signing implementation of Apache Interface.
Java
67
star
81

emr-sample-apps

Amazon Elastic MapReduce code samples
Java
64
star
82

cloudformation-validation-pipeline

WARNING- This package is no longer supported and will be replaced in the near future. An automated CI/CD Pipeline solution to help accelerate AWS CloudFormation template development
Python
64
star
83

aws-mobile-android-notes-tutorial

The origin code for the AWS Mobile tutorial series for Android Native development.
Java
63
star
84

aws-dynamodb-stream-eventbridge-fanout

This is a serverless application that forwards events from a DynamoDB stream to an Amazon EventBridge event bus.
Java
63
star
85

kinesis-log4j-appender

ARCHIVED: Log4J Appender for writing data into a Kinesis Stream
Java
62
star
86

amediamanager

Java
62
star
87

amazon-quicksight-embedding-sample

A QuickSight dashboard embedding sample for web apps.
HTML
61
star
88

cost-optimization-monitor

Cost Optimization Monitor solution as a reference deployment which provides dashboard and reporting capabilities giving customers a single-pane-of-glass view of their current AWS service inventory.
Python
60
star
89

aws-appsync-relay

A sample Relay app using AWS AppSync
JavaScript
59
star
90

aws-mobile-appsync-events-starter-android

GraphQL starter application using AWS AppSync
Java
57
star
91

startup-kit-nodejs

A Node.js sample workload for use with the AWS Startup Kit.
JavaScript
56
star
92

aws-app-mesh-inject

AWS AppMesh sidecar injector for EKS.
Go
56
star
93

aws-cross-account-manager

An automated reference implementation that assists with setting up corss account roles for easy federation of users from one AWS master account to multiple AWS sub-accounts.
JavaScript
56
star
94

amazon-ecs-interstella-workshop

Amazon ECS Interstella Workshops CON209/318/319/407
HTML
55
star
95

aws-appsync-refarch-microservices

AWS AppSync Microservices Access Layer Reference Architecture
JavaScript
54
star
96

ai-driven-social-media-dashboard

The AI-Driven Social Media Dashboard solutions provides customers with a CloudFormation template that is easy to deploy to use Amazon Translate, Amazon Comprehend, Amazon Kinesis, Amazon Athena, and Amazon QuickSight to build a natural-language-processing (NLP)-powered social media dashboard for tweets.
Python
53
star
97

aws-appsync-refarch-offline

AWS AppSync offline reference architecture powered by the Amplify DataStore
JavaScript
52
star
98

aws-mobile-angular-cognito-sample

A sample for using AWS Cognito qwith Angular projects.
JavaScript
52
star
99

cloudsearchable

An ActiveRecord-style ORM query interface for AWS CloudSearch.
Ruby
51
star
100

dynamodb-tictactoe-example-app

Python
50
star