• Stars
    star
    1,181
  • Rank 39,604 (Top 0.8 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 7 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Sequence-to-sequence framework with a focus on Neural Machine Translation based on PyTorch

Sockeye

PyPI version GitHub license GitHub issues Documentation Status Torch Nightly

Sockeye is an open-source sequence-to-sequence framework for Neural Machine Translation built on PyTorch. It implements distributed training and optimized inference for state-of-the-art models, powering Amazon Translate and other MT applications. Recent developments and changes are tracked in our CHANGELOG.

For a quickstart guide to training a standard NMT model on any size of data, see the WMT 2014 English-German tutorial.

For questions and issue reports, please file an issue on GitHub.

Version 3.1.x: PyTorch only

With version 3.1.x, we remove support for MXNet 2.x. Models trained with PyTorch and Sockeye 3.0.x remain compatible with Sockeye 3.1.x. Models trained with 2.3.x (using MXNet) and converted to PyTorch with Sockeye 3.0.x's conversion tool can NOT be used with Sockeye 3.1.x.

Version 3.0.0: Concurrent PyTorch and MXNet support

Starting with version 3.0.0, Sockeye is also based on PyTorch. We maintain backwards compatibility with MXNet models of version 2.3.x with 3.0.x. If MXNet 2.x is installed, Sockeye can run both with PyTorch or MXNet.

All models trained with 2.3.x (using MXNet) can be converted to models running with PyTorch using the converter CLI (sockeye.mx_to_pt). This will create a PyTorch parameter file (<model>/params.best) and backup the existing MXNet parameter file to <model>/params.best.mx. Note that this only applies to fully-trained models that are to be used for inference. Continued training of an MXNet model with PyTorch is not supported (because we do not convert training and optimizer states). sockeye.mx_to_pt requires MXNet to be installed into the environment.

All CLIs of Version 3.0.0 now use PyTorch by default, e.g. sockeye-{train,translate,score}. MXNet-based CLIs/modules are still operational and accessible via sockeye-{train,translate,score}-mx.

Sockeye 3 can be installed and run without MXNet, but if installed, an extended test suite is executed to ensure equivalence between PyTorch and MXNet models. Note that running Sockeye 3.0.0 with MXNet requires MXNet 2.x to be installed (pip install --pre -f https://dist.mxnet.io/python 'mxnet>=2.0.0b2021')

Installation

Download the current version of Sockeye:

git clone https://github.com/awslabs/sockeye.git

Install the sockeye module and its dependencies:

cd sockeye && pip3 install --editable .

For faster GPU training, install NVIDIA Apex. NVIDIA also provides PyTorch Docker containers that include Apex.

Documentation

Older versions

  • Sockeye 3.0, based on PyTorch & MXNet 2.x is available in the sockeye_30 branch.
  • Sockeye 2.x, based on the MXNet Gluon API, is available in the sockeye_2 branch.
  • Sockeye 1.x, based on the MXNet Module API, is available in the sockeye_1 branch.

Citation

For more information about Sockeye, see our papers (BibTeX).

Sockeye 3.x

Felix Hieber, Michael Denkowski, Tobias Domhan, Barbara Darques Barros, Celina Dong Ye, Xing Niu, Cuong Hoang, Ke Tran, Benjamin Hsu, Maria Nadejde, Surafel Lakew, Prashant Mathur, Anna Currey, Marcello Federico. Sockeye 3: Fast Neural Machine Translation with PyTorch. ArXiv e-prints.

Sockeye 2.x

Tobias Domhan, Michael Denkowski, David Vilar, Xing Niu, Felix Hieber, Kenneth Heafield. The Sockeye 2 Neural Machine Translation Toolkit at AMTA 2020. Proceedings of the 14th Conference of the Association for Machine Translation in the Americas (AMTA'20).

Felix Hieber, Tobias Domhan, Michael Denkowski, David Vilar. Sockeye 2: A Toolkit for Neural Machine Translation. Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, Project Track (EAMT'20).

Sockeye 1.x

Felix Hieber, Tobias Domhan, Michael Denkowski, David Vilar, Artem Sokolov, Ann Clifton, Matt Post. The Sockeye Neural Machine Translation Toolkit at AMTA 2018. Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (AMTA'18).

Felix Hieber, Tobias Domhan, Michael Denkowski, David Vilar, Artem Sokolov, Ann Clifton and Matt Post. 2017. Sockeye: A Toolkit for Neural Machine Translation. ArXiv e-prints.

Research with Sockeye

Sockeye has been used for both academic and industrial research. A list of known publications that use Sockeye is shown below. If you know more, please let us know or submit a pull request (last updated: May 2022).

2023

  • Zhang, Xuan, Kevin Duh, Paul McNamee. "A Hyperparameter Optimization Toolkit for Neural Machine Translation Research". Proceedings of ACL (2023).

2022

  • Currey, Anna, Maria NΔƒdejde, Raghavendra Pappagari, Mia Mayer, Stanislas Lauly, Xing Niu, Benjamin Hsu, Georgiana Dinu. "MT-GenEval: A Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation". Proceedings of EMNLP (2022).
  • Domhan, Tobias, Eva Hasler, Ke Tran, Sony Trenous, Bill Byrne and Felix Hieber. "The Devil is in the Details: On the Pitfalls of Vocabulary Selection in Neural Machine Translation". Proceedings of NAACL-HLT (2022)
  • Fischer, Lukas, Patricia Scheurer, Raphael Schwitter, Martin Volk. "Machine Translation of 16th Century Letters from Latin to German". Workshop on Language Technologies for Historical and Ancient Languages (2022).
  • Knowles, Rebecca, Patrick Littell. "Translation Memories as Baselines for Low-Resource Machine Translation". Proceedings of LREC (2022)
  • McNamee, Paul, Kevin Duh. "The Multilingual Microblog Translation Corpus: Improving and Evaluating Translation of User-Generated Text". Proceedings of LREC (2022)
  • Nadejde Maria, Anna Currey, Benjamin Hsu, Xing Niu, Marcello Federico, Georgiana Dinu. "CoCoA-MT: A Dataset and Benchmark for Contrastive Controlled MT with Application to Formality". Proceedings of NAACL (2022).
  • Weller-Di Marco, Marion, Matthias Huck, Alexander Fraser. "Modeling Target-Side Morphology in Neural Machine Translation: A Comparison of Strategies ". arXiv preprint arXiv:2203.13550 (2022)

2021

  • Bergmanis, Toms, Mārcis Pinnis. "Facilitating Terminology Translation with Target Lemma Annotations". arXiv preprint arXiv:2101.10035 (2021)
  • Briakou, Eleftheria, Marine Carpuat. "Beyond Noise: Mitigating the Impact of Fine-grained Semantic Divergences on Neural Machine Translation". arXiv preprint arXiv:2105.15087 (2021)
  • Hasler, Eva, Tobias Domhan, Sony Trenous, Ke Tran, Bill Byrne, Felix Hieber. "Improving the Quality Trade-Off for Neural Machine Translation Multi-Domain Adaptation". Proceedings of EMNLP (2021)
  • Tang, Gongbo, Philipp RΓΆnchen, Rico Sennrich, Joakim Nivre. "Revisiting Negation in Neural Machine Translation". Transactions of the Association for Computation Linguistics 9 (2021)
  • Vu, Thuy, Alessandro Moschitti. "Machine Translation Customization via Automatic Training Data Selection from the Web". arXiv preprint arXiv:2102.1024 (2021)
  • Xu, Weijia, Marine Carpuat. "EDITOR: An Edit-Based Transformer with Repositioning for Neural Machine Translation with Soft Lexical Constraints." Transactions of the Association for Computation Linguistics 9 (2021)
  • MΓΌller, Mathias, Rico Sennrich. "Understanding the Properties of Minimum Bayes Risk Decoding in Neural Machine Translation". Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (2021)
  • PopoviΔ‡, Maja, Alberto Poncelas. "On Machine Translation of User Reviews." Proceedings of RANLP (2021)
  • PopoviΔ‡, Maja. "On nature and causes of observed MT errors." Proceedings of the 18th MT Summit (Volume 1: Research Track) (2021)
  • Jain, Nishtha, Maja PopoviΔ‡, Declan Groves, Eva Vanmassenhove. "Generating Gender Augmented Data for NLP." Proceedings of the 3rd Workshop on Gender Bias in Natural Language Processing (2021)
  • Vilar, David, Marcello Federico. "A Statistical Extension of Byte-Pair Encoding." Proceedings of IWSLT (2021)

2020

  • Dinu, Georgiana, Prashant Mathur, Marcello Federico, Stanislas Lauly, Yaser Al-Onaizan. "Joint translation and unit conversion for end-to-end localization." Proceedings of IWSLT (2020)
  • Exel, Miriam, Bianka Buschbeck, Lauritz Brandt, Simona Doneva. "Terminology-Constrained Neural Machine Translation at SAP". Proceedings of EAMT (2020).
  • Hisamoto, Sorami, Matt Post, Kevin Duh. "Membership Inference Attacks on Sequence-to-Sequence Models: Is My Data In Your Machine Translation System?" Transactions of the Association for Computational Linguistics, Volume 8 (2020)
  • Naradowsky, Jason, Xuan Zhan, Kevin Duh. "Machine Translation System Selection from Bandit Feedback." arXiv preprint arXiv:2002.09646 (2020)
  • Niu, Xing, Prashant Mathur, Georgiana Dinu, Yaser Al-Onaizan. "Evaluating Robustness to Input Perturbations for Neural Machine Translation". arXiv preprint arXiv:2005.00580 (2020)
  • Niu, Xing, Marine Carpuat. "Controlling Neural Machine Translation Formality with Synthetic Supervision." Proceedings of AAAI (2020)
  • Keung, Phillip, Julian Salazar, Yichao Liu, Noah A. Smith. "Unsupervised Bitext Mining and Translation via Self-Trained Contextual Embeddings." arXiv preprint arXiv:2010.07761 (2020).
  • Sokolov, Alex, Tracy Rohlin, Ariya Rastrow. "Neural Machine Translation for Multilingual Grapheme-to-Phoneme Conversion." arXiv preprint arXiv:2006.14194 (2020)
  • Stafanovičs, ArtΕ«rs, Toms Bergmanis, Mārcis Pinnis. "Mitigating Gender Bias in Machine Translation with Target Gender Annotations." arXiv preprint arXiv:2010.06203 (2020)
  • Stojanovski, Dario, Alexander Fraser. "Addressing Zero-Resource Domains Using Document-Level Context in Neural Machine Translation." arXiv preprint arXiv preprint arXiv:2004.14927 (2020)
  • Stojanovski, Dario, Benno Krojer, Denis Peskov, Alexander Fraser. "ContraCAT: Contrastive Coreference Analytical Templates for Machine Translation". Proceedings of COLING (2020)
  • Zhang, Xuan, Kevin Duh. "Reproducible and Efficient Benchmarks for Hyperparameter Optimization of Neural Machine Translation Systems." Transactions of the Association for Computational Linguistics, Volume 8 (2020)
  • Swe Zin Moe, Ye Kyaw Thu, Hnin Aye Thant, Nandar Win Min, and Thepchai Supnithi, "Unsupervised Neural Machine Translation between Myanmar Sign Language and Myanmar Language", Journal of Intelligent Informatics and Smart Technology, April 1st Issue, 2020, pp. 53-61. (Submitted December 21, 2019; accepted March 6, 2020; revised March 16, 2020; published online April 30, 2020)
  • Thazin Myint Oo, Ye Kyaw Thu, Khin Mar Soe and Thepchai Supnithi, "Neural Machine Translation between Myanmar (Burmese) and Dawei (Tavoyan)", In Proceedings of the 18th International Conference on Computer Applications (ICCA 2020), Feb 27-28, 2020, Yangon, Myanmar, pp. 219-227
  • MΓΌller, Mathias, Annette Rios, Rico Sennrich. "Domain Robustness in Neural Machine Translation." Proceedings of AMTA (2020)
  • Rios, Annette, Mathias MΓΌller, Rico Sennrich. "Subword Segmentation and a Single Bridge Language Affect Zero-Shot Neural Machine Translation." Proceedings of the 5th WMT: Research Papers (2020)
  • PopoviΔ‡, Maja, Alberto Poncelas. "Neural Machine Translation between similar South-Slavic languages." Proceedings of the 5th WMT: Research Papers (2020)
  • PopoviΔ‡, Maja, Alberto Poncelas. "Extracting correctly aligned segments from unclean parallel data using character n-gram matching." Proceedings of Conference on Language Technologies & Digital Humanities (JTDH 2020).
  • PopoviΔ‡, Maja, Alberto Poncelas, Marija Brkic, Andy Way. "Neural Machine Translation for translating into Croatian and Serbian." Proceedings of the 7th Workshop on NLP for Similar Languages, Varieties and Dialects (2020)

2019

  • Agrawal, Sweta, Marine Carpuat. "Controlling Text Complexity in Neural Machine Translation." Proceedings of EMNLP (2019)
  • Beck, Daniel, Trevor Cohn, Gholamreza Haffari. "Neural Speech Translation using Lattice Transformations and Graph Networks." Proceedings of TextGraphs-13 (EMNLP 2019)
  • Currey, Anna, Kenneth Heafield. "Zero-Resource Neural Machine Translation with Monolingual Pivot Data." Proceedings of EMNLP (2019)
  • Gupta, Prabhakar, Mayank Sharma. "Unsupervised Translation Quality Estimation for Digital Entertainment Content Subtitles." IEEE International Journal of Semantic Computing (2019)
  • Hu, J. Edward, Huda Khayrallah, Ryan Culkin, Patrick Xia, Tongfei Chen, Matt Post, and Benjamin Van Durme. "Improved Lexically Constrained Decoding for Translation and Monolingual Rewriting." Proceedings of NAACL-HLT (2019)
  • Rosendahl, Jan, Christian Herold, Yunsu Kim, Miguel GraΓ§a,Weiyue Wang, Parnia Bahar, Yingbo Gao and Hermann Ney β€œThe RWTH Aachen University Machine Translation Systems for WMT 2019” Proceedings of the 4th WMT: Research Papers (2019)
  • Thompson, Brian, Jeremy Gwinnup, Huda Khayrallah, Kevin Duh, and Philipp Koehn. "Overcoming catastrophic forgetting during domain adaptation of neural machine translation." Proceedings of NAACL-HLT 2019 (2019)
  • TΓ€ttar, Andre, Elizaveta Korotkova, Mark Fishel β€œUniversity of Tartu’s Multilingual Multi-domain WMT19 News Translation Shared Task Submission” Proceedings of 4th WMT: Research Papers (2019)
  • Thazin Myint Oo, Ye Kyaw Thu and Khin Mar Soe, "Neural Machine Translation between Myanmar (Burmese) and Rakhine (Arakanese)", In Proceedings of the Sixth Workshop on NLP for Similar Languages, Varieties and Dialects, NAACL-2019, June 7th 2019, Minneapolis, United States, pp. 80-88

2018

  • Domhan, Tobias. "How Much Attention Do You Need? A Granular Analysis of Neural Machine Translation Architectures". Proceedings of 56th ACL (2018)
  • Kim, Yunsu, Yingbo Gao, and Hermann Ney. "Effective Cross-lingual Transfer of Neural Machine Translation Models without Shared Vocabularies." arXiv preprint arXiv:1905.05475 (2019)
  • Korotkova, Elizaveta, Maksym Del, and Mark Fishel. "Monolingual and Cross-lingual Zero-shot Style Transfer." arXiv preprint arXiv:1808.00179 (2018)
  • Niu, Xing, Michael Denkowski, and Marine Carpuat. "Bi-directional neural machine translation with synthetic parallel data." arXiv preprint arXiv:1805.11213 (2018)
  • Niu, Xing, Sudha Rao, and Marine Carpuat. "Multi-Task Neural Models for Translating Between Styles Within and Across Languages." COLING (2018)
  • Post, Matt and David Vilar. "Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation." Proceedings of NAACL-HLT (2018)
  • Schamper, Julian, Jan Rosendahl, Parnia Bahar, Yunsu Kim, Arne Nix, and Hermann Ney. "The RWTH Aachen University Supervised Machine Translation Systems for WMT 2018." Proceedings of the 3rd WMT: Shared Task Papers (2018)
  • Schulz, Philip, Wilker Aziz, and Trevor Cohn. "A stochastic decoder for neural machine translation." arXiv preprint arXiv:1805.10844 (2018)
  • Tamer, Alkouli, Gabriel Bretschner, and Hermann Ney. "On The Alignment Problem In Multi-Head Attention-Based Neural Machine Translation." Proceedings of the 3rd WMT: Research Papers (2018)
  • Tang, Gongbo, Rico Sennrich, and Joakim Nivre. "An Analysis of Attention Mechanisms: The Case of Word Sense Disambiguation in Neural Machine Translation." Proceedings of 3rd WMT: Research Papers (2018)
  • Thompson, Brian, Huda Khayrallah, Antonios Anastasopoulos, Arya McCarthy, Kevin Duh, Rebecca Marvin, Paul McNamee, Jeremy Gwinnup, Tim Anderson, and Philipp Koehn. "Freezing Subnetworks to Analyze Domain Adaptation in Neural Machine Translation." arXiv preprint arXiv:1809.05218 (2018)
  • Vilar, David. "Learning Hidden Unit Contribution for Adapting Neural Machine Translation Models." Proceedings of NAACL-HLT (2018)
  • Vyas, Yogarshi, Xing Niu and Marine Carpuat β€œIdentifying Semantic Divergences in Parallel Text without Annotations”. Proceedings of NAACL-HLT (2018)
  • Wang, Weiyue, Derui Zhu, Tamer Alkhouli, Zixuan Gan, and Hermann Ney. "Neural Hidden Markov Model for Machine Translation". Proceedings of 56th ACL (2018)
  • Zhang, Xuan, Gaurav Kumar, Huda Khayrallah, Kenton Murray, Jeremy Gwinnup, Marianna J Martindale, Paul McNamee, Kevin Duh, and Marine Carpuat. "An Empirical Exploration of Curriculum Learning for Neural Machine Translation." arXiv preprint arXiv:1811.00739 (2018)
  • Swe Zin Moe, Ye Kyaw Thu, Hnin Aye Thant and Nandar Win Min, "Neural Machine Translation between Myanmar Sign Language and Myanmar Written Text", In the second Regional Conference on Optical character recognition and Natural language processing technologies for ASEAN languages 2018 (ONA 2018), December 13-14, 2018, Phnom Penh, Cambodia.
  • Tang, Gongbo, Mathias MΓΌller, Annette Rios and Rico Sennrich. "Why Self-attention? A Targeted Evaluation of Neural Machine Translation Architectures." Proceedings of EMNLP (2018)

2017

  • Domhan, Tobias and Felix Hieber. "Using target-side monolingual data for neural machine translation through multi-task learning." Proceedings of EMNLP (2017).

More Repositories

1

git-secrets

Prevents you from committing secrets and credentials into git repositories
Shell
11,616
star
2

llrt

LLRT (Low Latency Runtime) is an experimental, lightweight JavaScript runtime designed to address the growing demand for fast and efficient Serverless applications.
JavaScript
8,074
star
3

aws-shell

An integrated shell for working with the AWS CLI.
Python
7,182
star
4

mountpoint-s3

A simple, high-throughput file client for mounting an Amazon S3 bucket as a local file system.
Rust
4,475
star
5

autogluon

AutoGluon: AutoML for Image, Text, and Tabular Data
Python
4,348
star
6

gluonts

Probabilistic time series modeling in Python
Python
3,686
star
7

aws-sdk-rust

AWS SDK for the Rust Programming Language
Rust
3,014
star
8

deequ

Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Scala
2,871
star
9

aws-lambda-rust-runtime

A Rust runtime for AWS Lambda
Rust
2,829
star
10

amazon-redshift-utils

Amazon Redshift Utils contains utilities, scripts and view which are useful in a Redshift environment
Python
2,643
star
11

diagram-maker

A library to display an interactive editor for any graph-like data.
TypeScript
2,359
star
12

amazon-ecr-credential-helper

Automatically gets credentials for Amazon ECR on docker push/docker pull
Go
2,261
star
13

amazon-eks-ami

Packer configuration for building a custom EKS AMI
Shell
2,164
star
14

aws-lambda-powertools-python

A developer toolkit to implement Serverless best practices and increase developer velocity.
Python
2,148
star
15

aws-well-architected-labs

Hands on labs and code to help you learn, measure, and build using architectural best practices.
Python
1,834
star
16

aws-config-rules

[Node, Python, Java] Repository of sample Custom Rules for AWS Config.
Python
1,473
star
17

smithy

Smithy is a protocol-agnostic interface definition language and set of tools for generating clients, servers, and documentation for any programming language.
Java
1,356
star
18

aws-support-tools

Tools and sample code provided by AWS Premium Support.
Python
1,290
star
19

open-data-registry

A registry of publicly available datasets on AWS
Python
1,199
star
20

aws-lambda-powertools-typescript

Powertools is a developer toolkit to implement Serverless best practices and increase developer velocity.
TypeScript
1,179
star
21

dgl-ke

High performance, easy-to-use, and scalable package for learning large-scale knowledge graph embeddings.
Python
1,144
star
22

aws-sdk-ios-samples

This repository has samples that demonstrate various aspects of the AWS SDK for iOS, you can get the SDK source on Github https://github.com/aws-amplify/aws-sdk-ios/
Swift
1,038
star
23

amazon-kinesis-video-streams-webrtc-sdk-c

Amazon Kinesis Video Streams Webrtc SDK is for developers to install and customize realtime communication between devices and enable secure streaming of video, audio to Kinesis Video Streams.
C
1,031
star
24

aws-sdk-android-samples

This repository has samples that demonstrate various aspects of the AWS SDK for Android, you can get the SDK source on Github https://github.com/aws-amplify/aws-sdk-android/
Java
1,018
star
25

aws-solutions-constructs

The AWS Solutions Constructs Library is an open-source extension of the AWS Cloud Development Kit (AWS CDK) that provides multi-service, well-architected patterns for quickly defining solutions
TypeScript
1,013
star
26

aws-lambda-go-api-proxy

lambda-go-api-proxy makes it easy to port APIs written with Go frameworks such as Gin (https://gin-gonic.github.io/gin/ ) to AWS Lambda and Amazon API Gateway.
Go
1,005
star
27

aws-cfn-template-flip

Tool for converting AWS CloudFormation templates between JSON and YAML formats.
Python
991
star
28

eks-node-viewer

EKS Node Viewer
Go
947
star
29

multi-model-server

Multi Model Server is a tool for serving neural net models for inference
Java
936
star
30

ec2-spot-labs

Collection of tools and code examples to demonstrate best practices in using Amazon EC2 Spot Instances.
Jupyter Notebook
905
star
31

aws-mobile-appsync-sdk-js

JavaScript library files for Offline, Sync, Sigv4. includes support for React Native
TypeScript
902
star
32

aws-saas-boost

AWS SaaS Boost is a ready-to-use toolset that removes the complexity of successfully running SaaS workloads in the AWS cloud.
Java
901
star
33

fargatecli

CLI for AWS Fargate
Go
891
star
34

fortuna

A Library for Uncertainty Quantification.
Python
882
star
35

aws-api-gateway-developer-portal

A Serverless Developer Portal for easily publishing and cataloging APIs
JavaScript
879
star
36

ecs-refarch-continuous-deployment

ECS Reference Architecture for creating a flexible and scalable deployment pipeline to Amazon ECS using AWS CodePipeline
Shell
842
star
37

dynamodb-data-mapper-js

A schema-based data mapper for Amazon DynamoDB.
TypeScript
818
star
38

goformation

GoFormation is a Go library for working with CloudFormation templates.
Go
812
star
39

flowgger

A fast data collector in Rust
Rust
796
star
40

aws-js-s3-explorer

AWS JavaScript S3 Explorer is a JavaScript application that uses AWS's JavaScript SDK and S3 APIs to make the contents of an S3 bucket easy to browse via a web browser.
HTML
771
star
41

aws-icons-for-plantuml

PlantUML sprites, macros, and other includes for Amazon Web Services services and resources
Python
737
star
42

aws-devops-essential

In few hours, quickly learn how to effectively leverage various AWS services to improve developer productivity and reduce the overall time to market for new product capabilities.
Shell
674
star
43

aws-apigateway-lambda-authorizer-blueprints

Blueprints and examples for Lambda-based custom Authorizers for use in API Gateway.
C#
660
star
44

amazon-ecs-nodejs-microservices

Reference architecture that shows how to take a Node.js application, containerize it, and deploy it as microservices on Amazon Elastic Container Service.
Shell
650
star
45

aws-deployment-framework

The AWS Deployment Framework (ADF) is an extensive and flexible framework to manage and deploy resources across multiple AWS accounts and regions based on AWS Organizations.
Python
636
star
46

amazon-kinesis-client

Client library for Amazon Kinesis
Java
621
star
47

aws-lambda-web-adapter

Run web applications on AWS Lambda
Rust
610
star
48

dgl-lifesci

Python package for graph neural networks in chemistry and biology
Python
594
star
49

data-on-eks

DoEKS is a tool to build, deploy and scale Data & ML Platforms on Amazon EKS
HCL
590
star
50

aws-security-automation

Collection of scripts and resources for DevSecOps and Automated Incident Response Security
Python
585
star
51

aws-glue-libs

AWS Glue Libraries are additions and enhancements to Spark for ETL operations.
Python
565
star
52

python-deequ

Python API for Deequ
Python
535
star
53

aws-athena-query-federation

The Amazon Athena Query Federation SDK allows you to customize Amazon Athena with your own data sources and code.
Java
507
star
54

amazon-dynamodb-lock-client

The AmazonDynamoDBLockClient is a general purpose distributed locking library built on top of DynamoDB. It supports both coarse-grained and fine-grained locking.
Java
469
star
55

shuttle

Shuttle is a library for testing concurrent Rust code
Rust
465
star
56

ami-builder-packer

An example of an AMI Builder using CI/CD with AWS CodePipeline, AWS CodeBuild, Hashicorp Packer and Ansible.
465
star
57

route53-dynamic-dns-with-lambda

A Dynamic DNS system built with API Gateway, Lambda & Route 53.
Python
461
star
58

aws-servicebroker

AWS Service Broker
Python
461
star
59

diagram-as-code

Diagram-as-code for AWS architecture.
Go
459
star
60

amazon-ecs-local-container-endpoints

A container that provides local versions of the ECS Task Metadata Endpoint and ECS Task IAM Roles Endpoint.
Go
456
star
61

datawig

Imputation of missing values in tables.
JavaScript
454
star
62

aws-jwt-verify

JS library for verifying JWTs signed by Amazon Cognito, and any OIDC-compatible IDP that signs JWTs with RS256, RS384, and RS512
TypeScript
452
star
63

aws-config-rdk

The AWS Config Rules Development Kit helps developers set up, author and test custom Config rules. It contains scripts to enable AWS Config, create a Config rule and test it with sample ConfigurationItems.
Python
444
star
64

ecs-refarch-service-discovery

An EC2 Container Service Reference Architecture for providing Service Discovery to containers using CloudWatch Events, Lambda and Route 53 private hosted zones.
Go
444
star
65

ssosync

Populate AWS SSO directly with your G Suite users and groups using either a CLI or AWS Lambda
Go
443
star
66

handwritten-text-recognition-for-apache-mxnet

This repository lets you train neural networks models for performing end-to-end full-page handwriting recognition using the Apache MXNet deep learning frameworks on the IAM Dataset.
Jupyter Notebook
442
star
67

awscli-aliases

Repository for AWS CLI aliases.
437
star
68

snapchange

Lightweight fuzzing of a memory snapshot using KVM
Rust
436
star
69

threat-composer

A simple threat modeling tool to help humans to reduce time-to-value when threat modeling
TypeScript
426
star
70

aws-security-assessment-solution

An AWS tool to help you create a point in time assessment of your AWS account using Prowler and Scout as well as optional AWS developed ransomware checks.
423
star
71

lambda-refarch-mapreduce

This repo presents a reference architecture for running serverless MapReduce jobs. This has been implemented using AWS Lambda and Amazon S3.
JavaScript
422
star
72

aws-lambda-cpp

C++ implementation of the AWS Lambda runtime
C++
409
star
73

pgbouncer-fast-switchover

Adds query routing and rewriting extensions to pgbouncer
C
396
star
74

aws-sdk-kotlin

Multiplatform AWS SDK for Kotlin
Kotlin
392
star
75

aws-cloudsaga

AWS CloudSaga - Simulate security events in AWS
Python
389
star
76

amazon-kinesis-producer

Amazon Kinesis Producer Library
C++
385
star
77

soci-snapshotter

Go
383
star
78

serverless-photo-recognition

A collection of 3 lambda functions that are invoked by Amazon S3 or Amazon API Gateway to analyze uploaded images with Amazon Rekognition and save picture labels to ElasticSearch (written in Kotlin)
Kotlin
378
star
79

amazon-sagemaker-workshop

Amazon SageMaker workshops: Introduction, TensorFlow in SageMaker, and more
Jupyter Notebook
378
star
80

serverless-rules

Compilation of rules to validate infrastructure-as-code templates against recommended practices for serverless applications.
Go
378
star
81

logstash-output-amazon_es

Logstash output plugin to sign and export logstash events to Amazon Elasticsearch Service
Ruby
374
star
82

kinesis-aggregation

AWS libraries/modules for working with Kinesis aggregated record data
Java
370
star
83

smithy-rs

Code generation for the AWS SDK for Rust, as well as server and generic smithy client generation.
Rust
369
star
84

syne-tune

Large scale and asynchronous Hyperparameter and Architecture Optimization at your fingertips.
Python
367
star
85

graphstorm

Enterprise graph machine learning framework for billion-scale graphs for ML scientists and data scientists.
Python
366
star
86

dynamodb-transactions

Java
354
star
87

amazon-kinesis-client-python

Amazon Kinesis Client Library for Python
Python
354
star
88

aws-sigv4-proxy

This project signs and proxies HTTP requests with Sigv4
Go
351
star
89

aws-serverless-data-lake-framework

Enterprise-grade, production-hardened, serverless data lake on AWS
Python
349
star
90

amazon-kinesis-agent

Continuously monitors a set of log files and sends new data to the Amazon Kinesis Stream and Amazon Kinesis Firehose in near-real-time.
Java
342
star
91

rds-snapshot-tool

The Snapshot Tool for Amazon RDS automates the task of creating manual snapshots, copying them into a different account and a different region, and deleting them after a specified number of days
Python
337
star
92

amazon-kinesis-scaling-utils

The Kinesis Scaling Utility is designed to give you the ability to scale Amazon Kinesis Streams in the same way that you scale EC2 Auto Scaling groups – up or down by a count or as a percentage of the total fleet. You can also simply scale to an exact number of Shards. There is no requirement for you to manage the allocation of the keyspace to Shards when using this API, as it is done automatically.
Java
333
star
93

amazon-kinesis-video-streams-producer-sdk-cpp

Amazon Kinesis Video Streams Producer SDK for C++ is for developers to install and customize for their connected camera and other devices to securely stream video, audio, and time-encoded data to Kinesis Video Streams.
C++
332
star
94

landing-zone-accelerator-on-aws

Deploy a multi-account cloud foundation to support highly-regulated workloads and complex compliance requirements.
TypeScript
330
star
95

statelint

A Ruby gem that provides a command-line validator for Amazon States Language JSON files.
Ruby
330
star
96

generative-ai-cdk-constructs

AWS Generative AI CDK Constructs are sample implementations of AWS CDK for common generative AI patterns.
TypeScript
327
star
97

route53-infima

Library for managing service-level fault isolation using Amazon Route 53.
Java
326
star
98

aws-automated-incident-response-and-forensics

326
star
99

mxboard

Logging MXNet data for visualization in TensorBoard.
Python
326
star
100

crossplane-on-eks

Crossplane bespoke composition blueprints for AWS resources
HCL
319
star