• Stars
    star
    232
  • Rank 172,847 (Top 4 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created about 7 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Handwritten Korean Character Recognition with TensorFlow and Android

WARNING: This repository is no longer maintained ⚠️

This repository will not be updated. The repository will be kept available in read-only mode.

Handwritten Korean Character Recognition with TensorFlow and Android

Read this in other languages: 한국어,日本語.

Hangul, the Korean alphabet, has 19 consonant and 21 vowel letters. Combinations of these letters give a total of 11,172 possible Hangul syllables/characters. However, only a small subset of these are typically used.

This code pattern will cover the creation process of an Android application that will utilize a TensorFlow model trained to recognize Korean syllables. In this application, users will be able to draw a Korean syllable on their mobile device, and the application will attempt to infer what the character is by using the trained model. Furthermore, users will be able to form words or sentences in the application which they can then translate using the Watson Language Translator service.

Demo App

The following steps will be covered:

  1. Generating image data using free Hangul-supported fonts found online and elastic distortion.
  2. Converting images to TFRecords format to be used for input and training of the model.
  3. Training and saving the model.
  4. Using the saved model in a simple Android application.
  5. Connecting the Watson Language Translator service to translate the characters.

architecture

Flow

  1. The user downloads several Korean fonts to use for data generation.
  2. The images generated from the fonts are fed into a TensorFlow model for training.
  3. The user draws a Korean character on their Android device.
  4. The drawn character is recognized using the previously trained TensorFlow model and the Android TensorFlow Inference Interface.
  5. A string of the classified Korean characters is sent to the Watson Language Translator service to retrieve an English translation.

Included Components

  • Watson Language Translator: An IBM Cloud service that converts text input in one language into a destination language for the end user using background from domain-specific models.
  • TensorFlow: An open-source software library for Machine Intelligence.
  • Android: An open-source mobile operating system based on the Linux kernel.

Featured Technologies

  • Artificial Intelligence: Cognitive technologies that can understand, reason, learn, and interact like humans.
  • Mobile: An environment to develop apps and enable engagements that are designed specifically for mobile users.

Watch the Video

Steps

Run locally

Follow these steps to setup and run this code pattern. The steps are described in detail below.

  1. Clone the repo
  2. Install dependencies
  3. Generate Image Data
  4. Convert Images to TFRecords
  5. Train the Model
  6. Try Out the Model
  7. Create the Android Application

1. Clone the repo

Clone the tensorflow-hangul-recognition locally. In a terminal, run:

git clone https://github.com/IBM/tensorflow-hangul-recognition

Now go to the cloned repo directory:

cd tensorflow-hangul-recognition

2. Install dependencies

The general recommendation for Python development is to use a virtual environment (venv). To install and initialize a virtual environment, use the venv module on Python 3 (you install the virtualenv library for Python 2.7):

# Create the virtual environment using Python. Use one of the two commands depending on your Python version.
# Note, it may be named python3 on your system.

$ python -m venv mytestenv       # Python 3.X
$ virtualenv mytestenv           # Python 2.X

# Now source the virtual environment. Use one of the two commands depending on your OS.

$ source mytestenv/bin/activate  # Mac or Linux
$ ./mytestenv/Scripts/activate   # Windows PowerShell

Install the Python requirements for this code pattern. Run:

pip install -r requirements.txt

Note: If you have an Nvidia GPU and want to use it in training, then you will need to install tensorflow-gpu instead of tensorflow. Details for installation can be found here.

TIP 💡 To terminate the virtual environment use the deactivate command.

3. Generate Image Data

In order to train a decent model, having copious amounts of data is necessary. However, getting a large enough dataset of actual handwritten Korean characters is challenging to find and cumbersome to create.

One way to deal with this data issue is to programmatically generate the data yourself, taking advantage of the abundance of Korean font files found online. So, that is exactly what we will be doing.

Provided in the tools directory of this repo is hangul-image-generator.py. This script will use fonts found in the fonts directory to create several images for each character provided in the given labels file. The default labels file is 2350-common-hangul.txt which contains 2350 frequent characters derived from the KS X 1001 encoding. Other label files are 256-common-hangul.txt and 512-common-hangul.txt. These were adapted from the top 6000 Korean words compiled by the National Institute of Korean Language listed here. If you don't have a powerful machine to train on, using a smaller label set can help reduce the amount of model training time later on.

The fonts folder is currently empty, so before you can generate the Hangul dataset, you must first download several font files as described in the fonts directory README. For my dataset, I used around 40 different font files, but more can always be used to improve your dataset, especially if you get several uniquely stylized ones. Once your fonts directory is populated, then you can proceed with the actual image generation with hangul-image-generator.py.

Optional flags for this are:

  • --label-file for specifying a different label file (perhaps with less characters). Default is ./labels/2350-common-hangul.txt.
  • --font-dir for specifying a different fonts directory. Default is ./fonts.
  • --output-dir for specifying the output directory to store generated images. Default is ./image-data.

Now run it, specifying your chosen label file:

python ./tools/hangul-image-generator.py --label-file <your label file path>

Depending on how many labels and fonts there are, this script may take a while to complete. In order to bolster the dataset, three random elastic distortions are also performed on each generated character image. An example is shown below, with the original character displayed first, followed by the elastic distortions.

Normal Image Distorted Image 1 Distorted Image 2 Distorted Image 3

Once the script is done, the output directory will contain a hangul-images folder which will hold all the 64x64 JPEG images. The output directory will also contain a labels-map.csv file which will map all the image paths to their corresponding labels.

4. Convert Images to TFRecords

The TensorFlow standard input format is TFRecords, which is a binary format that we can use to store raw image data and their labels in one place. In order to better feed in data to a TensorFlow model, let's first create several TFRecords files from our images. A script is provided that will do this for us.

This script will first read in all the image and label data based on the labels-map.csv file that was generated above. Then it will partition the data so that we have a training set and also a testing set (15% testing, 85% training). By default, the training set will be saved into multiple files/shards (three) so as not to end up with one gigantic file, but this can be configured with a CLI argument, --num-shards-train, depending on your data set size.

Optional flags for this script are:

  • --image-label-csv for specifying the CSV file that maps image paths to labels. Default is ./image-data/labels-map.csv
  • --label-file for specifying the labels that correspond to your training set. This is used by the script to determine the number of classes. Default is ./labels/2350-common-hangul.txt.
  • --output-dir for specifying the output directory to store TFRecords files. Default is ./tfrecords-output.
  • --num-shards-train for specifying the number of shards to divide training set TFRecords into. Default is 3.
  • --num-shards-test for specifying the number of shards to divide testing set TFRecords into. Default is 1.

To run the script, you can simply do:

python ./tools/convert-to-tfrecords.py --label-file <your label file path>

Once this script has completed, you should have sharded TFRecords files in the output directory ./tfrecords-output.

$ ls ./tfrecords-output
test1.tfrecords    train1.tfrecords    train2.tfrecords    train3.tfrecords

5. Train the Model

Now that we have a lot of data, it is time to actually use it. In the root of the project is hangul_model.py. This script will handle creating an input pipeline for reading in TFRecords files and producing random batches of images and labels. Next, a convolutional neural network (CNN) is defined, and training is performed. The training process will continuously feed in batches of images and labels to the CNN to find the optimal weight and biases for correctly classifying each character. After training, the model is exported so that it can be used in our Android application.

The model here is similar to the MNIST model described on the TensorFlow website. A third convolutional layer is added to extract more features to help classify for the much greater number of classes.

Optional flags for this script are:

  • --label-file for specifying the labels that correspond to your training set. This is used by the script to determine the number of classes to classify for. Default is ./labels/2350-common-hangul.txt.
  • --tfrecords-dir for specifying the directory containing the TFRecords shards. Default is ./tfrecords-output.
  • --output-dir for specifying the output directory to store model checkpoints, graphs, and Protocol Buffer files. Default is ./saved-model.
  • --num-train-epochs for specifying the number of epochs to train for. This is the number of complete passes through the training dataset. Definitely try tuning this parameter to improve model performance on your dataset. Default is 15 epochs.

To run the training, simply do the following from the root of the project:

python ./hangul_model.py --label-file <your label file path> --num-train-epochs <num>

Depending on how many images you have, this will likely take a long time to train (several hours to maybe even a day), especially if only training on a laptop. If you have access to GPUs, these will definitely help speed things up, and you should certainly install the TensorFlow version with GPU support (supported on Ubuntu and Windows only).

On my Windows desktop computer with an Nvidia GTX 1080 graphics card, training about 320,000 images with the script defaults took just a bit over two hours. Training on my MacBook Pro would probably take over 20 times that long.

One alternative is to use a reduced label set (i.e. 256 vs 2350 Hangul characters) which can reduce the computational complexity quite a bit.

As the script runs, you should hopefully see the printed training accuracies grow towards 1.0, and you should also see a respectable testing accuracy after the training. When the script completes, the exported model we should use will be saved, by default, as ./saved-model/optimized_hangul_tensorflow.pb. This is a Protocol Buffer file which represents a serialized version of our model with all the learned weights and biases. This specific one is optimized for inference-only usage.

6. Try Out the Model

Before we jump into making an Android application with our newly saved model, let's first try it out. Provided is a script that will load your model and use it for inference on a given image. Try it out on images of your own, or download some of the sample images below. Just make sure each image is 64x64 pixels with a black background and white character color.

Optional flags for this are:

  • --label-file for specifying a different label file. This is used to map indices in the one-hot label representations to actual characters. Default is ./labels/2350-common-hangul.txt.
  • --graph-file for specifying your saved model file. Default is ./saved-model/optimized_hangul_tensorflow.pb.

Run it like so:

python ./tools/classify-hangul.py <Image Path> --label-file <your label file path>

Sample Images:

Sample Image 1 Sample Image 2 Sample Image 3 Sample Image 4 Sample Image 5

After running the script, you should see the top five predictions and their corresponding scores. Hopefully the top prediction matches what your character actually is.

Note: If running this script on Windows, in order for the Korean characters to be displayed on the console, you must first change the active code page to support UTF-8. Just run:

chcp 65001

Then you must change the console font to be one that supports Korean text (like Batang, Dotum, or Gulim).

7. Create the Android Application

With the saved model, a simple Android application can be created that will be able to classify handwritten Hangul that a user has drawn. A completed application has already been included in ./hangul-tensordroid.

Set up the project

The easiest way to try the app out yourself is to use Android Studio. This will take care of a lot of the Android dependencies right inside the IDE.

After downloading and installing Android Studio, perform the following steps:

  1. Launch Android Studio
  2. A Welcome to Android Studio window should appear, so here, click on Open an existing Android Studio project. If this window does not appear, then just go to File > Open... in the top menu.
  3. In the file browser, navigate to and click on the ./hangul-tensordroid directory of this project, and then press OK.

After building and initializing, the project should now be usable from within Android Studio. When Gradle builds the project for the first time, you might find that there are some dependency issues, but these are easily resolvable in Android Studio by clicking on the error prompt links to install the dependencies.

In Android Studio, you can easily see the project structure from the side menu.

Android Project Structure

The java folder contains all the java source code for the app. Expanding this shows that we have just four java files:

  1. MainActivity.java is the main launch point of the application and will handle the setup and button pressing logic.
  2. PaintView.java is the class that enables the user to draw Korean characters in a BitMap on the screen.
  3. HangulClassifier.java handles loading our pre-trained model and connecting it with the TensorFlow Inference Interface which we can use to pass in images for classification.
  4. HangulTranslator.java interfaces with the Watson Language Translator API to get English translations for our text.

In its current state, the provided Android application uses the 2350-common-hangul.txt label files and already has a pre-trained model trained on about 320,000 images from 40 fonts. These are located in the assets folder of the project, ./hangul-tensordroid/app/src/main/assets/. If you want to switch out the model or labels file, simply place them in this directory. You must then specify the names of these files in MainActivity.java, ./hangul-tensordroid/app/src/main/java/ibm/tf/hangul/MainActivity.java, by simply changing the values of the constants LABEL_FILE and MODEL_FILE located at the top of the class.

If you want to enable translation support, you must do the following:

  1. Create an IBM Cloud account here.
  2. Create the Watson Language Translator service.
  3. Get Translator service credentials. Credentials should have been automatically created. You can retrieve them by clicking on the Language Translator service under the Services section of your IBM Cloud dashboard.
  4. Update ./hangul-tensordroid/app/src/main/res/values/translate_api.xml with the apikey and url retrieved in step 3.

Run the application

When you are ready to build and run the application, click on the green arrow button at the top of Android Studio.

Android Studio Run Button

This should prompt a window to Select Deployment Target. If you have an actual Android device, feel free to plug it into your computer using USB. More info can be found here. If you do not have an Android device, you can alternatively use an emulator. In the Select Deployment Target window, click on Create New Virtual Device. Then just follow the wizard, selecting a device definition and image (preferably an image with API level 21 or above). After the virtual device has been created, you can now select it when running the application.

After selecting a device, the application will automatically build, install, and then launch on the device.

Try drawing in the application to see how well the model recognizes your Hangul writing.

Links

Learn more

  • Artificial Intelligence Code Patterns: Enjoyed this Code Pattern? Check out our other AI Code Patterns.
  • AI and Data Code Pattern Playlist: Bookmark our playlist with all of our Code Pattern videos
  • With Watson: Want to take your Watson app to the next level? Looking to utilize Watson Brand assets? Join the With Watson program to leverage exclusive brand, marketing, and tech resources to amplify and accelerate your Watson embedded commercial solution.

License

This code pattern is licensed under the Apache License, Version 2. Separate third-party code objects invoked within this code pattern are licensed by their respective providers pursuant to their own separate licenses. Contributions are subject to the Developer Certificate of Origin, Version 1.1 and the Apache License, Version 2.

Apache License FAQ

More Repositories

1

sarama

Sarama is a Go library for Apache Kafka.
Go
11,359
star
2

plex

The package of IBM’s typeface, IBM Plex.
CSS
9,603
star
3

css-gridish

Automatically build your grid design’s CSS Grid code, CSS Flexbox fallback code, Sketch artboards, and Chrome extension.
CSS
2,253
star
4

openapi-to-graphql

Translate APIs described by OpenAPI Specifications (OAS) into GraphQL
TypeScript
1,609
star
5

fp-go

functional programming library for golang
Go
1,550
star
6

Project_CodeNet

This repository is to support contributions for tools for the Project CodeNet dataset hosted in DAX
Python
1,537
star
7

fhe-toolkit-linux

IBM Fully Homomorphic Encryption Toolkit For Linux. This toolkit is a Linux based Docker container that demonstrates computing on encrypted data without decrypting it! The toolkit ships with two demos including a fully encrypted Machine Learning inference with a Neural Network and a Privacy-Preserving key-value search.
C++
1,436
star
8

pytorch-seq2seq

An open source framework for seq2seq models in PyTorch.
Python
1,431
star
9

ibm.github.io

IBM Open Source at GitHub
JavaScript
1,106
star
10

Dromedary

Dromedary: towards helpful, ethical and reliable LLMs.
Python
1,104
star
11

MicroscoPy

An open-source, motorized, and modular microscope built using LEGO bricks, Arduino, Raspberry Pi and 3D printing.
Python
1,102
star
12

MAX-Image-Resolution-Enhancer

Upscale an image by a factor of 4, while generating photo-realistic details.
Python
863
star
13

differential-privacy-library

Diffprivlib: The IBM Differential Privacy Library
Python
819
star
14

elasticsearch-spark-recommender

Use Jupyter Notebooks to demonstrate how to build a Recommender with Apache Spark & Elasticsearch
Jupyter Notebook
806
star
15

build-blockchain-insurance-app

Sample insurance application using Hyperledger Fabric
JavaScript
719
star
16

FfDL

Fabric for Deep Learning (FfDL, pronounced fiddle) is a Deep Learning Platform offering TensorFlow, Caffe, PyTorch etc. as a Service on Kubernetes
Go
676
star
17

spring-boot-microservices-on-kubernetes

In this code we demonstrate how a simple Spring Boot application can be deployed on top of Kubernetes. This application, Office Space, mimicks the fictitious app idea from Michael Bolton in the movie "Office Space".
JavaScript
548
star
18

cloud-native-starter

Cloud Native Starter for Java/Jakarta EE based Microservices on Kubernetes and Istio
Shell
516
star
19

openapi-validator

Configurable and extensible validator/linter for OpenAPI documents
JavaScript
496
star
20

federated-learning-lib

A library for federated learning (a distributed machine learning process) in an enterprise environment.
Python
495
star
21

clai

Command Line Artificial Intelligence or CLAI is an open-sourced project from IBM Research aimed to bring the power of AI to the command line interface.
Python
476
star
22

nicedoc.io

pretty README as service.
JavaScript
473
star
23

import-tracker

Python utility for tracking third party dependencies within a library
Python
457
star
24

mac-ibm-enrollment-app

The Mac@IBM enrollment app makes setting up macOS with Jamf Pro more intuitive for users and easier for IT. The application offers IT admins the ability to gather additional information about their users during setup, allows users to customize their enrollment by selecting apps or bundles of apps to install during setup, and provides users with next steps when enrollment is complete.
Swift
455
star
25

mobx-react-router

Keep your MobX state in sync with react-router
JavaScript
440
star
26

EvolveGCN

Code for EvolveGCN: Evolving Graph Convolutional Networks for Dynamic Graphs
Python
384
star
27

fhe-toolkit-macos

IBM Homomorphic Encryption Toolkit For MacOS
C++
358
star
28

AutoMLPipeline.jl

A package that makes it trivial to create and evaluate machine learning pipeline architectures.
HTML
355
star
29

aihwkit

IBM Analog Hardware Acceleration Kit
Jupyter Notebook
352
star
30

graphql-query-generator

Randomly generates GraphQL queries from a GraphQL schema
TypeScript
337
star
31

zshot

Zero and Few shot named entity & relationships recognition
Python
336
star
32

lale

Library for Semi-Automated Data Science
Python
333
star
33

portieris

A Kubernetes Admission Controller for verifying image trust.
Go
330
star
34

FedMA

Code for Federated Learning with Matched Averaging, ICLR 2020.
Python
326
star
35

BluePic

WARNING: This repository is no longer maintained ⚠️ This repository will not be updated. The repository will be kept available in read-only mode.
Swift
325
star
36

evote

A voting application that leverages Hyperledger Fabric and the IBM Blockchain Platform to record and tally ballots.
JavaScript
320
star
37

TabFormer

Code & Data for "Tabular Transformers for Modeling Multivariate Time Series" (ICASSP, 2021)
Python
319
star
38

powerai-counting-cars

Run a Jupyter Notebook to detect, track, and count cars in a video using Maximo Visual Insights (formerly PowerAI Vision) and OpenCV
Jupyter Notebook
317
star
39

blockchain-network-on-kubernetes

Demonstrates the steps involved in setting up your business network on Hyperledger Fabric using Kubernetes APIs on IBM Cloud Kubernetes Service.
Shell
305
star
40

charts

The IBM/charts repository provides helm charts for IBM and Third Party middleware.
Smarty
297
star
41

IBM-Z-zOS

The helpful and handy location for finding and sharing z/OS files, which are not included in the product.
REXX
296
star
42

mac-ibm-notifications

macOS agent used to display custom notifications and alerts to the end user.
Swift
294
star
43

blockchain-application-using-fabric-java-sdk

Create and Deploy a Blockchain Network using Hyperledger Fabric SDK Java
Java
290
star
44

MAX-Object-Detector

Localize and identify multiple objects in a single image.
Python
286
star
45

design-kit

The IBM Design kit is a collection of tools aimed to help you design and prototype experiences faster, with confidence and thoughtfulness. This kit is based on the IBM Design System. Also, you may use this documentation to create add-on libraries to the IBM Design System or submit bugs to the current system.
272
star
46

AccDNN

A compiler from AI model to RTL (Verilog) accelerator in FPGA hardware with auto design space exploration.
Verilog
270
star
47

deploy-ibm-cloud-private

Instructions and Code required to install IBM Cloud Private
HCL
263
star
48

audit-ci

Audit NPM, Yarn, PNPM, and Bun dependencies in continuous integration environments, preventing integration if vulnerabilities are found at or above a configurable threshold while ignoring allowlisted advisories
TypeScript
261
star
49

vue-a11y-calendar

Accessible, internationalized Vue calendar
JavaScript
253
star
50

UQ360

Uncertainty Quantification 360 (UQ360) is an extensible open-source toolkit that can help you estimate, communicate and use uncertainty in machine learning model predictions.
Python
252
star
51

watson-banking-chatbot

A chatbot for banking that uses the Watson Assistant, Discovery, Natural Language Understanding and Tone Analyzer services.
JavaScript
250
star
52

ibm-generative-ai

IBM-Generative-AI is a Python library built on IBM's large language model REST interface to seamlessly integrate and extend this service in Python programs.
Python
246
star
53

Kubernetes-container-service-GitLab-sample

This code shows how a common multi-component GitLab can be deployed on Kubernetes cluster. Each component (NGINX, Ruby on Rails, Redis, PostgreSQL, and more) runs in a separate container or group of containers.
Shell
243
star
54

transition-amr-parser

SoTA Abstract Meaning Representation (AMR) parsing with word-node alignments in Pytorch. Includes checkpoints and other tools such as statistical significance Smatch.
Python
241
star
55

molformer

Repository for MolFormer
Jupyter Notebook
228
star
56

BlockchainNetwork-CompositeJourney

Part 1 in a series of patterns showing the building blocks of a Blockchain application
Shell
227
star
57

LNN

A `Neural = Symbolic` framework for sound and complete weighted real-value logic
Python
225
star
58

pytorchpipe

PyTorchPipe (PTP) is a component-oriented framework for rapid prototyping and training of computational pipelines combining vision and language
Python
223
star
59

Graph2Seq

Graph2Seq is a simple code for building a graph-encoder and sequence-decoder for NLP and other AI/ML/DL tasks.
Python
219
star
60

ModuleFormer

ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward experts. We released a collection of ModuleFormer-based Language Models (MoLM) ranging in scale from 4 billion to 8 billion parameters.
Python
219
star
61

data-prep-kit

Open source project for data preparation of LLM application builders
Jupyter Notebook
217
star
62

Scalable-WordPress-deployment-on-Kubernetes

This code showcases the full power of Kubernetes clusters and shows how can we deploy the world's most popular website framework on top of world's most popular container orchestration platform.
Shell
214
star
63

janusgraph-utils

Develop a graph database app using JanusGraph
Java
207
star
64

tensorflow-large-model-support

Large Model Support in Tensorflow
201
star
65

Scalable-Cassandra-deployment-on-Kubernetes

In this code we provide a full roadmap the deployment of a multi-node scalable Cassandra cluster on Kubernetes. Cassandra understands that it is running within a cluster manager, and uses this cluster management infrastructure to help implement the application. Kubernetes concepts like Replication Controller, StatefulSets etc. are leveraged to deploy either non-persistent or persistent Cassandra clusters on Kubernetes cluster.
Shell
195
star
66

adaptive-federated-learning

Code for paper "Adaptive Federated Learning in Resource Constrained Edge Computing Systems"
Python
193
star
67

action-recognition-pytorch

This is the pytorch implementation of some representative action recognition approaches including I3D, S3D, TSN and TAM.
Python
193
star
68

gantt-chart

IBM Gantt Chart Component, integrable in Vanilla, jQuery, or React Framework.
JavaScript
193
star
69

api-samples

Samples code that uses QRadar API's
Python
192
star
70

cdfsl-benchmark

(ECCV 2020) Cross-Domain Few-Shot Learning Benchmarking System
Python
190
star
71

kube101

Kubernetes 101 workshop (https://ibm.github.io/kube101/)
Shell
181
star
72

CrossViT

Official implementation of CrossViT. https://arxiv.org/abs/2103.14899
Python
180
star
73

rl-testbed-for-energyplus

Reinforcement Learning Testbed for Power Consumption Optimization using EnergyPlus
Python
180
star
74

browser-functions

A lightweight serverless platform that uses Web Browsers as execution engines
JavaScript
180
star
75

pwa-lit-template

A template for building Progressive Web Applications using Lit and Vaadin Router.
TypeScript
178
star
76

fastfit

FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes
Python
174
star
77

AMLSim

The AMLSim project is intended to provide a multi-agent based simulator that generates synthetic banking transaction data together with a set of known money laundering patterns - mainly for the purpose of testing machine learning models and graph algorithms. We welcome you to enhance this effort since the data set related to money laundering is critical to advance detection capabilities of money laundering activities.
Python
170
star
78

socket-io

A Socket.IO client for C#
C#
169
star
79

tfjs-web-app

A TensorFlow.js Progressive Web App for Offline Visual Recognition
JavaScript
164
star
80

spark-tpc-ds-performance-test

Use the TPC-DS benchmark to test Spark SQL performance
TSQL
160
star
81

simulai

A toolkit with data-driven pipelines for physics-informed machine learning.
Python
157
star
82

watson-online-store

Learn how to use Watson Assistant and Watson Discovery. This application demonstrates a simple abstraction of a chatbot interacting with a Cloudant NoSQL database, using a Slack UI.
HTML
156
star
83

unitxt

🦄 Unitxt: a python library for getting data fired up and set for training and evaluation
Python
155
star
84

istio101

Istio 101 workshop (https://ibm.github.io/istio101/)
Shell
154
star
85

Medical-Blockchain

A healthcare data management platform built on blockchain that stores medical data off-chain
Vue
150
star
86

terratorch

a Python toolkit for fine-tuning Geospatial Foundation Models (GFMs).
Python
148
star
87

node-odbc

ODBC bindings for node
JavaScript
146
star
88

taxinomitis

Source code for Machine Learning for Kids site
JavaScript
143
star
89

watson-assistant-slots-intro

A Chatbot for ordering a pizza that demonstrates how using the IBM Watson Assistant Slots feature, one can fill out an order, form, or profile.
JavaScript
143
star
90

tsfm

Foundation Models for Time Series
Jupyter Notebook
143
star
91

SALMON

Self-Alignment with Principle-Following Reward Models
Python
142
star
92

ipfs-social-proof

IPFS Social Proof: A decentralized identity and social proof system
JavaScript
142
star
93

kgi-slot-filling

This is the code for our KILT leaderboard submissions (KGI + Re2G models).
Python
141
star
94

etcd-java

Alternative etcd3 java client
Java
141
star
95

regression-transformer

Regression Transformer (2023; Nature Machine Intelligence)
Python
140
star
96

deploy-react-kubernetes

Built for developers who are interested in learning how to deploy a React application on Kubernetes, this pattern uses the React and Redux framework and calls the OMDb API to look up movie information based on user input. This pattern can be built and run on both Docker and Kubernetes.
JavaScript
139
star
97

probabilistic-federated-neural-matching

Bayesian Nonparametric Federated Learning of Neural Networks
Python
137
star
98

innovate-digital-bank

This repository contains instructions to build a digital bank composed of a set of microservices that communicate with each other. Using Nodejs, Express, MongoDB and deployed to a Kubernetes cluster on IBM Cloud.
JavaScript
137
star
99

core-dump-handler

Save core dumps from a Kubernetes Service or RedHat OpenShift to an S3 protocol compatible object store
Rust
136
star
100

KubeflowDojo

Repository to hold code, instructions, demos and pointers to presentation assets for Kubeflow Dojo
Jupyter Notebook
133
star