• Stars
    star
    100
  • Rank 329,741 (Top 7 %)
  • Language
    Scala
  • License
    Creative Commons ...
  • Created over 8 years ago
  • Updated 9 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The HMDA Submission backend applications.

Build Status codecov.io

HMDA Platform

Introduction

The Home Mortgage Disclosure Act (HMDA) Platform is a Regulatory technology application for financial institutions to submit mortgage information as described in the Filing Instruction Guide (FIG). The HMDA-Platform parses data as submitted by mortgage leading institutions and validates the information for edits (Syntactical, Validity, Quality, and Macro as-per the instructions in the FIG) before submitting the data. The HMDA-Platform supports quarterly and yearly filing periods. For detailed information on Home Mortgage Disclosure Act (HMDA), checkout the About HMDA page on the CFPB website.

Please watch this short video to view how HMDA Platform transforms the data upload, validation, and submission process.

Linked Projects

Project Repo Link Description
Frontend https://github.com/cfpb/hmda-frontend ReactJS Front-end repository powering the HMDA Platform
HMDA-Help https://github.com/cfpb/hmda-help ReactJS Front-end repository powering HMDA Help - used to resolve and troubleshoot issues in filing
LARFT https://github.com/cfpb/hmda-platform-larft Repo for the Public Facing LAR formatting tool
HMDA Test Files https://github.com/cfpb/hmda-test-files Repo for automatically generating various different test files for HMDA Data
HMDA Census https://github.com/cfpb/hmda-census ETL for geographic and Census data used by the HMDA Platform
HMDA Data Science https://github.com/cfpb/HMDA_Data_Science_Kit Repo for HMDA Data science work as well as Spark codebase for Public Facing A&D Reports

Contents

TS and LAR File Specs

The data is submitted in a flat pipe (|) delimited TXT file. The text file is split into two parts: Transmission (TS) File -- first line in the file and Loan Application Register (LAR) -- all remaining lines of the file. Below are the links to the file specifications for data collected in years 2018 - current.

End-to-End filing GIF

The hmda-frontend uses Cypress to test the end-to-end filing process from the end user perspective. The GIF below shows the automated filing process via Cypree - no human intervention.

Cypress automated filing test

Technical Overview

This repository contains the code for the entirety of the public facing HMDA Platform backend. This platform has been designed to accommodate the needs of the HMDA filing process by financial institutions, as well as the data management, publication, aggregation, reporting, analyzing, visualizing, and downloading the HMDA data set.

The HMDA Platform follows a loosely coupled event driven micro-services architecture with API-first (API Documentation) design principles. The entire platform is built on open source frameworks and remains cloud vendor agnostic.

Microservices

The code base contained in this repository includes the following microservices that work together in support of the HMDA Platform.

  • HMDA Platform: The entire backend API for public facing filing platform. Used for processing the uploaded TXT files and validating them in a non-blocking, I/O streaming way. The APIs are built to be able to process various file sizes, from small (few lines) to large (1.5M+ lines), text files simultaneously without impeding the scalability or availability of the platform. The platform contains code for customizable data edits, a Domain Specific Language (DSL) for coding the data edits, and submitting events to Kafka topics.

  • Check Digit: The entire backend API for public facing check digit tool. The Check Digit tool is used to (1) Generate a two character check-digit based on an Legal Entity Identifier (LEI) and (2) Validate that a check-digit is calculated correctly for any complete Universal Loan Identifier (ULI). This APIs are built to process multiple row CSV files as well as one time processing.

  • Institutions API: Read only API for fetching details about an LEI. This microservice also listens to events put on the institutions-api Kafka topic for Creating, updating, and deleting institution data from PostgreSQL.

  • Data Publisher: This microservice runs on a scheduled basis to make internal / external data available for research purposes via object stores such as S3. The schedule for the job is configurable via K8s config map

  • Ratespread: Public facing API for the ratespread calculator. This calculator provides rate spreads for HMDA reportable loans with a final action date on or after January 1st, 2018. This API supports streaming CSV uploads as well as one-time calculations.

  • Modified LAR: Event driven service of modified-lar reports. Each time a filer successfully submits the data, the modified-lar micro-service generates a modified-lar report and puts it in the public object store (e.g. S3). Any re-submissions automatically re-generate new modified-lar reports.

  • IRS Publisher: Event driven service of irs-disclosure-reports. Each time a filer successfully submits the data, the irs-publisher microservice generates the IRS report.

  • HMDA Reporting: Real-time, public facing API for getting information (LEI number, institution name, and year) on LEIs who have successfully submitted their data.

  • HMDA Analytics: Event driven service to insert, delete, update information in PostgreSQL each time there is a successful submission. The data inserted maps with the Census data to provide information for MSAMds. It also adds race, sex, and ethnicity categorization to the data.

  • HMDA Dashboard: Authenticated APIs to view realtime analytics for the filings happening on the platform. The dashboard includes summarized statistics, data trends, and supports data visualizations via frontend.

  • Rate imit: Rate limiter service working in-sync with ambassador to limit the number of times in a given time period that the API can be called. If the rate limit is reached, a 503 error code is sent.

  • HMDA Data Browser: Public facing API for HMDA Data Browser. This API makes the entire dataset available for summarized statistics, deep analysis, as well as geographic map layout.

  • Email Service: Event driven service to send an automated email to the filer on each successful submission.

HMDA Platform Technical Architecture

The image below shows the cloud vendor agnostic technical architecture for the HMDA Platform.

HMDA Data Browser Technical Architecture

Please view the README for HMDA Data Browser

Running with sbt

The HMDA Platform can run locally using sbt with an embedded Cassandra and embedded Kafka. To get started:

git clone https://github.com/cfpb/hmda-platform.git
cd hmda-platform
export CASSANDRA_CLUSTER_HOSTS=localhost
export APP_PORT=2551
sbt
[...]
sbt:hmda-root> project hmda-platform
sbt:hmda-platform> reStart

Access locally build platform

hmda-admin-api
hmda-filing-api
hmda-public-api

Build hmda-platform Docker image

Docker Image is build via Docker plugin utilizing sbt-native-packager

sbt -batch clean hmda-platform/docker:publishLocal

The image can be built without running tests using:

sbt "project hmda-platform" dockerPublishLocalSkipTests

One-line Cloud Deployment to Dev/Prod

The platform and all of the related microservices explained above are deployed on Kubernetes using Helm. Each deployment is a single Helm command. Below is an example for the deployment of the email-service:

helm upgrade --install --force \                            
--namespace=default \
--values=kubernetes/hmda-platform/values.yaml \
--set image.repository=hmda/hmda-platform \
--set image.tag=<tag name> \
--set image.pullPolicy=Always \
hmda-platform \
kubernetes/hmda-platform

Docker Hub

All of the containers built by the HMDA Platform are released publicly via Docker Hub: https://hub.docker.com/u/hmda

One-line Local Development Environment (No Auth)

The platform and it's dependency services, Kafka, Cassandra and PostgreSQL, can run locally using Docker Compose.

# Bring up hmda-platform, hmda-analytics, institutions-api
docker-compose up

The entire filing plaform can be spun up using a one line command. Using this locally running instance of Platform One, no authentication is needed.

# Bring up the hmda-platform
docker-compose up hmda-platform

Additionally, there are several environment varialbes that can be configured/changed. The platform uses sensible defaults for each one. However, if required they can be overridden:

CASSANDRA_CLUSTER_HOSTS
CASSANDRA_CLUSTER_DC
CASSANDRA_CLUSTER_USERNAME
CASSANDRA_CLUSTER_PASSWORD
CASSANDRA_JOURNAL_KEYSPACE
CASSANDRA_SNAPSHOT_KEYSPACE
KAFKA_CLUSTER_HOSTS
APP_PORT
HMDA_HTTP_PORT
HMDA_HTTP_ADMIN_PORT
HMDA_HTTP_PUBLIC_PORT
MANAGEMENT_PORT
HMDA_CASSANDRA_LOCAL_PORT
HMDA_LOCAL_KAFKA_PORT
HMDA_LOCAL_ZK_PORT
WS_PORT

Automated Testing

The HMDA Platform takes a rigorous automated testing approach. In addtion to Travis and CodeCov, we've prepared a suite of Newman test scripts that perform end-to-end testing of the APIs on a recurring basis. The testing process for Newman is containerized and runs as a Kubernetes CronJob to act as a monitoring and alerting system. The platform and microservices are also testing for load by using Locust.

Postman Collection

In addition to using Newman for our internal testing, we've created a HMDA Postman collection that makes it easier for users to perform a end-to-end filing of HMDA Data, including upload, parsing data, flagging edits, resolving edits, and submitting data when S/V edits are resolved.

API Documentation

The HMDA Platform Public API Documentation is hosted in the HMDA Platform API Docs repo and deployed to GitHub Pages using the gh-pages branch.

Sprint Cadence

Our team works in two week sprints. The sprints are managed as Project Boards. The backlog grooming happens every two weeks as part of Sprint Planning and Sprint Retrospectives.

Code Formatting

Our team uses Scalafmt to format our codebase.

Development Process

Below are the steps the development team follows to fix issues, develop new features, etc.

  1. Create a fork of this repository
  2. Work in a branch of the fork
  3. Create a PR to merge into master
  4. The PR is automatically built, tested, and linted using: Travis, Snyk, and CodeCov
  5. Manual review is performed in addition to ensuring the above automatic scans are positive
  6. The PR is deployed to development servers to be checked using Newman
  7. The PR is merged only by a separate member in the dev team

Contributing

CFPB is developing the HMDA Platform in the open to maximize transparency and encourage third party contributions. If you want to contribute, please read and abide by the terms of the License for this project. Pull Requests are always welcome.

Issues

We use GitHub issues in this repository to track features, bugs, and enhancements to the software.

Open source licensing info

  1. TERMS
  2. LICENSE
  3. CFPB Source Code Policy

Credits and references

Related projects

More Repositories

1

open-source-checklist

check internal repos against open source checklist requirements
JavaScript
437
star
2

qu

โš ๏ธ This project was archived on September 25th, 2020 and is no longer maintained
Clojure
368
star
3

cfpb.github.io

A site for the CFPB to share and discuss its technology work with the world.
HTML
332
star
4

consumerfinance.gov

Django project protecting American consumers
Python
238
star
5

django-flags

Feature flags for Django projects
Python
220
star
6

open-source-project-template

A project template containing default open source files for new projects
193
star
7

idea-box

An application that lets an organization collect ideas, comment on them, and vote them up.
Python
151
star
8

salesforce-docs

Guidance for developing Salesforce applications and solutions for the CFPB
124
star
9

jenkins-automation

Helpers for automating Jenkins via Groovy codeโ€”primarily job builders and utilities.
Groovy
122
star
10

DOCter

A Jekyll template for project documentation
CSS
106
star
11

design-manual

โš ๏ธ THIS REPO IS DEPRECATED โš ๏ธ A set of design principles and standards for the Consumer Financial Protection Bureau.
HTML
98
star
12

clouseau

โš ๏ธ THIS PROJECT IS DEPRECATED โš ๏ธ Search your repository's git history for undesirable text patterns such as passwords, ssh keys and other personal identifiable information
Python
97
star
13

owning-a-home

โš ๏ธ DEPRECATED โš ๏ธ Suite of tools and resources for homebuyers.
JavaScript
93
star
14

wagtail-flags

Feature flags for Wagtail sites
Python
80
star
15

collab

Collab is a Django project with a standard set of configurations to provide services to reusable apps.
Python
78
star
16

django-nudge

Gently push content from development to production. This is a public domain work of the US Government.
Python
74
star
17

proxy-methodology

Stata
66
star
18

development

A repository for the discussion and publication of the CFPB development team standards.
Shell
60
star
19

amortize

A node module to calculate the interest paid, principal paid, remaining balance, and monthly payment of a loan.
JavaScript
57
star
20

wagtail-inventory

Search Wagtail pages by the StreamField blocks they contain
Python
57
star
21

capital-framework

The Consumer Financial Protection Bureau's user interface framework
HTML
55
star
22

regulations-parser

(DEPRECATED) Parser for U.S. federal regulations and other regulatory information
Python
54
star
23

wagtail-sharing

Easier sharing of Wagtail drafts
Python
49
star
24

college-costs

โš ๏ธ Deprecated: see note. โš ๏ธ A tool to help students weigh the costs and rewards of a college program.
HTML
49
star
25

ec2mapper

EC2mapper is a web application that provides a user-friendly interface to view Amazon AWS network configurations, while allowing changes to be easily tracked over time.
JavaScript
48
star
26

wagtail-treemodeladmin

An extension for Wagtail's ModelAdmin for a page explorer-like navigation of Django model relationships
Python
42
star
27

hmda-tools

Tools to make importing and analyzing mortgage application data easier. This is a public domain work of the US Government.
Python
41
star
28

jmeter-bootstrap

Downloads JMeter and JMeter plugins and demonstrates usage via examples. Suggested to be used as a git submodule
Python
38
star
29

HMDA_Data_Science_Kit

Scala
38
star
30

grasshopper

CFPB's streaming batch geocoder
Scala
37
star
31

source-code-policy

The CFPB's official Source Code Policy.
HTML
34
star
32

design-system

CFPB's work-in-progress design system
JavaScript
34
star
33

github-changelog

Generate a changelog based on GitHub pull request titles
Python
33
star
34

jenkins-as-code-starter-project

A neat little project that uses our jenkins utils and helps you to get started and start testing your scripts
Groovy
30
star
35

node-wcag

WCAG and Section 508 accessibility audits from the command line.
JavaScript
29
star
36

sheer

A tool for loading arbitrary content into Elasticsearch and serving that content on the web.
Python
29
star
37

transit_subsidy

Open source version of a simple Transit Subsidy intake form. This is what is commonly used by United Stated federal government agencies. This is a public domain work of the US Government.
JavaScript
29
star
38

regulations-site

(DEPRECATED) Web interface for viewing U.S. federal regulations and other regulatory information
JavaScript
28
star
39

api

Documentation to support upcoming data platform API and data sets
CSS
27
star
40

xtdiff

โš ๏ธ THIS REPO IS DEPRECATED โš ๏ธ Python library to compare two XML trees and generate a set of actions that transform one into the other
Python
26
star
41

retirement

Helping Americans make choices about retirement
HTML
22
star
42

project-open-source

(Work In Progress) A repository for tools and resources for open source in government.
CSS
21
star
43

aurora

An open source enterprise data warehousing and analysis platform.
Jinja
21
star
44

hmda-explorer

โš ๏ธ This project is archived and no longer maintained
JavaScript
20
star
45

regulations-core

(DEPRECATED) An engine that supplies the API that allows users to read regulations and their various layers.
Python
15
star
46

ccdb5-api

An API that provides an interface to search complaint data.
Python
14
star
47

owning-a-home-api

The API that drives the Owning A Home project.
Python
13
star
48

loan-calc

A node module to quickly calculate monthly payments and the total amount of interest paid for a fixed rate loan.
JavaScript
13
star
49

cfpb-chart-builder

Charts for the Consumer Financial Protection Bureau
JavaScript
12
star
50

hmda-frontend

Collection of HMDA frontend apps
JavaScript
12
star
51

mapusaurus

Real-time geospatial mapping of HMDA data. This project is being deprecated as HMDA data schema are changing in calendar year 2019
JavaScript
12
star
52

django-cache-tools

Tools to make caching easier in Django
Python
12
star
53

CFPBot

๐Ÿ‘พ
CoffeeScript
12
star
54

hmda-platform-ui

Front-end for https://github.com/cfpb/hmda-platform
JavaScript
11
star
55

collab-form-builder

Form Builder app for Collab
Python
11
star
56

django-hud

โš **DEPRECATED**โš  JSON API of housing counselor data
Python
11
star
57

find-a-housing-counselor

โš ๏ธ DEPRECATED โš ๏ธ Templates now live in cfgov-refresh. JS is with the API at https://github.com/cfpb/django-hud
CSS
10
star
58

collab-staff-directory

Python
9
star
59

curriculum-review-tool

An interactive tool that allows a teacher to assess the merits of a financial education curriculum.
JavaScript
9
star
60

hubot-onboarding

A Hubot script for welcoming new hires to your organization by gradually providing them scheduled information.
CoffeeScript
9
star
61

collab-news

Python
8
star
62

django-college-costs-comparison

A web app for comparing college financial aid offers
HTML
8
star
63

ccdb-data-pipeline

Python
8
star
64

generator-cf

Yeoman generator for Capital Framework
HTML
8
star
65

cms-toolkit

โš ๏ธ THIS REPO IS DEPRECATED โš ๏ธ
PHP
8
star
66

crawl-cfgov

Archive the HTML of consumerfinance.gov daily
HTML
8
star
67

jenkins-shared-libraries

Groovy
8
star
68

grasshopper-parser

Address Parsing REST API
Python
8
star
69

debt-collection-files

Native design files for a print version of the model validation notice and source code for a responsive version of the model validation notice designed by the Bureau of Consumer Financial Protection (Bureau) in connection with its proposed debt collection rule.
8
star
70

student-debt-calculator

Calculate student debt at graduation based on a wide array of possible inputs
JavaScript
7
star
71

hmda-census

ETL for geographic and Census data used by the HMDA Platform
Jupyter Notebook
7
star
72

grasshopper-loader

Data loader for Grasshopper
JavaScript
7
star
73

cfgov-lighthouse

JavaScript
7
star
74

github-wiki-search

โš ๏ธ DEPRECATED โš ๏ธ
CSS
7
star
75

jenkins-sqs-plugin

AWS SQS notifier plugin for Jenkins
Java
6
star
76

wagtail-regulations

Building blocks for interactive regulations in Wagtail
Python
6
star
77

cfgov-crawler-app

An electron app which crawls consumerfinance.gov and gathers interesting data
JavaScript
6
star
78

hmda-pilot

JavaScript
6
star
79

ckan-installer

Shell
6
star
80

publish_eccu

Publish ECCU files to Akamai
Python
6
star
81

consumer-credit-trends-data

Data for the CFPB's consumer credit trends visualizations
Python
6
star
82

cfgov-django

โš ๏ธ DEPRECATED โš ๏ธ โ€“ย This project is no longer in use.
Python
5
star
83

agile-playbook

5
star
84

regdown

A Python-Markdown extension for interactive regulation text
Python
5
star
85

ccdb5-ui

JavaScript
5
star
86

prepaid-disclosure-files

Native design files and source code for the model and sample disclosure forms in the CFPB's final rule on prepaid accounts.
HTML
5
star
87

hmda-pub-ui

HMDA Publication UI
JavaScript
5
star
88

hmda-platform-auth

See https://github.com/cfpb/hmda-platform/tree/master/auth
CSS
5
star
89

mac-setup

Scripts and baseline dotfiles for setting up a new CFPB developer Mac
Shell
5
star
90

govdelivery

A Python wrapper for the govdelivery API
Python
5
star
91

vax

Check your Node project for npm security best practices.
JavaScript
5
star
92

cfpb-wp-cli

A collection wp-cli commands used at the CFPB.
PHP
5
star
93

hmda-geo

Services that allow the user to query geographic entities by latitude and longitude and extract data.
Scala
5
star
94

hackathon

planning a hack day at CFPB.
CSS
5
star
95

design-system-react

A React/Storybook implementation of CFPB's Design System
TypeScript
5
star
96

complaint

โš ๏ธ THIS REPO IS DEPRECATED โš ๏ธ Submit a Complaint and Consumer Complaint Database websites
HTML
5
star
97

grasshopper-ui

Front-end to Grasshopper, cfpb's nascent geocoder
CSS
4
star
98

hubot-aws-cfpb

Manage your AWS EC2 instances from Hubot
CoffeeScript
4
star
99

elasticizer

Python
4
star
100

email-templates

Design, research, and front-end implementation of responsive email templates for the CFPB.
CSS
4
star