Predicting Remaining Useful Life
The general setup for the problem is a common one: we have a single table of sensor observations over time. Now that collecting information is easier than ever, most industries have already generated time-series type problems by the way that they store data. As such, it is crucial to be able to handle data in this form. Thankfully, built-in functionality from Featuretools handles time varying data well.
We'll demonstrate an end-to-end workflow using a Turbofan Engine Degradation Simulation Data Set from NASA. This notebook demonstrates a rapid way to predict the Remaining Useful Life (RUL) of an engine using an initial dataframe of time-series data. There are three sections of the notebook:
- Understand the Data
- Generate features
- Make predictions with Machine Learning
To run the notebooks, you need to download the data yourself. Download and unzip the file from https://ti.arc.nasa.gov/c/6/. Then create a 'data' directory and place the files in the 'data' directory.
Highlights
- Quickly make end-to-end workflow using time-series data
- Find interesting automatically generated features
- An advanced notebook using custom primitives and hyper-parameter tuning
Running the tutorial
-
Clone the repo
git clone https://github.com/Featuretools/predict-remaining-useful-life.git
-
Install the requirements
pip install -r requirements.txt
You will also need to install graphviz for this demo. Please install graphviz according to the instructions in the Featuretools Documentation
-
Download the data
The data is from the NASA Turbofan Engine Degradation Simulation Data Set and is available here
To run the notebooks, place the following files in the 'data' directory:
train_FD004.txt
,test_FD004.txt
,RUL_FD004.txt
-
Run the Tutorials notebooks:
jupyter notebook
The
utils.py
script contains a number of useful helper functions.
Feature Labs
Featuretools is an open source project created by Feature Labs. To see the other open source projects we're working on visit Feature Labs Open Source. If building impactful data science pipelines is important to you or your business, please get in touch.
Contact
Any questions can be directed to [email protected]