Localize your cat at home with BLE beacon, ESP32s, and Machine Learning
tl;dr / abstract. System which can be used for localization the position of a cat in a building using bluetooth low energy (BLE) beacons attached to the object, a set of cheap ESP32 detectors, and Machine Learning models.
- How it works
- Hardware setup
- Software setup
- Training and testing of ML models
- Detecting
- Tips and tricks
- Improvements of the accuracy
- Further directions
- Acknowledgements
How it works
This is an overwiev of a pipeline for creating an inhouse cat locator. Actually, it can be applied to any animal (including humans) or object, and any building. The system works as follows:
- The cat with a small BLE beacon, is emiting BLE signals
- BLE signal is detected by ESP32s located here and there; they are measuring he signal strength of the BLE beacon.
- Each ESP32 sends data to the server (database)
- The python program is fetching the last measurements from all ESP32 detectors (i.e., signal strenght values)
- Using trained machine learning models it predicts the location of the cat
๐ก The challenge here is to use a number of detectors which is significantly lower than the number of rooms and make ML do the rest.
๐ก This is not a ready-to-use set of programs. As each situation is different (i.e, hardware setup, house size, number of rooms etc), please treat this repo as a framework to build and customize your own system.
Hardware setup
- Moving object to track. A cat. A dog is also fine. Be careful with fishes. Can be more than one. I used two cats. ๐ฐ ~ 0 $
- BLE transmitters. A BLE beacon. Anything will be good. I used NotiOne buttons, they work well and has an additional functionality (not used here). But something cheaper will work too. ๐ฐ n x 10-20 $
- BLE detectors. I used four ESP32. I guess 4-6 for moderate house/flat should be enough. One can use also modern raspberry pi with BLE. ๐ฐ n x 4 $
- Data processing. A server/computer. I worked with amd64 Debian, but any Raspberry Pi should be also good. Can be remote. Can be done in a microcontroller (see: further directions)
- WiFi Connectivity (ESPs to the server)
Optional:
- LCD display showing the current location of the cat. I'm using TTGO1
Software setup
BLE beacon
Nothing to configure nor code here :-)
Detectors
ESP32
I provide a simple C/Arduino code for the detector. Of course it can do much more (my ESPs also measure temperature, humidity and other parameters):
Many things can be tuned and optimized (eg array of MAC addreses instead of single variables, reading MAC addresses from the server etc).
Raspberry Pi
(this solution was not thoroughly tested!)
Raspberry Pi with a BLE capability can be included in the set of detectors. The commands here are:
btmon &
hcitool lescan --duplicates | grep MAC
and next sending to the database.
On the server
0. Data source
We need a server with a database. I use influx, but any will be good (MySQL or even text files!)
1. Python environment
I used Python 3.8.
- libraries: scikit and pandas, and whatever is needed to fetch data from the database
- please see a sample conda environment file to see what is my setup
- remember that the trained ML models must be next used in the same environment (the same scikit and numpy version!); this can be guaranteed by using the dedicated cond environment
Sample conda environment can be impotred by issuing (this is a single time procedure):
conda env create -f cat-localizer.yml
and activated by:
conda activate cat-localizer
Training of ML models can be done on any computer (eg laptop) and next models transfered to the main server. Remember to keep the same modules version (use conda environment). See the next section on model training.
2. Program for cat localization prediction
When models are trained and ready (see the next section), we use python program to fetch last BLE signal strength data and - basing on this - predict localization of the cat. Please see the section Detecting for more info and code samples.
Training and testing of ML models
Data collection
Here we collect data for training ML algorithms. This means we have to collect signal strength data from all sensors WITH associated localization of the beacon. I.e, lets assume we have four ESP32 sensors, we need to create table like this one:
measurement id | time | target | esp32-attic | esp32-room_k | esp32-office | esp32-doors |
---|---|---|---|---|---|---|
0 | 2021-01-03 15:32 | office | -71.0 | -110.0 | -71.5 | -83.0 |
1 | 2021-01-03 15:33 | office | -73.0 | -110.0 | -82.5 | -82.5 |
2 | 2021-01-03 15:34 | office | -81.0 | -110.0 | -80.0 | -87.5 |
3 | 2021-01-03 15:35 | office | -75.0 | -110.0 | -86.5 | -86.5 |
4 | 2021-01-03 15:36 | office | -72.5 | -110.0 | -78.5 | -81.0 |
... | ... | ... | ... | ... | ... | ... |
236 | 2021-01-03 21:56 | room_k | -100.5 | -67.5 | -110.0 | -91.5 |
237 | 2021-01-03 21:57 | room_k | -86.0 | -60.0 | -110.0 | -91.5 |
238 | 2021-01-03 21:58 | room_k | -88.5 | -93.5 | -110.0 | -91.5 |
239 | 2021-01-03 21:59 | room_k | -86.0 | -75.0 | -110.0 | -91.5 |
240 | 2021-01-03 22:00 | room_k | -87.0 | -72.0 | -110.0 | -84.5 |
where each row (measurement) represent RSSI values read from beacon at a given location at the same time. Please see jupyter notebook for the data preprocessing.
When this table is ready, we can go further and train ML models.
The number of data points (i.e, different locations of BLE beacon) for each room is variable and depends on the room size. In my case it was like here:
room no of measurements
bathroom_d 11
bathroom_g 30
living_room 101
office 38
room_k 32
room_m 28
Remember to sample space available for your cats, so not only floors, but also tables, wardrobes, or chandelier etc ;-)
Algorithms evaluation and models training
Now we will probe several preprocessing algorithms together with several Machine Learning methods. We don't know which one will be the best in our case, so we will probe them all in a cross-validation experiment.
Please see jupyter notebook for the exact pipeline.
The final models will be build with at 75% of the dataset (training), while 25% will seve as a testing dataset.
Conclusions
The ML algos cv results is shown here (balanced accuracy values are shown):
we can see that in our case, the MLP Classifier with Min-Max Scaler gives the best results.
Let's see if we can go any further with some features engeneering:
The nswer is no. The best accuracy here is the same as an initial setup (0.83), so we will stay with the MLP Classifier with Min-Max Scaler.
The final models build with MLP Classifier with Min-Max Scaler gives the balanced accuracy equal 0.88. Very good! If we look at the confusion matrix:
we will see that (believe me or not) all miss-labelled predictions are for locations which are close to each other (eg. room_m is close in space to the office etc.). So this makes a perfect sense.
Detecting
For the detection you can use the code snippet in the section 6 of jupyter notebook or the code of the simple program. The program uses numpy array as a data source, but it can be read from the file or fetched from the database.
Tips and tricks
- gathering data for training ML. I used 1-minutes interval of collecting data. This gave me time to change the position of beacons in the room. Measurements were automatically stored in the database (influx). I noted the time span of collectiong data in a given room and then fetched data from db for this time period, assigned a room label, and saved in a csv file.
- for collection of data for a flat surfaces (floors) you can use devices such as roomba with beacons attached at the top of it. Works fine!
Improvements of the accuracy
- Optimization of the position of ESP32 detectors, to cover whole space and give unique signal patterns for each rooms
- Optimization of ML algorithms and parameters
- Increasing number of ESP32 detectors (but this is not a chalenge :))
- (?) Take into account the battery level of the BLE beacon, as this could have impact on the transmitting power. I'm not sure.
Further directions
- after some time of collecting the localization data for objects, we may create another ML model for predicting the localization of object in time (i.e., if it is Monday morning, the cat is near the fridge, if it is Saturday afternoon, the cat is sleeping in the bathroom, etc). This could involve another variables, like room temperature, humidity etc.
- move the prediction step to (another) microcontroller (ESP32, ESP8266) with such (cool!) project: https://github.com/eloquentarduino/micromlgen (no need to have a dedicated server for predictions)
Acknowledgements
- wikimedia commons and openclipart library for cool graphics (especially the cat!)