• Stars
    star
    231
  • Rank 167,733 (Top 4 %)
  • Language
    Shell
  • License
    MIT License
  • Created over 1 year ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Raspberry Pi Cluster automation

Raspberry Pi Cluster

CI

Turing Pi 2 - Raspberry Pi Compute Module Cluster

This repository contains examples and automation used in various Raspberry Pi clustering scenarios, as seen on Jeff Geerling's YouTube channel.

DeskPi Super6c Mini ITX Raspberry Pi Compute Module Cluster

The inspiration for this project was my first Pi cluster, the Raspberry Pi Dramble, which is still running in my basement to this day!

Usage

  1. Make sure you have Ansible installed.
  2. Copy the example.hosts.ini inventory file to hosts.ini. Make sure it has the control_plane and nodes configured correctly (for my examples I named my nodes node[1-4].local).
  3. Copy the example.config.yml file to config.yml, and modify the variables to your liking.

Raspberry Pi Setup

I am running Raspberry Pi OS on various Pi clusters. You can run this on any Pi cluster, but I tend to use Compute Modules without eMMC ('Lite' versions) and I often run them using 32 GB SanDisk Extreme microSD cards to boot each node. For some setups (like when I run the Compute Blade or DeskPi Super6c, I boot off NVMe SSDs instead.

In every case, I flashed Raspberry Pi OS (64-bit, lite) to the storage devices using Raspberry Pi Imager.

To make network discovery and integration easier, I edit the advanced configuration in Imager, and set the following options:

  • Set hostname: node1.local (set to 2 for node 2, 3 for node 3, etc.)
  • Enable SSH: 'Allow public-key', and paste in my public SSH key(s)
  • Configure wifi: (ONLY on node 1, if desired) enter SSID and password for local WiFi network

After setting all those options, making sure only node 1 has WiFi configured, and the hostname is unique to each node (and matches what is in hosts.ini), I inserted the microSD cards into the respective Pis, or installed the NVMe SSDs into the correct slots, and booted the cluster.

SSH connection test

To test the SSH connection from my Ansible controller (my main workstation, where I'm running all the playbooks), I connected to each server individually, and accepted the hostkey:

This ensures Ansible will also be able to connect via SSH in the following steps. You can test Ansible's connection with:

ansible all -m ping

It should respond with a 'SUCCESS' message for each node.

Storage Configuration

Warning: This playbook is configured to set up a ZFS mirror volume on node 3, with two drives connected to the built-in SATA ports on the Turing Pi 2.

It is not yet genericized for other use cases (e.g. boards that are not the Turing Pi 2).

This playbook will create a storage location on node 3 by default. You can use one of the storage configurations by switching the storage_type variable from filesystem to zfs in your config.yml file.

Filesystem Storage

If using filesystem (storage_type: filesystem), make sure to use the appropriate storage_nfs_dir variable in config.yml.

ZFS Storage

If using ZFS (storage_type: zfs, you should have two volumes available on node 3, /dev/sda, and /dev/sdb, able to be pooled into a mirror. Make sure your two SATA drives are wiped:

pi@node3:~ $ sudo wipefs --all --force /dev/sda?; sudo wipefs --all --force /dev/sda
pi@node3:~ $ sudo wipefs --all --force /dev/sdb?; sudo wipefs --all --force /dev/sdb

If you run lsblk, you should see sda and sdb have no partitions, and are ready to use:

pi@node3:~ $ lsblk
NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda           8:0    0  1.8T  0 disk 
sdb           8:16   0  1.8T  0 disk 

You should also make sure the storage_nfs_dir variable is set appropriately for ZFS in your config.yml.

This ZFS layout was configured originally for the Turing Pi 2 board, which has two built-in SATA ports connected directly to node 3. In the future, the configuration may be genericized a bit better.

Ceph Storage Configuration

You could also run Ceph on a Pi clusterโ€”see the storage configuration playbook inside the ceph directory.

This configuration is not yet integrated into the general K3s setup.

Cluster configuration and K3s installation

Run the playbook:

ansible-playbook main.yml

At the end of the playbook, there should be an instance of Drupal running on the cluster. If you log into node 1, you should be able to access it with curl localhost.

Alternatively, if you have SSH tunnelling configured (see later section), you could access http://[your-vps-ip-or-hostname]:8080/ and you'd see the site.

You can also log into node 1, switch to the root user account (sudo su), then use kubectl to manage the cluster (e.g. view Drupal pods with kubectl get pods -n drupal).

The Kubernetes Ingress object for Drupal (how HTTP requests from outside the cluster make it to Drupal) can be found by running kubectl get ingress -n drupal. Take the IP address or hostname there and enter it in your browser on a computer on the same network, and voila! You should see Drupal's installer.

K3s' kubeconfig file is located at /etc/rancher/k3s/k3s.yaml. If you'd like to manage the cluster from other hosts (or using a tool like Lens), copy the contents of that file, replacing localhost with the IP address or hostname of the control plane node, and paste the contents into a file ~/.kube/config.

Upgrading the cluster

Run the upgrade playbook:

ansible-playbook upgrade.yml

Monitoring the cluster

Prometheus and Grafana are used for monitoring. Grafana can be accessed via port forwarding (or you could choose to expose it another way).

To access Grafana:

  1. Make sure you set up a valid ~/.kube/config file (see 'K3s installation' above).
  2. Run kubectl port-forward service/cluster-monitoring-grafana :80
  3. Grab the port that's output, and browse to localhost:[port], and bingo! Grafana.

The default login is admin / prom-operator, but you can also get the secret with kubectl get secret cluster-monitoring-grafana -o jsonpath="{.data.admin-password}" | base64 -D.

You can then browse to all the Kubernetes and Pi-related dashboards by browsing the Dashboards in the 'General' folder.

Benchmarking the cluster

See the README file within the benchmarks folder.

Shutting down the cluster

The safest way to shut down the cluster is to run the following command:

ansible all -m community.general.shutdown -b

Note: If using the SSH tunnel, you might want to run the command first on nodes 2-4, then on node 1. So first run ansible 'all:!control_plane' [...], then run it again just for control_plane.

Then after you confirm the nodes are shut down (with K3s running, it can take a few minutes), press the cluster's power button (or yank the Ethernet cables if using PoE) to power down all Pis physically. Then you can switch off or disconnect your power supply.

Static network configuration (optional, but recommended)

I using my cluster both on-premise and remote (using a 4G LTE modem connected to the first Pi), I set it up on its own subnet (10.1.1.x). You can change the subnet that's used via the ipv4_subnet_prefix variable in config.yml.

To configure the local network for the Pi cluster (this is optionalโ€”you can still use the rest of the configurations without a custom local network), run the playbook:

ansible-playbook networking.yml

After running the playbook, until a reboot, the Pis will still be accessible over their former DHCP-assigned IP address. After the nodes are rebooted, you will need to make sure your workstation is connected to an interface using the same subnet as the cluster (e.g. 10.1.1.x).

Note: After the networking changes are made, since this playbook uses DNS names (e.g. node1.local) instead of IP addresses, your computer will still be able to connect to the nodes directlyโ€”assuming your network has IPv6 support. Pinging the nodes on their new IP addresses will not work, however. For better network compatibility, it's recommended you set up a separate network interface on the Ansible controller that's on the same subnet as the Pis in the cluster:

On my Mac, I connected a second network interface and manually configured its IP address as 10.1.1.10, with subnet mask 255.255.255.0, and that way I could still access all the nodes via IP address or their hostnames (e.g. node2.local).

Because the cluster subnet needs its own router, node 1 is configured as a router, using wlan0 as the primary interface for Internet traffic by default. The other nodes get their Internet access through node 1.

Switch between 4G LTE and WiFi (optional)

The network configuration defaults to an active_internet_interface of wlan0, meaning node 1 will route all Internet traffic for the cluster through it's WiFi interface.

Assuming you have a working 4G card in slot 1, you can switch node 1 to route through an alternate interface (e.g. usb0):

  1. Set active_internet_interface: "usb0" in your config.yml
  2. Run the networking playbook again: ansible-playbook networking.yml

You can switch back and forth between interfaces using the steps above.

Reverse SSH and HTTP tunnel configuration (optional)

For my own experimentation, I ran my Pi cluster 'off-grid', using a 4G LTE modem, as mentioned above.

Because my mobile network provider uses CG-NAT, there is no way to remotely access the cluster, or serve web traffic to the public internet from it, at least not out of the box.

I am using a reverse SSH tunnel to enable direct remote SSH and HTTP access. To set that up, I configured a VPS I run to use TCP Forwarding (see this blog post for details), and I configured an SSH key so node 1 could connect to my VPS (e.g. ssh my-vps-username@my-vps-hostname-or-ip).

Then I set the reverse_tunnel_enable variable to true in my config.yml, and configured the VPS username and hostname options.

Doing that and running the main.yml playbook configures autossh on node 1, and will try to get a connection through to the VPS on ports 2222 (to node 1's port 22) and 8080 (to node 1's port 80).

After that's done, you should be able to log into the cluster through your VPS with a command like:

$ ssh -p 2222 pi@[my-vps-hostname]

Note: If autossh isn't working, it could be that it didn't exit cleanly, and a tunnel is still reserving the port on the remote VPS. That's often the case if you run sudo systemctl status autossh and see messages like Warning: remote port forwarding failed for listen port 2222.

In that case, log into the remote VPS and run pgrep ssh | xargs kill to kill off all active SSH sessions, then autossh should pick back up again.

Warning: Use this feature at your own risk. Security is your own responsibility, and for better protection, you should probably avoid directly exposing your cluster (e.g. by disabling the GatewayPorts option) so you can only access the cluster while already logged into your VPS).

Caveats

These playbooks are used in both production and test clusters, but security is always your responsibility. If you want to use any of this configuration in production, take ownership of it and understand how it works so you don't wake up to a hacked Pi cluster one day!

Author

The repository was created in 2023 by Jeff Geerling, author of Ansible for DevOps, Ansible for Kubernetes, and Kubernetes 101.

More Repositories

1

ansible-for-devops

Ansible for DevOps examples.
Python
7,475
star
2

mac-dev-playbook

Mac setup and configuration via Ansible.
Shell
5,735
star
3

internet-pi

Raspberry Pi config for all things Internet.
Jinja
3,690
star
4

macos-virtualbox-vm

Instructions and script to help you create a VirtualBox VM running macOS.
Shell
2,519
star
5

ansible-vagrant-examples

Ansible examples using Vagrant to deploy to local VMs.
2,034
star
6

raspberry-pi-dramble

DEPRECATED - Raspberry Pi Kubernetes cluster that runs HA/HP Drupal 8
Shell
1,656
star
7

ansible-role-docker

Ansible Role - Docker
1,515
star
8

drupal-vm

A VM for Drupal development
Jinja
1,371
star
9

pi-webcam

Automation to configure a Raspberry Pi as a USB OTG webcam
1,322
star
10

internet-monitoring

Monitor your network and internet speed with Docker & Prometheus
1,237
star
11

raspberry-pi-pcie-devices

Raspberry Pi PCI Express device compatibility database
HTML
1,230
star
12

ansible-role-mysql

Ansible Role - MySQL
Jinja
988
star
13

ansible-role-jenkins

Ansible Role - Jenkins CI
Groovy
802
star
14

ansible-role-nginx

Ansible Role - Nginx
Jinja
774
star
15

ansible-role-security

Ansible Role - Security
Jinja
706
star
16

ansible-role-certbot

Ansible Role - Certbot (for Let's Encrypt)
Shell
698
star
17

ansible-role-gitlab

Ansible Role - GitLab
Jinja
653
star
18

packer-boxes

Jeff Geerling's Packer build configurations for Vagrant boxes.
Shell
635
star
19

ansible-for-kubernetes

Ansible and Kubernetes examples from Ansible for Kubernetes Book
Shell
612
star
20

my-backup-plan

How I back up all my data.
Shell
576
star
21

dotfiles

My configuration. Minimalist, but helps save a few thousand keystrokes a day.
Shell
540
star
22

ansible-role-firewall

Ansible Role - iptables Firewall configuration.
Shell
481
star
23

ansible-role-postgresql

Ansible Role - PostgreSQL
Jinja
478
star
24

kubernetes-101

Kubernetes 101 - by Jeff Geerling
HTML
477
star
25

ansible-role-php

Ansible Role - PHP
Jinja
468
star
26

ansible-role-kubernetes

Ansible Role - Kubernetes
Jinja
425
star
27

Ping

A PHP class to ping hosts.
PHP
415
star
28

ansible-role-apache

Ansible Role - Apache 2.x.
Jinja
392
star
29

ansible-role-nodejs

Ansible Role - Node.js
Jinja
386
star
30

turing-pi-cluster

DEPRECATED - Turing Pi cluster configuration for Raspberry Pi Compute Modules
Jinja
347
star
31

awx-container

Ansible Container project that manages the lifecycle of AWX on Docker.
297
star
32

ansible-role-java

Ansible Role - Java
Jinja
293
star
33

ansible-role-ntp

Ansible Role - NTP
Jinja
290
star
34

pi-timelapse

Time-lapse app for Raspberry Pi computers.
Python
278
star
35

temperature-monitor

Raspberry Pi-based home temperature monitoring network.
JavaScript
256
star
36

ansible-role-awx

DEPRECATED Ansible Role - AWX
230
star
37

ansible-role-homebrew

Ansible Role - Homebrew (MOVED to geerlingguy.mac collection)
Shell
228
star
38

ansible-role-redis

Ansible Role - Redis
Jinja
228
star
39

packer-centos-7

This build has been moved - see README.md
223
star
40

ansible-collection-mac

Collection of macOS automation tools for Ansible.
Shell
220
star
41

ansible-role-dotfiles

Ansible Role - Easy and flexible dotfile installation.
214
star
42

ansible-role-nfs

Ansible Role - NFS
Jinja
207
star
43

ansible-role-haproxy

Ansible Role - HAProxy
Jinja
194
star
44

ansible-role-git

Ansible Role - Git
188
star
45

ansible-role-repo-epel

Ansible Role - EPEL Repository for RHEL/CentOS
175
star
46

ansible-role-composer

Ansible Role - Composer PHP Dependency Manager
Jinja
175
star
47

ansible-role-pip

Ansible Role - Pip (for Python)
174
star
48

turing-pi-2-cluster

DEPRECATED - Turing Pi 2 Cluster
Jinja
174
star
49

ansible-role-logstash

Ansible Role - Logstash
Jinja
168
star
50

ansible-role-elasticsearch

Ansible Role - Elasticsearch
Jinja
167
star
51

pico-w-garage-door-sensor

Wireless garage door sensor for Home Assistant powered by Raspberry Pi Pico W
Python
166
star
52

drupal-for-kubernetes

Drupal Example Site for Kubernetes
PHP
149
star
53

ansible-role-filebeat

Ansible Role - Filebeat for ELK stack
Jinja
137
star
54

ansible-role-backup

Ansible Role - Backup for simple servers
Shell
135
star
55

ansible-role-ansible

Ansible Role - Ansible
130
star
56

airgradient-prometheus

AirGradient Prometheus exporter.
C++
116
star
57

ansible-role-kibana

Ansible Role - Kibana
Jinja
115
star
58

ansible-role-swap

Ansible Role - Swap
107
star
59

obs-task-list-overlay

An HTML and Node.js-based task list overlay for OBS.
JavaScript
105
star
60

drupal-pi

Drupal on Docker on a Raspberry Pi. Pi Dramble's little brother.
Jinja
105
star
61

ansible-role-glusterfs

Ansible Role - GlusterFS
104
star
62

ansible-role-raspberry-pi

Configures a Raspberry Pi (running Raspbian).
102
star
63

docker-centos7-ansible

CentOS 7 Docker container for Ansible playbook and role testing.
Dockerfile
101
star
64

packer-ubuntu-1804

This build has been moved - see README.md
101
star
65

ansible-role-solr

Ansible Role - Apache Solr
Shell
97
star
66

ansible-mastodon

Mastodon installation on a single server using Ansible.
Jinja
95
star
67

sbc-reviews

Jeff Geerling's SBC review data - Raspberry Pi, Radxa, Orange Pi, etc.
93
star
68

ansible-role-supervisor

Ansible Role - Supervisor
Shell
92
star
69

ansible-role-docker_arm

Ansible Role - Docker for ARM and Pi
91
star
70

ansible-requirements-updater

Update your requirements.yml with this grisly Ansible playbook.
90
star
71

ansible-role-php-versions

Ansible Role - PHP Versions
88
star
72

ansible-role-ruby

Ansible Role - Ruby
Shell
86
star
73

JJG-Ansible-Windows

[DEPRECATED] Windows shell provisioning script to bootstrap Ansible from within a Vagrant VM.
Shell
85
star
74

ansible-role-mas

Ansible Role - Mac App Store CLI (MOVED to geerlingguy.mac collection)
84
star
75

ansible-role-drupal

Ansible Role - Drupal
84
star
76

ansible-role-postfix

Ansible Role - Postfix
82
star
77

tower-operator

DEPRECATED: This project was moved and renamed to: https://github.com/ansible/awx-operator
Dockerfile
82
star
78

ansible-role-varnish

Ansible Role - Varnish HTTP accelerator
Jinja
82
star
79

packer-ubuntu-1404

DEPRECATED - Packer Example - Ubuntu 14.04 Vagrant Box using Ansible provisioner
Shell
82
star
80

pi4gpu

Raspberry Pi GPU Carrier Board
OpenSCAD
82
star
81

pi-camera

A Raspberry Pi Camera
Python
79
star
82

pi-router

Raspberry Pi-based OpenWRT router config for 4G/5G Waveshare Dual Ethernet board.
Dockerfile
79
star
83

packer-centos-6

This build has been moved - see README.md
79
star
84

docker-ubuntu1804-ansible

Ubuntu 18.04 LTS (Bionic) Docker container for Ansible playbook and role testing.
Dockerfile
78
star
85

top500-benchmark

Automated Top500 benchmark for clusters or single nodes.
Jinja
76
star
86

deskpi-super6c-cluster

DEPRECATED - DeskPi Super6c 6-node Raspberry Pi CM4 Cluster
76
star
87

docker-examples

There are many like it, but this one is mine.
Python
75
star
88

youtube-10k-pods

10,000 Kubernetes Pods for 10,000 Subscribers
74
star
89

docker-ubuntu2004-ansible

Ubuntu 20.04 LTS (Focal Fossa) Docker container for Ansible playbook and role testing.
Dockerfile
74
star
90

pi-bell-slapper

The King of Ding. Internet-connected Raspberry Pi-based notification bell.
OpenSCAD
74
star
91

baby-safe-temp

Safe temperature monitor for baby's room. Made for Raspberry Pi Pico.
Python
74
star
92

ansible-role-memcached

Ansible Role - Memcached
Jinja
68
star
93

ansible-role-node_exporter

Ansible role - Node exporter
Jinja
66
star
94

ansible-role-mailhog

Ansible Role - MailHog for catching and viewing emails
Shell
62
star
95

molecule-playbook-testing

This is an example from the Ansible 101 livestream
62
star
96

ansible-role-ssh-chroot-jail

Ansible Role - SSH chroot jail config
Shell
60
star
97

pi-nvr

Raspberry Pi NVR for home CCTV recording.
Dockerfile
58
star
98

Imap

Simple wrapper class for PHP's IMAP-related email functions.
PHP
57
star
99

Request

A simple PHP HTTP request class.
PHP
56
star
100

ansible-role-samba

Ansible Role - Samba
55
star