Dan Manners' Homelab
All of the readme's are in a state of flux at this moment. I'm working on refactoring much of the repository, but I'm happy to answer any questions in the k8s@Home Discord server or on Discord! Feel free to reach me at
danmanners
with any questions or at [email protected]!
Current status: BETA (but is highly stable)
This project aims to utilize industry-standard tooling and practices in order to both perform it's functions and act as a repository for people to reference for their own learning and work.
🔍 Features
- Easy to replicate GitOps
- Modularity; make it easy to add/remove components
- Hybrid Multi-Cloud
- External DNS updates
- Automagic cert management
- In-Cluster Container Registry
- Monitoring and alerting 🚧
💡 Current Tech Stack
Name | Description |
---|---|
ArgoCD | GitOps for Kubernetes |
AWS | Cloud Provider |
Blocky | Fast and lightweight DNS proxy as ad-blocker |
Buildah | Container Building |
Cert-Manager | Certificate Manager |
Cilium | CNI utilizing eBPF for Observability and Security |
CloudNativePG | Kubernetes operator covering lifecycle of HA PostgreSQL Clusters |
CSI-Driver-NFS | Kubernetes NFS Driver for persistent storage |
Dex IDP | Federated OIDC |
External-DNS | Configure and manage External DNS servers |
GitHub | Popular Code Management through Git |
Grafana | Metrics Visualization |
Harbor | Open Source Container Registry |
Helm | Kubernetes Package Management |
Jenkins | Open-Source Automation Server |
Kubernetes | Container Orchestration |
Let's Encrypt | Free TLS certificates |
Maddy | Composable all-in-one mail server |
MetalLB | Kubernetes bare-metal Load Balancer |
Mozilla SOPS | Simple/Flexible Tool |
Podman | Container and Pod management |
Prometheus | Metrics and Data Collection |
Python | Python Programming Language |
Raspberry Pi | Baremetal ARM SoC Hardware! |
SonarQube | Static code analysis |
Sonatype Nexus-OSS | Manage binaries and build artifacts |
Tekton | Cloud-Native CI/CD |
Ubuntu | Operating System |
Uptime Kuma | Fancy self-hosted system monitoring |
WikiJS | Open-Source Wiki/Documentation Service |
Deployment Order of Operations
Identifying Problems, Troubleshooting Steps, and more
Below are a few things that may be beneficial to you when troubleshooting or getting things up and operational
Traffic is not getting from the edge (cloud) nodes to the on-prem cluster networking
You can validate that your remote traffic is or isn't making it on site by using dig
inside of the netshoot container
kubectl run temp-troubleshooting \
--rm -it -n default \
--overrides='{"apiVersion":"v1","spec":{"nodeSelector":{"kubernetes.io/hostname":"talos-aws-grav01"}}}' \
--pod-running-timeout 3m \
--image=docker.io/nicolaka/netshoot:latest \
--command -- /bin/bash
Then, you can validate that you can reach CoreDNS or another pod/service IP from your remote node.
If you can prove it is not working, you may want to restart all of Cilium:
kubectl rollout restart -n kube-system daemonset cilium
To-Do Items
- Ensure that ALL services are tagged for the appropriate hardware (
arm64
oramd64
) to ensure runtime success- Alternatively, ensure that all containers are built for multi-architecture.
- Ensure that ALL application and service subdirectories have READMEs explaining what they're doing and what someone else may need to modify for their own environment
Gratitude and Thanks
This README redesign was inspired by several other homelab repos, individuals, and communities.
Individuals
Communities
The DevOps Lounge
K8s-at-Home
Without the inspiration and help of these individuals and communities, I don't think my own project would be nearly as far. Make sure to check out their projects as well!