How Excloud builds Kubernetes clusters from scratch — no kubeadm, no magic. Every template, every certificate, every line explained.
k8sapi is the Kubernetes-facing API for the Excloud cloud platform. When a user says "I want a Kubernetes cluster," this service does all the heavy lifting:
Calls computeapi to provision virtual machines that become control plane nodes.
Creates a full PKI — a Certificate Authority and every cert each component needs to communicate securely.
Uses 16 Go templates to generate systemd units, static pod manifests, kubeconfigs, CNI config, and shell scripts.
Packages everything into JSON that VMs fetch and use to self-assemble into a working Kubernetes cluster.
Key: This does NOT use kubeadm. Everything is built from raw binaries and hand-crafted configuration — full control from first principles.
The concepts you need to understand the bootstrap process.
The control plane makes all the decisions. It has four components:
The front door. Every single request — from users, nodes, other components — goes through here. It validates requests, authenticates callers, and stores state in etcd.
The corrector. Runs dozens of control loops. Each watches a resource type and ensures real state matches desired state. If you want 3 replicas but only 2 exist, it creates another.
The planner. When a new pod needs to run, picks the best node based on resources, affinity rules, and constraints.
The memory. A distributed key-value database storing all cluster state. If etcd dies, the cluster loses its memory.
Kubelet runs on every machine. It receives instructions from the API server ("run this pod") and ensures containers are running. Its superpower: it can run static pods by watching a directory on disk (/etc/kubernetes/manifests/). Any YAML placed there becomes a running pod — no API server needed.
Problem: the API server runs as a pod, but pods are managed by the API server. How do you start it?
Answer: static pods. Kubelet watches a directory and creates pods from YAML files on disk, without any API server involvement. In this project, three control plane components (kube-apiserver, kube-controller-manager, kube-scheduler) run as static pods. Their manifests are rendered by k8sapi and placed on disk before kubelet starts.
Why not etcd too? Etcd runs as a systemd service (not a static pod). Systemd gives more reliable lifecycle management for this critical component.
A container is a lightweight, isolated process with its dependencies bundled. A pod wraps one or more containers that share networking (they can reach each other via localhost) and storage. Most pods have a single container. The container runtime here is containerd.
Pods are ephemeral — they come and go, and their IPs change. A Service provides a stable virtual IP (ClusterIP) that load-balances traffic to the right pods. Key services here: kubernetes (API server, at 10.96.0.1) and kube-dns (CoreDNS, at 10.96.0.10).
Kubernetes delegates pod networking to a CNI plugin. The CNI assigns IPs, sets up routes between nodes, and enforces network policies. This project uses Cilium, but during bootstrap uses a temporary bridge CNI (because Cilium itself runs as pods, and pods need networking to start — another chicken-and-egg).
While CNI handles pod-to-pod traffic, kube-proxy handles service routing. It runs on every node as a DaemonSet and programs iptables rules to intercept traffic to service ClusterIPs and redirect to actual pod IPs.
The cluster's internal DNS server. Lets pods find services by name (e.g., my-service.default.svc.cluster.local). Runs as a Deployment with 2 replicas, served at 10.96.0.10. Kubelet configures all pods to use this IP for DNS.
K8s uses TLS certificates for all component-to-component communication. A Certificate Authority (CA) signs all certs. Because everyone trusts the CA, they can trust each other. Certificate fields are used for identity: Common Name (CN) = username, Organization (O) = group. The admin cert has CN=cluster-admin, O=system:masters — full root access.
A YAML file telling a K8s client: (1) where the API server is, (2) who you are (your client cert), (3) how to verify the server (the CA cert). This project generates 5 kubeconfigs — one per component.
Role-Based Access Control maps identities to permissions. Built-in groups: system:masters (full admin), system:nodes (kubelet permissions), system:node-proxier (kube-proxy), etc.
In one sentence: User calls k8sapi → creates VMs → VMs boot, ask IMDS "who am I?" → call back to k8sapi for bootstrap bundle → install everything → Kubernetes cluster.
k8sapi shares a PostgreSQL database with other Excloud services:
| Column | Type | Purpose |
|---|---|---|
| id | bigserial PK | Auto-incremented cluster ID |
| org_id | bigint | Owning organization |
| project_id | bigint | Project (hardcoded 1 for now) |
| ca_cert | text | PEM-encoded CA certificate |
| priv_key | text | PEM-encoded CA private key |
| service_acc_cert | text | Service Account public key |
| service_acc_priv_key | text | Service Account private key |
| Column | Type | Purpose |
|---|---|---|
| id | bigserial PK | Node record ID |
| kubecluster_id | bigint FK | Which cluster this belongs to |
| vm_id | bigint | The VM running this node |
| vm_ip | text | Private IP |
| cert / priv_key | text | Node's TLS cert and key |
| Column | Type | Purpose |
|---|---|---|
| hash | text | SHA-256 of token |
| vm_id | bigint | Which VM |
| org_id | bigint | Which org |
| expiry_at | timestamptz | Expiration |
k8sapi JOINs these three tables to resolve a VM's network addresses (private IPv4, IPv6, public IPv4).
Click to expand details.
The main endpoint. Requires bearer token auth. Creates VMs, generates PKI, returns admin kubeconfig.
// Request
{ "control_plane_count": 1, // 1-3 (default 1)
"control_plane_image_id": 42, // required
"control_plane_instance_type": "e2.medium",
"subnet_id": 5, // required
"zone_id": 1, // default 1
"security_group_ids": [], // auto-created if empty
"allocate_public_ipv4": true,
"ssh_pubkey": "ssh-ed25519 ...",
"root_volume_size_gib": 20 }
// Response
{ "cluster_id": 42,
"name": "exk8s-5-2a",
"kubeconfig": "apiVersion: v1...",
"control_plane_vm_ids": [101],
"control_plane_dns_name": "exk8s-5-2a.k8s.excloud.co.in",
"service_cidr": "10.96.0.0/12",
"pod_cidr": "172.16.0.0/12",
"dns_configured": true }
Called by VMs themselves. Requires dual auth: IMDS token (X-exc-imds-token header) + bearer token. A VM can only fetch its own bundle (VM ID must match IMDS token). Returns JSON with all files, certs, and configs.
200 OK if healthy. 418 during shutdown (for LB drain).
Auto-generated API documentation.
The complete journey from API call to running cluster.
exk8s-{org}-{id_hex}, DNS: exk8s-{org}-{id_hex}.k8s.excloud.co.inGET /bootstrap/control-plane/{node_id} with both tokensAlternative: scripts/bake-control-plane-image.sh does this at image build time, making boot much faster.
kubectl get --raw=/readyz up to 10 minAfter Phase 7: etcd, kube-apiserver, kube-controller-manager, kube-scheduler, kubelet, Cilium, kube-proxy, CoreDNS — all running. Cluster is fully operational.
All 16 templates with every line explained. This is the most detailed section.
The very first script that runs when a VM boots. Injected as cloud-init userdata. Fetches the bootstrap bundle and kicks off installation.
Go data passed: struct{ BootstrapBaseURL string }
#!/usr/bin/env bash
set -euo pipefail # exit on error, undefined vars, pipe failures
mkdir -p /var/log/exk8s # create log directory
exec > >(tee -a /var/log/exk8s/bootstrap-userdata.log) 2>&1 # log all output to file AND stdout
IMDS_BASE_URL="${EXC_IMDS_BASE_URL:-http://imdsapi.excloud.in}" # IMDS endpoint (overridable)
BOOTSTRAP_BASE_URL="{{ .BootstrapBaseURL }}" # k8sapi URL, templated by Go
# Install curl + jq if missing (needed for HTTP + JSON parsing)
if ! command -v curl >/dev/null 2>&1 || ! command -v jq >/dev/null 2>&1; then
if command -v apt-get >/dev/null 2>&1; then
export DEBIAN_FRONTEND=noninteractive
apt-get update -y && apt-get install -y curl jq
fi
fi
# Poll IMDS for a session token — up to 120 attempts (10 minutes)
# The token proves "I am this specific VM in this specific org"
for _ in $(seq 1 120); do
IMDS_TOKEN="$(curl -fsS "${IMDS_BASE_URL}/token" || true)"
[[ -n "${IMDS_TOKEN}" ]] && break
sleep 5
done
[[ -z "${IMDS_TOKEN:-}" ]] && echo "failed to fetch imds token" && exit 1
# Get node identity (our VM ID) from IMDS
NODE_ID="$(curl -fsS -H "X-exc-imds-token: ${IMDS_TOKEN}" \
"${IMDS_BASE_URL}/latest/identity/node-identity" | jq -r '.node_id')"
# Get an org-level access token from IMDS (second auth factor)
BOOTSTRAP_ACCESS_TOKEN="$(curl -fsS -H "X-exc-imds-token: ${IMDS_TOKEN}" \
"${IMDS_BASE_URL}/latest/identity/access-token" | jq -r '.access_token')"
# Fetch bootstrap bundle from k8sapi with BOTH tokens
TMP_JSON="$(mktemp)" && trap 'rm -f "$TMP_JSON"' EXIT
for _ in $(seq 1 120); do
curl -fsS -H "X-exc-imds-token: ${IMDS_TOKEN}" \
-H "Authorization: Bearer ${BOOTSTRAP_ACCESS_TOKEN}" \
"${BOOTSTRAP_BASE_URL}/bootstrap/control-plane/${NODE_ID}" > "$TMP_JSON" && break
sleep 5
done
# Write every file from the bundle to disk
jq -r '.files | to_entries[] | @base64' "$TMP_JSON" | while IFS= read -r entry; do
key="$(printf '%s' "$entry" | base64 --decode | jq -r '.key')" # file path
value="$(printf '%s' "$entry" | base64 --decode | jq -r '.value')" # file content
mkdir -p "$(dirname "$key")" && printf '%s' "$value" > "$key"
done
# Make scripts executable
chmod +x /usr/local/bin/exk8s-*.sh 2>/dev/null || true
# Install prereqs if not already present (skip if image was pre-baked)
if ! command -v kubelet >/dev/null 2>&1; then
/usr/local/bin/exk8s-install-control-plane-prereqs.sh
fi
systemctl daemon-reload || true
/usr/local/bin/exk8s-bootstrap-control-plane.sh # Start everything
#!/usr/bin/env bash
set -euo pipefail
mkdir -p /var/log/exk8s
exec > >(tee -a /var/log/exk8s/bootstrap-userdata.log) 2>&1
IMDS_BASE_URL="${EXC_IMDS_BASE_URL:-http://imdsapi.excloud.in}"
BOOTSTRAP_BASE_URL="https://k8sapi.excloud.in"
# ... rest is pure bash, no more template variables ...
Key-value environment file sourced by all shell scripts. Contains every cluster parameter.
Go data: ControlPlaneNodeBundle fields directly
exk8s-5-2a. Used in component configs.v1.35.3. Used to download correct binaries.cluster.local. The DNS domain for services.10.96.0.0/12. IP range for Service ClusterIPs.10.96.0.1. First usable IP — reserved for the kubernetes service.10.96.0.10. CoreDNS Service IP. Kubelet points pods here for DNS.172.16.0.0/12. IP range for pod IPs.https://127.0.0.1:6443. Local API server endpoint.cp-1.cluster-42.internal).cp-1=https://10.0.0.5:2380,cp-2=https://10.0.0.6:2380. Tells etcd who its peers are.true for the first node only. Controls whether addon installer runs.CLUSTER_NAME=exk8s-5-2a
KUBERNETES_VERSION=v1.35.3
CLUSTER_DOMAIN=cluster.local
SERVICE_CIDR=10.96.0.0/12
SERVICE_CLUSTER_IP=10.96.0.1
DNS_SERVICE_IP=10.96.0.10
POD_CIDR=172.16.0.0/12
APISERVER_ENDPOINT=https://127.0.0.1:6443
CONTROL_PLANE_DNS_NAME=exk8s-5-2a.k8s.excloud.co.in
NODE_NAME=cp-1.cluster-42.internal
NODE_IPV4=10.0.1.5
NODE_IPV6=
ETCD_INITIAL_CLUSTER=cp-1.cluster-42.internal=https://10.0.1.5:2380,cp-2.cluster-42.internal=https://10.0.1.6:2380,cp-3.cluster-42.internal=https://10.0.1.7:2380
IS_INITIAL_CONTROL_PLANE=true
Systemd unit that runs etcd — the cluster's distributed database.
Go data: struct{ Name, Addr, InitialCluster string }
cp-1=https://10.0.0.5:2380,cp-2=https://10.0.0.6:2380.[Unit]
Description=etcd
Documentation=https://etcd.io
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
ExecStart=/usr/local/bin/etcd \
--name=cp-1.cluster-42.internal \
--data-dir=/var/lib/etcd \
--listen-client-urls=https://10.0.1.5:2379,https://127.0.0.1:2379 \
--advertise-client-urls=https://10.0.1.5:2379 \
--listen-peer-urls=https://10.0.1.5:2380 \
--initial-advertise-peer-urls=https://10.0.1.5:2380 \
--initial-cluster=cp-1.cluster-42.internal=https://10.0.1.5:2380,cp-2.cluster-42.internal=https://10.0.1.6:2380,cp-3.cluster-42.internal=https://10.0.1.7:2380 \
--initial-cluster-state=new \
--trusted-ca-file=/etc/exk8s/pki/ca.pem \
--cert-file=/etc/exk8s/pki/etcd-peer.pem \
--key-file=/etc/exk8s/pki/etcd-peer-key.pem \
--client-cert-auth \
--peer-trusted-ca-file=/etc/exk8s/pki/ca.pem \
--peer-cert-file=/etc/exk8s/pki/etcd-peer.pem \
--peer-key-file=/etc/exk8s/pki/etcd-peer-key.pem \
--peer-client-cert-auth
Restart=always
RestartSec=5
LimitNOFILE=40000
[Install]
WantedBy=multi-user.target
Systemd unit for kubelet — the node agent.
Go data: struct{ Hostname, NodeAddress string }
- prefix means "don't fail if missing."[Unit]
Description=Kubernetes Kubelet
Documentation=https://kubernetes.io/docs/
After=network-online.target containerd.service
Wants=network-online.target
[Service]
EnvironmentFile=-/etc/exk8s/config/cluster.env
ExecStart=/usr/local/bin/kubelet \
--config=/var/lib/kubelet/config.yaml \
--kubeconfig=/etc/exk8s/kubeconfig/kubelet.kubeconfig \
--hostname-override=cp-1.cluster-42.internal \
--node-ip=10.0.1.5
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
KubeletConfiguration — detailed settings for the kubelet agent.
Go data: struct{ DNSServiceIP, ClusterDomain, NodeAddress string }
cluster.local. The search domain appended to short DNS names.apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
authentication:
anonymous:
enabled: false
x509:
clientCAFile: /etc/exk8s/pki/ca.pem
webhook:
enabled: true
authorization:
mode: Webhook
cgroupDriver: systemd
clusterDNS:
- 10.96.0.10
clusterDomain: cluster.local
containerRuntimeEndpoint: unix:///run/containerd/containerd.sock
failSwapOn: false
healthzBindAddress: 127.0.0.1
readOnlyPort: 0
rotateCertificates: false
serverTLSBootstrap: false
staticPodPath: /etc/kubernetes/manifests
tlsCertFile: /etc/exk8s/pki/kubelet-client.pem
tlsPrivateKeyFile: /etc/exk8s/pki/kubelet-client-key.pem
address: 10.0.1.5
Generic kubeconfig template used 5 times (admin, kubelet, kube-proxy, controller-manager, scheduler).
Go data: struct{ ClusterName, CAPEM, APIEndpoint, UserName, CertPEM, KeyPEM string }
Note: b64 is a custom template function registered in renderer.go that base64-encodes strings.
apiVersion: v1
kind: Config
clusters:
- name: exk8s-5-2a
cluster:
certificate-authority-data: LS0tLS1CRUdJTi... (base64 of CA PEM)
server: https://127.0.0.1:6443
users:
- name: cluster-admin
user:
client-certificate-data: LS0tLS1CRUdJTi... (base64 of admin cert PEM)
client-key-data: LS0tLS1CRUdJTi... (base64 of admin key PEM)
contexts:
- name: cluster-admin
context:
cluster: exk8s-5-2a
user: cluster-admin
current-context: cluster-admin
The most critical manifest — the API server static pod. This is what kubelet starts from /etc/kubernetes/manifests/.
Go data: struct{ KubernetesVersion, APIAddress, EtcdServers, ControlPlaneDNSName, ServiceCIDR string }
https://10.0.0.5:2379.apiVersion: v1
kind: Pod
metadata:
name: kube-apiserver
namespace: kube-system
spec:
hostNetwork: true
priorityClassName: system-node-critical
containers:
- name: kube-apiserver
image: registry.k8s.io/kube-apiserver:v1.35.3
command:
- kube-apiserver
- --advertise-address=10.0.1.5
- --allow-privileged=true
- --authorization-mode=Node,RBAC
- --client-ca-file=/etc/exk8s/pki/ca.pem
- --enable-admission-plugins=NodeRestriction
- --etcd-cafile=/etc/exk8s/pki/ca.pem
- --etcd-certfile=/etc/exk8s/pki/apiserver-etcd-client.pem
- --etcd-keyfile=/etc/exk8s/pki/apiserver-etcd-client-key.pem
- --etcd-servers=https://10.0.1.5:2379,https://10.0.1.6:2379,https://10.0.1.7:2379
- --kubelet-client-certificate=/etc/exk8s/pki/apiserver-kubelet-client.pem
- --kubelet-client-key=/etc/exk8s/pki/apiserver-kubelet-client-key.pem
- --kubelet-preferred-address-types=InternalIP,Hostname
- --proxy-client-cert-file=/etc/exk8s/pki/admin.pem
- --proxy-client-key-file=/etc/exk8s/pki/admin-key.pem
- --requestheader-client-ca-file=/etc/exk8s/pki/ca.pem
- --requestheader-allowed-names=cluster-admin
- --secure-port=6443
- --service-account-issuer=https://exk8s-5-2a.k8s.excloud.co.in:6443
- --service-account-key-file=/etc/exk8s/pki/service-account.pem
- --service-account-signing-key-file=/etc/exk8s/pki/service-account-key.pem
- --service-cluster-ip-range=10.96.0.0/12
- --tls-cert-file=/etc/exk8s/pki/apiserver.pem
- --tls-private-key-file=/etc/exk8s/pki/apiserver-key.pem
- --v=2
livenessProbe:
httpGet:
host: 127.0.0.1
path: /livez
port: 6443
scheme: HTTPS
initialDelaySeconds: 15
periodSeconds: 10
readinessProbe:
httpGet:
host: 127.0.0.1
path: /readyz
port: 6443
scheme: HTTPS
periodSeconds: 5
volumeMounts:
- name: exk8s-pki
mountPath: /etc/exk8s/pki
readOnly: true
volumes:
- name: exk8s-pki
hostPath:
path: /etc/exk8s/pki
type: DirectoryOrCreate
Controller manager — runs reconciliation loops.
Go data: struct{ KubernetesVersion, ClusterName, PodCIDR string }
apiVersion: v1
kind: Pod
metadata:
name: kube-controller-manager
namespace: kube-system
spec:
hostNetwork: true
priorityClassName: system-node-critical
containers:
- name: kube-controller-manager
image: registry.k8s.io/kube-controller-manager:v1.35.3
command:
- kube-controller-manager
- --authentication-kubeconfig=/etc/exk8s/kubeconfig/controller-manager.kubeconfig
- --authorization-kubeconfig=/etc/exk8s/kubeconfig/controller-manager.kubeconfig
- --bind-address=127.0.0.1
- --cluster-name=exk8s-5-2a
- --cluster-cidr=172.16.0.0/12
- --configure-cloud-routes=false
- --kubeconfig=/etc/exk8s/kubeconfig/controller-manager.kubeconfig
- --leader-elect=true
- --root-ca-file=/etc/exk8s/pki/ca.pem
- --service-account-private-key-file=/etc/exk8s/pki/service-account-key.pem
- --use-service-account-credentials=true
- --v=2
volumeMounts:
- name: exk8s-pki
mountPath: /etc/exk8s/pki
readOnly: true
- name: exk8s-kubeconfig
mountPath: /etc/exk8s/kubeconfig
readOnly: true
volumes:
- name: exk8s-pki
hostPath:
path: /etc/exk8s/pki
type: DirectoryOrCreate
- name: exk8s-kubeconfig
hostPath:
path: /etc/exk8s/kubeconfig
type: DirectoryOrCreate
Scheduler — assigns pods to nodes. The simplest static pod.
Go data: struct{ KubernetesVersion string }
apiVersion: v1
kind: Pod
metadata:
name: kube-scheduler
namespace: kube-system
spec:
hostNetwork: true
priorityClassName: system-node-critical
containers:
- name: kube-scheduler
image: registry.k8s.io/kube-scheduler:v1.35.3
command:
- kube-scheduler
- --authentication-kubeconfig=/etc/exk8s/kubeconfig/scheduler.kubeconfig
- --authorization-kubeconfig=/etc/exk8s/kubeconfig/scheduler.kubeconfig
- --bind-address=127.0.0.1
- --kubeconfig=/etc/exk8s/kubeconfig/scheduler.kubeconfig
- --leader-elect=true
- --v=2
volumeMounts:
- name: exk8s-kubeconfig
mountPath: /etc/exk8s/kubeconfig
readOnly: true
volumes:
- name: exk8s-kubeconfig
hostPath:
path: /etc/exk8s/kubeconfig
type: DirectoryOrCreate
kube-proxy DaemonSet — runs on every node, handles service → pod routing via iptables.
Go data: struct{ KubernetesVersion, PodCIDR string }
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-proxy
namespace: kube-system
labels:
k8s-app: kube-proxy
spec:
selector:
matchLabels:
k8s-app: kube-proxy
template:
metadata:
labels:
k8s-app: kube-proxy
spec:
hostNetwork: true
priorityClassName: system-node-critical
tolerations:
- key: CriticalAddonsOnly
operator: Exists
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
containers:
- name: kube-proxy
image: registry.k8s.io/kube-proxy:v1.35.3
imagePullPolicy: IfNotPresent
command:
- kube-proxy
- --kubeconfig=/etc/exk8s/kubeconfig/kube-proxy.kubeconfig
- --cluster-cidr=172.16.0.0/12
- --hostname-override=$(NODE_NAME)
- --proxy-mode=iptables
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
securityContext:
privileged: true
volumeMounts:
- name: kubeconfig
mountPath: /etc/exk8s/kubeconfig/kube-proxy.kubeconfig
readOnly: true
- name: lib-modules
mountPath: /lib/modules
readOnly: true
volumes:
- name: kubeconfig
hostPath:
path: /etc/exk8s/kubeconfig/kube-proxy.kubeconfig
type: File
- name: lib-modules
hostPath:
path: /lib/modules
type: Directory
updateStrategy:
rollingUpdate:
maxUnavailable: 1
type: RollingUpdate
Full CoreDNS manifest — 5 Kubernetes resources in one file. The cluster's DNS server.
Go data: struct{ ClusterDomain, DNSServiceIP string }
*.cluster.local from K8s API.# ServiceAccount + ClusterRole + ClusterRoleBinding (RBAC setup)
---
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
data:
Corefile: |-
.:53 {
errors
health { lameduck 5s }
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus 0.0.0.0:9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
}
---
apiVersion: v1
kind: Service
metadata:
name: kube-dns
namespace: kube-system
spec:
selector: { k8s-app: kube-dns }
clusterIP: 10.96.0.10
ports:
- { name: dns, port: 53, protocol: UDP }
- { name: dns-tcp, port: 53, protocol: TCP }
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: coredns
namespace: kube-system
spec:
replicas: 2
# ... (tolerations, dnsPolicy: Default, probes, resources, volume mounts)
Configuration for Cilium CNI installation via Helm.
Go data: struct{ K8sServiceHost, PodCIDR string }
k8sServiceHost: 10.0.1.5
k8sServicePort: 6443
ipv4:
enabled: true
ipv6:
enabled: false
kubeProxyReplacement: false
operator:
replicas: 1
routingMode: tunnel
tunnelProtocol: vxlan
enableIPv4Masquerade: true
ipam:
mode: cluster-pool
operator:
clusterPoolIPv4PodCIDRList:
- 172.16.0.0/12
Temporary bridge CNI — basic pod networking before Cilium takes over.
Go data: struct{ PodCIDR string }
cni0. Connects all pod veth interfaces.{
"cniVersion": "0.4.0",
"name": "exk8s",
"plugins": [
{
"type": "bridge",
"bridge": "cni0",
"isDefaultGateway": true,
"ipMasq": true,
"hairpinMode": true,
"ipam": {
"type": "host-local",
"ranges": [[{ "subnet": "172.16.0.0/12" }]],
"routes": [{ "dst": "0.0.0.0/0" }]
}
},
{ "type": "portmap", "capabilities": { "portMappings": true } },
{ "type": "firewall" }
]
}
Installs all runtime dependencies — containerd, K8s binaries, etcd.
Go data: struct{ KubeVersion, EtcdVersion string }
Key operations:
apt install containerd + networking toolsSystemdCgroup = true (required by K8s)/usr/lib/cni → /opt/cni/bindl.k8s.io with retry logicetcd + etcdctl#!/usr/bin/env bash
set -euo pipefail
export DEBIAN_FRONTEND=noninteractive
# ... architecture detection ...
apt-get update -y
apt-get install -y ca-certificates conntrack containerd containernetworking-plugins \
curl ebtables ethtool iproute2 iptables socat wget
# Configure containerd for systemd cgroups
containerd config default > /etc/containerd/config.toml
sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
ln -sfn /usr/lib/cni /opt/cni/bin
# Download K8s binaries (version rendered from Go)
KUBE_VERSION=v1.35.3
ETCD_VERSION=v3.6.10
download_binary "https://dl.k8s.io/v1.35.3/bin/linux/amd64/kube-apiserver" /usr/local/bin/kube-apiserver
download_binary "https://dl.k8s.io/v1.35.3/bin/linux/amd64/kube-controller-manager" /usr/local/bin/kube-controller-manager
download_binary "https://dl.k8s.io/v1.35.3/bin/linux/amd64/kube-scheduler" /usr/local/bin/kube-scheduler
download_binary "https://dl.k8s.io/v1.35.3/bin/linux/amd64/kubelet" /usr/local/bin/kubelet
download_binary "https://dl.k8s.io/v1.35.3/bin/linux/amd64/kubectl" /usr/local/bin/kubectl
# Download etcd
curl -fL "https://github.com/etcd-io/etcd/releases/download/v3.6.10/etcd-v3.6.10-linux-amd64.tar.gz" ...
mkdir -p /etc/exk8s/pki /etc/exk8s/kubeconfig /etc/exk8s/config /etc/exk8s/manifests \
/etc/kubernetes/manifests /var/lib/etcd /var/lib/kubelet
The main bootstrap runner. Sources config, verifies binaries, starts all services.
Go data: nil (no template variables — logic is all bash)
Key operations:
cluster.env with set -a (auto-export all vars)systemctl enable --now containerd, etcd, kubeletIS_INITIAL_CONTROL_PLANE=true: run addon installer#!/usr/bin/env bash
set -euo pipefail
if [[ -f /etc/exk8s/config/cluster.env ]]; then
set -a # auto-export all sourced vars
. /etc/exk8s/config/cluster.env
set +a
fi
# Verify every required binary is installed
for bin in containerd etcd kube-apiserver kube-controller-manager kube-scheduler kubelet; do
if ! command -v "$bin" >/dev/null 2>&1; then
echo "missing required binary: $bin"
exit 1
fi
done
mkdir -p /etc/kubernetes/manifests /var/lib/etcd /var/lib/kubelet /etc/exk8s/config
systemctl daemon-reload
systemctl enable containerd || true
systemctl restart containerd || true
systemctl enable etcd
systemctl enable kubelet
systemctl restart etcd # Start the cluster database
systemctl restart kubelet # Start the node agent (picks up static pod manifests)
# Only the first control plane node installs cluster addons
if [[ "${IS_INITIAL_CONTROL_PLANE:-false}" == "true" ]] \
&& [[ -x /usr/local/bin/exk8s-install-cluster-addons.sh ]]; then
/usr/local/bin/exk8s-install-cluster-addons.sh
fi
Installs cluster-wide addons. Only runs on the first control plane node.
Go data: nil
Key operations:
IS_INITIAL_CONTROL_PLANE=truekubectl get --raw=/readyz)helm upgrade --install cilium from OCI chart with --wait --timeout 10mkubectl apply kube-proxy DaemonSet + wait for rolloutkubectl apply CoreDNS Deployment + wait for rollout#!/usr/bin/env bash
set -euo pipefail
# Source cluster config
if [[ -f /etc/exk8s/config/cluster.env ]]; then
set -a && . /etc/exk8s/config/cluster.env && set +a
fi
# Only run on the initial control plane node
if [[ "${IS_INITIAL_CONTROL_PLANE:-false}" != "true" ]]; then
exit 0
fi
export KUBECONFIG=/etc/exk8s/kubeconfig/admin.kubeconfig
# Wait for API server to be ready (up to 10 minutes)
for _ in $(seq 1 120); do
kubectl get --raw=/readyz >/dev/null 2>&1 && break
sleep 5
done
# Install Helm (downloads v3.19.2 for correct architecture)
install_helm # ... (function body downloads and extracts helm binary)
# Install Cilium CNI via Helm
helm upgrade --install cilium oci://quay.io/cilium/charts/cilium \
--version 1.19.2 \
--namespace kube-system \
--create-namespace \
--values /etc/exk8s/config/cilium-values.yaml \
--wait --timeout 10m
# Apply kube-proxy DaemonSet
kubectl apply -f /etc/exk8s/manifests/kube-proxy.yaml
kubectl -n kube-system rollout status daemonset/kube-proxy --timeout=10m
# Apply CoreDNS Deployment
kubectl apply -f /etc/exk8s/manifests/coredns.yaml
kubectl -n kube-system rollout status deployment/coredns --timeout=10m
Every TLS connection and which cert authenticates it.
All certificates are signed by a single Cluster CA. Every component trusts the CA, so they trust each other.
Service Account keys are different — a raw RSA keypair, not certificates. API server signs JWT tokens with sa.key. Controller-manager verifies with sa.pub.
Virtual IPs for K8s Services. Intercepted by kube-proxy iptables rules — never hit a real interface.
Real IPs assigned to pods by Cilium. Pods communicate directly. Cilium uses VXLAN tunneling between nodes.
Boot: Bridge CNI (cni0) provides basic same-node networking. After addons install: Cilium takes over with VXLAN tunneling, network policies, and cluster-wide IPAM. The bridge config is effectively replaced.
The API server listens on 6443. The security group allows this port from 0.0.0.0/0 so users can connect from anywhere. All other ports are restricted to within the subnet.
Click groups to expand.
Everything from raw binaries. Full control over cluster topology, cert management, component versions, and systemd integration. Tradeoff: every systemd unit and manifest must be hand-crafted.
All certs generated by k8sapi, never by VMs. Centralizes trust — VMs never have the CA key. Certs are delivered via the authenticated bootstrap API.
IMDS token (proves "I am VM X") + bearer token (proves "I belong to org Y"). Prevents both external and internal impersonation. A VM can only fetch its own bundle.
DNS failures don't block cluster creation. Users can always connect via IP. The dns_configured flag reports status.
Solves the chicken-and-egg: pods need networking, but the CNI plugin (Cilium) runs as pods. Bridge CNI provides minimal networking until Cilium is installed.
First VM (allNodes[0]) installs cluster addons. Others skip it. Prevents duplicate installations in multi-node setups.
k8sapi reads from computeapi's tables (vms, interfaces). Avoids inter-service API calls for VM identity lookup. Tradeoff: schema coupling.