excloud / k8sapi

The Complete Kubernetes Bootstrap Guide

How Excloud builds Kubernetes clusters from scratch — no kubeadm, no magic. Every template, every certificate, every line explained.

16
Templates
7
Phases
8
Cert Pairs
30+
Files Rendered
01

What is This Project?

k8sapi is the Kubernetes-facing API for the Excloud cloud platform. When a user says "I want a Kubernetes cluster," this service does all the heavy lifting:

1. Creates VMs

Calls computeapi to provision virtual machines that become control plane nodes.

2. Generates Certificates

Creates a full PKI — a Certificate Authority and every cert each component needs to communicate securely.

3. Renders Config

Uses 16 Go templates to generate systemd units, static pod manifests, kubeconfigs, CNI config, and shell scripts.

4. Delivers Bundles

Packages everything into JSON that VMs fetch and use to self-assemble into a working Kubernetes cluster.

Key: This does NOT use kubeadm. Everything is built from raw binaries and hand-crafted configuration — full control from first principles.

02

Kubernetes 101

The concepts you need to understand the bootstrap process.

Control Plane — The Brain

The control plane makes all the decisions. It has four components:

kube-apiserver

The front door. Every single request — from users, nodes, other components — goes through here. It validates requests, authenticates callers, and stores state in etcd.

kube-controller-manager

The corrector. Runs dozens of control loops. Each watches a resource type and ensures real state matches desired state. If you want 3 replicas but only 2 exist, it creates another.

kube-scheduler

The planner. When a new pod needs to run, picks the best node based on resources, affinity rules, and constraints.

etcd

The memory. A distributed key-value database storing all cluster state. If etcd dies, the cluster loses its memory.

A hospital. API server = front desk (everyone checks in through it). Controller-manager = specialists (each monitors a system, intervenes when wrong). Scheduler = admissions (decides which room each patient goes to). Etcd = medical records (if it goes down, nobody knows what's happening).

kubelet — The Node Agent

Kubelet runs on every machine. It receives instructions from the API server ("run this pod") and ensures containers are running. Its superpower: it can run static pods by watching a directory on disk (/etc/kubernetes/manifests/). Any YAML placed there becomes a running pod — no API server needed.

Static Pods — The Chicken-and-Egg Solution

Problem: the API server runs as a pod, but pods are managed by the API server. How do you start it?

Answer: static pods. Kubelet watches a directory and creates pods from YAML files on disk, without any API server involvement. In this project, three control plane components (kube-apiserver, kube-controller-manager, kube-scheduler) run as static pods. Their manifests are rendered by k8sapi and placed on disk before kubelet starts.

Why not etcd too? Etcd runs as a systemd service (not a static pod). Systemd gives more reliable lifecycle management for this critical component.

Pods & Containers

A container is a lightweight, isolated process with its dependencies bundled. A pod wraps one or more containers that share networking (they can reach each other via localhost) and storage. Most pods have a single container. The container runtime here is containerd.

Services

Pods are ephemeral — they come and go, and their IPs change. A Service provides a stable virtual IP (ClusterIP) that load-balances traffic to the right pods. Key services here: kubernetes (API server, at 10.96.0.1) and kube-dns (CoreDNS, at 10.96.0.10).

CNI — Pod Networking

Kubernetes delegates pod networking to a CNI plugin. The CNI assigns IPs, sets up routes between nodes, and enforces network policies. This project uses Cilium, but during bootstrap uses a temporary bridge CNI (because Cilium itself runs as pods, and pods need networking to start — another chicken-and-egg).

kube-proxy — Service Networking

While CNI handles pod-to-pod traffic, kube-proxy handles service routing. It runs on every node as a DaemonSet and programs iptables rules to intercept traffic to service ClusterIPs and redirect to actual pod IPs.

CoreDNS — Cluster DNS

The cluster's internal DNS server. Lets pods find services by name (e.g., my-service.default.svc.cluster.local). Runs as a Deployment with 2 replicas, served at 10.96.0.10. Kubelet configures all pods to use this IP for DNS.

PKI — Certificates Everywhere

K8s uses TLS certificates for all component-to-component communication. A Certificate Authority (CA) signs all certs. Because everyone trusts the CA, they can trust each other. Certificate fields are used for identity: Common Name (CN) = username, Organization (O) = group. The admin cert has CN=cluster-admin, O=system:masters — full root access.

kubeconfig

A YAML file telling a K8s client: (1) where the API server is, (2) who you are (your client cert), (3) how to verify the server (the CA cert). This project generates 5 kubeconfigs — one per component.

RBAC

Role-Based Access Control maps identities to permissions. Built-in groups: system:masters (full admin), system:nodes (kubelet permissions), system:node-proxier (kube-proxy), etc.

03

Architecture & Database

System Diagram

User / Client
k8sapi
computeapi
dnsapi
PostgreSQL
VM boots
IMDS (identity)
k8sapi (bundle)
K8s Cluster!

In one sentence: User calls k8sapi → creates VMs → VMs boot, ask IMDS "who am I?" → call back to k8sapi for bootstrap bundle → install everything → Kubernetes cluster.

Database Tables

k8sapi shares a PostgreSQL database with other Excloud services:

kubeclusters — k8sapi owns

ColumnTypePurpose
idbigserial PKAuto-incremented cluster ID
org_idbigintOwning organization
project_idbigintProject (hardcoded 1 for now)
ca_certtextPEM-encoded CA certificate
priv_keytextPEM-encoded CA private key
service_acc_certtextService Account public key
service_acc_priv_keytextService Account private key

etcdnodes / apiservernodes — k8sapi owns

ColumnTypePurpose
idbigserial PKNode record ID
kubecluster_idbigint FKWhich cluster this belongs to
vm_idbigintThe VM running this node
vm_iptextPrivate IP
cert / priv_keytextNode's TLS cert and key

imds_tokens — imdsapi owns, k8sapi reads

ColumnTypePurpose
hashtextSHA-256 of token
vm_idbigintWhich VM
org_idbigintWhich org
expiry_attimestamptzExpiration

vms + interfaces + public_ipv4_allocations — computeapi owns, k8sapi reads

k8sapi JOINs these three tables to resolve a VM's network addresses (private IPv4, IPv6, public IPv4).

04

API Endpoints

Click to expand details.

POST/clustersCreate a K8s cluster

The main endpoint. Requires bearer token auth. Creates VMs, generates PKI, returns admin kubeconfig.

// Request
{ "control_plane_count": 1,       // 1-3 (default 1)
  "control_plane_image_id": 42,   // required
  "control_plane_instance_type": "e2.medium",
  "subnet_id": 5,                 // required
  "zone_id": 1,                   // default 1
  "security_group_ids": [],       // auto-created if empty
  "allocate_public_ipv4": true,
  "ssh_pubkey": "ssh-ed25519 ...",
  "root_volume_size_gib": 20 }

// Response
{ "cluster_id": 42,
  "name": "exk8s-5-2a",
  "kubeconfig": "apiVersion: v1...",
  "control_plane_vm_ids": [101],
  "control_plane_dns_name": "exk8s-5-2a.k8s.excloud.co.in",
  "service_cidr": "10.96.0.0/12",
  "pod_cidr": "172.16.0.0/12",
  "dns_configured": true }
GET/bootstrap/control-plane/:vm_idBootstrap bundle for a VM

Called by VMs themselves. Requires dual auth: IMDS token (X-exc-imds-token header) + bearer token. A VM can only fetch its own bundle (VM ID must match IMDS token). Returns JSON with all files, certs, and configs.

GET/healthHealth check

200 OK if healthy. 418 during shutdown (for LB drain).

GET/docsSwagger UI

Auto-generated API documentation.

05

The 7-Phase Bootstrap Flow

The complete journey from API call to running cluster.

POST /clusters
Gen CA + certs
Create VMs
Return kubeconfig
VM boots
IMDS token
Fetch bundle
Write files
Install prereqs
Start services
Install addons
Cluster running!
1
Cluster Creation
API-side • POST /clusters • clusterscreate.go
  1. Auth middleware validates bearer token
  2. Validate inputs — control_plane_count (1-3), image ID, instance type, subnet ID required
  3. Generate Cluster CA — 4096-bit RSA, 10-year validity. Signs ALL other certs. Root of trust.
    CreateCA() in certs.go
  4. Generate Service Account keypair — separate RSA keypair for signing/verifying pod JWT tokens
  5. BEGIN database transaction
  6. INSERT kubeclusters — stores CA + SA keys, gets cluster ID
  7. Derive names — cluster: exk8s-{org}-{id_hex}, DNS: exk8s-{org}-{id_hex}.k8s.excloud.co.in
  8. Create security group if needed — all traffic within subnet + port 6443 from 0.0.0.0/0
  9. Create VMs via computeapi — each with bootstrap-userdata.sh injected as cloud-init
  10. Resolve VM IPs — JOIN vms + interfaces + public_ipv4_allocations
  11. Generate etcd peer cert per VM — for etcd node-to-node mTLS
  12. Generate apiserver cert per VM — HTTPS serving cert with DNS/IP SANs
  13. INSERT etcdnodes + apiservernodes
  14. Generate admin cert — CN=cluster-admin, O=system:masters (root user)
  15. Render admin kubeconfig — points to external DNS name
  16. COMMIT transaction
  17. Create DNS A records — best-effort, non-blocking
  18. Return response — cluster ID, kubeconfig, VM IDs, DNS name
2
VM Boot
Inside the VM • cloud-init • bootstrap-userdata.sh
  1. cloud-init runs bootstrap-userdata.sh — the script injected as VM userdata
  2. Install curl + jq if missing
  3. Poll IMDS for session token — up to 120 attempts (10 min). Proves "I am VM X in org Y"
  4. Fetch node identity from IMDS — gets the VM's node_id
  5. Fetch access token from IMDS — org-level bearer token for API auth
  6. Fetch bootstrap bundle from k8sapiGET /bootstrap/control-plane/{node_id} with both tokens
3
Bundle Delivery
API-side • GET /bootstrap/control-plane/:vm_id • bootstrapcontrolplane.go
  1. Validate IMDS token — hash lookup in imds_tokens, check expiry
  2. Validate bearer token — via token cache
  3. Verify VM ID match — URL's VM ID must match IMDS token's VM ID (anti-impersonation)
  4. Look up cluster data from DB
  5. Parse cluster CA from stored PEM
  6. Generate 5 more certs — admin, apiserver-etcd-client, apiserver-kubelet-client, kube-proxy, kubelet-client
  7. BuildControlPlaneNodeBundle() — renders all templates, assembles file map
  8. Return JSON bundle
4
File Installation
Inside the VM • bootstrap-userdata.sh (continued)
  1. Write every file to disk — iterates bundle's files map, base64 decodes, writes to path
  2. chmod +x scripts
  3. Run exk8s-install-control-plane-prereqs.sh (Phase 5)
  4. systemctl daemon-reload
  5. Run exk8s-bootstrap-control-plane.sh (Phase 6)
5
Installing Prerequisites
Inside the VM • exk8s-install-control-plane-prereqs.sh
  1. apt install — containerd, conntrack, ebtables, ethtool, iproute2, iptables, socat, wget, curl, ca-certificates, containernetworking-plugins
  2. Configure containerd — enable SystemdCgroup (required for K8s)
  3. Symlink CNI plugins — /usr/lib/cni → /opt/cni/bin
  4. Download K8s binaries — kube-apiserver, kube-controller-manager, kube-scheduler, kubelet, kubectl (v1.35.3)
  5. Download etcd + etcdctl — v3.6.10
  6. Create directories

Alternative: scripts/bake-control-plane-image.sh does this at image build time, making boot much faster.

6
Starting Services
Inside the VM • exk8s-bootstrap-control-plane.sh
  1. Source cluster.env — loads all cluster variables
  2. Verify all binaries exist
  3. Start containerd — the container runtime
  4. Start etcd — the cluster database
  5. Start kubelet — scans /etc/kubernetes/manifests/, starts kube-apiserver, kube-controller-manager, kube-scheduler as static pods
  6. If initial control plane: run addon installer (Phase 7)
7
Installing Cluster Addons
Inside the VM • exk8s-install-cluster-addons.sh • FIRST NODE ONLY
  1. Wait for API server readiness — polls kubectl get --raw=/readyz up to 10 min
  2. Install Helm
  3. Install Cilium CNI via Helm — takes over from bridge CNI
  4. Apply kube-proxy DaemonSet — wait for rollout
  5. Apply CoreDNS Deployment — wait for rollout

After Phase 7: etcd, kube-apiserver, kube-controller-manager, kube-scheduler, kubelet, Cilium, kube-proxy, CoreDNS — all running. Cluster is fully operational.

06

Every File in the Bootstrap Bundle

PKI Certificates — /etc/exk8s/pki/

ca.pem + ca-key.pem
Cluster CA. Root of trust. Signs every other cert. The key can forge any identity.
service-account.pem + service-account-key.pem
Service Account keypair. API server uses the key to sign JWT tokens for pods. Controller-manager uses the public key to verify them.
etcd-peer.pem + etcd-peer-key.pem
Etcd peer cert. Etcd node-to-node communication (replication, leader election). Both client + server auth.
apiserver.pem + apiserver-key.pem
API server TLS cert. Makes HTTPS work. SANs: hostname, cluster DNS name, kubernetes.*, node IPs, 10.96.0.1, 127.0.0.1.
admin.pem + admin-key.pem
Admin client cert. CN=cluster-admin, O=system:masters. The root user.
apiserver-etcd-client.pem + key
apiserver → etcd. Client cert for API server to connect to etcd.
apiserver-kubelet-client.pem + key
apiserver → kubelet. For fetching logs, exec, port-forward. O=system:masters.
kubelet-client.pem + key
kubelet → apiserver. CN=system:node:{hostname}, O=system:nodes.

Kubeconfigs — /etc/exk8s/kubeconfig/

admin.kubeconfig
For kubectl. Points to 127.0.0.1:6443. cluster-admin access.
kubelet.kubeconfig
For kubelet. Identity: system:node:{hostname}.
kube-proxy.kubeconfig
For kube-proxy. Identity: system:kube-proxy.
controller-manager.kubeconfig
For controller-manager. 127.0.0.1:6443 (same-node).
scheduler.kubeconfig
For scheduler. 127.0.0.1:6443.

Systemd Units, Static Pods, Configs, Scripts

/etc/systemd/system/etcd.service
Runs etcd with mTLS, data dir, listen URLs, initial cluster.
/etc/systemd/system/kubelet.service
Runs kubelet with config, kubeconfig, hostname override.
/etc/kubernetes/manifests/kube-apiserver.yaml
Static pod. RBAC+Node auth, etcd mTLS, SA issuer, TLS cert.
/etc/kubernetes/manifests/kube-controller-manager.yaml
Static pod. Cluster CIDR, SA key, root CA.
/etc/kubernetes/manifests/kube-scheduler.yaml
Static pod. Kubeconfig, leader election.
/etc/exk8s/config/cluster.env
All cluster params as env vars.
/etc/exk8s/config/cilium-values.yaml
Helm values for Cilium CNI.
/var/lib/kubelet/config.yaml
KubeletConfiguration: auth, cgroup, runtime, DNS.
/etc/cni/net.d/10-exk8s.conflist
Temporary bridge CNI. Replaced by Cilium.
/etc/exk8s/manifests/kube-proxy.yaml
kube-proxy DaemonSet manifest.
/etc/exk8s/manifests/coredns.yaml
CoreDNS full manifest (SA, RBAC, ConfigMap, Service, Deployment).
/usr/local/bin/exk8s-install-control-plane-prereqs.sh
Installs containerd, K8s binaries, etcd.
/usr/local/bin/exk8s-bootstrap-control-plane.sh
Starts all services.
/usr/local/bin/exk8s-install-cluster-addons.sh
Installs Cilium, kube-proxy, CoreDNS (initial node only).
07

Every Template — Line by Line

All 16 templates with every line explained. This is the most detailed section.

1. bootstrap-userdata.shcloud-init

The very first script that runs when a VM boots. Injected as cloud-init userdata. Fetches the bootstrap bundle and kicks off installation.

Go data passed: struct{ BootstrapBaseURL string }

#!/usr/bin/env bash
set -euo pipefail                                    # exit on error, undefined vars, pipe failures

mkdir -p /var/log/exk8s                              # create log directory
exec > >(tee -a /var/log/exk8s/bootstrap-userdata.log) 2>&1  # log all output to file AND stdout

IMDS_BASE_URL="${EXC_IMDS_BASE_URL:-http://imdsapi.excloud.in}"  # IMDS endpoint (overridable)
BOOTSTRAP_BASE_URL="{{ .BootstrapBaseURL }}"         # k8sapi URL, templated by Go

# Install curl + jq if missing (needed for HTTP + JSON parsing)
if ! command -v curl >/dev/null 2>&1 || ! command -v jq >/dev/null 2>&1; then
  if command -v apt-get >/dev/null 2>&1; then
    export DEBIAN_FRONTEND=noninteractive
    apt-get update -y && apt-get install -y curl jq
  fi
fi

# Poll IMDS for a session token — up to 120 attempts (10 minutes)
# The token proves "I am this specific VM in this specific org"
for _ in $(seq 1 120); do
  IMDS_TOKEN="$(curl -fsS "${IMDS_BASE_URL}/token" || true)"
  [[ -n "${IMDS_TOKEN}" ]] && break
  sleep 5
done
[[ -z "${IMDS_TOKEN:-}" ]] && echo "failed to fetch imds token" && exit 1

# Get node identity (our VM ID) from IMDS
NODE_ID="$(curl -fsS -H "X-exc-imds-token: ${IMDS_TOKEN}" \
  "${IMDS_BASE_URL}/latest/identity/node-identity" | jq -r '.node_id')"

# Get an org-level access token from IMDS (second auth factor)
BOOTSTRAP_ACCESS_TOKEN="$(curl -fsS -H "X-exc-imds-token: ${IMDS_TOKEN}" \
  "${IMDS_BASE_URL}/latest/identity/access-token" | jq -r '.access_token')"

# Fetch bootstrap bundle from k8sapi with BOTH tokens
TMP_JSON="$(mktemp)" && trap 'rm -f "$TMP_JSON"' EXIT
for _ in $(seq 1 120); do
  curl -fsS -H "X-exc-imds-token: ${IMDS_TOKEN}" \
    -H "Authorization: Bearer ${BOOTSTRAP_ACCESS_TOKEN}" \
    "${BOOTSTRAP_BASE_URL}/bootstrap/control-plane/${NODE_ID}" > "$TMP_JSON" && break
  sleep 5
done

# Write every file from the bundle to disk
jq -r '.files | to_entries[] | @base64' "$TMP_JSON" | while IFS= read -r entry; do
  key="$(printf '%s' "$entry" | base64 --decode | jq -r '.key')"    # file path
  value="$(printf '%s' "$entry" | base64 --decode | jq -r '.value')"  # file content
  mkdir -p "$(dirname "$key")" && printf '%s' "$value" > "$key"
done

# Make scripts executable
chmod +x /usr/local/bin/exk8s-*.sh 2>/dev/null || true

# Install prereqs if not already present (skip if image was pre-baked)
if ! command -v kubelet >/dev/null 2>&1; then
  /usr/local/bin/exk8s-install-control-plane-prereqs.sh
fi

systemctl daemon-reload || true
/usr/local/bin/exk8s-bootstrap-control-plane.sh  # Start everything
Rendered output (example with BootstrapBaseURL = "https://k8sapi.excloud.in")
#!/usr/bin/env bash
set -euo pipefail

mkdir -p /var/log/exk8s
exec > >(tee -a /var/log/exk8s/bootstrap-userdata.log) 2>&1

IMDS_BASE_URL="${EXC_IMDS_BASE_URL:-http://imdsapi.excloud.in}"
BOOTSTRAP_BASE_URL="https://k8sapi.excloud.in"

# ... rest is pure bash, no more template variables ...
2. cluster.envconfig

Key-value environment file sourced by all shell scripts. Contains every cluster parameter.

Go data: ControlPlaneNodeBundle fields directly

  • CLUSTER_NAME={{ .ClusterName }}e.g. exk8s-5-2a. Used in component configs.
  • KUBERNETES_VERSION={{ .KubernetesVersion }}e.g. v1.35.3. Used to download correct binaries.
  • CLUSTER_DOMAIN={{ .ClusterDomain }}cluster.local. The DNS domain for services.
  • SERVICE_CIDR={{ .ServiceCIDR }}10.96.0.0/12. IP range for Service ClusterIPs.
  • SERVICE_CLUSTER_IP={{ .ServiceClusterIP }}10.96.0.1. First usable IP — reserved for the kubernetes service.
  • DNS_SERVICE_IP={{ .DNSServiceIP }}10.96.0.10. CoreDNS Service IP. Kubelet points pods here for DNS.
  • POD_CIDR={{ .PodCIDR }}172.16.0.0/12. IP range for pod IPs.
  • APISERVER_ENDPOINT={{ .APIServerEndpoint }}https://127.0.0.1:6443. Local API server endpoint.
  • CONTROL_PLANE_DNS_NAME={{ .ControlPlaneDNSName }}External DNS name for the cluster.
  • NODE_NAME={{ .NodeName }}This node's hostname (e.g. cp-1.cluster-42.internal).
  • NODE_IPV4={{ .NodeIPv4 }}This node's private IPv4 address.
  • NODE_IPV6={{ .NodeIPv6 }}This node's IPv6 address (may be empty).
  • ETCD_INITIAL_CLUSTER={{ .EtcdInitialCluster }}e.g. cp-1=https://10.0.0.5:2380,cp-2=https://10.0.0.6:2380. Tells etcd who its peers are.
  • IS_INITIAL_CONTROL_PLANE={{ .IsInitialControlPlane }}true for the first node only. Controls whether addon installer runs.
Rendered output (example: cluster exk8s-5-2a, 3 nodes)
CLUSTER_NAME=exk8s-5-2a
KUBERNETES_VERSION=v1.35.3
CLUSTER_DOMAIN=cluster.local
SERVICE_CIDR=10.96.0.0/12
SERVICE_CLUSTER_IP=10.96.0.1
DNS_SERVICE_IP=10.96.0.10
POD_CIDR=172.16.0.0/12
APISERVER_ENDPOINT=https://127.0.0.1:6443
CONTROL_PLANE_DNS_NAME=exk8s-5-2a.k8s.excloud.co.in
NODE_NAME=cp-1.cluster-42.internal
NODE_IPV4=10.0.1.5
NODE_IPV6=
ETCD_INITIAL_CLUSTER=cp-1.cluster-42.internal=https://10.0.1.5:2380,cp-2.cluster-42.internal=https://10.0.1.6:2380,cp-3.cluster-42.internal=https://10.0.1.7:2380
IS_INITIAL_CONTROL_PLANE=true
3. etcd.servicesystemd

Systemd unit that runs etcd — the cluster's distributed database.

Go data: struct{ Name, Addr, InitialCluster string }

  • After=network-online.targetWait for network before starting.
  • Type=notifyetcd notifies systemd when ready (sd_notify).
  • --name={{ .Name }}Node name in the etcd cluster (matches hostname).
  • --data-dir=/var/lib/etcdWhere etcd stores its database on disk.
  • --listen-client-urls=https://{{ .Addr }}:2379,https://127.0.0.1:2379Accept client connections (like from kube-apiserver) on the node's IP and localhost.
  • --advertise-client-urls=https://{{ .Addr }}:2379Tell other nodes "connect to me at this URL."
  • --listen-peer-urls=https://{{ .Addr }}:2380Listen for peer connections (replication) on port 2380.
  • --initial-advertise-peer-urls=https://{{ .Addr }}:2380Tell peers "my peer endpoint is here."
  • --initial-cluster={{ .InitialCluster }}Full peer list, e.g. cp-1=https://10.0.0.5:2380,cp-2=https://10.0.0.6:2380.
  • --initial-cluster-state=newStarting a brand new cluster (not joining existing).
  • --trusted-ca-file, --cert-file, --key-fileClient-facing mTLS: CA for verifying clients, server cert/key for TLS.
  • --client-cert-authRequire clients (kube-apiserver) to present a valid certificate.
  • --peer-trusted-ca-file, --peer-cert-file, --peer-key-filePeer-facing mTLS: all etcd nodes verify each other.
  • --peer-client-cert-authRequire peer nodes to present valid certs.
  • Restart=always, RestartSec=5Auto-restart on crash after 5 seconds.
  • LimitNOFILE=40000etcd needs many open file descriptors for its database.
Rendered output (example: node cp-1 at 10.0.1.5, 3-node cluster)
[Unit]
Description=etcd
Documentation=https://etcd.io
After=network-online.target
Wants=network-online.target

[Service]
Type=notify
ExecStart=/usr/local/bin/etcd \
  --name=cp-1.cluster-42.internal \
  --data-dir=/var/lib/etcd \
  --listen-client-urls=https://10.0.1.5:2379,https://127.0.0.1:2379 \
  --advertise-client-urls=https://10.0.1.5:2379 \
  --listen-peer-urls=https://10.0.1.5:2380 \
  --initial-advertise-peer-urls=https://10.0.1.5:2380 \
  --initial-cluster=cp-1.cluster-42.internal=https://10.0.1.5:2380,cp-2.cluster-42.internal=https://10.0.1.6:2380,cp-3.cluster-42.internal=https://10.0.1.7:2380 \
  --initial-cluster-state=new \
  --trusted-ca-file=/etc/exk8s/pki/ca.pem \
  --cert-file=/etc/exk8s/pki/etcd-peer.pem \
  --key-file=/etc/exk8s/pki/etcd-peer-key.pem \
  --client-cert-auth \
  --peer-trusted-ca-file=/etc/exk8s/pki/ca.pem \
  --peer-cert-file=/etc/exk8s/pki/etcd-peer.pem \
  --peer-key-file=/etc/exk8s/pki/etcd-peer-key.pem \
  --peer-client-cert-auth
Restart=always
RestartSec=5
LimitNOFILE=40000

[Install]
WantedBy=multi-user.target
4. kubelet.servicesystemd

Systemd unit for kubelet — the node agent.

Go data: struct{ Hostname, NodeAddress string }

  • After=containerd.serviceContainerd must be running first (kubelet needs it to run containers).
  • EnvironmentFile=-/etc/exk8s/config/cluster.envLoad cluster vars. The - prefix means "don't fail if missing."
  • --config=/var/lib/kubelet/config.yamlDetailed kubelet configuration file.
  • --kubeconfig=/etc/exk8s/kubeconfig/kubelet.kubeconfigHow to connect to and authenticate with the API server.
  • --hostname-override={{ .Hostname }}Override the system hostname with the cluster-assigned name.
  • --node-ip={{ .NodeAddress }}Tell K8s "my IP is this" — important when VM has multiple interfaces.
Rendered output (example: node cp-1 at 10.0.1.5)
[Unit]
Description=Kubernetes Kubelet
Documentation=https://kubernetes.io/docs/
After=network-online.target containerd.service
Wants=network-online.target

[Service]
EnvironmentFile=-/etc/exk8s/config/cluster.env
ExecStart=/usr/local/bin/kubelet \
  --config=/var/lib/kubelet/config.yaml \
  --kubeconfig=/etc/exk8s/kubeconfig/kubelet.kubeconfig \
  --hostname-override=cp-1.cluster-42.internal \
  --node-ip=10.0.1.5
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target
5. kubelet-config.yamlconfig

KubeletConfiguration — detailed settings for the kubelet agent.

Go data: struct{ DNSServiceIP, ClusterDomain, NodeAddress string }

  • authentication.anonymous.enabled: falseDon't allow unauthenticated requests to kubelet's API.
  • authentication.x509.clientCAFile: /etc/exk8s/pki/ca.pemVerify client certs against the cluster CA.
  • authentication.webhook.enabled: trueAlso accept bearer tokens (validated by API server).
  • authorization.mode: WebhookAsk the API server "is this user allowed?" for each request.
  • cgroupDriver: systemdUse systemd for cgroup management (must match containerd's setting).
  • clusterDNS: [{{ .DNSServiceIP }}]Tell pods to use this IP for DNS resolution (10.96.0.10 = CoreDNS).
  • clusterDomain: {{ .ClusterDomain }}cluster.local. The search domain appended to short DNS names.
  • containerRuntimeEndpoint: unix:///run/containerd/containerd.sockConnect to containerd via this Unix socket.
  • failSwapOn: falseDon't crash if swap is enabled (some VMs have swap).
  • readOnlyPort: 0Disable the unauthenticated read-only port (security).
  • rotateCertificates: falseDon't auto-rotate certs (managed externally by k8sapi).
  • staticPodPath: /etc/kubernetes/manifestsWatch this directory for static pod manifests.
  • tlsCertFile / tlsPrivateKeyFileKubelet's own TLS cert for serving its HTTPS API.
  • address: {{ .NodeAddress }}Bind kubelet's API to this IP.
Rendered output (example: node at 10.0.1.5)
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
authentication:
  anonymous:
    enabled: false
  x509:
    clientCAFile: /etc/exk8s/pki/ca.pem
  webhook:
    enabled: true
authorization:
  mode: Webhook
cgroupDriver: systemd
clusterDNS:
- 10.96.0.10
clusterDomain: cluster.local
containerRuntimeEndpoint: unix:///run/containerd/containerd.sock
failSwapOn: false
healthzBindAddress: 127.0.0.1
readOnlyPort: 0
rotateCertificates: false
serverTLSBootstrap: false
staticPodPath: /etc/kubernetes/manifests
tlsCertFile: /etc/exk8s/pki/kubelet-client.pem
tlsPrivateKeyFile: /etc/exk8s/pki/kubelet-client-key.pem
address: 10.0.1.5
6. kubeconfigauth

Generic kubeconfig template used 5 times (admin, kubelet, kube-proxy, controller-manager, scheduler).

Go data: struct{ ClusterName, CAPEM, APIEndpoint, UserName, CertPEM, KeyPEM string }

  • certificate-authority-data: {{ b64 .CAPEM }}Base64-encoded CA cert. Used to verify the API server is legitimate.
  • server: {{ .APIEndpoint }}API server URL. 127.0.0.1:6443 for local components, external DNS for admin.
  • client-certificate-data: {{ b64 .CertPEM }}Base64-encoded client cert. This is your identity.
  • client-key-data: {{ b64 .KeyPEM }}Base64-encoded client key. Proves you own the cert.
  • current-context: {{ .UserName }}Which context is active. Matches the user name.

Note: b64 is a custom template function registered in renderer.go that base64-encodes strings.

Rendered output (example: admin.kubeconfig for cluster exk8s-5-2a)
apiVersion: v1
kind: Config
clusters:
- name: exk8s-5-2a
  cluster:
    certificate-authority-data: LS0tLS1CRUdJTi... (base64 of CA PEM)
    server: https://127.0.0.1:6443
users:
- name: cluster-admin
  user:
    client-certificate-data: LS0tLS1CRUdJTi... (base64 of admin cert PEM)
    client-key-data: LS0tLS1CRUdJTi... (base64 of admin key PEM)
contexts:
- name: cluster-admin
  context:
    cluster: exk8s-5-2a
    user: cluster-admin
current-context: cluster-admin
7. kube-apiserver.yamlstatic pod

The most critical manifest — the API server static pod. This is what kubelet starts from /etc/kubernetes/manifests/.

Go data: struct{ KubernetesVersion, APIAddress, EtcdServers, ControlPlaneDNSName, ServiceCIDR string }

  • hostNetwork: trueUse the host's network directly (not pod network). Required because CNI isn't ready yet.
  • priorityClassName: system-node-criticalNever evict this pod. Highest priority.
  • --advertise-address={{ .APIAddress }}Tell clients "connect to me at this IP."
  • --allow-privileged=trueAllow privileged containers (needed by kube-proxy, Cilium).
  • --authorization-mode=Node,RBACTwo authorizers: Node (restricts kubelets to their own resources) and RBAC (role-based access).
  • --client-ca-file=/etc/exk8s/pki/ca.pemVerify client certificates against the cluster CA.
  • --enable-admission-plugins=NodeRestrictionPrevents nodes from modifying other nodes' resources.
  • --etcd-cafile, --etcd-certfile, --etcd-keyfilemTLS for connecting to etcd. API server must prove its identity to etcd.
  • --etcd-servers={{ .EtcdServers }}Comma-separated list of etcd endpoints, e.g. https://10.0.0.5:2379.
  • --kubelet-client-certificate/keyFor API server → kubelet connections (logs, exec, port-forward).
  • --kubelet-preferred-address-types=InternalIP,HostnamePrefer internal IP when connecting to kubelets.
  • --proxy-client-cert-file/keyFor aggregated API server proxying (extension API servers).
  • --requestheader-client-ca-fileCA for validating front-proxy client certs.
  • --secure-port=6443The standard K8s API port.
  • --service-account-issuer=https://{{ .ControlPlaneDNSName }}:6443The issuer URL in SA JWT tokens. Must match for OIDC discovery.
  • --service-account-key-filePublic key to verify SA tokens.
  • --service-account-signing-key-filePrivate key to sign SA tokens.
  • --service-cluster-ip-range={{ .ServiceCIDR }}IP range for ClusterIP services.
  • --tls-cert-file, --tls-private-key-fileThe API server's own HTTPS certificate and key.
  • livenessProbe: /livez, readinessProbe: /readyzkubelet uses these to know if the API server is healthy.
  • volumeMounts: /etc/exk8s/pki (readOnly)Mount the certificate directory into the pod.
Rendered output (example: node at 10.0.1.5, 3-node cluster, DNS exk8s-5-2a.k8s.excloud.co.in)
apiVersion: v1
kind: Pod
metadata:
  name: kube-apiserver
  namespace: kube-system
spec:
  hostNetwork: true
  priorityClassName: system-node-critical
  containers:
  - name: kube-apiserver
    image: registry.k8s.io/kube-apiserver:v1.35.3
    command:
    - kube-apiserver
    - --advertise-address=10.0.1.5
    - --allow-privileged=true
    - --authorization-mode=Node,RBAC
    - --client-ca-file=/etc/exk8s/pki/ca.pem
    - --enable-admission-plugins=NodeRestriction
    - --etcd-cafile=/etc/exk8s/pki/ca.pem
    - --etcd-certfile=/etc/exk8s/pki/apiserver-etcd-client.pem
    - --etcd-keyfile=/etc/exk8s/pki/apiserver-etcd-client-key.pem
    - --etcd-servers=https://10.0.1.5:2379,https://10.0.1.6:2379,https://10.0.1.7:2379
    - --kubelet-client-certificate=/etc/exk8s/pki/apiserver-kubelet-client.pem
    - --kubelet-client-key=/etc/exk8s/pki/apiserver-kubelet-client-key.pem
    - --kubelet-preferred-address-types=InternalIP,Hostname
    - --proxy-client-cert-file=/etc/exk8s/pki/admin.pem
    - --proxy-client-key-file=/etc/exk8s/pki/admin-key.pem
    - --requestheader-client-ca-file=/etc/exk8s/pki/ca.pem
    - --requestheader-allowed-names=cluster-admin
    - --secure-port=6443
    - --service-account-issuer=https://exk8s-5-2a.k8s.excloud.co.in:6443
    - --service-account-key-file=/etc/exk8s/pki/service-account.pem
    - --service-account-signing-key-file=/etc/exk8s/pki/service-account-key.pem
    - --service-cluster-ip-range=10.96.0.0/12
    - --tls-cert-file=/etc/exk8s/pki/apiserver.pem
    - --tls-private-key-file=/etc/exk8s/pki/apiserver-key.pem
    - --v=2
    livenessProbe:
      httpGet:
        host: 127.0.0.1
        path: /livez
        port: 6443
        scheme: HTTPS
      initialDelaySeconds: 15
      periodSeconds: 10
    readinessProbe:
      httpGet:
        host: 127.0.0.1
        path: /readyz
        port: 6443
        scheme: HTTPS
      periodSeconds: 5
    volumeMounts:
    - name: exk8s-pki
      mountPath: /etc/exk8s/pki
      readOnly: true
  volumes:
  - name: exk8s-pki
    hostPath:
      path: /etc/exk8s/pki
      type: DirectoryOrCreate
8. kube-controller-manager.yamlstatic pod

Controller manager — runs reconciliation loops.

Go data: struct{ KubernetesVersion, ClusterName, PodCIDR string }

  • --bind-address=127.0.0.1Only listen locally (not exposed externally).
  • --cluster-cidr={{ .PodCIDR }}Tells IPAM controller what pod IP range to manage.
  • --configure-cloud-routes=falseDon't try to configure cloud provider routes (handled by Cilium).
  • --leader-elect=trueIn multi-node setups, only one controller-manager is active at a time.
  • --root-ca-file=/etc/exk8s/pki/ca.pemCA cert to include in service account tokens.
  • --service-account-private-key-fileKey for signing service account tokens.
  • --use-service-account-credentials=trueEach controller runs with its own service account (better security).
  • Mounts: /etc/exk8s/pki + /etc/exk8s/kubeconfigNeeds both certs and its kubeconfig.
Rendered output (example: cluster exk8s-5-2a)
apiVersion: v1
kind: Pod
metadata:
  name: kube-controller-manager
  namespace: kube-system
spec:
  hostNetwork: true
  priorityClassName: system-node-critical
  containers:
  - name: kube-controller-manager
    image: registry.k8s.io/kube-controller-manager:v1.35.3
    command:
    - kube-controller-manager
    - --authentication-kubeconfig=/etc/exk8s/kubeconfig/controller-manager.kubeconfig
    - --authorization-kubeconfig=/etc/exk8s/kubeconfig/controller-manager.kubeconfig
    - --bind-address=127.0.0.1
    - --cluster-name=exk8s-5-2a
    - --cluster-cidr=172.16.0.0/12
    - --configure-cloud-routes=false
    - --kubeconfig=/etc/exk8s/kubeconfig/controller-manager.kubeconfig
    - --leader-elect=true
    - --root-ca-file=/etc/exk8s/pki/ca.pem
    - --service-account-private-key-file=/etc/exk8s/pki/service-account-key.pem
    - --use-service-account-credentials=true
    - --v=2
    volumeMounts:
    - name: exk8s-pki
      mountPath: /etc/exk8s/pki
      readOnly: true
    - name: exk8s-kubeconfig
      mountPath: /etc/exk8s/kubeconfig
      readOnly: true
  volumes:
  - name: exk8s-pki
    hostPath:
      path: /etc/exk8s/pki
      type: DirectoryOrCreate
  - name: exk8s-kubeconfig
    hostPath:
      path: /etc/exk8s/kubeconfig
      type: DirectoryOrCreate
9. kube-scheduler.yamlstatic pod

Scheduler — assigns pods to nodes. The simplest static pod.

Go data: struct{ KubernetesVersion string }

  • --bind-address=127.0.0.1Local only.
  • --leader-elect=trueOne active scheduler in multi-node setups.
  • Mounts: /etc/exk8s/kubeconfig onlyScheduler doesn't need certs directly — just its kubeconfig.
Rendered output
apiVersion: v1
kind: Pod
metadata:
  name: kube-scheduler
  namespace: kube-system
spec:
  hostNetwork: true
  priorityClassName: system-node-critical
  containers:
  - name: kube-scheduler
    image: registry.k8s.io/kube-scheduler:v1.35.3
    command:
    - kube-scheduler
    - --authentication-kubeconfig=/etc/exk8s/kubeconfig/scheduler.kubeconfig
    - --authorization-kubeconfig=/etc/exk8s/kubeconfig/scheduler.kubeconfig
    - --bind-address=127.0.0.1
    - --kubeconfig=/etc/exk8s/kubeconfig/scheduler.kubeconfig
    - --leader-elect=true
    - --v=2
    volumeMounts:
    - name: exk8s-kubeconfig
      mountPath: /etc/exk8s/kubeconfig
      readOnly: true
  volumes:
  - name: exk8s-kubeconfig
    hostPath:
      path: /etc/exk8s/kubeconfig
      type: DirectoryOrCreate
10. kube-proxy.yamlDaemonSet

kube-proxy DaemonSet — runs on every node, handles service → pod routing via iptables.

Go data: struct{ KubernetesVersion, PodCIDR string }

  • kind: DaemonSetOne copy per node automatically.
  • hostNetwork: trueNeeds host network to manage iptables rules.
  • tolerations: control-plane, masterRun even on control plane nodes (which normally repel workloads).
  • --proxy-mode=iptablesUse iptables for service routing (vs ipvs or nftables).
  • --hostname-override=$(NODE_NAME)Get node name from downward API (spec.nodeName).
  • securityContext: privileged: trueNeeds privilege to modify iptables.
  • Volume: /lib/modulesAccess to kernel modules for iptables.
Rendered output
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube-proxy
  namespace: kube-system
  labels:
    k8s-app: kube-proxy
spec:
  selector:
    matchLabels:
      k8s-app: kube-proxy
  template:
    metadata:
      labels:
        k8s-app: kube-proxy
    spec:
      hostNetwork: true
      priorityClassName: system-node-critical
      tolerations:
      - key: CriticalAddonsOnly
        operator: Exists
      - key: node-role.kubernetes.io/control-plane
        operator: Exists
        effect: NoSchedule
      - key: node-role.kubernetes.io/master
        operator: Exists
        effect: NoSchedule
      containers:
      - name: kube-proxy
        image: registry.k8s.io/kube-proxy:v1.35.3
        imagePullPolicy: IfNotPresent
        command:
        - kube-proxy
        - --kubeconfig=/etc/exk8s/kubeconfig/kube-proxy.kubeconfig
        - --cluster-cidr=172.16.0.0/12
        - --hostname-override=$(NODE_NAME)
        - --proxy-mode=iptables
        env:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        securityContext:
          privileged: true
        volumeMounts:
        - name: kubeconfig
          mountPath: /etc/exk8s/kubeconfig/kube-proxy.kubeconfig
          readOnly: true
        - name: lib-modules
          mountPath: /lib/modules
          readOnly: true
      volumes:
      - name: kubeconfig
        hostPath:
          path: /etc/exk8s/kubeconfig/kube-proxy.kubeconfig
          type: File
      - name: lib-modules
        hostPath:
          path: /lib/modules
          type: Directory
  updateStrategy:
    rollingUpdate:
      maxUnavailable: 1
    type: RollingUpdate
11. coredns.yamlDeployment + Service

Full CoreDNS manifest — 5 Kubernetes resources in one file. The cluster's DNS server.

Go data: struct{ ClusterDomain, DNSServiceIP string }

  • ServiceAccount: corednsIdentity for CoreDNS pods to authenticate to API server.
  • ConfigMap: CorefileCoreDNS config. Plugins: errors, health (lameduck 5s), ready, kubernetes (watches cluster), prometheus (metrics on :9153), forward (upstream to /etc/resolv.conf), cache 30s, loop detection, auto-reload, loadbalance.
  • kubernetes {{ .ClusterDomain }}Answer DNS queries for *.cluster.local from K8s API.
  • ClusterRole: system:corednsPermissions: list/watch endpoints, services, pods, namespaces, endpointslices.
  • Service: kube-dns, clusterIP: {{ .DNSServiceIP }}Fixed IP (10.96.0.10). All pods use this for DNS. Ports: 53/UDP, 53/TCP, 9153/TCP (metrics).
  • Deployment: 2 replicasRedundancy — if one pod dies, DNS still works.
  • dnsPolicy: DefaultCoreDNS itself uses host DNS (not cluster DNS — it IS cluster DNS).
  • livenessProbe: /health:8080Restart if CoreDNS process is unhealthy.
  • readinessProbe: /ready:8181Don't send traffic until ready.
  • resources: 100m CPU, 70-170Mi memoryResource limits prevent runaway consumption.
Rendered output (abridged — full manifest is 154 lines, showing key sections)
# ServiceAccount + ClusterRole + ClusterRoleBinding (RBAC setup)
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |-
    .:53 {
        errors
        health { lameduck 5s }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
            pods insecure
            fallthrough in-addr.arpa ip6.arpa
            ttl 30
        }
        prometheus 0.0.0.0:9153
        forward . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
    }
---
apiVersion: v1
kind: Service
metadata:
  name: kube-dns
  namespace: kube-system
spec:
  selector: { k8s-app: kube-dns }
  clusterIP: 10.96.0.10
  ports:
  - { name: dns, port: 53, protocol: UDP }
  - { name: dns-tcp, port: 53, protocol: TCP }
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: coredns
  namespace: kube-system
spec:
  replicas: 2
  # ... (tolerations, dnsPolicy: Default, probes, resources, volume mounts)
12. cilium-values.yamlHelm values

Configuration for Cilium CNI installation via Helm.

Go data: struct{ K8sServiceHost, PodCIDR string }

  • k8sServiceHost: {{ .K8sServiceHost }}API server IP for Cilium to connect to. Uses first node's IP (not ClusterIP since Cilium IS the networking).
  • k8sServicePort: 6443Standard API server port.
  • ipv4.enabled: true / ipv6.enabled: falseIPv4-only for now.
  • kubeProxyReplacement: falseRun alongside kube-proxy (not replacing it).
  • routingMode: tunnel / tunnelProtocol: vxlanUse VXLAN encapsulation for cross-node pod traffic.
  • enableIPv4Masquerade: trueNAT pod traffic going outside the cluster.
  • ipam.mode: cluster-poolCilium manages pod IPs from the pod CIDR.
  • clusterPoolIPv4PodCIDRList: [{{ .PodCIDR }}]The actual IP range for pods.
Rendered output (example: first node at 10.0.1.5)
k8sServiceHost: 10.0.1.5
k8sServicePort: 6443
ipv4:
  enabled: true
ipv6:
  enabled: false
kubeProxyReplacement: false
operator:
  replicas: 1
routingMode: tunnel
tunnelProtocol: vxlan
enableIPv4Masquerade: true
ipam:
  mode: cluster-pool
  operator:
    clusterPoolIPv4PodCIDRList:
      - 172.16.0.0/12
13. single-node-bridge-cni.conflistCNI

Temporary bridge CNI — basic pod networking before Cilium takes over.

Go data: struct{ PodCIDR string }

  • "type": "bridge", "bridge": "cni0"Create a Linux bridge named cni0. Connects all pod veth interfaces.
  • "isDefaultGateway": trueThe bridge is the default gateway for pods.
  • "ipMasq": trueNAT pod traffic going to the internet.
  • "hairpinMode": trueAllow a pod to reach itself via service IP.
  • "ipam.type": "host-local"Allocate IPs from a local pool (not distributed like Cilium).
  • "subnet": "{{ .PodCIDR }}"IP range for pods.
  • "type": "portmap"Plugin for hostPort mappings.
  • "type": "firewall"Basic iptables firewall integration.
Rendered output
{
  "cniVersion": "0.4.0",
  "name": "exk8s",
  "plugins": [
    {
      "type": "bridge",
      "bridge": "cni0",
      "isDefaultGateway": true,
      "ipMasq": true,
      "hairpinMode": true,
      "ipam": {
        "type": "host-local",
        "ranges": [[{ "subnet": "172.16.0.0/12" }]],
        "routes": [{ "dst": "0.0.0.0/0" }]
      }
    },
    { "type": "portmap", "capabilities": { "portMappings": true } },
    { "type": "firewall" }
  ]
}
14. exk8s-install-control-plane-prereqs.shscript

Installs all runtime dependencies — containerd, K8s binaries, etcd.

Go data: struct{ KubeVersion, EtcdVersion string }

Key operations:

  • Detects architecture (amd64 or arm64)
  • apt install containerd + networking tools
  • Configures containerd with SystemdCgroup = true (required by K8s)
  • Symlinks /usr/lib/cni/opt/cni/bin
  • Downloads 5 K8s binaries from dl.k8s.io with retry logic
  • Downloads etcd tarball from GitHub, extracts etcd + etcdctl
  • Creates all required directories
Rendered output (key lines — only {{ .KubeVersion }} and {{ .EtcdVersion }} are templated)
#!/usr/bin/env bash
set -euo pipefail
export DEBIAN_FRONTEND=noninteractive

# ... architecture detection ...

apt-get update -y
apt-get install -y ca-certificates conntrack containerd containernetworking-plugins \
  curl ebtables ethtool iproute2 iptables socat wget

# Configure containerd for systemd cgroups
containerd config default > /etc/containerd/config.toml
sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
ln -sfn /usr/lib/cni /opt/cni/bin

# Download K8s binaries (version rendered from Go)
KUBE_VERSION=v1.35.3
ETCD_VERSION=v3.6.10

download_binary "https://dl.k8s.io/v1.35.3/bin/linux/amd64/kube-apiserver" /usr/local/bin/kube-apiserver
download_binary "https://dl.k8s.io/v1.35.3/bin/linux/amd64/kube-controller-manager" /usr/local/bin/kube-controller-manager
download_binary "https://dl.k8s.io/v1.35.3/bin/linux/amd64/kube-scheduler" /usr/local/bin/kube-scheduler
download_binary "https://dl.k8s.io/v1.35.3/bin/linux/amd64/kubelet" /usr/local/bin/kubelet
download_binary "https://dl.k8s.io/v1.35.3/bin/linux/amd64/kubectl" /usr/local/bin/kubectl

# Download etcd
curl -fL "https://github.com/etcd-io/etcd/releases/download/v3.6.10/etcd-v3.6.10-linux-amd64.tar.gz" ...

mkdir -p /etc/exk8s/pki /etc/exk8s/kubeconfig /etc/exk8s/config /etc/exk8s/manifests \
  /etc/kubernetes/manifests /var/lib/etcd /var/lib/kubelet
15. exk8s-bootstrap-control-plane.shscript

The main bootstrap runner. Sources config, verifies binaries, starts all services.

Go data: nil (no template variables — logic is all bash)

Key operations:

  • Source cluster.env with set -a (auto-export all vars)
  • Verify all 6 binaries exist (containerd, etcd, kube-apiserver, kube-controller-manager, kube-scheduler, kubelet)
  • systemctl enable --now containerd, etcd, kubelet
  • If IS_INITIAL_CONTROL_PLANE=true: run addon installer
Rendered output (no template variables — this is pure bash)
#!/usr/bin/env bash
set -euo pipefail

if [[ -f /etc/exk8s/config/cluster.env ]]; then
  set -a                          # auto-export all sourced vars
  . /etc/exk8s/config/cluster.env
  set +a
fi

# Verify every required binary is installed
for bin in containerd etcd kube-apiserver kube-controller-manager kube-scheduler kubelet; do
  if ! command -v "$bin" >/dev/null 2>&1; then
    echo "missing required binary: $bin"
    exit 1
  fi
done

mkdir -p /etc/kubernetes/manifests /var/lib/etcd /var/lib/kubelet /etc/exk8s/config
systemctl daemon-reload
systemctl enable containerd || true
systemctl restart containerd || true
systemctl enable etcd
systemctl enable kubelet
systemctl restart etcd          # Start the cluster database
systemctl restart kubelet       # Start the node agent (picks up static pod manifests)

# Only the first control plane node installs cluster addons
if [[ "${IS_INITIAL_CONTROL_PLANE:-false}" == "true" ]] \
   && [[ -x /usr/local/bin/exk8s-install-cluster-addons.sh ]]; then
  /usr/local/bin/exk8s-install-cluster-addons.sh
fi
16. exk8s-install-cluster-addons.shscript

Installs cluster-wide addons. Only runs on the first control plane node.

Go data: nil

Key operations:

  • Exit immediately if not IS_INITIAL_CONTROL_PLANE=true
  • Wait up to 10 minutes for API server readiness (kubectl get --raw=/readyz)
  • Install Helm v3.19.2 (if not present)
  • helm upgrade --install cilium from OCI chart with --wait --timeout 10m
  • kubectl apply kube-proxy DaemonSet + wait for rollout
  • kubectl apply CoreDNS Deployment + wait for rollout
Rendered output (no template variables — pure bash)
#!/usr/bin/env bash
set -euo pipefail

# Source cluster config
if [[ -f /etc/exk8s/config/cluster.env ]]; then
  set -a && . /etc/exk8s/config/cluster.env && set +a
fi

# Only run on the initial control plane node
if [[ "${IS_INITIAL_CONTROL_PLANE:-false}" != "true" ]]; then
  exit 0
fi

export KUBECONFIG=/etc/exk8s/kubeconfig/admin.kubeconfig

# Wait for API server to be ready (up to 10 minutes)
for _ in $(seq 1 120); do
  kubectl get --raw=/readyz >/dev/null 2>&1 && break
  sleep 5
done

# Install Helm (downloads v3.19.2 for correct architecture)
install_helm  # ... (function body downloads and extracts helm binary)

# Install Cilium CNI via Helm
helm upgrade --install cilium oci://quay.io/cilium/charts/cilium \
  --version 1.19.2 \
  --namespace kube-system \
  --create-namespace \
  --values /etc/exk8s/config/cilium-values.yaml \
  --wait --timeout 10m

# Apply kube-proxy DaemonSet
kubectl apply -f /etc/exk8s/manifests/kube-proxy.yaml
kubectl -n kube-system rollout status daemonset/kube-proxy --timeout=10m

# Apply CoreDNS Deployment
kubectl apply -f /etc/exk8s/manifests/coredns.yaml
kubectl -n kube-system rollout status deployment/coredns --timeout=10m
08

Certificate Map

Every TLS connection and which cert authenticates it.

All certificates are signed by a single Cluster CA. Every component trusts the CA, so they trust each other.

kubectladmin.pemapiserver
apiserverapiserver-etcd-client.pemetcd
apiserverapiserver-kubelet-client.pemkubelet
kubeletkubelet-client.pemapiserver
kube-proxy(via kubeconfig)apiserver
etcd nodeetcd-peer.pemetcd node
HTTPS clientsapiserver.pem (TLS)apiserver

Service Account keys are different — a raw RSA keypair, not certificates. API server signs JWT tokens with sa.key. Controller-manager verifies with sa.pub.

09

Network Architecture

10.96.0.0/12
Service CIDR

Virtual IPs for K8s Services. Intercepted by kube-proxy iptables rules — never hit a real interface.

10.96.0.1
kubernetes service (API server)
10.96.0.10
kube-dns service (CoreDNS)
172.16.0.0/12
Pod CIDR

Real IPs assigned to pods by Cilium. Pods communicate directly. Cilium uses VXLAN tunneling between nodes.

CNI Transition

Boot: Bridge CNI (cni0) provides basic same-node networking. After addons install: Cilium takes over with VXLAN tunneling, network policies, and cluster-wide IPAM. The bridge config is effectively replaced.

Port 6443

The API server listens on 6443. The security group allows this port from 0.0.0.0/0 so users can connect from anywhere. All other ports are restricted to within the subnet.

10

Every Source File

Click groups to expand.

cmd/ — Entry Points
cmd/k8sapi/main.goEntry point: healthcheck, start server, graceful shutdown.
cmd/k8sapi/init.goinit(): validate env, init OTEL, DB pool, token cache.
cmd/gen-openapi/main.goCLI to generate OpenAPI spec JSON.
internal/handlers/ — Request Handlers
clusterscreate.goPOST /clusters: validate, gen CA, create VMs, gen certs, return kubeconfig (~400 lines).
bootstrapcontrolplane.goGET /bootstrap/control-plane/:vm_id: auth VM, gen certs, build bundle.
bootstrapauth.goDual auth: IMDS token + bearer token validation.
health.go200 OK or 418 if shutting down.
favicon.goServes embedded favicon.ico.
globalerrorhandler.goCatches errors, logs + OTEL span.
internal/bootstrap/ — Bundle & Certs
bundle.goBuildControlPlaneNodeBundle(): assembles all files. FirstUsableIP(), RenderKubeconfig(), buildEtcdInitialCluster().
certs.goCreateCA(), CreateSignedCert(), CreateRSAKeyPairPEM(), ParseCAPair().
dns.goClusterName(), ControlPlaneDNSName(), ControlPlaneNodeName().
rendered_files.go14 functions that call renderTemplate() with data structs.
userdata.goControlPlaneBootstrapUserData(): renders bootstrap-userdata.sh.
templates/renderer.gogo:embed *.tmpl, parse at init, b64 helper, Render().
templates/*.tmpl (16 files)All templates — see Section 7 for line-by-line analysis.
internal/repository/ — Database
kubeclusters.goCreateKubeCluster(), GetKubeClusterByID().
etcdnodes.goCreateEtcdNode(), GetEtcdNodeByVMID(), ListEtcdNodesForCluster().
apiservernodes.goCreateAPIServerNode(), GetAPIServerNodeByVMID(), ListAPIServerNodesForCluster().
tokens.goToken validation + creation.
imdstokens.goIMDS token validation (read-only).
vms.goGetVMIdentityByID(): JOIN vms + interfaces + public_ipv4_allocations.
internal/computeclient/ + dnsclient/ — HTTP Clients
computeclient/client.goCreateVM(), CreateSecurityGroup(), GetSubnet(). Go generics helpers.
dnsclient/client.goCreateDNSRecord(). Uses DNS_SERVICE_TOKEN env var.
internal/db/, healthcheck/, shutdown/, utils/, otel/ — Infrastructure
db/*.go (8 files)pgx driver with full OTEL tracing on every query/batch/connection.
healthcheck/healthcheck.goPolls DB + OTEL health every second.
shutdown/handleshutdown.goSIGINT/SIGTERM handler with 5s drain wait.
constants/constants.goAll constants: CIDRs, versions, DNS zone, base URL.
11

Key Design Decisions

No kubeadm

Everything from raw binaries. Full control over cluster topology, cert management, component versions, and systemd integration. Tradeoff: every systemd unit and manifest must be hand-crafted.

Server-side PKI

All certs generated by k8sapi, never by VMs. Centralizes trust — VMs never have the CA key. Certs are delivered via the authenticated bootstrap API.

Dual authentication for bootstrap

IMDS token (proves "I am VM X") + bearer token (proves "I belong to org Y"). Prevents both external and internal impersonation. A VM can only fetch its own bundle.

Best-effort DNS

DNS failures don't block cluster creation. Users can always connect via IP. The dns_configured flag reports status.

Bridge CNI bootstrap shim

Solves the chicken-and-egg: pods need networking, but the CNI plugin (Cilium) runs as pods. Bridge CNI provides minimal networking until Cilium is installed.

Initial control plane concept

First VM (allNodes[0]) installs cluster addons. Others skip it. Prevents duplicate installations in multi-node setups.

Shared database

k8sapi reads from computeapi's tables (vms, interfaces). Avoids inter-service API calls for VM identity lookup. Tradeoff: schema coupling.