My CoreOS / Kubernetes Journey: Kubernetes

I need to learn and deploy Kubernetes quickly. I have a good bit of experience with Docker and docker-compose, and have read of the wonders of Kubernetes and of course watched Kelsey Hightower's kubernetes videos. But I have never actually used it. So here is how I did that, part the third.

What I Do For Fun On A Saturday

That. Was. Painful. I should have done more documentation as I went along, and I really hope I am able to capture what all went down yesterday. I went to bed at 2am with a more-or-less done Kubernetes install that did not work at all because the pods absolutely could not speak to one another.

The official CoreOS guide CoreOS + Kubernetes Step By Step is unfortunately outdated, and downright incorrect. At every step of the way, I was faced with weird errors that went unanswered on the Internet, except with someone official saying "Why not just pay Google?" No, thank you, this is about NOT running my applications on Someone Else's Computer.

I have dealt with poorly documented software before. No stranger to reading source code in languages I don't know. That's fine, and sometimes one can take the opportunity to contribute back to a project's documentation. (The price we pay for Free software is that if you don't like something about it, it's your fault for not fixing it.) But when you call your software open source and then intentionally make the documentation worse than useless, leading users down wrong paths and wasting my time... THAT is unforgivable. Hey Google, don't be evil.

I am not saying that I can write great documentation, but at least the information you find here will be correct as of Container Linux 1855.5.0 and Kubernetes 1.12.2. The deployment process is complicated but I am thinking about scripting it, and then devising a test that can tell me when my script fails, so that I can come back and update this documentation.

This will be modeled off of the incorrect and misleading CoreOS guide, with errors corrected where I can. I am not guaranteeing complete correctness or best practices, just showing what worked and hopefully saving you a little bit of time.

With experience, hopefully I can come back and provide better information.

Strap in, this is a long one!

Setup Guide for Kubernetes 1.12.x on Container Linux by CoreOS

This document tells how to set up Kubernetes in a single-master mode with one Master and two Workers. Kubernetes Pods will still function on Worker nodes with no Master, but you won't be able to do any of the cool dynamic configuration stuff or interact with the platform at all.

First, some terminology. If you are totally new to Kubernetes terminology, go read the Standardized Glossary first and then come back here.

A Pod is the smallest unit of Kubernetes. It contains one or more containers and represents a self-contained application unit. It might not be an entire application, but it is the smallest piece of an application that needs to be grouped together in order to work. For example, your application might expose a REST API, but also a monitoring service that consumes the same API. These two things should always be deployed together. The database, however, can be passed in externally and does not need to be grouped with your application.

A Service describes how to access an application, and can describe ports and load-balancers. It can serve at least one pod, and basically represents the front-end API of an application that could span many pods.

The Pod Network is where your Pods will be assigned IP addresses. We are going to use Flannel in this guide, and its job is to provide these IP addresses and the inter-pod communication layer.

The Service Network is where your Services will be exposed. It is an internal, private network. It is not managed by Flannel; it is exposed on each individual node by kube-proxy. Different pods can use this private network to access each other's services. This seems overly complicated, but the decoupling provided is the magic that allows Kubernetes to just schedule your pods to run where-the-hell-ever without you (or your developers) having to clearly plan your cluster design in advance. If anyone reading this has ever used Veritas.... yeah, that's a powerful tool, but this is so refreshing.

Determine your Constants

I am going to provide a whole bunch of templates below. These are mainly from the CoreOS website but have been corrected to function on a current sytem as of November 2018. Also, any routable IPs that I show are representative of my cluster and I will try to point out where you need to make changes, and where you can leave things to my determination.

First, let's bring forward the table of CoreOS nodes and add a little more information about the sytsems role. Up to this point, they have all been equal peers, but now we are going with a master/worker setup and need to make some decisions. I made kube0 the master; all real running applications will exist only on kube1 and kube2.

linode reverse dns hostname private ip public ip role
kube0 kube0.jefftickle.com 192.168.227.219 66.228.62.28 master
kube1 kube1.jefftickle.com 192.168.228.91 45.79.202.82 worker
kube2 kube2.jefftickle.com 192.168.225.159 23.239.19.243 worker

For our constants, I am going to be setting environment variables for a Linux shell like bash. This makes it easy to use envsubst to later switch them out in the template files. You can put these in a file, maybe kube.cfg, and then source that file into your shell by typing source kube.cfg.

You are going to need a list of all your etcd endpoints, which in our case run on the same nodes but are listening on the Linode Private IP space:

ETCD_ENDPOINTS=https://192.168.227.219:2379,https://192.168.228.91:2379,https://192.168.225.159:2379

We have decided which host is to be the master (kube0) and need to reference its Public IP in a number of places:

MASTER_HOST=66.228.62.28

Determine an IP space for the Pod Network. This will be managed by Flannel and will be the main IP address assigned to every container built by Kubernetes. Make sure it doesn't conflict with any other network your containers will need to access, and that there is absolutely no overlap with the Service Netowrk.

POD_NETWORK=10.2.0.0/16

Determine an IP space for the Service Network. This will NOT be managed by Flannel; it will be exposed to each individual node by kube-proxy.

SERVICE_NETWORK=10.3.0.0/16

Kubernetes has an internal API that can be accessed by Pods. It must be the first IP address in the SERVICE_NETWORK range. I am not entirely sure yet what the power exposed to pods is, but we'll learn that together later.

K8S_SERVICE_IP=10.3.0.1

Kubernetes also has an internal DNS that allows pods to access one another by name. This IP must be in the SERVICE_NETWORK, and the same IP must be configured on all worker nodes. Every other IP can be configured dynamically, because this will provide a name service!

DNS_SERVICE_IP=10.0.3.10

Certificate Authority

Once again, we need a certificate authority to authenticate all nodes and clients within our Kubernetes cluster. This is similar to what was done for etcd, but we will make a CA just for Kubernetes that will remain separate from etcd.

Also at this point the official documentation actually uses my preferred tool, openssl. Sorry about that cfssl mess earlier!

Create a new directory to work in...

[jwt@inara kubernetes-scripts]$ mkdir ca-kube
[jwt@inara kubernetes-scripts]$ cd ca-kube

Create a new cerficiate authority which will be used to sign the rest of these certificates.

[jwt@inara ca-kube]$ openssl genrsa -out ca-key.pem 2048
[jwt@inara ca-kube]$ openssl req -x509 -new -nodes \
                         -key ca-key.pem \
                         -days 3650 \
                         -out ca.pem \
                         -subj "/CN=kube-ca"

Create a file openssl.cnf with the following contents. This will only be used on the Master node, so the DNS informatin and IP addresses below should reflect that. Please make sure to change out the variables under [alt_names] to reflect your setup. TODO... if we were making a multi-master setup, we would also need the load-balancer front end informatin here.

[req]
req_extensions = v3_req
distinguished_name = req_distinguished_name
[req_distinguished_name]
[ v3_req ]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation, digitalSignature, keyEncipherment
subjectAltName = @alt_names
[alt_names]
DNS.1 = kube0
DNS.2 = kube0.local
DNS.3 = kube0.jefftickle.com
IP.1 = 10.3.0.1
IP.2 = 66.228.62.28

And now create the API Server Keypair.

[jwt@inara ca-kube]$ openssl genrsa -out apiserver-key.pem 2048
[jwt@inara ca-kube]$ openssl req -new \
                         -key apiserver-key.pem \
                         -out apiserver.csr \
                         -subj "/CN=kube-apiserver" \
                         -config openssl.cnf
[jwt@inara ca-kube]$ openssl x509 -req \
                         -in apiserver.csr \
                         -CAkey ca-key.pem \
                         -CAcreateserial \
                         -out apiserver.pem \
                         -days 365 \
                         -extensions v3_req \
                         -extfile openssl.cnf

Create a file worker-openssl.cnf with the following contents. This will be used on the worker nodes, and it takes an environment variable to specify the IP address of the worker. You shouldn't need to change anything from this template since we will leverage the environment variable below.

[ req ]
req_extensions = v3_req
distinguished_name = req_distinguished_name
[ req_distinguished_name ]
[ v3_req ]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation, digitalSignature, keyEncipherment
subjectAltName = @alt_names
[ alt_names ]
IP.1 = $ENV::WORKER_IP

Create a keypair for node1:

[jwt@inara ca-kube]$ openssl genrsa -out kube1-key.pem 2048
[jwt@inara ca-kube]$ WORKER_IP=45.79.202.82 openssl req -new \
                         -key kube1-key.pem \
                         -out kube1.csr \
                         -subj "/CN=kube1.jefftickle.com" \
                         -config worker-openssl.cnf
[jwt@inara ca-kube]$ WORKER_IP=43.79.202.82 openssl x509 -req \
                         -in kube1.csr \
                         -CAkey ca-key.pem \
                         -CAcreateserial \
                         -out kube1.pem \
                         -days 365 \
                         -extensions v3_req \
                         -extfile worker-openssl.cnf

And create a keypair for node2:

[jwt@inara ca-kube]$ openssl genrsa -out kube2-key.pem 2048
[jwt@inara ca-kube]$ WORKER_IP=23.239.19.243 openssl req -new \
                         -key kube2-key.pem \
                         -out kube2.csr \
                         -subj "/CN=kube2.jefftickle.com" \
                         -config worker-openssl.cnf
[jwt@inara ca-kube]$ WORKER_IP=23.239.19.243 openssl x509 -req \
                         -in kube2.csr \
                         -CAkey ca-key.pem \
                         -CAcreateserial \
                         -out kube2.pem \
                         -days 365 \
                         -extensions v3_req \
                         -extfile worker-openssl.cnf

Finally, you will need a keypair later to access the cluster using kubectl. Keep this CA around, because each user of the system should have a unique client certificate for authentication. Don't need any AltNames or CN for this one, fortunately, so don't worry about the openssl.cnf.

[jwt@inara ca-kube]$ openssl genrsa -out admin-key.pem 2048
[jwt@inara ca-kube]$ openssl req -new \
                         -key admin-key.pem \
                         -out admin.csr \
                         -subj "/CN=kube-admin"
[jwt@inara ca-kube]$ openssl x509 -req \
                         -in admin.csr \
                         -CAkey ca-key.pem \
                         -CAcreateserial \
                         -out admin.pem \
                         -days 365 \
                         -extensions v3_req

You should now have a whole bunch of crypto files:

file purpose
openssl.cnf Master Node OpenSSL Conf (keep in case of new master)
worker-openssl.cnf Worker Node OpenSSL Conf (keep in case of new worker)
ca-key.pem Kubernetes Certificate Authority Private Key (BIG SECRET)
ca.pem Kubernetes CA Certificate (not secret)
kube0.csr Master Node Cert Request (keep for regenerating later)
kube0-key.pem Master Node Private Key (BIG SECRET)
kube0.pem Master Node Certificate (not secret)
kube1.csr Worker Node Cert Request (keep for regenerating later)
kube1-key.pem Worker Node Private Key (BIG SECRET)
kube1.pem Worker Node Certificate (not secret)
kube2.csr Worker Node Cert Request (keep for regenerating later)
kube2-key.pem Worker Node Private Key (BIG SECRET)
kube2.pem Worker Node Certificate (not secret)

Spoiler alert, we're going to be copying out CA to everything and cert/key to specific systems!

Create Scaffolding on Master Node

For this section we will ONLY be working on the Master node, kube0. We will set up Flannel, Docker, and Kubelet, which are the foundation of Kubernetes. Kubernetes actually runs as a series of pods within Kubelet, which contain containers running on Docker, and communicate with each other over Flannel.

Let's start by coping our crypto assets and establish an ssh connection:

[jwt@inara ca-kube]$ scp ca.pem kube0-key.pem kube0.pem core@kube0.jefftickle.com:~/
[jwt@inara ca-kube]$ ssh core@kube0.jefftickle.com
Container Linux by CoreOS stable (1855.5.0)
core@kube0 ~ $

Configure Flannel and Docker first. Flannel needs to talk to etcd, and it needs to start before Docker, and Docker needs to be told not to mess with the network.

Create /etc/flannel/options.env, which is an environment file that systemd will use to bootstrap flannel. We need to make it aware of etcd certificates from episode 1. Note, these are NOT the certificates we just copied up.

core@kube0 ~ $ sudo mkdir /etc/flannel
core@kube0 ~ $ vim /etc/flannel/options.env
FLANNEL_IFACE=192.168.227.219
FLANNELD_ETCD_ENDPOINTS=https://192.168.227.219:2379,https://192.168.228.91:2379,https://192.168.225.159:2379
FLANNELD_ETCD_KEYFILE=/etc/ssl/certs/kube0-etcd.key
FLANNELD_ETCD_CERTFILE=/etc/ssl/certs/kube0-etcd.crt
FLANNELD_ETCD_CAFILE=/etc/ssl/certs/ca-etcd.crt

Create a systemd drop-in to symlink this configuration to the right place

core@kube0 ~ $ sudo mkdir /etc/systemd/system/flanneld.service.d
core@kube0 ~ $ sudo vim  /etc/systemd/system/flanneld.service.d/40-ExecStartPre-symlink.conf
[Service]
ExecStartPre=/usr/bin/ln -sf /etc/flannel/options.env /run/flannel/options.env

Create a systemd drop-in to make Docker start after Flannel. Docker also needs to not mess with the network, which is what the docker_opts_cni.env environment file configures.

core@kube0 ~ $ sudo mkdir /etc/systemd/system/docker.service.d
core@kube0 ~ $ sudo vim /etc/systemd/system/docker.service.d/40-flannel.conf
[Unit]
Requires=flanneld.service
After=flanneld.service
[Service]
EnvironmentFile=/etc/kubernetes/cni/docker_opts_cni.env
core@kube0 ~ $ sudo mkdir -p /etc/kubernetes/cni
core@kube0 ~ $ sudo vim /etc/kubernetes/cni/docker_opts_cni.env
DOCKER_OPT_BIP=""
DOCKER_OPT_IPMASQ=""

Create a systemd unit to start Kubelet. Kubelet is the agent that starts and stop pods for Kubernetes. It is sort of the CPU on which Kubernetes runs. It takes instructions from local manifest files on the disk (which we will put in /etc/kubernetes/manifests shortly), and takes instructions from the kube-apiserver pod that it hosts (nice chicken-and-egg problem, eh?)

Note: you need to change the hostname-override to be the Public IP address of the Master Node. You may need to change cluster_dns if you changed it when we determined constants in the first section of this document.

Put this in /etc/systemd/system/kubelet.service

[Service]
Environment=KUBELET_IMAGE=docker://gcr.io/google-containers/hyperkube:v1.12.2
Environment="RKT_RUN_ARGS=--uuid-file-save=/run/kubelet/pod.uuid \
  --volume var-log,kind=host,source=/var/log \
  --mount volume=var-log,target=/var/log \
  --volume dns,kind=host,source=/etc/resolv.conf \
  --mount volume=dns,target=/etc/resolv.conf \
  --insecure-options=image"
ExecStartPre=/usr/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/usr/bin/mkdir -p /var/log/containers
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/run/kubelet/pod.uuid
ExecStart=/usr/lib/coreos/kubelet-wrapper \
  --register-schedulable=false \
  --cni-conf-dir=/etc/kubernetes/cni/net.d \
  --network-plugin=cni \
  --container-runtime=docker \
  --allow-privileged=true \
  --pod-manifest-path=/etc/kubernetes/manifests \
  --hostname-override=66.228.62.28 \
  --cluster_dns=10.3.0.10 \
  --cluster_domain=cluster.local
ExecStop=/usr/bin/rkt stop --uuid-file=/run/kubelet/pod.uuid
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target

Now we need to set up a little etcd configuration for Flannel so that it knows what we want for networking. From a kube node, run this, switching out the Network section for the POD_NETWORK decided above; and replacing the 192.168 address with an etcd endpoint from your set.

core@kube0 ~ $ curl -X PUT -d \
    --cacert /etc/ssl/certs/ca-etcd.crt \
    --cert   /etc/ssl/certs/kube0-etcd.crt \
    --key    /etc/ssl/certs/kube0-etcd.key \
    "value={\"Network\":\"10.2.0.0/16\",\"Backend\":{\"Type\":\"vxlan\"}}" \
    "https://192.168.227.219:2379/v2/keys/coreos.com/network/config"

You can verify this was written with:

core@kube0 ~ $ curl \
    --cacert /etc/ssl/certs/ca-etcd.crt \
    --cert   /etc/ssl/certs/kube0-etcd.crt \
    --key    /etc/ssl/certs/kube0-etcd.key \
    https://192.168.227.219:2379/v2/keys/coreos.com/network/config

At this point, we can start the scaffolding that will hold up the things that actually are Kubernetes. Flannel provides cross-node networking, etcd provides shared key-value store, Docker provides a runtime environment, and Kubelet manages the things running in the Docker environment. Kubelet will watch /etc/kubernetes/manifests (created by above systemd script) and will run any pods described therein; and in the next step we will put some manifest files there. We've already started etcd, so now enable and start flanneld, docker, and kubelet.

core@kube0 ~ $ sydo systemctl daemon-reload
core@kube0 ~ $ sudo systemctl enable flanneld
Created symlink /etc/systemd/system/multi-user.target.wants/flanneld.service → /usr/lib/systemd/system/flanneld.service.
core@kube0 ~ $ sudo systemctl enable docker
Created symlink /etc/systemd/system/multi-user.target.wants/docker.service → /run/systemd/system/docker.service.
core@kube0 ~ $ sudo systemctl enable kubelet
Created symlink /etc/systemd/system/multi-user.target.wants/kubelet.service → /etc/systemd/system/kubelet.service.
core@kube0 ~ $ sudo systemctl start flanneld
core@kube0 ~ $ sudo systemctl start docker
core@kube0 ~ $ sudo systemctl start kubelet
core@kube0 ~ $ sudo journalctl -xef

Watch the log outputs for error messages. If I have made a mistake in this documentation, please let me know. Be on the lookout for errors from etcd that a client has a bad certificate, that was something I struggled with, but this config should have that all correct now.

It is also worthwhile to see what rkt is running, since it runs all the scafolding for us:

core@kube0 ~ $ rkt list
UUID     APP       IMAGE NAME                                 STATE       ...
10a00a48 etcd      quay.io/coreos/etcd:v3.3.9                 running     ...
13dca84c flannel   quay.io/coreos/flannel:v0.10.0             running     ...
a382aec5 hyperkube gcr.io/google-containers/hyperkube:v1.12.2 running     ...

Deploy Kubernetes Master

We're really in it now. After this step, you will be running Kubernetes. Not in a very robust fashion, but pods will be deployed and speaking to one another.

This should be rather straightforward. We will put some Kubernetes Manifests into the /etc/kubernetes/manifests path and Kubelet will run them. It's okay if they start a little out of order; they keep on retrying until they make the connection or fail entirely. It really is quite an impressive, robust system.

At this point, we need the keys and certificates that we generated earlier. Let's go ahead and put them into place.

core@kube0 ~ $ mkdir -p /etc/kubernetes/ssl
core@kube0 ~ $ sudo cp ca.crt kube0-key.pem kube0.pem /etc/kubernetes/ssl/

Let's start with the apiserver. This is the front-end interface to Kubernetes. It is what kubectl speaks to, and it is how the worker nodes get their instructions.

Note that you will need to change out --etcd-servers below, and --advertise-address should be the Public IP of the Master Node. If you decided on a different SERVICE_NETWORK, change it in --service-cluster-ip-range.

Create /etc/kubernetes/manifests/kube-apiserver.yaml with these contents:

apiVersion: v1
kind: Pod
metadata:
  name: kube-apiserver
  namespace: kube-system
spec:
  hostNetwork: true
  containers:
  - name: kube-apiserver
    image: gcr.io/google-containers/hyperkube:v1.12.2
    command:
    - /hyperkube
    - apiserver
    - --bind-address=0.0.0.0
    - --etcd-servers=https://192.168.227.219:2379,https://192.168.228.91:2379,https://192.168.225.159:2379
    - --allow-privileged=true
    - --service-cluster-ip-range=10.3.0.0/16
    - --secure-port=443
    - --advertise-address=66.228.62.28
    - --admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota
    - --tls-cert-file=/etc/kubernetes/ssl/apiserver.pem
    - --tls-private-key-file=/etc/kubernetes/ssl/apiserver-key.pem
    - --client-ca-file=/etc/kubernetes/ssl/ca.pem
    - --service-account-key-file=/etc/kubernetes/ssl/apiserver-key.pem
    - --runtime-config=extensions/v1beta1/networkpolicies=true
    - --anonymous-auth=false
    - --etcd-cafile=/etc/etcd/certs/ca-etcd.crt
    - --etcd-certfile=/etc/etcd/certs/kube0-etcd.crt
    - --etcd-keyfile=/etc/etcd/certs/kube0-etcd.key
    livenessProbe:
      httpGet:
        host: 127.0.0.1
        port: 8080
        path: /healthz
      initialDelaySeconds: 120
      timeoutSeconds: 120
    ports:
    - containerPort: 443
      hostPort: 443
      name: https
    - containerPort: 8080
      hostPort: 8080
      name: local
    volumeMounts:
    - mountPath: /etc/kubernetes/ssl
      name: ssl-certs-kubernetes
      readOnly: true
    - mountPath: /etc/ssl/certs
      name: ssl-certs-host
      readOnly: true
    - mountPath: /etc/etcd/certs
      name: ssl-certs-etcd
      readOnly: true
  volumes:
  - hostPath:
      path: /etc/kubernetes/ssl
    name: ssl-certs-kubernetes
  - hostPath:
      path: /usr/share/ca-certificates
    name: ssl-certs-host
  - hostPath:
      path: /etc/ssl/certs
    name: ssl-certs-etcd

Next up is the kube-proxy. The Kubernetes Proxy takes Service traffic and sends it to the right place in the cluster, taking instructions from the apiserver. You should not need to change anything in this file.

Create /etc/kubernetes/manifests/kube-proxy.yaml with these contents:

apiVersion: v1
kind: Pod
metadata:
  name: kube-proxy
  namespace: kube-system
spec:
  hostNetwork: true
  containers:
  - name: kube-proxy
    image: gcr.io/google-containers/hyperkube:v1.12.2
    command:
    - /hyperkube
    - proxy
    - --master=http://127.0.0.1:8080
    securityContext:
      privileged: true
    volumeMounts:
    - mountPath: /etc/ssl/certs
      name: ssl-certs-host
      readOnly: true
  volumes:
  - hostPath:
      path: /usr/share/ca-certificates
    name: ssl-certs-host

Now we deploy the controller manager. This is the heart of Kubernetes. It is the control loop that determines the state of the system. It ultimately is the thing that makes decisions. You should not need to change anything in this file.

Create /etc/kubernetes/manifests/kube-controller-manager.yaml with these contents:

apiVersion: v1
kind: Pod
metadata:
  name: kube-controller-manager
  namespace: kube-system
spec:
  hostNetwork: true
  containers:
  - name: kube-controller-manager
    image: gcr.io/google-containers/hyperkube:v1.12.2
    command:
    - /hyperkube
    - controller-manager
    - --master=http://127.0.0.1:8080
    - --leader-elect=true
    - --service-account-private-key-file=/etc/kubernetes/ssl/apiserver-key.pem
    - --root-ca-file=/etc/kubernetes/ssl/ca.pem
    resources:
      requests:
        cpu: 200m
    livenessProbe:
      httpGet:
        host: 127.0.0.1
        path: /healthz
        port: 10252
      initialDelaySeconds: 15
      timeoutSeconds: 15
    volumeMounts:
    - mountPath: /etc/kubernetes/ssl
      name: ssl-certs-kubernetes
      readOnly: true
    - mountPath: /etc/ssl/certs
      name: ssl-certs-host
      readOnly: true
  volumes:
  - hostPath:
      path: /etc/kubernetes/ssl
    name: ssl-certs-kubernetes
  - hostPath:
      path: /usr/share/ca-certificates
    name: ssl-certs-host

And finally we will deploy the scheduler. This is the thing that figures out where and when to put your pods; it deals with autoscaling, and what needs to be close to what. This is the other heart of Kubernetes, I guess. You should not need to change anything in this file.

Create /etc/kubernetes/manifests/kube-scheduler.yaml with these contents:

apiVersion: v1
kind: Pod
metadata:
  name: kube-scheduler
  namespace: kube-system
spec:
  hostNetwork: true
  containers:
  - name: kube-scheduler
    image: gcr.io/google-containers/hyperkube:v1.12.2
    command:
    - /hyperkube
    - scheduler
    - --master=http://127.0.0.1:8080
    - --leader-elect=true
    resources:
      requests:
        cpu: 100m
    livenessProbe:
      httpGet:
        host: 127.0.0.1
        path: /healthz
        port: 10251
      initialDelaySeconds: 15
      timeoutSeconds: 15

And that's all there is to it. Kubelet now has all of the information it needs in order to start creating these pods and starting Kubernetes proper. You can check journalctl -xef for errors, or to make sure things are communicating properly. Please be aware that it does take some time to start up, maybe a minute or two. If you check your Docker output, you should see four containers called 'pause', and maybe some kube* containers. I'm not sure what these 'pause' containers do exactly but it's something to do with Kubernetes' management process. When things are up and running properly though, you should see something like this from Docker:

core@kube0 /etc/kubernetes/manifests $ sudo docker ps -a
CONTAINER ID        IMAGE                  COMMAND                  ...
208b5b8ac00b        3e4497624dd0           "/hyperkube proxy --…"   ...
c6d468f94270        3e4497624dd0           "/hyperkube apiserve…"   ...
c06f2316adb2        3e4497624dd0           "/hyperkube controll…"   ...
ed18b56facbc        3e4497624dd0           "/hyperkube schedule…"   ...
9081f7807cf2        k8s.gcr.io/pause:3.1   "/pause"                 ...
22682417a8d1        k8s.gcr.io/pause:3.1   "/pause"                 ...
f3c1ba750da2        k8s.gcr.io/pause:3.1   "/pause"                 ...
3b7e6ca7721a        k8s.gcr.io/pause:3.1   "/pause"                 ...

Deploy Kubernetes Workers

Kubernetes Workers are a good bit simpler than the Master. They need Flannel, Docker, and Kubelet, and those things are configured in the same way as on the master. The only Kubernetes Pod that is deployed on a Worker is the kube-proxy pod. Much of this is literally copypasta from above.

Please be aware that you need to do this N times where N is the number of Worker nodes. I am only going to list the values for kube1 in my cluster. So once you complete this section, start it over on kube2. Or be smarter than me and script it... but good luck if you have had the bright idea of using Ansible, yeah, that don't work so good when the remote OS has no Python.

Let's start by coping our crypto assets and establish an ssh connection:

[jwt@inara ca-kube]$ scp ca.pem kube1-key.pem kube1.pem core@kube1.jefftickle.com:~/
[jwt@inara ca-kube]$ ssh core@kube1.jefftickle.com
Container Linux by CoreOS stable (1855.5.0)
core@kube1 ~ $

Configure Flannel and Docker.

core@kube1 ~ $ sudo mkdir /etc/flannel
core@kube1 ~ $ vim /etc/flannel/options.env
FLANNEL_IFACE=192.168.227.219
FLANNELD_ETCD_ENDPOINTS=https://192.168.227.219:2379,https://192.168.228.91:2379,https://192.168.225.159:2379
FLANNELD_ETCD_KEYFILE=/etc/ssl/certs/kube1-etcd.key
FLANNELD_ETCD_CERTFILE=/etc/ssl/certs/kube1-etcd.crt
FLANNELD_ETCD_CAFILE=/etc/ssl/certs/ca-etcd.crt

Create a systemd drop-in to symlink this configuration to the right place

core@kube1 ~ $ sudo mkdir /etc/systemd/system/flanneld.service.d
core@kube1 ~ $ sudo vim  /etc/systemd/system/flanneld.service.d/40-ExecStartPre-symlink.conf
[Service]
ExecStartPre=/usr/bin/ln -sf /etc/flannel/options.env /run/flannel/options.env

Create a systemd drop-in to make Docker start after Flannel and set the Docker network options.

core@kube1 ~ $ sudo mkdir /etc/systemd/system/docker.service.d
core@kube1 ~ $ sudo vim /etc/systemd/system/docker.service.d/40-flannel.conf
[Unit]
Requires=flanneld.service
After=flanneld.service
[Service]
EnvironmentFile=/etc/kubernetes/cni/docker_opts_cni.env
core@kube1 ~ $ sudo mkdir -p /etc/kubernetes/cni
core@kube1 ~ $ sudo vim /etc/kubernetes/cni/docker_opts_cni.env
DOCKER_OPT_BIP=""
DOCKER_OPT_IPMASQ=""

Create a systemd unit to start Kubelet.

Note: the hostname-override here is the IP address for this particular node, so make sure to change that out for node1 and node2. The rest is the same as before.

Put this in /etc/systemd/system/kubelet.service

[Service]
Environment=KUBELET_IMAGE=docker://gcr.io/google-containers/hyperkube:v1.12.2
Environment="RKT_RUN_ARGS=--uuid-file-save=/run/kubelet/pod.uuid \
  --volume var-log,kind=host,source=/var/log \
  --mount volume=var-log,target=/var/log \
  --volume dns,kind=host,source=/etc/resolv.conf \
  --mount volume=dns,target=/etc/resolv.conf \
  --insecure-options=image"
ExecStartPre=/usr/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/usr/bin/mkdir -p /var/log/containers
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/run/kubelet/pod.uuid
ExecStart=/usr/lib/coreos/kubelet-wrapper \
  --register-schedulable=false \
  --cni-conf-dir=/etc/kubernetes/cni/net.d \
  --network-plugin=cni \
  --container-runtime=docker \
  --allow-privileged=true \
  --pod-manifest-path=/etc/kubernetes/manifests \
  --hostname-override=45.79.202.82 \
  --cluster_dns=10.3.0.10 \
  --cluster_domain=cluster.local
ExecStop=/usr/bin/rkt stop --uuid-file=/run/kubelet/pod.uuid
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target

Note that at this point in the Master instructions, we set up some etcd configuration for Flannel. We do NOT need to do that here, since etcd is all replicated out.

It's time to start the scaffolding services!

core@kube1 ~ $ sudo systemctl daemon-reload
core@kube1 ~ $ sudo systemctl enable flanneld
Created symlink /etc/systemd/system/multi-user.target.wants/flanneld.service → /usr/lib/systemd/system/flanneld.service.
core@kube1 ~ $ sudo systemctl enable docker
Created symlink /etc/systemd/system/multi-user.target.wants/docker.service → /run/systemd/system/docker.service.
core@kube1 ~ $ sudo systemctl enable kubelet
Created symlink /etc/systemd/system/multi-user.target.wants/kubelet.service → /etc/systemd/system/kubelet.service.
core@kube1 ~ $ sudo systemctl start flanneld
core@kube1 ~ $ sudo systemctl start docker
core@kube1 ~ $ sudo systemctl start kubelet
core@kube1 ~ $ sudo journalctl -xef

Watch the log outputs for error messages. Check rkt to see what is running:

core@kube1 ~ $ rkt list
UUID     APP       IMAGE NAME                                 STATE       ...
2113c8a5 hyperkube gcr.io/google-containers/hyperkube:v1.12.2 running     ...
6d9c8f8e etcd      quay.io/coreos/etcd:v3.3.9                 running     ...
98502536 flannel   quay.io/coreos/flannel:v0.10.0             running     ...

And now that the scaffolding is going, let's deploy the kube-proxy pod. This is a little different from before; we need to make a kubeconfig file that tells it how to talk to the master. This is the same kind of kubeconfig that you will eventually use on the client, although it's a less manual process on the client.

First create the kubeconfig file so that it will be there when kube-proxy starts. Put these contents into /etc/kubernetes/worker-kubeconfig.yaml. Note the different worker certificate name for the particular node that you are on, and server: should be set to https:// and the public IP of the master.

apiVersion: v1
kind: Config
clusters:
- name: local
  cluster:
    certificate-authority: /etc/kubernetes/ssl/ca.pem
    server: https://66.228.62.28
    pod-manifest-path: /etc/kubernetes/manifests
users:
- name: kubelet
  user:
    client-certificate: /etc/kubernetes/ssl/kube1.pem
    client-key: /etc/kubernetes/ssl/kube1-key.pem
contexts:
- context:
    cluster: local
    user: kubelet
  name: kubelet-context
current-context: kubelet-context

Put these contents into /etc/kubernetes/manifests/kube-proxy.yaml, you shouldn't need to change anything:

apiVersion: v1
kind: Pod
metadata:
  name: kube-proxy
  namespace: kube-system
spec:
  hostNetwork: true
  containers:
  - name: kube-proxy
    image: gcr.io/google-containers/hyperkube:v1.12.2
    command:
    - /hyperkube
    - proxy
    - --kubeconfig=/etc/kubernetes/worker-kubeconfig.yaml
    securityContext:
      privileged: true
    volumeMounts:
    - mountPath: /etc/ssl/certs
      name: "ssl-certs"
    - mountPath: /etc/kubernetes/worker-kubeconfig.yaml
      name: "kubeconfig"
      readOnly: true
    - mountPath: /etc/kubernetes/ssl
      name: "etc-kube-ssl"
      readOnly: true
  volumes:
  - name: "ssl-certs"
    hostPath:
      path: "/usr/share/ca-certificates"
  - name: "kubeconfig"
    hostPath:
      path: "/etc/kubernetes/worker-kubeconfig.yaml"
  - name: "etc-kube-ssl"
    hostPath:
      path: "/etc/kubernetes/ssl"

It may take a couple minutes to start, but after you create this file, kubelet should get to starting the pod. Check journalctl -xef for errors and docker ps -a for containers.

Configure kubectl

Kubectl is the cool tool you always see people using to interact with Kubernetes; and if nothing is going wrong, it is really all that you need. Let's set that up on our workstation so that we can deploy some apps.

Note, this is not done on a server. You do this on your client workstation.

Install kubectl either through a package manager, or the disgusting and very untrustable way that has become popular of late:

[jwt@inara ~]$ wget https://storage.googleapis.com/kubernetes-release/release/v1.0.6/bin/linux/amd64/kubectl

Change 'linux' to 'darwin' above if you are doing this on a mac. On Windows Subsystem for Linux, leave it linux.

Make it executable and put it somewhere friendly in your path. By the way, it is a really bad idea to just download something from the Internet, make it executable, and then run it. I am really, really sick of people distributing software in this way. I mean at least hash the binary if nothing else, people.

[jwt@inara ~]$ chmod +x kubectl
[jwt@inara ~]$ mv kubectl /usr/local/bin/

Now go to the directory where you made certificates at the beginning of this long page, or just use absolute paths below. We need to configure the cluster, the admin, and the system context, and then choose the system context. Also, make sure to change kube0.jefftickle.com for the public interface of your cluster. You can use an IP as well if you don't have DNS.

[jwt@inara ~]$ cd projects/kubernetes-scripts/ca-kube
[jwt@inara ca-kube]$ kubectl config set-cluster default-cluster \
    --server=https://kube0.jefftickle.com \
    --certificate-authority=ca.pem
[jwt@inara ca-kube]$ kubectl config set-credentials default-admin \
    --certificate-authority=ca.pem \
    --client-key=admin-key.pem \
    --client-certificate=admin.pem
[jwt@inara ca-kube]$ kubectl config set-context default-system \
    --cluster=default-cluster \
    --user=default-admin
[jwt@inara ca-kube]$ kubectl config use-context default-system

And you're configured! Try using kubectl to connect to your cluster:

[jwt@inara ca-kube]$ kubectl get nodes
NAME            STATUS    AGE       VERSION
23.239.19.243   Ready     14h       v1.12.2
45.79.202.82    Ready     14h       v1.12.2

[jwt@inara ca-kube]$ kubectl get all --all-namespaces
NAMESPACE     NAME                          READY    STATUS    RESTARTS   AGE
kube-system   po/kube-proxy-23.239.19.243   1/1      Running   0          5h
kube-system   po/kube-proxy-45.79.202.82    1/1      Running   0          5h

NAMESPACE     NAME              CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
default       svc/kubernetes    10.3.0.1     <none>        443/TCP    17h

Deploy Kubernetes Add-Ons

The CoreOS documentation has changed since last night, and it is still wrong. How can this be? Someone is actually updating it, and it's straight up incorrect and does not work. I discovered this because they removed a feature. Last night, on the add-ons page, they gave instructions on how to install the Kubernetes Dashboard. Now... gone. I don't get it. Did someone notice I was hitting their pages like crazy and try to throw me off even further from having success? The mind boggles.

Anyway, the core Kubernetes system is running and you can deploy applications. Kubernetes ships some "Add-ons" that will be part of the kube-system namespace (which is a totally new term that I am not explaining because I am sick of it and want to get this page written so I can move on. TODO: clean all this up).

One of these applications is kube-dns, which lets us interact between pods by name. The other is kube-dashboard, which gives us a real pretty web-based interface to look into the Kubernetes system.

Installing these things are pretty easy. You create the pod manifest, use kubectl to bring it up, and everything should just work. Unfortunately, last night I ran into an issue where the pods could not connect to the apiserver. I spent two hours beating my head against the wall (and repeating commands because I forgot about that damn kube-system namespace) and gave up and went to bed.

This morning, I realized that I had only actually done the reverse DNS instructions from episode 0 on kube0, not kube1 or kube2; and none of the systems had automatically picked up their hostnames. So make sure to use hostnamectl to set the proper hostnames for each system, and make sure to get those reverse DNS lookups correct before moving along. If you're here and haven't done that yet, I believe everything to this point will work without it. Now's the time to clean that up.

But, once that's done, this is really easy. If you decided to use a DNS_SERVICE_IP other than 10.3.0.10, then make sure to change the value of clusterIP: below. Otherwise just copypasta this into dns-addons.yaml:

apiVersion: v1
kind: Service
metadata:
  name: kube-dns
  namespace: kube-system
  labels:
    k8s-app: kube-dns
    kubernetes.io/cluster-service: "true"
    kubernetes.io/name: "KubeDNS"
spec:
  selector:
    k8s-app: kube-dns
  clusterIP: 10.3.0.10
  ports:
  - name: dns
    port: 53
    protocol: UDP
  - name: dns-tcp
    port: 53
    protocol: TCP


---


apiVersion: v1
kind: ReplicationController
metadata:
  name: kube-dns-v20
  namespace: kube-system
  labels:
    k8s-app: kube-dns
    version: v20
    kubernetes.io/cluster-service: "true"
spec:
  replicas: 1
  selector:
    k8s-app: kube-dns
    version: v20
  template:
    metadata:
      labels:
        k8s-app: kube-dns
        version: v20
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ''
        scheduler.alpha.kubernetes.io/tolerations: '[{"key":"CriticalAddonsOnly", "operator":"Exists"}]'
    spec:
      containers:
      - name: kubedns
        image: gcr.io/google_containers/kubedns-amd64:1.8
        resources:
          limits:
            memory: 170Mi
          requests:
            cpu: 100m
            memory: 70Mi
        livenessProbe:
          httpGet:
            path: /healthz-kubedns
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 60
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 5
        readinessProbe:
          httpGet:
            path: /readiness
            port: 8081
            scheme: HTTP
          initialDelaySeconds: 3
          timeoutSeconds: 5
        args:
        - --domain=cluster.local.
        - --dns-port=10053
        ports:
        - containerPort: 10053
          name: dns-local
          protocol: UDP
        - containerPort: 10053
          name: dns-tcp-local
          protocol: TCP
      - name: dnsmasq
        image: gcr.io/google_containers/kube-dnsmasq-amd64:1.4
        livenessProbe:
          httpGet:
            path: /healthz-dnsmasq
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 60
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 5
        args:
        - --cache-size=1000
        - --no-resolv
        - --server=127.0.0.1#10053
        - --log-facility=-
        ports:
        - containerPort: 53
          name: dns
          protocol: UDP
        - containerPort: 53
          name: dns-tcp
          protocol: TCP
      - name: healthz
        image: gcr.io/google_containers/exechealthz-amd64:1.2
        resources:
          limits:
            memory: 50Mi
          requests:
            cpu: 10m
            memory: 50Mi
        args:
        - --cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1 >/dev/null
        - --url=/healthz-dnsmasq
        - --cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1:10053 >/dev/null
        - --url=/healthz-kubedns
        - --port=8080
        - --quiet
        ports:
        - containerPort: 8080
          protocol: TCP
      dnsPolicy: Default

This is a very complicated version of a Kubernetes manifest. We'll be making simpler ones on the next episode.

Start the DNS add-on:

[jwt@inara kubernetes-scripts]$ kubectl create -f dns-addon.yaml
[jwt@inara kubernetes-scripts]$ kubectl get pods --namespace=kube-system
NAME                                READY     STATUS    RESTARTS   AGE
kube-dns-v20-5gtf8                  3/3       Running   12         13h
kube-proxy-23.239.19.243            1/1       Running   0          5h
kube-proxy-45.79.202.82             1/1       Running   0          5h

See all those restarts on mine? It was in the hundreds this morning, and I destroyed and recreated the service, and then figured out the reverse DNS issue. Boy.

So that's DNS, but we can't really do anything with it right now. Let's install something that we can actually see running and feel successful. The kubernetes dashboard is a neat tool and I'll let you explore it on your own and won't spoil it with screenshots.

Create a file kube-dashboard-rc.yaml. This is the Replication Controller. We'll make another one in a minute that describes the Service.

apiVersion: v1
kind: ReplicationController
metadata:
  name: kubernetes-dashboard-v1.6.0
  namespace: kube-system
  labels:
    k8s-app: kubernetes-dashboard
    version: v1.6.0
    kubernetes.io/cluster-service: "true"
spec:
  replicas: 1
  selector:
    k8s-app: kubernetes-dashboard
  template:
    metadata:
      labels:
        k8s-app: kubernetes-dashboard
        version: v1.6.0
        kubernetes.io/cluster-service: "true"
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ''
        scheduler.alpha.kubernetes.io/tolerations: '[{"key":"CriticalAddonsOnly", "operator":"Exists"}]'
    spec:
      containers:
      - name: kubernetes-dashboard
        image: gcr.io/google_containers/kubernetes-dashboard-amd64:v1.6.0
        resources:
          limits:
            cpu: 100m
            memory: 50Mi
          requests:
            cpu: 100m
            memory: 50Mi
        ports:
        - containerPort: 9090
        livenessProbe:
          httpGet:
            path: /
            port: 9090
          initialDelaySeconds: 30
          timeoutSeconds: 30

Now make a file called kube-dashboard-svc.yaml:

apiVersion: v1
kind: Service
metadata:
  name: kubernetes-dashboard
  namespace: kube-system
  labels:
    k8s-app: kubernetes-dashboard
    kubernetes.io/cluster-service: "true"
spec:
  selector:
    k8s-app: kubernetes-dashboard
  ports:
  - port: 80
    targetPort: 9090

And we will start them with kubectl:

[jwt@inara kubernetes-scripts]$ kubectl create -f kube-dashboard-rc.yaml
[jwt@inara kubernetes-scripts]$ kubectl create -f kube-dashboard-svc.yaml

And verify that they are running (sometimes it takes a minute to start, so don't worry):

[jwt@inara kubernetes-scripts]$ kubectl get pods --namespace=kube-system
NAME                                READY     STATUS    RESTARTS   AGE
kube-dns-v20-5gtf8                  3/3       Running   12         13h
kube-proxy-23.239.19.243            1/1       Running   0          5h
kube-proxy-45.79.202.82             1/1       Running   0          5h
kubernetes-dashboard-v1.6.0-cv7x6   1/1       Running   8          14h
[jwe@inara kubernetes-scripts]$ kubectl get all --all-namespaces
NAME                                READY     STATUS    RESTARTS   AGE
kube-dns-v20-5gtf8                  3/3       Running   12         13h
kube-proxy-23.239.19.243            1/1       Running   0          5h
kube-proxy-45.79.202.82             1/1       Running   0          5h
kubernetes-dashboard-v1.6.0-cv7x6   1/1       Running   8          14h
[jwt@inara kubernetes-scripts]$ kubectl get all --all-namespaces
NAMESPACE     NAME                                   READY     STATUS   ...
kube-system   po/kube-dns-v20-5gtf8                  3/3       Running  ...
kube-system   po/kube-proxy-23.239.19.243            1/1       Running  ...
kube-system   po/kube-proxy-45.79.202.82             1/1       Running  ...
kube-system   po/kubernetes-dashboard-v1.6.0-cv7x6   1/1       Running  ...

NAMESPACE     NAME                             DESIRED   CURRENT   READY   ...
kube-system   rc/kube-dns-v20                  1         1         1       ...
kube-system   rc/kubernetes-dashboard-v1.6.0   1         1         1       ...

NAMESPACE     NAME                       CLUSTER-IP   EXTERNAL-IP   PORT(S)   ...
default       svc/kubernetes             10.3.0.1     <none>        443/TCP   ...
kube-system   svc/kube-dns               10.3.0.10    <none>        53/UDP,53/TCP
kube-system   svc/kubernetes-dashboard   10.3.194.6   <none>        80/TCP    ...

If it looks like this, and you see all of all are 'ready' and no errors.... then you have done it my friend, you boostrapped a Kubernetes cluster. Now all the actually well-documented parts about it should work, until you run your nodes out of resources anyway!

Let's take a look at that dashboard, it really gave me quite the sense of accomplishment to be able to interact with a working application at long last. Here is another cool feature of kubectl, it allows you to forward an internal port to your local system quickly and easily. You don't need to worry about securing the dashboard on the front-end, because it has no accessible front-end unless you already have the correct authentication certs!

kubectl port-forward kubernetes-dashboard-v1.6.0-cv7x6 9090 --namespace=kube-system

Now open your web browser and navigate to http://localhost:9090. If you see a Kubernetes dashboard..... well done! If not.... good luck troubleshooting!

Next Time

On the next episode we will begin to use Kubernetes to deploy a real public-facing application.