Random notes on Kubernetes

In what continues to be very off-brand content for me (I'm really a kernel developer, I promise!), here are some random notes on some k8s stuff that will hopefully help someone. I'll keep this post as a running log of interesting stuff as I study for the CKA exam.

Deploying k8s with the Canonical Distribution of Kubernetes (CDK)

Does your deployment hang forever with kubernetes-master "Waiting for kube-system pods to start"?

I was using a local deployment with LXD, and I needed to do the following 3 things:

Disable swap on the host. In my case, I used juju ssh to connect to a kubernetes-master unit, then kubectl get nodes|less -S. This informed me that the nodes weren't starting the container runtime. Using juju ssh to connect to a node and running sudo kubectl, I found out that the container runtime was refusing to start because the host had swap memory enabled! After disabling that with sudo swapoff -a on the host, I was able to make progress and get a different error.
Don't use ZFS for the LXD storage pool. Then, from /var/log/syslog on the worker unit, I found the kubelet was in an infinite start loop because docker was trying to use ZFS and the ZFS tools weren't installed. Installing zfsutils-linux allowed this to progress, but all it did was show that for some reason docker was trying to use ZFS inside the container, which would never work, as the ZFS pool wasn't being made available inside the container. So at this point I razed my local install to the ground and rebuilt without ZFS.
Get proxy settings right. This is a bit specific to my environment, as I have a proxy that allows access to package repositories but not the internet as a whole. I ended up needing to do juju model-config http-proxy="http://whatever" (and again with https-proxy), but I also ended up having to set the entire lxdbr0 subnet as no-proxy with juju model-config no-proxy="$(printf '%s,' 10.172.217.{1..255}).lxd" and lxc restart --all. (Update: see below!)

After these 3 steps, juju seems to think everything is deployed. Yay! What we can learn from this is that even with a pretty fine-tuned distribution like CDK, deploying k8s on anything other than really well-trodden paths (e.g. GKE or AWS) is likely to be an interesting exercise, and that the plumbing of error info through juju has a ways to go still.

Why does my network not work?

I tried to deploy a pod. The network didn't seem to work. I looked at my logs on a node. I saw:

kubelet.daemon[1279]: E0522 13:42:56.378989 1279 streamwatcher.go:109] Unable to decode an event from the watch stream: stream error: stream ID 201; INTERNAL_ERROR

I realised that on my containers, doing stuff like curl http://localhost:<random port> went out to the proxy. So I changed my no-proxy config:

juju model-config no-proxy="$(printf '%s,' 10.172.217.{1..255}).lxd,localhost,127.0.0.1"

This still didn't fix things. Removing the proxy entirely - temporarily at least - does make the network work, at the cost of making downloading new images impossible.

More updates to come when I figure out a good long-term solution to this.

Comments

0stittesmit_do199324 April 2022 at 01:49
0stittesmit_do1993 Antonio Rodriguez https://wakelet.com/wake/pA4dtHseheL6o-q4J9aeR
gnuttanecheal
ReplyDelete
Replies
mulhae0ce_to1 December 2022 at 04:45
compdicsumdzu Jamie Smith Software
software
hotlisere
ReplyDelete
Replies

Add comment

Daniel's Notes

Search This Blog