Skip to main content

Random notes on Kubernetes

In what continues to be very off-brand content for me (I'm really a kernel developer, I promise!), here are some random notes on some k8s stuff that will hopefully help someone. I'll keep this post as a running log of interesting stuff as I study for the CKA exam.

Deploying k8s with the Canonical Distribution of Kubernetes (CDK)

Does your deployment hang forever with kubernetes-master "Waiting for kube-system pods to start"?

I was using a local deployment with LXD, and I needed to do the following 3 things:
  1. Disable swap on the host. In my case, I used juju ssh to connect to a kubernetes-master unit, then kubectl get nodes|less -S. This informed me that the nodes weren't starting the container runtime. Using juju ssh to connect to a node and running sudo kubectl, I found out that the container runtime was refusing to start because the host had swap memory enabled! After disabling that with sudo swapoff -a on the host, I was able to make progress and get a different error.
  2. Don't use ZFS for the LXD storage pool. Then, from /var/log/syslog on the worker unit, I found the kubelet was in an infinite start loop because docker was trying to use ZFS and the ZFS tools weren't installed. Installing zfsutils-linux allowed this to progress, but all it did was show that for some reason docker was trying to use ZFS inside the container, which would never work, as the ZFS pool wasn't being made available inside the container. So at this point I razed my local install to the ground and rebuilt without ZFS.
  3. Get proxy settings right. This is a bit specific to my environment, as I have a proxy that allows access to package repositories but not the internet as a whole. I ended up needing to do juju model-config http-proxy="http://whatever" (and again with https-proxy), but I also ended up having to set the entire lxdbr0 subnet as no-proxy with juju model-config no-proxy="$(printf '%s,' 10.172.217.{1..255}).lxd" and lxc restart --all. (Update: see below!)
After these 3 steps, juju seems to think everything is deployed. Yay! What we can learn from this is that even with a pretty fine-tuned distribution like CDK, deploying k8s on anything other than really well-trodden paths (e.g. GKE or AWS) is likely to be an interesting exercise, and that the plumbing of error info through juju has a ways to go still.

Why does my network not work?

I tried to deploy a pod. The network didn't seem to work. I looked at my logs on a node. I saw:

kubelet.daemon[1279]: E0522 13:42:56.378989    1279 streamwatcher.go:109] Unable to decode an event from the watch stream: stream error: stream ID 201; INTERNAL_ERROR

I realised that on my containers, doing stuff like curl http://localhost:<random port> went out to the proxy. So I changed my no-proxy config:

juju model-config no-proxy="$(printf '%s,' 10.172.217.{1..255}).lxd,localhost,127.0.0.1" 

This still didn't fix things. Removing the proxy entirely - temporarily at least - does make the network work, at the cost of making downloading new images impossible.

More updates to come when I figure out a good long-term solution to this.

Comments

Post a Comment

Popular posts from this blog

Connecting to a wifi network with netplan

How do you connect to a a wifi network with netplan? I hang out on the #netplan IRC channel on Freenode, and this comes up every so often. netplan - the default network configuration tool in Ubuntu 17.10 onwards - currently supports WPA2 Personal networks, and open (unencrypted) networks only. If you need something else, consider using NetworkManager directly, or falling back to ifupdown and wpa_supplicant for a little longer. Without further ado, here are tested, working YAML files for connection to my local WPA2 and unencrypted network. The only things that have been changed are the SSIDs and password. Both networks have a router providing dhcp4. In both cases I assume there's only one wifi device in the system - if this is not true, replace match: {} with something more specific. You can drop these in  /etc/netplan and run netplan generate; netplan apply  and things should work. The network will also be brought up on subsequent boots. Note that, as always in YAML, ind

Netplan by example

netplan  is the default network configuration system for new installs of Ubuntu 18.04 (Bionic). It uses YAML to configure network interfaces, instead of  /etc/network/interfaces . I've been testing netplan for a while, so in light of the release of Bionic, here's my set of examples, caveats, tips and tricks. Contents General tips and tricks Matching Basic IPv4 configuration MTUs Bridges, Bonds and VLANs Wifi IPv6 Supplementing or replacing netplan Going Further General tips and tricks Tabs are not allowed in YAML and currently you get a very useless error message if you use them: "Invalid YAML at //etc/netplan/10-bridge.yaml line 5 column 0: found character that cannot start any token". If you see this, check for tabs! Indentation matters in YAML. Make sure that things line up where they're supposed to. Rebooting is somewhat more reliable than netplan apply , but make sure  there are no errors in your YAML before you reboot or no network

Anonymous bridges in netplan

netplan is the default network configuration system for new installs of Ubuntu 18.04 (Bionic). Introduced as the default in Artful, it replaces /etc/network/interfaces . One question that gets asked repeatedly is: "How do I set up an anonymous bridge in netplan?" (An anonymous bridge, I discovered, is one where the bridge doesn't have an IP address; it's more akin to a switch or hub.) It's been approached on  Launchpad , and comes up on the IRC channel. If you're trying to create a bridge without an IP address, the obvious first thing to try is this: network: version: 2 ethernets: ens8: match: macaddress: 52:54:00:f9:e9:dd ens9: match: macaddress: 52:54:00:56:0d:ce bridges: br0: interfaces: [ens8, ens9] This is neat, plausible, and wrong - the bridge will be created but will stay 'down'. Per ip a : 5: br0: <BROADCAST,MULTICAST> mtu 15