Skip to content

Instantly share code, notes, and snippets.

@triangletodd
Last active November 11, 2024 23:01
Show Gist options
  • Save triangletodd/02f595cd4c0dc9aac5f7763ca2264185 to your computer and use it in GitHub Desktop.
Save triangletodd/02f595cd4c0dc9aac5f7763ca2264185 to your computer and use it in GitHub Desktop.
k3s in LXC on Proxmox

On the host

Ensure these modules are loaded

cat /proc/sys/net/bridge/bridge-nf-call-iptables

Disable swap

sysctl vm.swappiness=0
swapoff -a

Enable IP Forwarding

The first time I tried to get this working, once the cluster was up, the traefik pods were in CrashloopBackoff due to ip_forwarding being disabled. Since LXC containers share the host's kernel, we need to enable this on the host.

echo 'net.ipv4.ip_forward=1' >> /etc/sysctl.conf
sysctl --system

Create the k3s container

Uncheck unprivileged container

general.png

Set swap to 0

memory.png

Enable DHCP

network.png

Results

confirm.png

Back on the Host

Edit the config file for the container (/etc/pve/lxc/$ID.conf) and add the following:

lxc.apparmor.profile: unconfined
lxc.cgroup.devices.allow: a
lxc.cap.drop:
lxc.mount.auto: "proc:rw sys:rw"

In the container

/etc/rc.local

/etc/rc.local doesn't exist in the default 20.04 LXC template provided by Rroxmox. Create it with these contents:

#!/bin/sh -e

# Kubeadm 1.15 needs /dev/kmsg to be there, but it's not in lxc, but we can just use /dev/console instead
# see: https://github.com/kubernetes-sigs/kind/issues/662
if [ ! -e /dev/kmsg ]; then
    ln -s /dev/console /dev/kmsg
fi

# https://medium.com/@kvaps/run-kubernetes-in-lxc-container-f04aa94b6c9c
mount --make-rshared /

Then run this:

chmod +x /etc/rc.local
reboot

Installing k8s

k3sup Installation

Assuming $HOME/bin is in your PATH:

curl -sLS https://get.k3sup.dev | sh
mv k3sup ~/bin/k3sup && chmod +x ~/bin/k3sup

k8s Installation

k3sup install --ip $CONTAINER_IP --user root

Test

KUBECONFIG=kubeconfig kubectl get pods --all-namespaces
NAMESPACE     NAME                                     READY   STATUS      RESTARTS   AGE
kube-system   metrics-server-7566d596c8-zm7tj          1/1     Running     0          69m
kube-system   local-path-provisioner-6d59f47c7-ldbcl   1/1     Running     0          69m
kube-system   helm-install-traefik-glt48               0/1     Completed   0          69m
kube-system   coredns-7944c66d8d-67lxp                 1/1     Running     0          69m
kube-system   traefik-758cd5fc85-wzcst                 1/1     Running     0          68m
kube-system   svclb-traefik-cwd9h                      2/2     Running     0          42m

References

@jmturner
Copy link

[INFO] systemd: Starting k3s
Created symlink /etc/systemd/system/multi-user.target.wants/k3s.service → /etc/systemd/system/k3s.service.

Error: error received processing command: Process exited with status 127

Forgive me for reviving an old thread but for those who still need this answer:

k3s is getting confused starting as root as it does not know it's in an unprivileged container and thus things aren't working. Add the following lines to ExecStart in the systemd file /etc/systemd/systemd/k3s.service (and then run systemctl daemon-reload) to get k3s running. This answer comes from k3s-io issue 4249.:

--kubelet-arg=feature-gates=KubeletInUserNamespace=true \
--kube-controller-manager-arg=feature-gates=KubeletInUserNamespace=true \
--kube-apiserver-arg=feature-gates=KubeletInUserNamespace=true \

@rsik
Copy link

rsik commented Feb 15, 2023

k3s is getting confused starting as root as it does not know it's in an unprivileged container and thus things aren't working. Add the following lines to ExecStart in the systemd file /etc/systemd/systemd/k3s.service (and then run systemctl daemon-reload) to get k3s running. This answer comes from k3s-io issue 4249.:

Small correction from your post /etc/systemd/systemd/k3s.service -> /etc/systemd/system/k3s.service .

I used k3sup as normal (and received the 127 error) and then edited the service file (as above) then ran:

k3sup install --skip-install --host host.domain.tld --sudo=false --user root --ssh-key ~/.ssh/ssh_key

and all is well!

I am not sure if there is a way to pass a service file on the first install, but this worked for me - thank you @jmturner

@ky-bd
Copy link

ky-bd commented Jun 19, 2023

I managed to start k3s in an unprivileged LXC container. I added the following to the CT conf file (also don't forget to check unprivileged container, or set unprivileged: 1 in the config):

lxc.cap.drop:
lxc.apparmor.profile: unconfined
lxc.mount.auto: proc:rw sys:rw cgroup:rw
lxc.cgroup2.devices.allow: c 10:200 rwm

modprobe / lsmod for br_netfilter might fail because it's already compiled into the kernel, rather than a loadable kernel module. You can check this by grep 'BRIDGE_NETFILTER' /boot/config-$(uname -r) . The overlay module needs to be loaded at Proxmox host side as well.

Then use k3sup to install (I'm actually installing and joining the k3s server to an existing cluster; modify this command as you need):

k3sup.exe join --host host.example.com --user root --ssh-key path_to_key --server-ip xxx.xxx.xxx.xxx --server --sudo=false

k3sup would report that installation succeeded, though the k3s server didn't start properly. Add these options to /etc/systemd/system/k3s.service ( https://gist.github.com/triangletodd/02f595cd4c0dc9aac5f7763ca2264185?permalink_comment_id=4466758#gistcomment-4466758 )

--kubelet-arg=feature-gates=KubeletInUserNamespace=true \
--kube-controller-manager-arg=feature-gates=KubeletInUserNamespace=true \
--kube-apiserver-arg=feature-gates=KubeletInUserNamespace=true \

Then restart k3s:

systemctl daemon-reload
systemctl restart k3s

Now the k3s should be running fine, and no need for running k3sup a second time. You can check it with systemctl status k3s and kubectl get nodes -A -o wide.

@glassman81
Copy link

I managed to start k3s in an unprivileged LXC container. I added the following to the CT conf file (also don't forget to check unprivileged container, or set unprivileged: 1 in the config):

lxc.cap.drop:
lxc.apparmor.profile: unconfined
lxc.mount.auto: proc:rw sys:rw cgroup:rw
lxc.cgroup2.devices.allow: c 10:200 rwm

I was able to use unprivileged containers too, but I'm not sure cgroup:rw is necessary. I didn't use it, but, everything seems to be working.

@glassman81
Copy link

glassman81 commented Jul 10, 2023

I was able to use unprivileged containers too, but I'm not sure cgroup:rw is necessary. I didn't use it, but, everything seems to be working.

Scratch that. There are too many apps that error out when trying to do this unprivileged, like rancher.

2023/07/10 06:33:16 [INFO] Applying CRD machinesets.cluster.x-k8s.io
2023/07/10 06:33:23 [FATAL] error running the jail command: exit status 2

Privileged works though.

@ky-bd
Copy link

ky-bd commented Jul 11, 2023

I was able to use unprivileged containers too, but I'm not sure cgroup:rw is necessary. I didn't use it, but, everything seems to be working.

Scratch that. There are too many apps that error out when trying to do this unprivileged, like rancher.

2023/07/10 06:33:16 [INFO] Applying CRD machinesets.cluster.x-k8s.io
2023/07/10 06:33:23 [FATAL] error running the jail command: exit status 2

Privileged works though.

Yeah, I found that unprivilegd LXC failed to mount block devices, so Longhorn and probably other CSI driver won't work. I gave it up and just turned to VMs though.

@glassman81
Copy link

I was able to use unprivileged containers too, but I'm not sure cgroup:rw is necessary. I didn't use it, but, everything seems to be working.

Scratch that. There are too many apps that error out when trying to do this unprivileged, like rancher.

2023/07/10 06:33:16 [INFO] Applying CRD machinesets.cluster.x-k8s.io
2023/07/10 06:33:23 [FATAL] error running the jail command: exit status 2

Privileged works though.

Yeah, I found that unprivilegd LXC failed to mount block devices, so Longhorn and probably other CSI driver won't work. I gave it up and just turned to VMs though.

I'm having the same problem even with privileged LXCs. Longhorn goes through this process of constantly attaching/detaching when the frontend is block device. When it's iSCSI, it doesn't even attempt to attach, though I think that's because the CSI driver doesn't support iSCSI mode.

Did you ever get longhorn to work with privileged LXCs, or it just didn't work all around?

@glassman81
Copy link

Well, it seems in its current state, longhorn won't work with LXCs:

longhorn/longhorn#2585
longhorn/longhorn#3866

This is not to say that it can't, just that someone hasn't figured it out yet. Maybe if someone like @timothystewart6 is interested (hopefully), he can have a go at it. His pretty awesome work led me here in the first place, so I can only hope.

@ky-bd
Copy link

ky-bd commented Jul 14, 2023

Well, it seems in its current state, longhorn won't work with LXCs:

longhorn/longhorn#2585 longhorn/longhorn#3866

This is not to say that it can't, just that someone hasn't figured it out yet. Maybe if someone like @timothystewart6 is interested (hopefully), he can have a go at it. His pretty awesome work led me here in the first place, so I can only hope.

I read those issues before, and that's part of the reason why I gave up before trying privileged LXC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment