# FAQ / Nice to know

## Issues with BTRFS

- As [@jaredallard](https://github.com/jaredallard) [pointed out](https://github.com/rancher/k3d/pull/48), people running `k3d` on a system with **btrfs**, may need to mount `/dev/mapper` into the nodes for the setup to work.
  - This will do: `#!bash k3d cluster create CLUSTER_NAME -v /dev/mapper:/dev/mapper`

## Issues with ZFS

- k3s currently has [no support for ZFS](https://github.com/rancher/k3s/issues/66) and thus, creating multi-server setups (e.g. `#!bash k3d cluster create multiserver --servers 3`) fails, because the initializing server node (server flag `--cluster-init`) errors out with the following log:

  ```bash
  starting kubernetes: preparing server: start cluster and https: raft_init(): io: create I/O capabilities probe file: posix_allocate: operation not supported on socket
  ```

  - This issue can be worked around by providing docker with a different filesystem (that's also better for docker-in-docker stuff).
  - A possible solution can be found here: [https://github.com/rancher/k3s/issues/1688#issuecomment-619570374](https://github.com/rancher/k3s/issues/1688#issuecomment-619570374)

## Pods evicted due to lack of disk space

- Pods go to evicted state after doing X
  - Related issues: [#133 - Pods evicted due to `NodeHasDiskPressure`](https://github.com/rancher/k3d/issues/133) (collection of #119 and #130)
  - Background: somehow docker runs out of space for the k3d node containers, which triggers a hard eviction in the kubelet
  - Possible [fix/workaround by @zer0def](https://github.com/rancher/k3d/issues/133#issuecomment-549065666):
    - use a docker storage driver which cleans up properly (e.g. overlay2)
    - clean up or expand docker root filesystem
    - change the kubelet's eviction thresholds upon cluster creation:

      ```bash
      k3d cluster create \
        --k3s-agent-arg '--kubelet-arg=eviction-hard=imagefs.available<1%,nodefs.available<1%' \
        --k3s-agent-arg '--kubelet-arg=eviction-minimum-reclaim=imagefs.available=1%,nodefs.available=1%'
      ```

## Restarting a multi-server cluster or the initializing server node fails

- What you do: You create a cluster with more than one server node and later, you either stop `server-0` or stop/start the whole cluster
- What fails: After the restart, you cannot connect to the cluster anymore and `kubectl` will give you a lot of errors
- What causes this issue: it's a [known issue with dqlite in `k3s`](https://github.com/rancher/k3s/issues/1391) which doesn't allow the initializing server node to go down
- What's the solution: Hopefully, this will be solved by the planned [replacement of dqlite with embedded etcd in k3s](https://github.com/rancher/k3s/pull/1770)
- Related issues: [#262](https://github.com/rancher/k3d/issues/262)

## Passing additional arguments/flags to k3s (and on to e.g. the kube-apiserver)

- The Problem: Passing a feature flag to the Kubernetes API Server running inside k3s.
- Example: you want to enable the EphemeralContainers feature flag in Kubernetes
- Solution: `#!bash k3d cluster create --k3s-server-arg '--kube-apiserver-arg=feature-gates=EphemeralContainers=true'`
  - **Note**: Be aware of where the flags require dashes (`--`) and where not.
    - the k3s flag (`--kube-apiserver-arg`) has the dashes
    - the kube-apiserver flag `feature-gates` doesn't have them (k3s adds them internally)

- Second example:

  ```bash
  k3d cluster create k3d-one \
    --k3s-server-arg --cluster-cidr="10.118.0.0/17" \
    --k3s-server-arg --service-cidr="10.118.128.0/17" \
    --k3s-server-arg --disable=servicelb \
    --k3s-server-arg --disable=traefik \
    --verbose
  ```

  - **Note**: There are many ways to use the `"` and `'` quotes, just be aware, that sometimes shells also try to interpret/interpolate parts of the commands

## How to access services (like a database) running on my Docker Host Machine

- As of version v3.1.0, we're injecting the `host.k3d.internal` entry into the k3d containers (k3s nodes) and into the CoreDNS ConfigMap, enabling you to access your host system by referring to it as `host.k3d.internal`

## Running behind a corporate proxy

Running k3d behind a corporate proxy can lead to some issues with k3d that have already been reported in more than one issue.  
Some can be fixed by passing the `HTTP_PROXY` environment variables to k3d, some have to be fixed in docker's `daemon.json` file and some are as easy as adding a volume mount.

## Pods fail to start: `x509: certificate signed by unknown authority`

- Example Error Message:

  ```bash
  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to get sandbox image "docker.io/rancher/pause:3.1": failed to pull image "docker.io/rancher/pause:3.1": failed to pull and unpack image "docker.io/rancher/pause:3.1": failed to resolve reference "docker.io/rancher/pause:3.1": failed to do request: Head https://registry-1.docker.io/v2/rancher/pause/manifests/3.1: x509: certificate signed by unknown authority
  ```

- Problem: inside the container, the certificate of the corporate proxy cannot be validated
- Possible Solution: Mounting the CA Certificate from your host into the node containers at start time via `k3d cluster create --volume /path/to/your/certs.crt:/etc/ssl/certs/yourcert.crt`
- Issue: [rancher/k3d#535](https://github.com/rancher/k3d/discussions/535#discussioncomment-474982)

## Spurious PID entries in `/proc` after deleting `k3d` cluster with shared mounts

- When you perform cluster create and deletion operations multiple times with **same cluster name** and **shared volume mounts**, it was observed that `grep k3d /proc/*/mountinfo` shows many spurious entries
- Problem: Due to above, at times you'll see `no space left on device: unknown` when a pod is scheduled to the nodes
- If you observe anything of above sort you can check for inaccessible file systems and unmount them by using below command (note: please remove `xargs umount -l` and check for the diff o/p first)
- `diff <(df -ha | grep pods | awk '{print $NF}') <(df -h | grep pods | awk '{print $NF}') | awk '{print $2}' | xargs umount -l`
- As per the conversation on [rancher/k3d#594](https://github.com/rancher/k3d/issues/594#issuecomment-837900646) above issue wasn't reported/known earlier and so there are high chances that it's not universal.

## [SOLVED] Nodes fail to start or get stuck in `NotReady` state with log `nf_conntrack_max: permission denied`

### Problem

- When: This happens when running k3d on a Linux system with a kernel version >= 5.12.2 (and others like >= 5.11.19) when creating a new cluster
  - the node(s) stop or get stuck with a log line like this: `<TIMESTAMP>  F0516 05:05:31.782902       7 server.go:495] open /proc/sys/net/netfilter/nf_conntrack_max: permission denied`
- Why: The issue was introduced by a change in the Linux kernel ([Changelog 5.12.2](https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.12.2): [Commit](https://github.com/torvalds/linux/commit/671c54ea8c7ff47bd88444f3fffb65bf9799ce43)), that changed the netfilter_conntrack behavior in a way that `kube-proxy` is not able to set the `nf_conntrack_max` value anymore

### Workaround

- Workaround: as a workaround, we can tell `kube-proxy` to not even try to set this value:

  ```bash
  k3d cluster create \
    --k3s-server-arg "--kube-proxy-arg=conntrack-max-per-core=0" \
    --k3s-agent-arg "--kube-proxy-arg=conntrack-max-per-core=0" \
    --image rancher/k3s:v1.20.6-k3s
  ```

### Fix

- **Note**: k3d v4.4.5 already uses rancher/k3s:v1.21.1-k3s1 as the new default k3s image, so no workarounds needed there!

This is going to be fixed "upstream" in k3s itself in [rancher/k3s#3337](https://github.com/k3s-io/k3s/pull/3337) and backported to k3s versions as low as v1.18.

- **The fix was released and backported in k3s, so you don't need to use the workaround when using one of the following k3s versions (or later ones)**
  - v1.18.19-k3s1 ([rancher/k3s#3344](https://github.com/k3s-io/k3s/pull/3344))
  - v1.19.11-k3s1 ([rancher/k3s#3343](https://github.com/k3s-io/k3s/pull/3343))
  - v1.20.7-k3s1 ([rancher/k3s#3342](https://github.com/k3s-io/k3s/pull/3342))
  - v1.21.1-k3s1 ([rancher/k3s#3341](https://github.com/k3s-io/k3s/pull/3341)))
- Issue Reference: [rancher/k3s#607](https://github.com/rancher/k3d/issues/607)

## DockerHub Pull Rate Limit

### Problem

You're deploying something to the cluster using an image from DockerHub and the image fails to be pulled, with a `429` response code and a message saying `You have reached your pull rate limit. You may increase the limit by authenticating and upgrading`.

### Cause

This is caused by DockerHub's pull rate limit (see <https://docs.docker.com/docker-hub/download-rate-limit/>), which limits pulls from unauthenticated/anonymous users to 100 pulls per hour and for authenticated users (not paying customers) to 200 pulls per hour (as of the time of writing).

### Solution

a) use images from a private registry, e.g. configured as a pull-through cache for DockerHub  
b) use a different public registry without such limitations, if the same image is stored there  
c) authenticate containerd inside k3s/k3d to use your DockerHub user  

#### (c) Authenticate Containerd against DockerHub

1. Create a registry configuration file for containerd:

  ```yaml
  # saved as e.g. $HOME/registries.yaml
  configs:
    "docker.io":
      auth:
        username: "$USERNAME"
        password: "$PASSWORD"
  ```

2. Create a k3d cluster using that config:

  ```bash
  k3d cluster create --registry-config $HOME/registries.yaml
  ```

3. Profit. That's it. In the test for this, we pulled the same image 120 times in a row (confirmed, that pull numbers went up), without being rate limited (as a non-paying, normal user)