=================
== CJ Virtucio ==
=================

Rootless crictl

k3s containerd

Running commands like crictl on rootless k3s can be a little tricky. The filepaths are a little different, due to everything not being in paths that would normally be associated with a root process. I ran into this discussion, which shed light on how it works. I’m not as familiar with linux namespaces and how they are operated on by container processes, so I don’t think I’d be able to do the concepts justice. What I can do, however, is show how you can automate calling crictl in a rootless k3s setup.

Let’s start our script with a shebang:

#!/usr/bin/env bash

Instead of using /bin/bash, we use the env command and pass bash as the first argument. This ensures that the default version of bash in the running environment is used, instead of an executable at a specific path. This will make our script more flexible, as anyone can decide what version of bash to use when they run this script.

We can then start working on the script proper. This is subjective, but I personally prefer hiding the logic within a main function:

#!/usr/bin/env bash

function main {
}

main "$@"

Here we pass all arguments to the main function. We’re not using them yet; we’ll work on that later.

We want to shift left, so we’ll add set -eo pipefail:

#!/usr/bin/env bash

function main {
  set -eo pipefail
}

main "$@"

This calls the set builtin, setting some configuration parameters for the process. The -e flag ensures that the script exits on failure. The -o pipefail ensures that each pipeline terminates as soon as a command in the pipeline terminates. For instance, given this pipeline:

stat /tmp/does_not_exist | grep foo

the grep will not execute, because the stat will fail, immediately terminating the pipeline. And because -e is set, the script will exit.

Next we grab the PID of the k3s process:

#!/usr/bin/env bash

function main {
  set -eo pipefail

  local main_pid
  main_pid="$(systemctl show --property MainPID --user k3s-rootless | cut -d '=' -f2 | tr -d '\n')"
}

main "$@"

Let’s break down this pipeline into its components.

If you follow the instructions for setting up rootless k3s, you should have k3s running as a user-level systemd unit. systemctl show will print the field passed to the --property flag. --user tells systemctl to look for the unit among the user-level units.

We then pass the STDOUT of that command to cut. cut -d '=' -f2 will split the output by a delimiter passed to -d, and print out the column specified by -f. Because the output of our systemctl show command is a key/value pair delimited by =, we ask for the value, which would be on the second column. We then remove the trailing newline with tr -d '\n'.

Next thing we need to do is find the PID of one of the child processes run by k3s. The github discussion referenced above talks about this in more detail; essentially, k3s runs a child process that has its own namespace. The socket we want exists there, and not in the namespace of the parent process running k3s. /proc is a directory in linux containing subdirectories created for every running process. Each one will have a file called status, which has some metadata about the process.

We will loop over every status file in every PID’s directory within the /proc directory. We will check each status file’s PPid, which specifies the PID of the parent process. If this PID matches the PID of our k3s process, we check if the Name field in the status file is set to exe. If that matches, we’ve found our child PID.

  local exe_pid
  for f in /proc/*/status; do
    local parent_pid
    parent_pid="$(grep PPid "${f}" | awk '{print $2}' | tr -d '\n')"
    if [[ "${main_pid}" -ne "${parent_pid}" ]]; then
      continue
    fi

I’m sure by now you’re seeing a recurring pattern in how variables are declared. We use the local operator to ensure that the variable is scoped to the function.

We declare the exe_pid variable, which will hold the PID of our desired child process, exe. We then iterate through the status files with a glob, /proc/*/status.

We grab the PPid value from the status file. First we run grep to print out the line with that field; then we pass it through awk to print the value using the expression {print $2}, i.e. the second column in the input to awk. Again, we remove the trailing newline with tr.

We then validate the value against the parent PID, continuing if they don’t match.

The code is fairly similar for grabbing the name of the process:

    local child_process_name
    child_process_name="$(grep Name: "${f}" | awk '{print $2}' | tr -d '\n')"
    if ! [[ "${child_process_name}" =~ exe ]]; then
      continue
    fi

And then we set the exe_pid variable if the child process’s name is exe:

    exe_pid="$(grep -E '^Pid:' "${f}" | awk '{print $2}' | tr -d '\n')"
    break

The same github discussion also talks about where the socket file is located. Specifically, it’s at /proc/<exe PID>/root/run/k3s/containerd/containerd.sock. So we just pass a unix URI with that path to our crictl command, while also passing the arguments passed to the main function:

  k3s crictl --runtime-endpoint "unix:///proc/${exe_pid}/root/run/k3s/containerd/containerd.sock" "$@"

And that concludes our rootless crictl script.

In summary, our rootless crictl script does quite a few things. We ensure the default interpreter of the script’s caller is used, thanks to a shebang that uses /usr/bin/env. We shift left by setting the -e and -o pipefail flags. We grab the PID of the k3s process, grab the PID of the child exe process, then use that PID for the path to our desired containerd socket. Finally, we pass the path to the socket to our k3s crictl call, as well as all the arguments passed to main, which receive all the arguments passed to our script:

[k8s@cjvautomation ~]$ ~/bin/rootless_crictl.sh info | grep '"lastCNILoadStatus"'
  "lastCNILoadStatus": "OK",
[k8s@cjvautomation ~]$

The complete script can be found here.