Kubernetes Resource Setting in a local K3D Cluster

Klaus Hofrichter
10 min readJan 1, 2022

Resource management for CPU and memory in Kubernetes seems to be dark magic — everyone wants it, there is a lot of uncertainty, and no simple recipe. This article provides tools to get a handle on the topic.

Photo by Jo Szczepanska on Unsplash

A series of previous articles describe a growing K3D Kubernetes environment featuring a simple NodeJS application with a lot of instrumentation around it, including Prometheus/Grafana/Alertmanager. This is a local setup for experimentation and education; this current article discusses how to set CPU and memory resources for all workloads running in the cluster.

Just to recap, the setup build through scripts in the repository includes the following:

There are a few additions covered in this article:

  • Goldilocks to gather some suggestions for resource settings from the live system, plus some supporting scripts,
  • a bash script to extract current resource settings of all components, and
  • a bash script to write back these settings after manipulation.

Other changes are related to version updates for Helm charts and other releases, and minor bug fixes. All of the workloads now have suggested resource settings, so you can have 100% resource coverage.

Update February 13, 2022: There is a newer version of the software here with several updates.

What you need to bring, how to install everything

If you are using Windows: you can use Windows Subsystem for Linux to have a clean installation. A recent Windows 10 or 11 system will do fine, or a Ubuntu-alike Linux machine, better with 8 GB or more RAM. A basic understanding of bash, Docker, Kubernetes, Grafana, and NodeJS is expected. You will need optionally a Slack account for receiving alert messages, and a Grafana Cloud account for the full setup.

All source code is available at Github via https://github.com/klaushofrichter/resources. The source code includes bash scripts and NodeJS code. As always, you should inspect code like this before you execute it on your machine, to make sure that no bad things happen.

The Windows Subsystem for Linux setup process is described in a previous article. You should do the same steps for this setup, except using the newer repository. There are two new configuration options in config.sh:

export GOLDILOCKS_ENABLE="yes"  # or "no"
...
export RESOURCEPATCH="yes" # or "no"

If GOLDILOCKS_ENABLE is set to yes we are getting Goldilocks deployed in the course of the start.sh script, but it’s optional. You can deploy Goldilocks any time later with ./goldilocks-deploy.sh.

RESOURCEPATCH="yes" will cause patching resources using kubectl patch with some settings wherever there is no such mechanism in helm values files or manifests. Using yes will result in all components having resource requests and limits set, otherwise, default settings are used or none at all.

TL;DR

You may need to change config.sh to enable or disable what you are looking for, but you can also leave it as it is in the repository to begin. ./setup.sh installs necessary tools on the Linux or WSL, but check out the article for details. Then call ./start.sh and let it build the cluster, for about 10 minutes. After that, here are the short steps to manage resources across the cluster:

  • ./resources-get.sh: this queries all workloads (deployments, stateful sets, and daemon sets) and collects various data, including the resource settings, both limits, and requests. Per default, a file called resources.csv is generated. You can give another filename as an argument.
  • You can view the CSV file and manipulate the resource columns as you see fit (more about that later in the article). Some stats are generated by another script called ./resource-check.sh, but this is optional.
  • ./resource-apply.sh: this takes the CSV file and applies the resources settings by using kubectl patch. Per default, resources.csv is used as the data source, but a different filename can be given as an argument, e.g. if you have an edited version of the CSV. Running this script causes a restart of the containers that have values changed.

That’s what this is… it’s a central place (resources.csv) for all resource settings.

Resources.csv

You can open the resource.csv file in a spreadsheet for viewing and editing; here is a slightly formatted view in a spreadsheet program:

resource.csv

The CSV is generated within resources-get.sh by a small golang template, resources.go (learn more about golang templates here). It’s perhaps not widely known, but you can apply golang templates with kubectl in this way:

$ kubectl get deployment,daemonset,statefulset --all-namespaces -o go-template-file=./resources.go

This call goes through all deployments, daemon sets, and stateful sets to produce the CSV output, with filtering applied by resources.go. Here are the columns:

  • kind: the workload type,
  • namespace: the namespace that the workload object is in,
  • name: the name of the workload,
  • container-name: the name of the container within the workload, there can be multiple containers. Note that the query only captures active containers, so init-containers that are not active anymore do not show. This also means that transient and scheduled containers may escape this query.
  • container-index: this is the index of the container spec in the workload object. This index makes it easier to apply values back to the right container later.
  • replicas: this is the number of replicas active at the time of the query. The value may help a bit to do calculations later, but it is a difficult topic due to its dynamic nature. Note also that this does not really work for daemon-sets, as these “scale” with the number of nodes, which is not considered here at all.
  • cpu-request, memory-request, cpu-limit, memory-limit: these four columns can be edited. Note that the values are taken directly from the specifications, i.e. they may contain units. Check out the Kubernetes documentation for details, and be mindful of the difference between lower-caps m (1/1,000) and uppercase M (1,000,000).
  • location-hint: this is a reference to where this workload was defined. This is hardcoded in resources-location.json, and might save you some time when you want to change the resource allocation, for example in a values file. The location-hint tells you which file to edit so that your custom settings are used when you call start.sh again.

You could run a small helper script resources-check.sh to see some statistics about the CSV file, but the output there may not fully reflect the reality, e.g. as this does not take the number of nodes into account.

Note that the resource.csv file does not need to include all objects when using resource-apply.sh: you can delete those lines that you don’t want to touch, e.g. everything in the kube-system namespace if you don’t want to touch these.

Measuring Resource Consumption

There are many ways to measure resource consumption and to find the values that are best to use. Our environment here is pretty limited as it is a local installation on a single host, but there are still things to discover.

Before doing measurements, it may be a good idea to generate some traffic, so that all system components are doing something, that logs are created and the NodeJS application does some work. A simple way to do this is to use the app-traffic.sh script:

$ ./app-traffic 0 0 1

This call generates unlimited API calls with a delay of up to one second between the calls (details about this script are in the middle of this article and in the header of the script itself). This is not a lot of action but keeps the system a bit busy. While this is running, you want to look at Grafana: Grafana offers many dashboards that show resource consumption, including some that are pre-installed in the current cluster. Visit this local Grafana dashboard to get started:

This dashboard shows the overall situation in the “headlines” section: current utilization, requests, and limits for both memory and CPU. There is a nice graph, and you can drill down in the table below the chart to see details of the individual namespaces, and subsequently see the pods, and inside of pods the containers.

Looking at the detail of a single pod is where you can make a judgment about the proper settings:

In the example above, we see a spike in CPU usage of a pod in the Compute Resources /Pod dashboard because the traffic generation was changed for a short time by using this call:

$ ./app-traffic.sh 10000 0 0 

This generates 10,000 API calls as fast as possible, which takes about two minutes on my machine. Given that “load”, we broke through the red requests/cpu line, but did not reach the limits/cpu line.If you had the SLACK_ENABLE configured, you probably got some alert messages, or see alerts at http://localhost:8080/alert/. If you anticipate that you would not get more than 10,000 calls within 2 minutes, the requests and limits settings, in this case, seem like a good fit.

The alerts related to the CPU load are configured in am-values.yaml.template, see this article for details. The alerts there are pre-configured to use percentages of actual resource use to the request and limit, so you can create alerts that tell you when certain constraints are hit, such as going beyond resource requests settings, or worse, coming close to the limits.

Optimizing Resource Assignments

With Grafana alone you can go and optimize resources by looking at every single component listed in resources.csv and come up with the best values for your needs. The resources.csv file also shows where to place the values that you like to like to use by using helm values or after-deploy patches. You can do changes there, or change values in resources.csv and apply the values by using ./resources-appy.sh. Note that applying changes with resources-appy.sh will cause the workload to restart.

Also of note, you can use kubectl set resources, as described here, instead of kubectl patch.

The resource values that are coded in this repository come from a manual try-and-error approach using Grafana and simulated loads as shown above. Here is an article that discusses this approach (using a different service, though).

The question remains: What are good values to use? There is a tool called Goldilocks that provides recommendations. If you do not have GOLDILOCKS_ENABLE=”yes” in config.sh, you can deploy the tool now with ./goldilocks-deploy.sh, and then go after a few minutes to http://localhost:31082 and see the Goldilocks dashboard.

The Goldilocks dashboard shows all namespaces, and you can navigate down to a specific deployment and see what the recommendations are. The recommendations come from another tool that is used by Goldilocks, the Vertical Pod Autoscaler.

Goldilocks dashboard showing recommendations

The recommendation is a “starting point”. In practice, for our purpose, the “burstable QoS” recommendation seems too high and the “guaranteed QoS” too low, but this certainly depends on actual load patterns.

If you want to try the Goldilocks recommendations, please use ./goldilocks-recommendation.sh. This generates two CSV files, one for each of the cases mentioned above, based on Goldilocks “summary” function. The CSV files can be used like this to apply the values:

$ ./resources-apply.sh goldilocks-bqos.cvs

or

$ ./resources-apply.sh goldilocks-gqos.csv

Calling any of the above will take a while as each workload needs to restart when the resource specification changes and the recommendations are very likely different from the current setting.

Consequences of Setting Resource Assignments

Resource settings do not just provide an opportunity for graphs and alerts, there are actually consequences for doing too much or too little. Instead of re-explaining, here is a collection of some documentation about this topic:

Where to go from here

As always, this setup invites you to do your own experiments. The next steps could include some automatic optimization of resource assignment by measuring what is going on in the cluster and generating suitable resource.csv definitions, similar to the Goldilocks approach. The installation here installs the Vertical Pod Autoscaler as a sub-chart of Goldilocks main helm chart. A separate installation would offer more options to customize VPA, including taking advantage of Prometheus data for resource estimations — this is something for future consideration.

There are also many more facilities to manage resources within Kubernetes itself that are not discussed in this article at all, including Quotas for Namespaces, and Limit Ranges.

Finally, an additional topic is doing something proactive with the measurements, such as scaling. There are tools that support this type of automation as part of a Continous Optimization pipeline.

--

--