Kubernetes Metrics-Server & HPA

5 min readApr 13, 2024

Overview

The Metrics Server is an integral component of Kubernetes’ autoscaling infrastructure, efficiently gathering container resource metrics from Kubelets.

It exposes these metrics via the Metrics API within the Kubernetes apiserver, facilitating the functionality of both Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA). Additionally, the Metrics API offers accessibility to tools like kubectl top, streamlining the debugging process for autoscaling configurations.

Caution

The Metrics Server serves a specific purpose: autoscaling. Avoid utilising it to relay metrics to monitoring solutions. Instead, opt to gather metrics directly from the Kubelet’s /metrics/resource endpoint in scenarios where you need to forward metrics to monitoring solutions.

Advantages

The Metrics Server presents several advantages:

It offers a unified deployment suitable for the majority of clusters.
It enables rapid autoscaling by gathering metrics at 15-second intervals.
It ensures resource efficiency, consuming merely 1 milli-core of CPU and 2 MB of memory per node within a cluster.
It boasts scalable support, accommodating clusters of up to 5,000 nodes.

Use Cases

Metrics Server proves valuable in the following scenarios:

Horizontal Pod Autoscaling based on CPU and memory metrics.
Automatically adapting or recommending resource allocations for containers. (Vertical Pod Autoscaling)

Avoid employing Metrics Server under the following circumstances:

Outside Kubernetes environments.
When precise resource usage metrics are essential.
For horizontal autoscaling based on resources other than CPU and memory.

Requirements

Ensure your Kubernetes cluster network configuration should facilitate communication:

Allow the control plane node to access the Metrics Server’s pod IP and port 10250 (or node IP and a custom port if hostNetwork is enabled).
Enable Metrics Server to communicate with the Kubelet on all nodes. Metrics Server requires access to the node’s address and Kubelet port. These addresses and ports are configured in Kubelet and exposed as part of the Node object. The addresses are listed in .status.addresses, and the Kubelet port is specified in .status.daemonEndpoints.kubeletEndpoint.port (default port is 10250). Metrics Server will select the first node address based on the list provided by the kubelet-preferred-address-types command-line flag.

Endpoints

All endpoints are GET endpoints, rooted at /apis/metrics/v1alpha1/. There won't be support for the other REST methods.

The list of supported endpoints:

/nodes - all node metrics; type []NodeMetrics
/nodes/{node} - metrics for a specified node; type NodeMetrics
/namespaces/{namespace}/pods - all pod metrics within namespace with support for all-namespaces; type []PodMetrics
/namespaces/{namespace}/pods/{pod} - metrics for a specified pod; type PodMetrics

Installation

Metrics Server can be installed either directly from YAML manifest or via the official Helm chart.

Directly from Manifest

# Non HA Deployment
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.7.1/components.yaml

# HA Deployment
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.7.1/high-availability.yaml

output

serviceaccount/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
service/metrics-server created
deployment.apps/metrics-server created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created

Via Official Helm Chart

# add the metrics-server repo
helm repo add metrics-server https://kubernetes-sigs.github.io/metrics-server/

# install the chart
helm upgrade --install metrics-server metrics-server/metrics-server

Validation,

kubectl get po -n kube-system -l k8s-app=metrics-server
NAME                              READY   STATUS    RESTARTS   AGE
metrics-server-6d94bc8694-v2x58   0/1     Running   0          7m38s

If you find pod is not in running state then check logs for the pod to see errors.

$ kubectl logs -f deployment/metrics-server -n kube-system

if output is something similar

I0413 05:11:23.275913       1 shared_informer.go:318] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
E0413 05:11:38.174750       1 scraper.go:149] "Failed to scrape node" err="Get \"https://192.168.49.2:10250/metrics/resource\": tls: failed to verify certificate: x509: cannot validate certificate for 192.168.49.2 because it doesn't contain any IP SANs" node="minikube"
I0413 05:11:46.219134       1 server.go:191] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"
E0413 05:11:53.176157       1 scraper.go:149] "Failed to scrape node" err="Get \"https://192.168.49.2:10250/metrics/resource\": tls: failed to verify certificate: x509: cannot validate certificate for 192.168.49.2 because it doesn't contain any IP SANs" node="minikube"

then edit deployment configuration

$ kubectl edit deployment/metrics-server -n kube-system

and add kubelet-insecure-tls under args

Now check pod status again

Horizontal Pod Autoscaling (HPA)

Amazon Elastic Kubernetes Service (EKS) supports the Horizontal Pod Autoscaler (HPA) and Kubernetes Metrics Server. This integration simplifies the process of scaling Kubernetes workloads managed by Amazon EKS in response to custom metrics.

Containers offer the advantage of quickly adjusting application scale, whether increasing or decreasing. Many workloads rely on defining application scaling based on custom metrics such as inbound connection requests or job queue length. The Horizontal Pod Autoscaler (HPA), a Kubernetes component, automates service scaling based on CPU utilization or memory utilization metrics defined through the Kubernetes Metrics Server.

Info: Clusters running PlatformVersion “eks.2” have API Aggregation enabled and, as a result, support the Horizontal Pod Autoscaler and Kubernetes Metrics Server. Please note that you must use version 0.3.0 or greater of Kubernetes Metrics Server with Amazon EKS

Configure HPA

Deploy a PHP application and service

$ kubectl create deployment php-apache --image=k8s.gcr.io/hpa-example
deployment.apps/php-apache created

Set CPU requests to deployment

Important: If you don’t set the value for cpu correctly, then the CPU utilization metric for the pod isn’t defined and the HPA can’t scale.

$ kubectl patch deployment php-apache -p='{"spec":{"template":{"spec":{"containers":[{"name":"hpa-example","resources":{"requests":{"cpu":"200m"}}}]}}}}'
deployment.apps/php-apache patched

Expose the deployment as a service

$ kubectl create service clusterip php-apache --tcp=80
service/php-apache created

You can verify the absence of resources supporting horizontal pod autoscaling in our cluster by running the following command:

$ kubectl get hpa -A
No resources found

create an HPA

$ kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
horizontalpodautoscaler.autoscaling/php-apache autoscaled

create a pod to test a load on the php-apache deployment pod

$ kubectl run -i --tty load-generator --image=busybox /bin/sh
while true; do wget -q -O- http://php-apache; done

To see how the HPA scales the pod based on CPU utilization metrics,

$ kubectl get hpa -w