Kubernetes Metrics-Server & HPA

Manish Sharma
5 min readApr 13, 2024



The Metrics Server is an integral component of Kubernetes’ autoscaling infrastructure, efficiently gathering container resource metrics from Kubelets.

It exposes these metrics via the Metrics API within the Kubernetes apiserver, facilitating the functionality of both Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA). Additionally, the Metrics API offers accessibility to tools like kubectl top, streamlining the debugging process for autoscaling configurations.


The Metrics Server serves a specific purpose: autoscaling. Avoid utilising it to relay metrics to monitoring solutions. Instead, opt to gather metrics directly from the Kubelet’s /metrics/resource endpoint in scenarios where you need to forward metrics to monitoring solutions.


The Metrics Server presents several advantages:

  • It offers a unified deployment suitable for the majority of clusters.
  • It enables rapid autoscaling by gathering metrics at 15-second intervals.
  • It ensures resource efficiency, consuming merely 1 milli-core of CPU and 2 MB of memory per node within a cluster.
  • It boasts scalable support, accommodating clusters of up to 5,000 nodes.

Use Cases

Metrics Server proves valuable in the following scenarios:

  • Horizontal Pod Autoscaling based on CPU and memory metrics.
  • Automatically adapting or recommending resource allocations for containers. (Vertical Pod Autoscaling)

Avoid employing Metrics Server under the following circumstances:

  • Outside Kubernetes environments.
  • When precise resource usage metrics are essential.
  • For horizontal autoscaling based on resources other than CPU and memory.


Ensure your Kubernetes cluster network configuration should facilitate communication:

  • Allow the control plane node to access the Metrics Server’s pod IP and port 10250 (or node IP and a custom port if hostNetwork is enabled).
  • Enable Metrics Server to communicate with the Kubelet on all nodes. Metrics Server requires access to the node’s address and Kubelet port. These addresses and ports are configured in Kubelet and exposed as part of the Node object. The addresses are listed in .status.addresses, and the Kubelet port is specified in .status.daemonEndpoints.kubeletEndpoint.port (default port is 10250). Metrics Server will select the first node address based on the list provided by the kubelet-preferred-address-types command-line flag.


All endpoints are GET endpoints, rooted at /apis/metrics/v1alpha1/. There won't be support for the other REST methods.

The list of supported endpoints:

  • /nodes - all node metrics; type []NodeMetrics
  • /nodes/{node} - metrics for a specified node; type NodeMetrics
  • /namespaces/{namespace}/pods - all pod metrics within namespace with support for all-namespaces; type []PodMetrics
  • /namespaces/{namespace}/pods/{pod} - metrics for a specified pod; type PodMetrics


Metrics Server can be installed either directly from YAML manifest or via the official Helm chart.

Directly from Manifest

# Non HA Deployment
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.7.1/components.yaml

# HA Deployment
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.7.1/high-availability.yaml


serviceaccount/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
service/metrics-server created
deployment.apps/metrics-server created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created

Via Official Helm Chart

# add the metrics-server repo
helm repo add metrics-server https://kubernetes-sigs.github.io/metrics-server/

# install the chart
helm upgrade --install metrics-server metrics-server/metrics-server


kubectl get po -n kube-system -l k8s-app=metrics-server
metrics-server-6d94bc8694-v2x58 0/1 Running 0 7m38s

If you find pod is not in running state then check logs for the pod to see errors.

$ kubectl logs -f deployment/metrics-server -n kube-system

if output is something similar

I0413 05:11:23.275913       1 shared_informer.go:318] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
E0413 05:11:38.174750 1 scraper.go:149] "Failed to scrape node" err="Get \"\": tls: failed to verify certificate: x509: cannot validate certificate for because it doesn't contain any IP SANs" node="minikube"
I0413 05:11:46.219134 1 server.go:191] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"
E0413 05:11:53.176157 1 scraper.go:149] "Failed to scrape node" err="Get \"\": tls: failed to verify certificate: x509: cannot validate certificate for because it doesn't contain any IP SANs" node="minikube"

then edit deployment configuration

$ kubectl edit deployment/metrics-server -n kube-system 

and add kubelet-insecure-tls under args

Now check pod status again

Horizontal Pod Autoscaling (HPA)

Amazon Elastic Kubernetes Service (EKS) supports the Horizontal Pod Autoscaler (HPA) and Kubernetes Metrics Server. This integration simplifies the process of scaling Kubernetes workloads managed by Amazon EKS in response to custom metrics.

Containers offer the advantage of quickly adjusting application scale, whether increasing or decreasing. Many workloads rely on defining application scaling based on custom metrics such as inbound connection requests or job queue length. The Horizontal Pod Autoscaler (HPA), a Kubernetes component, automates service scaling based on CPU utilization or memory utilization metrics defined through the Kubernetes Metrics Server.

Info: Clusters running PlatformVersion “eks.2” have API Aggregation enabled and, as a result, support the Horizontal Pod Autoscaler and Kubernetes Metrics Server. Please note that you must use version 0.3.0 or greater of Kubernetes Metrics Server with Amazon EKS

Configure HPA

  • Deploy a PHP application and service
$ kubectl create deployment php-apache --image=k8s.gcr.io/hpa-example
deployment.apps/php-apache created
  • Set CPU requests to deployment

Important: If you don’t set the value for cpu correctly, then the CPU utilization metric for the pod isn’t defined and the HPA can’t scale.

$ kubectl patch deployment php-apache -p='{"spec":{"template":{"spec":{"containers":[{"name":"hpa-example","resources":{"requests":{"cpu":"200m"}}}]}}}}'
deployment.apps/php-apache patched
  • Expose the deployment as a service
$ kubectl create service clusterip php-apache --tcp=80
service/php-apache created
  • You can verify the absence of resources supporting horizontal pod autoscaling in our cluster by running the following command:
$ kubectl get hpa -A
No resources found
  • create an HPA
$ kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
horizontalpodautoscaler.autoscaling/php-apache autoscaled
  • create a pod to test a load on the php-apache deployment pod
$ kubectl run -i --tty load-generator --image=busybox /bin/sh
while true; do wget -q -O- http://php-apache; done
  • To see how the HPA scales the pod based on CPU utilization metrics,
$ kubectl get hpa -w

Reference Links



Manish Sharma

I am technology geek & keep pushing myself to learn new skills. I am AWS Solution Architect — Associate, Professional & Terraform Associate Developer certified.