How to monitor your Kubernetes metrics server

How to monitor your Kubernetes metrics server

Table of Contents

Introduction

In this article, we will examine a Kubernetes metrics server and its uses. We will also learn how to set one up and use it to monitor Kubernetes metrics. Finally, we will explore using Hosted Graphite by MetricFire to monitor Kubernetes metrics.

                   

To easily get started with monitoring Kubernetes clusters, check out our tutorial on using the Telegraf agent as a Daemonset to forward node/pod metrics to a data source and use that data to create custom dashboards and alerts. Then, to learn more about monitoring Kubernetes metrics using hosted Graphite by MetricFire, book a demo with the MetricFire team or sign up for the free trial today. 

                       

What is a Kubernetes metrics server?

A Kubernetes Metrics server is a cluster add-on that allows you to collect resource metrics for autoscaling pipelines from Kubernetes. After receiving metrics, it delivers these aggregated metrics to the Kubernetes API server via the Metric API. The metrics server is used only for autoscaling purposes.

                

The main advantages of Kubernetes metrics server:

  1. Efficient use of resources.
  2. Scalable support for up to 5000 cluster nodes.
  3. Single deployment for most clusters.
  4. Systematic collection of metrics, by default, every 15 seconds.

            

What is a Kubernetes metrics server used for?

Let’s look at the cases you can use a metrics server.

  1. Horizontal autoscale based on CPU or memory. It is implemented as a control circuit with a period handled by the controller manager flag (default value is 15 seconds). The controller manager requests resource usage for the specified metrics during each period. The controller manager obtains metrics from the Resource Metrics API or the Custom Metrics API. The automatic scaling of the horizontal unit does not apply to objects that cannot be scaled.
  2. Automatically configure or suggest resources needed by containers. Vertical Pod Autoscaling (VPA) allows you to automatically set queries based on usage, allowing proper scheduling for nodes so that the appropriate amount of resources is available for each pod. It can also maintain the ratios between limits and requests specified in the initial container configuration. Depending on how they are used over time, it can either downscale modules that are over-requesting resources or upscale modules that are over-requesting resources.

                        

You should use tools other than metrics server in the following cases:

  1. Monitoring cluster metrics that are not Kubernetes specific.
  2. To get an accurate source of resource utilization metrics.
  3. For horizontal autoscaling based on resources other than CPU or memory.

                  

Kubernetes metrics server requirements

Before using Metrics Server, you need to check that your network and cluster have the following settings:

  • Server metrics address: If hostNetwork is enabled, the IP address of the host will be used; otherwise, the IP address of the container will be used.
  • The aggregation level must be enabled on the Kube API server.
  • The cluster nodes must have Webhook authentication and authorization enabled.
  • If Kubelet certificate validation is enabled on the metrics server, it must be signed by the cluster CA.
  • The container runtime must implement RPC container metrics or have cAdvisor support.

                            

                                        

Metrics to watch

Let’s look at the main groups of Kubernetes metrics that can be monitored using the metric server.

          

Cluster state metrics

These metrics show the health and availability of Kubernetes items. They are used to keep track of whether the modules are working as expected. These metrics give you high-level information about the cluster and its health and can help identify problems with nodes and pods. Cluster state metrics include node status, desired pods, current pods, available pods, and unavailable pods.

              

Resource metrics

These metrics allow you to understand whether the cluster can handle its workloads and new loads. It is possible to track the use of resources at different levels of the cluster. This group includes the following metrics: memory requests, memory limits, allocatable memory, memory utilization, CPU requests, CPU limits, allocatable CPU, CPU utilization, and disk utilization.

                

Control plane metrics

This group includes metrics that allow you to monitor the operation of the primary services and resources for managing the cluster, such as API servers, controller managers, schedulers, and data stores.

                          

How to deploy a metrics server?

Some clusters include default server metrics deployment. To check whether the metrics server is running on your cluster, run the following command:

kubectl get pods --all-namespaces | grep metrics-server

                

If the metrics server is running, you will see information about the running nodes in the response. Otherwise, run the following command to install the latest version of server metrics.

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

          

Querying the Metric API

After setting server metrics, you can get metrics for any node or pod using the kubectl get tool. Use the following commands to get metrics for all nodes and pods.

# Get the metrics for all nodes 
kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes


# Get the metrics for all pods 
kubectl get --raw /apis/metrics.k8s.io/v1beta1/pods

             

You can also get metrics separately for one selected node or pod. To do this, you need to specify its name, as shown in the following commands.

# Get the metrics for node <node_name>
kubectl get --raw /apis/metrics.k8s.io/v1beta1/<node_name> |  jq '.'
# Get the metrics for pode <pod_name>
kubectl get --raw /apis/metrics.k8s.io/v1beta1/<pod_name> | jq '.'

           

To get a list of all nodes or pods in a given namespace, run the kubectl get nodes, or kubectl get pods command, respectively.

The Metric API returns the result in JSON format. To display JSON in a human-readable form in the terminal, use the jq utility to output.

            

Use the kubectl top command to get the current CPU and memory usage for all or individual nodes or pods. The following command returns resource usage by all pods.

kubectl top pod

             

You can read more about how to use the kubectl tool and all its commands here.

               

Using Kubernetes dashboard for watching metrics

Kubernetes dashboard is a graphical tool for monitoring and managing a cluster. It provides the same functionality as kubectl. The Kubernetes dashboard has a panel that provides a convenient metrics breakdown for each node and pod. In addition, the dashboard has charts that allow you to track how the metrics have changed over a certain period.

       

To install the latest version of the Kubernetes dashboard, run the following command:

kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/recommended/kubernetes-dashboard.yaml

        

To access the dashboard interface through the browser, run the following command:

kubectl proxy

        

Next, you need to generate and enter an authentication token using the command:

kubectl --namespace kubernetes-dashboard describe secret $(kubectl -n kubernetes-dashboard get secret | grep admin-user | awk '{print $1}')

         

After completing the authentication, you can access the dashboard graphical interface, which you can use to monitor metrics and edit Kubernetes objects.

           

Using Hosted Graphite by MetricFire to monitor Kubernetes

A production-level Kubernetes infrastructure can require a few hundred nodes and upwards of a few Mbps of network traffic. Therefore, you must scale out both Graphite and Grafana to handle the increasing load. 

That’s where Hosted Graphite and Hosted Grafana come into the picture. These allow you to scale for long-term storage and provide redundant data storage without going through the arduous process of setting up Graphite and Grafana. You don’t have to worry about installing, configuring, and maintaining your monitoring system, but view the metrics on a web page. 

Hosted Graphite and Hosted Grafana through MetricFire allow for the continuous active deployment of new features, as MetricFire’s products all have their foundations in the ever-growing open-source projects. Configuring a Snap Daemon to send Kubernetes metrics to your MetricFire account is simple. It just requires configuring your account's API key to be used as the prefix for each metric and the URL Endpoint to be used as the server destination. Check out our article Monitoring Kubernetes with Hosted Graphite to learn how to set up monitoring your Kubernetes infrastructure quickly and easily using our Hosted service.       

Other benefits of using MetricFire:

  1. No vendor lock-ins: MetricFire provides continuous, uninterrupted access to your data at any time.
  2. Easy Budgeting: You can choose the pricing plan that suits your needs.
  3. Transparency: MetricFire works transparently on all aspects of its SaaS system monitoring operations. Its internal system metrics are on its public status page.
  4. Robust Support: If you have any difficulties working with MetricFire, MetricFire engineers will always provide a comprehensive answer to any question by phone or video conference.

         

If you want to know about monitoring Kubernetes metrics with MetricFire, book a demo with our engineers or sign up for a MetricFire free trial today.

            

Conclusion 

Kubernetes metrics server is a powerful tool for monitoring Kubernetes autoscaling metrics based on CPU or memory. However, for it to work, you need to invest enough time configuring several tools and settings. A better alternative would be to use Hosted Graphite by MetricFire, making monitoring all your Kubernetes metrics easy. To easily get started with monitoring Kubernetes clusters, check out our tutorial on using the Telegraf agent as a Daemonset to forward node/pod metrics to a data source and use that data to create custom dashboards and alerts. 

                  

Book a demo with the MetricFire team or sign up for the MetricFire free trial to find out more options that MetricFire has to offer.

You might also like other posts...
metricfire Dec 06, 2024 · 6 min read

Step by Step Guide to Monitoring Apache Spark with MetricFire

Monitoring Spark metrics is crucial because it provides visibility into how your cluster and... Continue Reading

metricfire Dec 02, 2024 · 8 min read

Easiest Way to Monitor Your API Endpoints Using Telegraf

Monitoring the health of your API endpoints is crucial to keeping your applications running... Continue Reading

metricfire Nov 28, 2024 · 3 min read

厳選!オープンソースのネットワーク監視ツール

ネットワーク監視は、組織に影響を及ぼす可能性のあるネットワーク関連の問題について貴重な洞察を提供する、ネットワーク管理戦略の重要な要素です。ネットワークを定期的に監視することで、ネットワークの過負荷、ルーターの問題、ダウンタイム、サイバー犯罪、データ損失などのリスクを軽減します。 Continue Reading

header image

We strive for 99.999% uptime

Because our system is your system.

14-day trial 14-day trial
No Credit Card Required No Credit Card Required