Blog

AWS EKS Monitoring Stack: Grafana, Prometheus, AlertManager, Promtail, and Loki
Photo by Saifeddine Rajhi

AWS EKS Monitoring Stack: Grafana, Prometheus, AlertManager, Promtail, and Loki

โ€ข 5mins read
  • AWS EKS
  • Grafana
  • Prometheus
  • AlertManager
  • Promtail
  • Loki
  • Open Source Monitoring

    Content

    Monitoring AWS EKS with performant open source tools

    ๐Ÿ”– Introduction

    Amazon Elastic Kubernetes Service (EKS) is a managed Kubernetes service that makes it easy to run Kubernetes on AWS. However, EKS doesn't come with a built-in monitoring solution. This means that you need to set up your own monitoring stack if you want to track the health and performance of your EKS cluster.

    In this article, we will show you how to create a monitoring stack for AWS EKS using Grafana, Prometheus, AlertManager, Promtail, and Loki. These open source tools provide a comprehensive monitoring solution that can help you track all aspects of your EKS cluster, from metrics to logs.

    We will walk you through the steps of installing and configuring these tools, and we will show you how to create dashboards in Grafana to visualize your data. By the end of this article, you will have a solid understanding of how to monitor your AWS EKS cluster using open source tools.

    Grafana and Prometheus

    ๐ŸŽฏ Goals & Objectives

    If you are looking for a comprehensive and cost-effective monitoring solution for your AWS EKS cluster, then this monitoring stack is a great option. So, roll your sleeves and let's get cracking! ๐Ÿ”ฅ

    ๐Ÿ“‹ Prerequisites

    • Knowledge of Kubernetes.
    • AWS EKS cluster.
    • Helm.

    ๐Ÿ› ๏ธ Kube-Prometheus-Stack: An Effective Way to Monitor AWS EKS Cluster

    Kube-Prometheus-stack, also known as Prometheus Operator, is a popular open-source project providing complete monitoring and alerting solutions for Kubernetes clusters. It combines tools and components to create a monitoring stack for Kubernetes environments.

    We will use the official Helm Chart Kube-Prometheus-stack, with a customized value file.

    Steps to Deploy the Monitoring Stack

    1. Creating Namespace called monitoring

      kubectl create namespace monitoring
    2. Add a new Helm Repository

      helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
      helm repo update
    3. Edit the Values

      These settings indicate that the selectors mentioned (rule, service monitor, pod monitor, and scrape config) will have independent configurations and will not be based on Helm graphic values.

      prometheus:
        enabled: true
        agentMode: false
        annotations: {}
        networkPolicy:
          enabled: false
          flavor: kubernetes
        serviceAccount:
          create: true
          name: ""
          annotations: {}
        serviceMonitor:
          interval: ""
          selfMonitor: true
          sampleLimit: 0
          targetLimit: 0
          labelLimit: 0
          labelNameLengthLimit: 0
          labelValueLengthLimit: 0
          scheme: ""
          tlsConfig: {}
          bearerTokenFile:
          metricRelabelings: []
          relabelings: []
        prometheusSpec:
          disableCompaction: false
          enableRemoteWriteReceiver: false
          replicaExternalLabelName: ""
          replicaExternalLabelNameClear: false
          prometheusExternalLabelName: ""
          prometheusExternalLabelNameClear: false
          externalUrl: ""
          nodeSelector: {}
          secrets: []
          configMaps: []
          query: {}
          ruleNamespaceSelector: {}
          ruleSelectorNilUsesHelmValues: true
          ruleSelector: {}
          serviceMonitorSelectorNilUsesHelmValues: true
          serviceMonitorSelector: {}
          serviceMonitorNamespaceSelector: {}
          podMonitorSelectorNilUsesHelmValues: true
          podMonitorSelector: {}
          podMonitorNamespaceSelector: {}
          probeSelectorNilUsesHelmValues: true
          probeSelector: {}
          probeNamespaceSelector: {}
          scrapeConfigSelectorNilUsesHelmValues: true
          scrapeConfigSelector: {}
          scrapeConfigNamespaceSelector: {}
          retention: 10d
          retentionSize: ""
          tsdb:
            outOfOrderTimeWindow: 0s
          walCompression: true
          paused: false
          replicas: 1
          shards: 1
          logLevel: info
          logFormat: logfmt
          routePrefix: /
          podMetadata: {}
          podAntiAffinity: ""
          podAntiAffinityTopologyKey: kubernetes.io/hostname
          affinity: {}
          remoteRead: []
          additionalRemoteRead: []
          remoteWrite: []
          additionalRemoteWrite: []
          remoteWriteDashboards: false
          resources: {}
          storageSpec: {}
          volumes: []
          volumeMounts: []
          additionalScrapeConfigs: []
          additionalScrapeConfigsSecret: {}
          additionalPrometheusSecretsAnnotations: {}
          additionalAlertManagerConfigs: []
          additionalAlertManagerConfigsSecret: {}
          additionalAlertRelabelConfigs: []
          additionalAlertRelabelConfigsSecret: {}
          arbitraryFSAccessThroughSMs: false
          overrideHonorLabels: false
          overrideHonorTimestamps: false
          ignoreNamespaceSelectors: false
          enforcedNamespaceLabel: ""
          prometheusRulesExcludedFromEnforce: []
          excludedFromEnforcement: []
          queryLogFile: false
          enforcedSampleLimit: false
          enforcedTargetLimit: false
          enforcedLabelLimit: false
          enforcedLabelNameLengthLimit: false
          enforcedLabelValueLengthLimit: false
          allowOverlappingBlocks: false
          minReadySeconds: 0
          hostNetwork: false
          hostAliases: []
          tracingConfig: {}
    4. Run the Helm Deployment Command

      helm upgrade --install -f values.yaml kube-prometheus-stack prometheus-community/kube-prometheus-stack -n monitoring

    After we deploy the Kube-Prometheus stack, we get:

    • Grafana
    • Prometheus
    • AlertManager

    ๐Ÿงช Testing

    To check if everything is working, you can use the following commands:

    Prometheus Dashboard

    kubectl port-forward svc/prometheus-operated 9090:9090 -n monitoring

    You can access the Prometheus dashboard at http://localhost:9090.

    Grafana Dashboard

    kubectl port-forward svc/kube-prometheus-stack-grafana 3000:3000 -n monitoring

    You can access the Grafana dashboard at http://localhost:3000.

    Check Running Pods

    To ensure all pods are running, use the following command:

    kubectl get pods --namespace monitoring

    ๐Ÿ›ก๏ธ Create a ServiceMonitor

    Prometheus Operator uses ServiceMonitors to auto-detect target pods based on label selectors and associate them with Prometheus instances. The ServiceMonitor defines an application that scrapes metrics from Kubernetes.

    Below is an example manifest for a ServiceMonitor:

    apiVersion: monitoring.coreos.com/v1
    kind: ServiceMonitor
    metadata:
        name: prometheus-self
        labels:
            app: kube-prometheus-stack-prometheus
    spec:
        endpoints:
            - interval: 30s
                port: web
        selector:
            matchLabels:
                app: kube-prometheus-stack-prometheus

    The ServiceMonitor defined above will select pods labeled app: kube-prometheus-stack-prometheus using the spec.selector.matchLabels field.

    To apply the ServiceMonitor manifest, run the following command:

    kubectl apply -f servicemonitor.yaml -n monitoring

    ๐Ÿ”— Integration of Promtail and Loki to Grafana

    Promtail is a log agent that ships logs to Loki, a horizontally scalable, highly available, and efficient log store. Loki can be integrated with Grafana to create dashboards that visualize your logs and metrics. This integration provides a powerful way to monitor your applications and infrastructure.

    ๐Ÿ“œ Intro to Loki

    Loki uses Promtail as its primary log collector, but it can also accept log flows from other sources, such as Syslog or other log senders. It offers high uptime and supports data replication across multiple replicas for durability and fault tolerance. Loki is tightly integrated with Grafana, allowing you to visualize log metrics. Think of Loki as Prometheus, but for logs.

    Promtail

    ๐Ÿ“ฅ Add a New Helm Repository

    helm repo add grafana https://grafana.github.io/helm-charts
    helm repo update

    ๐Ÿš€ Run the Helm Deployment Command

    helm upgrade --install loki grafana/loki-distributed -n monitoring

    After deploying Loki, update the Kube-Prometheus-stack values file to add additional data sources. This configuration defines a default additional data source that connects to a Loki log aggregation system. It specifies the URL for accessing Loki and sets the access method to the proxy.

    grafana:
        additionalDataSources:
            - name: Loki
                type: loki
                url: http://loki-loki-distributed-gateway.monitoring.svc.cluster.local
                access: proxy

    Please refer to the Helm values for more details.

    ๐Ÿ“œ Intro to Promtail

    Promtail can log files from different sources, including local files, container output streams (stdout and stderr), and systemd journals. It supports the development, filtering, and relabeling of logs before sending them to Loki.

    clients:
        - url: http://loki-loki-distributed-gateway.monitoring.svc.cluster.local/loki/api/v1/push

    Please refer to the Helm values for more details.

    ๐Ÿš€ Run the Helm Upgrade Command

    helm upgrade --install -f promtail.yaml promtail grafana/promtail -n monitoring

    After deploying Promtail and Loki, two new data sources will be presented to Grafana.

    ๐Ÿ” Access Grafana

    Run the following command to port-forward Grafana:

    kubectl port-forward svc/kube-prometheus-stack-grafana 3000:3000 -n monitoring

    Connect to Grafana by navigating to http://localhost:3000, then go to Menu -> Data Sources to add the new data sources.

    ๐Ÿ“ Summary

    Setting up a monitoring stack for Kubernetes is a complex task, but it is essential for ensuring the health and performance of your applications. The steps outlined in this article provide a good starting point, but they are just the beginning. In most cases, workloads will not be limited to Kubernetes pods and containers. There will also be a need to monitor virtual machines and other infrastructure components. A unified logging mechanism based on the same tools is essential for getting a complete view of your system.


    Until next time, ใคใฅใ ๐ŸŽ‰

    ๐Ÿ’ก Thank you for Reading !! ๐Ÿ™Œ๐Ÿป๐Ÿ˜๐Ÿ“ƒ, see you in the next blog.๐Ÿค˜ Until next time ๐ŸŽ‰

    ๐Ÿš€ Thank you for sticking up till the end. If you have any questions/feedback regarding this blog feel free to connect with me:

    โ™ป๏ธ LinkedIn: https://www.linkedin.com/in/rajhi-saif/

    โ™ป๏ธ X/Twitter: https://x.com/rajhisaifeddine

    The end โœŒ๐Ÿป

    ๐Ÿ”ฐ Keep Learning !! Keep Sharing !! ๐Ÿ”ฐ

    ๐Ÿ“… Stay updated

    Subscribe to our newsletter for more insights on AWS cloud computing and containers.