티스토리 뷰

카테고리 없음

Grafana 대시보드

자바바라 2025. 9. 16. 21:40

요약

  1. Grafana → Create → Import → 붙여넣기.
  2. Variable namespace에서 네임스페이스 선택.
  3. Prometheus datasource가 다르면 JSON에서 datasource 필드 바꿔.

Grafana 대시보드

{
  "annotations": { "list": [] },
  "editable": true,
  "gnetId": null,
  "graphTooltip": 0,
  "id": null,
  "links": [],
  "panels": [
    {
      "datasource": "Prometheus",
      "fieldConfig": { "defaults": {}, "overrides": [] },
      "gridPos": { "h": 8, "w": 12, "x": 0, "y": 0 },
      "id": 1,
      "title": "Node CPU — usage / request / limit (cores)",
      "type": "timeseries",
      "targets": [
        {
          "expr": "sum by (node) (rate(container_cpu_usage_seconds_total{namespace=~\"$namespace\",container!~\"POD|\",image!=\"\"}[5m]))",
          "legendFormat": "{{node}} - usage",
          "refId": "A"
        },
        {
          "expr": "sum by (node) (kube_pod_container_resource_requests_cpu_cores{namespace=~\"$namespace\"})",
          "legendFormat": "{{node}} - request",
          "refId": "B"
        },
        {
          "expr": "sum by (node) (kube_pod_container_resource_limits_cpu_cores{namespace=~\"$namespace\"})",
          "legendFormat": "{{node}} - limit",
          "refId": "C"
        }
      ],
      "options": { "legend": { "displayMode": "list" } }
    },
    {
      "datasource": "Prometheus",
      "fieldConfig": { "defaults": {}, "overrides": [] },
      "gridPos": { "h": 8, "w": 12, "x": 12, "y": 0 },
      "id": 2,
      "title": "Node Memory — usage / request / limit (bytes)",
      "type": "timeseries",
      "targets": [
        {
          "expr": "sum by (node) (container_memory_working_set_bytes{namespace=~\"$namespace\",container!~\"POD|\"})",
          "legendFormat": "{{node}} - usage",
          "refId": "A"
        },
        {
          "expr": "sum by (node) (kube_pod_container_resource_requests_memory_bytes{namespace=~\"$namespace\"})",
          "legendFormat": "{{node}} - request",
          "refId": "B"
        },
        {
          "expr": "sum by (node) (kube_pod_container_resource_limits_memory_bytes{namespace=~\"$namespace\"})",
          "legendFormat": "{{node}} - limit",
          "refId": "C"
        }
      ],
      "options": { "legend": { "displayMode": "list" } }
    },
    {
      "datasource": "Prometheus",
      "fieldConfig": { "defaults": {}, "overrides": [] },
      "gridPos": { "h": 4, "w": 24, "x": 0, "y": 8 },
      "id": 3,
      "title": "Pod Status (counts) in $namespace",
      "type": "stat",
      "targets": [
        {
          "expr": "count by (phase) (kube_pod_status_phase{namespace=~\"$namespace\"})",
          "refId": "A"
        }
      ],
      "options": {
        "orientation": "auto",
        "reduceOptions": { "calcs": ["lastNotNull"], "fields": "", "values": false }
      }
    }
  ],
  "schemaVersion": 36,
  "style": "dark",
  "tags": ["k8s","namespace"],
  "templating": {
    "list": [
      {
        "name": "namespace",
        "type": "query",
        "label": "Namespace",
        "query": "label_values(kube_pod_info, namespace)",
        "datasource": "Prometheus",
        "multi": false,
        "includeAll": false,
        "refresh": 1,
        "hide": 0
      }
    ]
  },
  "time": { "from": "now-1h", "to": "now" },
  "timepicker": {},
  "title": "Namespace → Node & Pod Overview",
  "uid": null,
  "version": 1
}

추가: Pod별 상세 테이블

Grafana에서 Table 패널을 추가해서 다음 쿼리들을 각각(A,B,C,...)로 만든 뒤 Transformations → Join by field (pod) 하거나 Outer join (Grafana의 Transform 기능으로 여러 쿼리 결과를 병합해야 깔끔)

  • A (CPU usage per pod, cores):
sum by (pod, namespace) (rate(container_cpu_usage_seconds_total{namespace="$Namespace",container!~"POD|",image!=""}[5m]))
  • B (CPU requests per pod, cores):
sum by (pod, namespace) (kube_pod_container_resource_requests_cpu_cores{namespace="$Namespace"})
  • C (CPU limits per pod, cores):
sum by (pod, namespace) (kube_pod_container_resource_limits_cpu_cores{namespace="$Namespace"})
  • D (Mem usage per pod, bytes):
sum by (pod, namespace) (container_memory_working_set_bytes{namespace="$Namespace",container!~"POD|"})
  • E (Mem requests per pod, bytes):
sum by (pod, namespace) (kube_pod_container_resource_requests_memory_bytes{namespace="$Namespace"})
  • F (Mem limits per pod, bytes):
sum by (pod, namespace) (kube_pod_container_resource_limits_memory_bytes{namespace="$Namespace"})

Transform:

  1. Outer join (left: A, join on pod & namespace) → merge B/C/D/E/F 순으로 조인.
  2. 필요한 컬럼만 표시(pod, cpu_usage, cpu_request, cpu_limit, mem_usage, mem_request, mem_limit).
  3. 단위 설정: CPU -> short with unit cores, Memory -> bytes.

  • 일부 metric 이름과 라벨은 클러스터 구성(예: cAdvisor, kube-state-metrics 버전)에 따라 다름. (node 라벨 대신 instance일 수 있음)
  • CPU usage 계산은 container_cpu_usage_seconds_total의 rate 합산을 사용. 정확성 필요하면 irate/rate window 조정.
  • Node 집계가 기대대로 안 나오면 kube_pod_container_resource_*에 node 라벨 유무를 확인하고 쿼리 sum by (node) 대신 sum by (node,namespace) 등으로 조정.

 

 

 


Namespace 선택 → Node 리스트 표시 → Node별 CPU/MEM(limit, request, usage) + Pod 상태 표시

1. Prometheus 메트릭

  • Node CPU/Memory
    • kube_node_status_allocatable_cpu_cores
    • kube_node_status_allocatable_memory_bytes
    • kube_pod_container_resource_requests_cpu_cores
    • kube_pod_container_resource_limits_cpu_cores
    • kube_pod_container_resource_requests_memory_bytes
    • kube_pod_container_resource_limits_memory_bytes
    • node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate
    • container_memory_usage_bytes
  • Pod 상태
    • kube_pod_status_phase{phase="Running"}
    • kube_pod_status_phase{phase="Pending"}
    • kube_pod_status_phase{phase="Failed"}

2. Grafana Dashboard JSON 예시

(필요하면 Grafana → Import → JSON 붙여넣기)

{
  "title": "Namespace → Node → Pod 리소스 현황",
  "timezone": "browser",
  "panels": [
    {
      "type": "dropdown",
      "title": "Namespace 선택",
      "datasource": "Prometheus",
      "targets": [
        {
          "expr": "count(kube_namespace_labels)",
          "legendFormat": "{{namespace}}"
        }
      ],
      "templating": {
        "type": "query",
        "query": "label_values(kube_pod_info, namespace)",
        "label": "Namespace",
        "name": "namespace",
        "refresh": 2
      }
    },
    {
      "type": "table",
      "title": "Node별 CPU/MEM Requests & Limits",
      "datasource": "Prometheus",
      "targets": [
        {
          "expr": "sum(kube_pod_container_resource_requests_cpu_cores{namespace=\"$namespace\"}) by (node)",
          "legendFormat": "CPU Request"
        },
        {
          "expr": "sum(kube_pod_container_resource_limits_cpu_cores{namespace=\"$namespace\"}) by (node)",
          "legendFormat": "CPU Limit"
        },
        {
          "expr": "sum(kube_pod_container_resource_requests_memory_bytes{namespace=\"$namespace\"}) by (node)",
          "legendFormat": "MEM Request"
        },
        {
          "expr": "sum(kube_pod_container_resource_limits_memory_bytes{namespace=\"$namespace\"}) by (node)",
          "legendFormat": "MEM Limit"
        },
        {
          "expr": "sum(rate(container_cpu_usage_seconds_total{namespace=\"$namespace\", image!=\"\"}[5m])) by (node)",
          "legendFormat": "CPU Usage"
        },
        {
          "expr": "sum(container_memory_usage_bytes{namespace=\"$namespace\", image!=\"\"}) by (node)",
          "legendFormat": "MEM Usage"
        }
      ],
      "options": {
        "showHeader": true
      }
    },
    {
      "type": "bargauge",
      "title": "Pod 상태",
      "datasource": "Prometheus",
      "targets": [
        {
          "expr": "count(kube_pod_status_phase{namespace=\"$namespace\",phase=\"Running\"})",
          "legendFormat": "Running"
        },
        {
          "expr": "count(kube_pod_status_phase{namespace=\"$namespace\",phase=\"Pending\"})",
          "legendFormat": "Pending"
        },
        {
          "expr": "count(kube_pod_status_phase{namespace=\"$namespace\",phase=\"Failed\"})",
          "legendFormat": "Failed"
        }
      ],
      "options": {
        "orientation": "horizontal",
        "displayMode": "gradient",
        "reduceOptions": {
          "calcs": ["last"]
        }
      }
    }
  ],
  "schemaVersion": 39,
  "version": 1
}

 

3. 구성 요약

  1. 템플릿 변수 (Namespace 선택기) → label_values(kube_pod_info, namespace)
  2. Node별 CPU/MEM Panel → Requests, Limits, Usage 비교
  3. Pod 상태 Panel → Running, Pending, Failed

 

🔹 패널 구조

  1. Namespace 선택기 (템플릿 변수)
  2. Node 리스트 테이블 – Node별 CPU/MEM Requests, Limits, Usage
  3. Node별 CPU Usage 그래프 – Core 단위
  4. Node별 Memory Usage 그래프 – Bytes 단위
  5. Pod 상태 요약 (Running / Pending / Failed) – BarGauge
  6. Pod 상세 테이블 – Pod 이름별 CPU/MEM Usage

Grafana Dashboard JSON

{
  "title": "Namespace → Node → Pod 리소스 현황 (완전판)",
  "timezone": "browser",
  "schemaVersion": 39,
  "version": 1,
  "templating": {
    "list": [
      {
        "name": "namespace",
        "label": "Namespace",
        "type": "query",
        "datasource": "Prometheus",
        "query": "label_values(kube_pod_info, namespace)",
        "refresh": 2,
        "includeAll": false,
        "multi": false,
        "sort": 1
      }
    ]
  },
  "panels": [
    {
      "type": "table",
      "title": "Node별 CPU/MEM Requests & Limits & Usage",
      "datasource": "Prometheus",
      "targets": [
        {
          "expr": "sum(kube_pod_container_resource_requests_cpu_cores{namespace=\"$namespace\"}) by (node)",
          "legendFormat": "CPU Request"
        },
        {
          "expr": "sum(kube_pod_container_resource_limits_cpu_cores{namespace=\"$namespace\"}) by (node)",
          "legendFormat": "CPU Limit"
        },
        {
          "expr": "sum(rate(container_cpu_usage_seconds_total{namespace=\"$namespace\", image!=\"\"}[5m])) by (node)",
          "legendFormat": "CPU Usage"
        },
        {
          "expr": "sum(kube_pod_container_resource_requests_memory_bytes{namespace=\"$namespace\"}) by (node)",
          "legendFormat": "MEM Request"
        },
        {
          "expr": "sum(kube_pod_container_resource_limits_memory_bytes{namespace=\"$namespace\"}) by (node)",
          "legendFormat": "MEM Limit"
        },
        {
          "expr": "sum(container_memory_usage_bytes{namespace=\"$namespace\", image!=\"\"}) by (node)",
          "legendFormat": "MEM Usage"
        }
      ],
      "options": {
        "showHeader": true
      },
      "gridPos": { "x": 0, "y": 0, "w": 24, "h": 8 }
    },
    {
      "type": "timeseries",
      "title": "Node별 CPU Usage (cores)",
      "datasource": "Prometheus",
      "targets": [
        {
          "expr": "sum(rate(container_cpu_usage_seconds_total{namespace=\"$namespace\", image!=\"\"}[5m])) by (node)",
          "legendFormat": "{{node}}"
        }
      ],
      "fieldConfig": { "defaults": { "unit": "cores" } },
      "gridPos": { "x": 0, "y": 8, "w": 12, "h": 8 }
    },
    {
      "type": "timeseries",
      "title": "Node별 Memory Usage (bytes)",
      "datasource": "Prometheus",
      "targets": [
        {
          "expr": "sum(container_memory_usage_bytes{namespace=\"$namespace\", image!=\"\"}) by (node)",
          "legendFormat": "{{node}}"
        }
      ],
      "fieldConfig": { "defaults": { "unit": "bytes" } },
      "gridPos": { "x": 12, "y": 8, "w": 12, "h": 8 }
    },
    {
      "type": "bargauge",
      "title": "Pod 상태 요약",
      "datasource": "Prometheus",
      "targets": [
        {
          "expr": "count(kube_pod_status_phase{namespace=\"$namespace\",phase=\"Running\"})",
          "legendFormat": "Running"
        },
        {
          "expr": "count(kube_pod_status_phase{namespace=\"$namespace\",phase=\"Pending\"})",
          "legendFormat": "Pending"
        },
        {
          "expr": "count(kube_pod_status_phase{namespace=\"$namespace\",phase=\"Failed\"})",
          "legendFormat": "Failed"
        }
      ],
      "options": {
        "orientation": "horizontal",
        "displayMode": "gradient",
        "reduceOptions": { "calcs": ["last"] }
      },
      "gridPos": { "x": 0, "y": 16, "w": 8, "h": 8 }
    },
    {
      "type": "table",
      "title": "Pod별 CPU/MEM Usage",
      "datasource": "Prometheus",
      "targets": [
        {
          "expr": "sum(rate(container_cpu_usage_seconds_total{namespace=\"$namespace\", image!=\"\"}[5m])) by (pod)",
          "legendFormat": "CPU Usage"
        },
        {
          "expr": "sum(container_memory_usage_bytes{namespace=\"$namespace\", image!=\"\"}) by (pod)",
          "legendFormat": "MEM Usage"
        }
      ],
      "options": {
        "showHeader": true
      },
      "gridPos": { "x": 8, "y": 16, "w": 16, "h": 8 }
    }
  ]
}

 

 

 

댓글
공지사항
최근에 올라온 글
최근에 달린 댓글
Total
Today
Yesterday
링크
TAG more
«   2026/02   »
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
글 보관함