티스토리 뷰
요약
- Grafana → Create → Import → 붙여넣기.
- Variable namespace에서 네임스페이스 선택.
- Prometheus datasource가 다르면 JSON에서 datasource 필드 바꿔.
Grafana 대시보드
{
"annotations": { "list": [] },
"editable": true,
"gnetId": null,
"graphTooltip": 0,
"id": null,
"links": [],
"panels": [
{
"datasource": "Prometheus",
"fieldConfig": { "defaults": {}, "overrides": [] },
"gridPos": { "h": 8, "w": 12, "x": 0, "y": 0 },
"id": 1,
"title": "Node CPU — usage / request / limit (cores)",
"type": "timeseries",
"targets": [
{
"expr": "sum by (node) (rate(container_cpu_usage_seconds_total{namespace=~\"$namespace\",container!~\"POD|\",image!=\"\"}[5m]))",
"legendFormat": "{{node}} - usage",
"refId": "A"
},
{
"expr": "sum by (node) (kube_pod_container_resource_requests_cpu_cores{namespace=~\"$namespace\"})",
"legendFormat": "{{node}} - request",
"refId": "B"
},
{
"expr": "sum by (node) (kube_pod_container_resource_limits_cpu_cores{namespace=~\"$namespace\"})",
"legendFormat": "{{node}} - limit",
"refId": "C"
}
],
"options": { "legend": { "displayMode": "list" } }
},
{
"datasource": "Prometheus",
"fieldConfig": { "defaults": {}, "overrides": [] },
"gridPos": { "h": 8, "w": 12, "x": 12, "y": 0 },
"id": 2,
"title": "Node Memory — usage / request / limit (bytes)",
"type": "timeseries",
"targets": [
{
"expr": "sum by (node) (container_memory_working_set_bytes{namespace=~\"$namespace\",container!~\"POD|\"})",
"legendFormat": "{{node}} - usage",
"refId": "A"
},
{
"expr": "sum by (node) (kube_pod_container_resource_requests_memory_bytes{namespace=~\"$namespace\"})",
"legendFormat": "{{node}} - request",
"refId": "B"
},
{
"expr": "sum by (node) (kube_pod_container_resource_limits_memory_bytes{namespace=~\"$namespace\"})",
"legendFormat": "{{node}} - limit",
"refId": "C"
}
],
"options": { "legend": { "displayMode": "list" } }
},
{
"datasource": "Prometheus",
"fieldConfig": { "defaults": {}, "overrides": [] },
"gridPos": { "h": 4, "w": 24, "x": 0, "y": 8 },
"id": 3,
"title": "Pod Status (counts) in $namespace",
"type": "stat",
"targets": [
{
"expr": "count by (phase) (kube_pod_status_phase{namespace=~\"$namespace\"})",
"refId": "A"
}
],
"options": {
"orientation": "auto",
"reduceOptions": { "calcs": ["lastNotNull"], "fields": "", "values": false }
}
}
],
"schemaVersion": 36,
"style": "dark",
"tags": ["k8s","namespace"],
"templating": {
"list": [
{
"name": "namespace",
"type": "query",
"label": "Namespace",
"query": "label_values(kube_pod_info, namespace)",
"datasource": "Prometheus",
"multi": false,
"includeAll": false,
"refresh": 1,
"hide": 0
}
]
},
"time": { "from": "now-1h", "to": "now" },
"timepicker": {},
"title": "Namespace → Node & Pod Overview",
"uid": null,
"version": 1
}
추가: Pod별 상세 테이블
Grafana에서 Table 패널을 추가해서 다음 쿼리들을 각각(A,B,C,...)로 만든 뒤 Transformations → Join by field (pod) 하거나 Outer join (Grafana의 Transform 기능으로 여러 쿼리 결과를 병합해야 깔끔)
- A (CPU usage per pod, cores):
sum by (pod, namespace) (rate(container_cpu_usage_seconds_total{namespace="$Namespace",container!~"POD|",image!=""}[5m]))
- B (CPU requests per pod, cores):
sum by (pod, namespace) (kube_pod_container_resource_requests_cpu_cores{namespace="$Namespace"})
- C (CPU limits per pod, cores):
sum by (pod, namespace) (kube_pod_container_resource_limits_cpu_cores{namespace="$Namespace"})
- D (Mem usage per pod, bytes):
sum by (pod, namespace) (container_memory_working_set_bytes{namespace="$Namespace",container!~"POD|"})
- E (Mem requests per pod, bytes):
sum by (pod, namespace) (kube_pod_container_resource_requests_memory_bytes{namespace="$Namespace"})
- F (Mem limits per pod, bytes):
sum by (pod, namespace) (kube_pod_container_resource_limits_memory_bytes{namespace="$Namespace"})
Transform:
- Outer join (left: A, join on pod & namespace) → merge B/C/D/E/F 순으로 조인.
- 필요한 컬럼만 표시(pod, cpu_usage, cpu_request, cpu_limit, mem_usage, mem_request, mem_limit).
- 단위 설정: CPU -> short with unit cores, Memory -> bytes.
팁
- 일부 metric 이름과 라벨은 클러스터 구성(예: cAdvisor, kube-state-metrics 버전)에 따라 다름. (node 라벨 대신 instance일 수 있음)
- CPU usage 계산은 container_cpu_usage_seconds_total의 rate 합산을 사용. 정확성 필요하면 irate/rate window 조정.
- Node 집계가 기대대로 안 나오면 kube_pod_container_resource_*에 node 라벨 유무를 확인하고 쿼리 sum by (node) 대신 sum by (node,namespace) 등으로 조정.
Namespace 선택 → Node 리스트 표시 → Node별 CPU/MEM(limit, request, usage) + Pod 상태 표시
1. Prometheus 메트릭
- Node CPU/Memory
- kube_node_status_allocatable_cpu_cores
- kube_node_status_allocatable_memory_bytes
- kube_pod_container_resource_requests_cpu_cores
- kube_pod_container_resource_limits_cpu_cores
- kube_pod_container_resource_requests_memory_bytes
- kube_pod_container_resource_limits_memory_bytes
- node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate
- container_memory_usage_bytes
- Pod 상태
- kube_pod_status_phase{phase="Running"}
- kube_pod_status_phase{phase="Pending"}
- kube_pod_status_phase{phase="Failed"}
2. Grafana Dashboard JSON 예시
(필요하면 Grafana → Import → JSON 붙여넣기)
{
"title": "Namespace → Node → Pod 리소스 현황",
"timezone": "browser",
"panels": [
{
"type": "dropdown",
"title": "Namespace 선택",
"datasource": "Prometheus",
"targets": [
{
"expr": "count(kube_namespace_labels)",
"legendFormat": "{{namespace}}"
}
],
"templating": {
"type": "query",
"query": "label_values(kube_pod_info, namespace)",
"label": "Namespace",
"name": "namespace",
"refresh": 2
}
},
{
"type": "table",
"title": "Node별 CPU/MEM Requests & Limits",
"datasource": "Prometheus",
"targets": [
{
"expr": "sum(kube_pod_container_resource_requests_cpu_cores{namespace=\"$namespace\"}) by (node)",
"legendFormat": "CPU Request"
},
{
"expr": "sum(kube_pod_container_resource_limits_cpu_cores{namespace=\"$namespace\"}) by (node)",
"legendFormat": "CPU Limit"
},
{
"expr": "sum(kube_pod_container_resource_requests_memory_bytes{namespace=\"$namespace\"}) by (node)",
"legendFormat": "MEM Request"
},
{
"expr": "sum(kube_pod_container_resource_limits_memory_bytes{namespace=\"$namespace\"}) by (node)",
"legendFormat": "MEM Limit"
},
{
"expr": "sum(rate(container_cpu_usage_seconds_total{namespace=\"$namespace\", image!=\"\"}[5m])) by (node)",
"legendFormat": "CPU Usage"
},
{
"expr": "sum(container_memory_usage_bytes{namespace=\"$namespace\", image!=\"\"}) by (node)",
"legendFormat": "MEM Usage"
}
],
"options": {
"showHeader": true
}
},
{
"type": "bargauge",
"title": "Pod 상태",
"datasource": "Prometheus",
"targets": [
{
"expr": "count(kube_pod_status_phase{namespace=\"$namespace\",phase=\"Running\"})",
"legendFormat": "Running"
},
{
"expr": "count(kube_pod_status_phase{namespace=\"$namespace\",phase=\"Pending\"})",
"legendFormat": "Pending"
},
{
"expr": "count(kube_pod_status_phase{namespace=\"$namespace\",phase=\"Failed\"})",
"legendFormat": "Failed"
}
],
"options": {
"orientation": "horizontal",
"displayMode": "gradient",
"reduceOptions": {
"calcs": ["last"]
}
}
}
],
"schemaVersion": 39,
"version": 1
}
3. 구성 요약
- 템플릿 변수 (Namespace 선택기) → label_values(kube_pod_info, namespace)
- Node별 CPU/MEM Panel → Requests, Limits, Usage 비교
- Pod 상태 Panel → Running, Pending, Failed
🔹 패널 구조
- Namespace 선택기 (템플릿 변수)
- Node 리스트 테이블 – Node별 CPU/MEM Requests, Limits, Usage
- Node별 CPU Usage 그래프 – Core 단위
- Node별 Memory Usage 그래프 – Bytes 단위
- Pod 상태 요약 (Running / Pending / Failed) – BarGauge
- Pod 상세 테이블 – Pod 이름별 CPU/MEM Usage
Grafana Dashboard JSON
{
"title": "Namespace → Node → Pod 리소스 현황 (완전판)",
"timezone": "browser",
"schemaVersion": 39,
"version": 1,
"templating": {
"list": [
{
"name": "namespace",
"label": "Namespace",
"type": "query",
"datasource": "Prometheus",
"query": "label_values(kube_pod_info, namespace)",
"refresh": 2,
"includeAll": false,
"multi": false,
"sort": 1
}
]
},
"panels": [
{
"type": "table",
"title": "Node별 CPU/MEM Requests & Limits & Usage",
"datasource": "Prometheus",
"targets": [
{
"expr": "sum(kube_pod_container_resource_requests_cpu_cores{namespace=\"$namespace\"}) by (node)",
"legendFormat": "CPU Request"
},
{
"expr": "sum(kube_pod_container_resource_limits_cpu_cores{namespace=\"$namespace\"}) by (node)",
"legendFormat": "CPU Limit"
},
{
"expr": "sum(rate(container_cpu_usage_seconds_total{namespace=\"$namespace\", image!=\"\"}[5m])) by (node)",
"legendFormat": "CPU Usage"
},
{
"expr": "sum(kube_pod_container_resource_requests_memory_bytes{namespace=\"$namespace\"}) by (node)",
"legendFormat": "MEM Request"
},
{
"expr": "sum(kube_pod_container_resource_limits_memory_bytes{namespace=\"$namespace\"}) by (node)",
"legendFormat": "MEM Limit"
},
{
"expr": "sum(container_memory_usage_bytes{namespace=\"$namespace\", image!=\"\"}) by (node)",
"legendFormat": "MEM Usage"
}
],
"options": {
"showHeader": true
},
"gridPos": { "x": 0, "y": 0, "w": 24, "h": 8 }
},
{
"type": "timeseries",
"title": "Node별 CPU Usage (cores)",
"datasource": "Prometheus",
"targets": [
{
"expr": "sum(rate(container_cpu_usage_seconds_total{namespace=\"$namespace\", image!=\"\"}[5m])) by (node)",
"legendFormat": "{{node}}"
}
],
"fieldConfig": { "defaults": { "unit": "cores" } },
"gridPos": { "x": 0, "y": 8, "w": 12, "h": 8 }
},
{
"type": "timeseries",
"title": "Node별 Memory Usage (bytes)",
"datasource": "Prometheus",
"targets": [
{
"expr": "sum(container_memory_usage_bytes{namespace=\"$namespace\", image!=\"\"}) by (node)",
"legendFormat": "{{node}}"
}
],
"fieldConfig": { "defaults": { "unit": "bytes" } },
"gridPos": { "x": 12, "y": 8, "w": 12, "h": 8 }
},
{
"type": "bargauge",
"title": "Pod 상태 요약",
"datasource": "Prometheus",
"targets": [
{
"expr": "count(kube_pod_status_phase{namespace=\"$namespace\",phase=\"Running\"})",
"legendFormat": "Running"
},
{
"expr": "count(kube_pod_status_phase{namespace=\"$namespace\",phase=\"Pending\"})",
"legendFormat": "Pending"
},
{
"expr": "count(kube_pod_status_phase{namespace=\"$namespace\",phase=\"Failed\"})",
"legendFormat": "Failed"
}
],
"options": {
"orientation": "horizontal",
"displayMode": "gradient",
"reduceOptions": { "calcs": ["last"] }
},
"gridPos": { "x": 0, "y": 16, "w": 8, "h": 8 }
},
{
"type": "table",
"title": "Pod별 CPU/MEM Usage",
"datasource": "Prometheus",
"targets": [
{
"expr": "sum(rate(container_cpu_usage_seconds_total{namespace=\"$namespace\", image!=\"\"}[5m])) by (pod)",
"legendFormat": "CPU Usage"
},
{
"expr": "sum(container_memory_usage_bytes{namespace=\"$namespace\", image!=\"\"}) by (pod)",
"legendFormat": "MEM Usage"
}
],
"options": {
"showHeader": true
},
"gridPos": { "x": 8, "y": 16, "w": 16, "h": 8 }
}
]
}
댓글
