티스토리 뷰
docker-compose up -d로 로컬 실습, 혹은 Helm으로 k8s에 배포 가능한 예제
전제(Prereqs)
- Docker, Docker Compose (v2) 설치
- Kubernetes 클러스터 (minikube / kind / k3s 등)
- kubectl, helm 설치
프로젝트 구조
llmops-lab/
├─ docker-compose.yml
├─ fastapi/
│ ├─ Dockerfile
│ ├─ app/
│ │ ├─ main.py
│ │ ├─ metrics.py
│ │ └─ requirements.txt
├─ dags/
│ ├─ news_pipeline.py
│ ├─ train_model.py
│ └─ llm_pipeline.py
├─ scripts/
│ └─ finetune.py
├─ prometheus/
│ └─ prometheus.yml
├─ grafana/
│ └─ dashboards/llm_api_dashboard.json
└─ helm-chart/
├─ Chart.yaml
├─ values.yaml
└─ templates/
├─ fastapi-deployment.yaml
├─ fastapi-service.yaml
├─ mlflow-deployment.yaml
├─ mlflow-service.yaml
└─ dags-configmap.yaml
1) docker-compose.yml
version: '3.8'
services:
postgres:
image: postgres:14
environment:
POSTGRES_USER: mlflow
POSTGRES_PASSWORD: mlflow
POSTGRES_DB: mlflow_db
volumes:
- pgdata:/var/lib/postgresql/data
mlflow:
image: mlfloworg/mlflow:2.7.2
environment:
MLFLOW_TRACKING_URI: http://0.0.0.0:5000
BACKEND_STORE_URI: postgresql://mlflow:mlflow@postgres:5432/mlflow_db
ARTIFACT_ROOT: /mlflow/artifacts
ports:
- 5000:5000
volumes:
- mlflow_artifacts:/mlflow/artifacts
depends_on:
- postgres
airflow:
image: apache/airflow:2.8.2
environment:
AIRFLOW__CORE__EXECUTOR: LocalExecutor
AIRFLOW__CORE__FERNET_KEY: 'fernet-key'
AIRFLOW__CORE__LOAD_EXAMPLES: 'false'
volumes:
- ./dags:/opt/airflow/dags
- ./logs:/opt/airflow/logs
ports:
- 8080:8080
command: >
bash -c "airflow db upgrade && airflow users create --username admin --password admin --role Admin --email admin@example.com --firstname Admin --lastname User || true && airflow webserver"
depends_on:
- mlflow
fastapi:
build: ./fastapi
ports:
- 8000:8000
volumes:
- ./app:/app
depends_on:
- mlflow
prometheus:
image: prom/prometheus:latest
volumes:
- ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml:ro
command:
- --config.file=/etc/prometheus/prometheus.yml
ports:
- 9090:9090
grafana:
image: grafana/grafana:9.5.2
ports:
- 3000:3000
volumes:
- ./grafana/dashboards:/var/lib/grafana/dashboards
environment:
GF_SECURITY_ADMIN_PASSWORD: admin
depends_on:
- prometheus
volumes:
pgdata:
mlflow_artifacts:
2) FastAPI 앱 (fastapi/app)
Dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY app/requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt
COPY app /app
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
requirements.txt
fastapi
uvicorn[standard]
transformers
torch
prometheus-client
requests
metrics.py
from prometheus_client import Counter, Histogram
import time
REQUEST_COUNT = Counter("request_count", "Total API requests")
REQUEST_LATENCY = Histogram("request_latency_seconds", "Request latency")
def track_metrics(func):
def wrapper(*args, **kwargs):
start = time.time()
resp = func(*args, **kwargs)
REQUEST_COUNT.inc()
REQUEST_LATENCY.observe(time.time() - start)
return resp
return wrapper
main.py
from fastapi import FastAPI
from transformers import pipeline
from prometheus_client import generate_latest
from metrics import track_metrics
app = FastAPI()
# 데모 목적, 실제 환경에선 로컬에서 큰 모델을 쓰지 마
classifier = pipeline("sentiment-analysis")
@app.get("/predict")
@track_metrics
def predict(q: str):
result = classifier(q)
return {"input": q, "prediction": result}
@app.get("/metrics")
def metrics():
return generate_latest()
3) Airflow DAGs (dags/)
복사한 기존 예제들을 사용한다.
news_pipeline.py
(앞서 제공한 fetch -> save_to_db 예제 사용)
train_model.py
(앞서 제공한 MLflow 연동 학습 DAG 예제 사용)
llm_pipeline.py
(앞서 제공한 finetune -> deploy DAG 예제 사용)
4) scripts/finetune.py (간단 stub)
# scripts/finetune.py
import time
if __name__ == '__main__':
print('start finetune (stub)')
time.sleep(2)
# 실제 파인튜닝 코드, HF Trainer나 LoRA 적용 코드 삽입
print('done')
5) Prometheus config (prometheus/prometheus.yml)
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'fastapi'
static_configs:
- targets: ['fastapi:8000']
metrics_path: /metrics
- job_name: 'airflow'
static_configs:
- targets: ['airflow:8080']
metrics_path: /metrics
- job_name: 'mlflow'
static_configs:
- targets: ['mlflow:5000']
metrics_path: /metrics
주의: Airflow 기본 이미지의 /metrics 엔드포인트가 활성화되어 있지 않을 수 있다. 필요시 exporter(예: airflow-prometheus-exporter) 추가.
6) Grafana dashboard 예제 (grafana/dashboards/llm_api_dashboard.json)
- 앞서 제공한 간단 JSON을 넣으면 된다. (대시보드 패널: Request Latency, Total Requests)
7) Helm chart (간단 버전) — helm-chart/
Chart.yaml
apiVersion: v2
name: llmops-lab
description: Minimal Helm chart for fastapi + mlflow + dags
version: 0.1.0
appVersion: "1.0"
values.yaml
replicaCount: 1
image:
fastapi:
repository: yourrepo/llm-fastapi
tag: latest
mlflow:
repository: mlfloworg/mlflow
tag: 2.7.2
service:
fastapi:
port: 8000
mlflow:
port: 5000
postgres:
enabled: false
host: postgres
port: 5432
user: mlflow
password: mlflow
database: mlflow_db
templates/fastapi-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "llmops-lab.fullname" . }}-fastapi
spec:
replicas: {{ .Values.replicaCount }}
selector:
matchLabels:
app: fastapi
template:
metadata:
labels:
app: fastapi
spec:
containers:
- name: fastapi
image: "{{ .Values.image.fastapi.repository }}:{{ .Values.image.fastapi.tag }}"
ports:
- containerPort: {{ .Values.service.fastapi.port }}
readinessProbe:
httpGet:
path: /predict?q=health
port: {{ .Values.service.fastapi.port }}
initialDelaySeconds: 5
periodSeconds: 10
templates/fastapi-service.yaml
apiVersion: v1
kind: Service
metadata:
name: llmops-fastapi
spec:
selector:
app: fastapi
ports:
- protocol: TCP
port: 80
targetPort: {{ .Values.service.fastapi.port }}
type: ClusterIP
templates/mlflow-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: llmops-mlflow
spec:
replicas: 1
selector:
matchLabels:
app: mlflow
template:
metadata:
labels:
app: mlflow
spec:
containers:
- name: mlflow
image: "{{ .Values.image.mlflow.repository }}:{{ .Values.image.mlflow.tag }}"
env:
- name: BACKEND_STORE_URI
value: "postgresql://{{ .Values.postgres.user }}:{{ .Values.postgres.password }}@{{ .Values.postgres.host }}:{{ .Values.postgres.port }}/{{ .Values.postgres.database }}"
ports:
- containerPort: {{ .Values.service.mlflow.port }}
templates/mlflow-service.yaml
apiVersion: v1
kind: Service
metadata:
name: llmops-mlflow
spec:
selector:
app: mlflow
ports:
- protocol: TCP
port: 5000
targetPort: {{ .Values.service.mlflow.port }}
templates/dags-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: llmops-dags
data:
news_pipeline.py: |
{{ (.Files.Get "dags/news_pipeline.py") | indent 4 }}
train_model.py: |
{{ (.Files.Get "dags/train_model.py") | indent 4 }}
llm_pipeline.py: |
{{ (.Files.Get "dags/llm_pipeline.py") | indent 4 }}
Helm 차트는 최소형이다. prometheus/grafana는 kube-prometheus-stack이나 grafana-community/helm-charts를 통해 별도 설치 권장.
8) 실행 방법 — 로컬 (Docker Compose)
- 파일 복사/저장
- docker-compose up -d --build
- 접속
- FastAPI: http://localhost:8000/predict?q=hello
- MLflow: http://localhost:5000
- Airflow: http://localhost:8080 (admin/admin)
- Prometheus: http://localhost:9090
- Grafana: http://localhost:3000 (admin/admin)
9) 실행 방법 — Kubernetes (Helm)
- (옵션) Postgres 설치: helm install pg bitnami/postgresql
- prometheus/grafana 설치 권장: helm repo add prometheus-community https://prometheus-community.github.io/helm-charts 등
- helm upgrade --install llmops ./helm-chart
- kubectl get svc로 서비스 확인
마무리
- 이 구성은 교육/실습 목적의 최소 구성이다. 프로덕션 배포 전 보안, 인증, 스토리지, 리소스 제한, 모델 사이즈/추론 방식(여러 GPU, Triton 등) 고려해야 한다.
- 필요한 부분(예: Helm chart를 더 완성된 형태로, Grafana provisioning 자동화, Traefik/Ingress, TLS) 바로 만들어 줄게.
코드 세트 기반 LLMOps 커리큘럼 각 단계별로 Airflow DAG / MLflow / FastAPI / Grafana 대시보드 예제 코드.
단계별 LLMOps 실습 커리큘럼 (코드 세트)
1단계 – 데이터 파이프라인 (Airflow DAG)
목표: 뉴스 데이터를 수집 → 전처리 → DB 저장
# dags/news_pipeline.py
from airflow import DAG
from airflow.operators.python import PythonOperator
from datetime import datetime
import requests, sqlite3
def fetch_news():
url = "https://newsapi.org/v2/top-headlines?country=us&apiKey=demo"
resp = requests.get(url).json()
return [a["title"] for a in resp["articles"]]
def save_to_db(**context):
titles = context['ti'].xcom_pull(task_ids='fetch_news')
conn = sqlite3.connect("/tmp/news.db")
c = conn.cursor()
c.execute("CREATE TABLE IF NOT EXISTS news(title TEXT)")
for t in titles:
c.execute("INSERT INTO news VALUES (?)", (t,))
conn.commit()
conn.close()
with DAG("news_pipeline", start_date=datetime(2023,1,1), schedule="@daily", catchup=False) as dag:
t1 = PythonOperator(task_id="fetch_news", python_callable=fetch_news)
t2 = PythonOperator(task_id="save_to_db", python_callable=save_to_db, provide_context=True)
t1 >> t2
2단계 – 모델 학습 & 추적 (MLflow + Airflow DAG)
목표: 텍스트 분류 모델 학습 → MLflow에 기록
# dags/train_model.py
from airflow import DAG
from airflow.operators.python import PythonOperator
from datetime import datetime
import mlflow
from sklearn.datasets import fetch_20newsgroups
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import make_pipeline
def train_model():
mlflow.set_tracking_uri("http://localhost:5000")
mlflow.set_experiment("text-classification")
data = fetch_20newsgroups(subset="train", categories=["sci.space", "rec.sport.baseball"])
X, y = data.data, data.target
model = make_pipeline(TfidfVectorizer(), LogisticRegression(max_iter=200))
with mlflow.start_run():
model.fit(X, y)
acc = model.score(X, y)
mlflow.log_metric("accuracy", acc)
mlflow.sklearn.log_model(model, "model")
with DAG("train_model", start_date=datetime(2023,1,1), schedule="@daily", catchup=False) as dag:
PythonOperator(task_id="train", python_callable=train_model)
3단계 – 모델 서빙 (FastAPI + LLM)
목표: HuggingFace 모델을 FastAPI로 배포
# app/main.py
from fastapi import FastAPI
from transformers import pipeline
app = FastAPI()
classifier = pipeline("sentiment-analysis")
@app.get("/predict")
def predict(q: str):
result = classifier(q)
return {"input": q, "prediction": result}
실행:
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
테스트:
curl "http://localhost:8000/predict?q=I love LLMOps!"
4단계 – 모델 모니터링 (Prometheus + Grafana)
목표: API 응답 속도/요청 수 수집
FastAPI + Prometheus Metrics
# app/metrics.py
from prometheus_client import Counter, Histogram
import time
REQUEST_COUNT = Counter("request_count", "Total API requests")
REQUEST_LATENCY = Histogram("request_latency_seconds", "Request latency")
def track_metrics(func):
def wrapper(*args, **kwargs):
start = time.time()
resp = func(*args, **kwargs)
REQUEST_COUNT.inc()
REQUEST_LATENCY.observe(time.time() - start)
return resp
return wrapper
# app/main.py (추가)
from fastapi import FastAPI
from prometheus_client import generate_latest
from metrics import track_metrics
@app.get("/predict")
@track_metrics
def predict(q: str):
result = classifier(q)
return {"input": q, "prediction": result}
@app.get("/metrics")
def metrics():
return generate_latest()
Grafana 대시보드 JSON (응답시간 + 요청수)
{
"title": "LLM API Monitoring",
"panels": [
{
"type": "graph",
"title": "Request Latency",
"targets": [{"expr": "request_latency_seconds_bucket"}]
},
{
"type": "stat",
"title": "Total Requests",
"targets": [{"expr": "request_count"}]
}
]
}
5단계 – LLMOps 자동화 파이프라인 (Airflow DAG + MLflow + FastAPI)
목표: 데이터 수집 → 파인튜닝 → 모델 등록 → 서빙 → 모니터링 자동화
# dags/llm_pipeline.py
from airflow import DAG
from airflow.operators.python import PythonOperator
from datetime import datetime
import mlflow, subprocess
def finetune_model():
mlflow.set_tracking_uri("http://localhost:5000")
mlflow.set_experiment("llm-finetune")
with mlflow.start_run():
subprocess.run(["python", "scripts/finetune.py"])
mlflow.log_artifact("finetuned_model/")
def deploy_model():
subprocess.run(["docker-compose", "up", "-d", "fastapi_app"])
with DAG("llm_pipeline", start_date=datetime(2023,1,1), schedule="@daily", catchup=False) as dag:
finetune = PythonOperator(task_id="finetune", python_callable=finetune_model)
deploy = PythonOperator(task_id="deploy", python_callable=deploy_model)
finetune >> deploy
최종 구조 (폴더)
llmops-lab/
├─ dags/ # Airflow DAGs (news_pipeline, train_model, llm_pipeline)
├─ app/ # FastAPI 앱 (main.py, metrics.py)
├─ scripts/ # 모델 파인튜닝 스크립트
├─ grafana/ # Grafana 대시보드 JSON
└─ docker-compose.yml # FastAPI + Prometheus + Grafana
Airflow → MLflow → FastAPI → Grafana 연결해서 LLMOps 연습 가능
댓글
