2026-06-07 — Docker Compose health-gated startup for reverse-proxied app stacks

핵심 요약

오늘의 1개 핵심 주제는 FastAPI/Next.js/Postgres/Redis 같은 Docker Compose 스택을 NPM(Nginx Proxy Manager) 또는 reverse proxy 뒤에서 안정적으로 띄우기 위한 health-gated startup 패턴이다.
Docker 공식 문서 기준으로 Compose의 depends_on은 기본적으로 “컨테이너가 running 상태가 됨”만 보장하고, 애플리케이션이 실제 요청을 받을 준비가 되었는지는 보장하지 않는다.
준비 완료를 게이트로 쓰려면 dependency 쪽에 healthcheck를 정의하고 dependent service에서 long syntax depends_on: { condition: service_healthy }를 사용한다.
Compose long syntax의 restart: true는 dependency가 명시적 Compose 작업으로 업데이트/재시작될 때 dependent service도 재시작하도록 도와 connection 재수립을 유도한다. 단, container runtime의 자동 재시작과는 별개다.
Phillip 서버는 systemd 255 환경이라 systemd의 RestartSteps/RestartMaxDelaySec 기반 exponential backoff도 사용 가능하다. 다만 Compose 자체 restart policy와 systemd wrapper restart를 중복 적용할 때는 “누가 재시작 책임자인가”를 명확히 해야 한다.

왜 중요한가

reverse proxy는 upstream 컨테이너의 프로세스 존재만 보고 트래픽을 넘기는 경우가 많다. 앱이 DB migration, cache warmup, OAuth token refresh, model/tool registry load 중이면 502/504 또는 첫 요청 실패가 발생한다.
FastAPI/Next.js 배포에서 “컨테이너는 떴는데 /healthz는 실패”하는 구간을 줄이면 NPM proxy, cron webhook, Hermes gateway/API server, internal RAG services의 야간 장애가 크게 줄어든다.
재시작 자동화는 편하지만, 준비 안 된 의존성 위로 앱을 계속 재시작하면 log storm·DB connection storm·provider retry storm이 된다. health gate와 backoff를 같이 설계해야 한다.

실무 적용

Compose에서 DB/Redis/검색엔진 같은 dependency에는 반드시 readiness healthcheck를 둔다.
앱 컨테이너는 단순 depends_on: [db] 대신 long syntax를 사용한다.
FastAPI는 /healthz를 가볍게 유지하되, 외부 API 호출 없이 프로세스 readiness + DB ping + 필수 config presence 정도만 확인한다. 비싼 점검은 /readyz 또는 운영 스크립트로 분리한다.
Next.js는 앱 자체 health endpoint를 두거나 reverse proxy가 확인할 수 있는 경량 route를 둔다. SSR/API route가 DB를 필요로 한다면 DB readiness gate 이후 시작하도록 한다.
NPM 뒤의 서비스는 내부 Docker network 또는 host port mapping을 명확히 하고, proxy upstream host/port와 container health 대상이 같은 의미인지 확인한다.
systemd로 docker compose up을 감쌀 경우, host boot 시 Docker daemon 준비 이후 시작되도록 After=docker.service/Requires=docker.service 성격을 검토하고, local systemd 버전에서 지원되는 backoff 설정을 확인한다.

구현/운영 패턴

1) Compose dependency readiness gate

services:
  api:
    build: ./api
    depends_on:
      db:
        condition: service_healthy
        restart: true
      redis:
        condition: service_started
    healthcheck:
      test: [&quot;CMD-SHELL&quot;, &quot;python -c 'import urllib.request; urllib.request.urlopen(\&quot;http://127.0.0.1:8000/healthz\&quot;, timeout=2)'&quot;]
      interval: 30s
      timeout: 5s
      retries: 3
      start_period: 20s

  db:
    image: postgres:18
    healthcheck:
      test: [&quot;CMD-SHELL&quot;, &quot;pg_isready -U $${POSTGRES_USER} -d $${POSTGRES_DB}&quot;]
      interval: 10s
      timeout: 10s
      retries: 5
      start_period: 30s

  redis:
    image: redis:7

2) systemd wrapper는 backoff와 관찰성을 담당

[Unit]
Description=project compose stack
After=docker.service network-online.target
Requires=docker.service
Wants=network-online.target

[Service]
Type=oneshot
RemainAfterExit=yes
WorkingDirectory=/opt/my-project
ExecStart=/usr/bin/docker compose up -d --remove-orphans
ExecStop=/usr/bin/docker compose down
TimeoutStartSec=180
Restart=on-failure
RestartSec=10s
RestartSteps=4
RestartMaxDelaySec=160s

[Install]
WantedBy=multi-user.target

RestartSteps/RestartMaxDelaySec는 systemd 254+ 기능이며, 현재 서버는 systemd 255라 사용 가능하다.
Type=oneshot + RemainAfterExit=yes는 compose stack을 background로 띄운 뒤 unit이 active 상태로 남게 하는 패턴이다. 장기 foreground docker compose up을 systemd가 직접 추적하게 할지, up -d를 orchestrator 명령으로 볼지는 프로젝트별로 선택한다.

3) 배포 후 smoke test는 proxy와 internal을 분리

# internal container health
/usr/bin/docker compose ps
/usr/bin/docker compose exec api python -c 'import urllib.request; print(urllib.request.urlopen(&quot;http://127.0.0.1:8000/healthz&quot;, timeout=3).status)'

# proxy path health, 예: NPM 뒤 public/internal domain
curl -fsS https://app.example.com/healthz

internal health는 container readiness를 확인한다.
proxy health는 NPM host/SSL/upstream 설정까지 포함한다.
둘 중 하나만 성공하면 원인을 다르게 본다: internal 실패는 app/dependency, proxy 실패는 NPM/DNS/TLS/upstream 가능성이 높다.

리스크/검증 필요

Docker 문서의 depends_on.restart: true는 명시적 Compose 작업에 대한 dependent restart이며, 컨테이너 런타임이 dependency를 자동 재시작한 경우까지 보장하는 것은 아니다.
healthcheck 명령이 이미지에 없는 도구(curl, wget, python)에 의존하면 health가 영구 실패한다. 이미지별로 내장 도구를 확인하거나 작은 static health binary/endpoint를 사용한다.
/healthz에서 외부 LLM API, 웹 검색, 결제 API 같은 외부 의존성을 호출하면 장애 전파와 rate-limit를 만들 수 있다. readiness와 deep diagnostics를 분리해야 한다.
systemd wrapper와 Compose restart: policy를 모두 켜면 재시작 주체가 복수화된다. 운영 문서에 “container failure는 Compose/Docker, stack bring-up failure는 systemd”처럼 책임 경계를 적는다.
NPM 자체의 upstream health check 동작은 구성/버전에 따라 제한적일 수 있으므로, NPM 설정은 별도 검증 필요. 오늘 학습에서는 Docker/systemd 공식 동작을 확정 근거로 삼았다.

다음 학습 질문

Phillip의 주요 Docker 프로젝트별로 healthcheck가 없는 service는 무엇인가?
FastAPI 공통 템플릿에 /healthz와 /readyz를 어떻게 분리할 것인가?
NPM 뒤 배포에서 internal health 성공/proxy health 실패를 자동 분류하는 read-only smoke script를 만들 수 있는가?
GitHub Actions/CI에서 docker compose up --wait 또는 health poll을 어디까지 표준화할 것인가?

2026-06-07 — Docker Compose health-gated startup for reverse-proxied app stacks

2026-06-07 — Docker Compose health-gated startup for reverse-proxied app stacks

핵심 요약

왜 중요한가

실무 적용

구현/운영 패턴

1) Compose dependency readiness gate

2) systemd wrapper는 backoff와 관찰성을 담당

3) 배포 후 smoke test는 proxy와 internal을 분리

리스크/검증 필요

다음 학습 질문

관련 링크

내일 학습·스터디 큐

스터디 대화

인사이트로 Second Brain에 저장