[Harness Engineering] 엔트로피 관리 ⋆ Blog * JackerLab

AI가 만든 코드는 시간이 지나면 부패한다

컨텍스트 엔지니어링이 방향을 알려주고, 아키텍처 제약이 경계를 강제한다. 그런데 이 두 가지가 완벽하게 작동해도 해결되지 않는 문제가 있다.

에이전트 A가 날짜 포맷팅 유틸 함수를 만든다. 에이전트 B는 그 함수의 존재를 모르고 같은 기능의 함수를 또 만든다. 에이전트 C는 한 달 전 패턴으로 코드를 작성한다. 에이전트 D는 더 이상 사용되지 않는 API를 호출하는 코드를 추가한다. 각각의 코드는 개별적으로 올바르다. 아키텍처 테스트도 통과하고, 린터도 통과한다. 하지만 전체적으로 보면 코드베이스에 불일치가 쌓인다.

Martin Fowler는 이 현상을 “코드 부패(code rot)”라고 불렀다. 그리고 하네스의 엔트로피 관리를 프로그래밍 언어의 **가비지 컬렉션(GC)**에 비유했다. GC가 메모리 누수를 자동으로 정리하듯, 엔트로피 관리는 코드베이스의 일관성 붕괴를 자동으로 탐지하고 복구한다.

왜 엔트로피가 증가하는가

인간 개발팀에서도 코드 부패는 발생한다. 하지만 에이전트 환경에서는 속도가 다르다.

인간 팀 vs 에이전트 팀의 엔트로피

인간 팀:
  - 코드 리뷰에서 "이거 이미 있는 함수인데?" 하고 지적
  - 페어 프로그래밍 중 "그 패턴은 이제 안 써" 하고 교정
  - 팀 회의에서 "이런 방향으로 통일하자" 하고 합의
  - 점진적으로 축적 → 주기적 리팩토링으로 해소

에이전트 팀:
  - 서로의 작업을 모름 (각 세션이 독립적)
  - 암묵적 합의가 불가능 (대화 기록은 세션마다 초기화)
  - 코드베이스의 전체 그림을 보지 않음 (컨텍스트 윈도우 제한)
  - 빠르게 축적 → 자동화된 정리 메커니즘 필수

에이전트가 10개 동시에 작업하면, 하루에 수십 개의 PR이 머지된다. OpenAI 실험에서 엔지니어 1인당 일 3.5 PR이었고, 팀이 커지면서 처리량은 더 늘었다. 이 속도에서 인간이 모든 중복과 불일치를 잡아내는 것은 불가능하다.

엔트로피의 구체적 유형

1. 코드 중복
   - 같은 기능의 함수가 여러 곳에 존재
   - 비슷하지만 미묘하게 다른 구현이 공존

2. 패턴 불일치
   - 에러 처리 방식이 파일마다 다름
   - 같은 종류의 API 호출인데 스타일이 제각각

3. 죽은 코드
   - 리팩토링 후 더 이상 호출되지 않는 함수
   - 제거된 기능의 관련 코드가 남아있음

4. 문서-코드 괴리
   - AGENTS.md에 적힌 규칙과 실제 코드가 불일치
   - 주석이 코드 변경을 반영하지 않음

5. 의존성 부패
   - 사용하지 않는 패키지가 package.json에 남아있음
   - 같은 기능의 라이브러리가 중복 설치됨

중복 탐지 자동화

엔트로피 관리의 첫 번째 단계는 중복 탐지다. 코드 수준의 중복을 기계적으로 찾아내는 것이다.

jscpd를 활용한 코드 클론 탐지

// .jscpd.json — 중복 코드 탐지 설정

{
  "threshold": 5,
  "reporters": ["json", "console"],
  "ignore": [
    "node_modules",
    "**/*.test.ts",
    "**/*.spec.ts",
    "dist"
  ],
  "minLines": 5,
  "minTokens": 50,
  "output": "reports/jscpd"
}

# CI 파이프라인에 추가
npx jscpd src/ --config .jscpd.json

# 중복률이 임계값을 초과하면 실패
npx jscpd src/ --threshold 5 --exitCode 1

하지만 단순 텍스트 비교만으로는 부족하다. “거의 같지만 변수명만 다른” 함수, “기능은 같지만 구현이 다른” 코드는 jscpd로 잡기 어렵다.

커스텀 중복 함수 스캐너

더 정교한 탐지를 위해 AST(Abstract Syntax Tree) 기반 분석을 할 수 있다.

// scripts/detect-similar-functions.ts

import * as ts from 'typescript';
import * as fs from 'fs';
import * as path from 'path';

interface FunctionSignature {
  name: string;
  file: string;
  params: string[];
  returnType: string;
  lineCount: number;
}

function extractFunctions(filePath: string): FunctionSignature[] {
  const content = fs.readFileSync(filePath, 'utf-8');
  const sourceFile = ts.createSourceFile(filePath, content, ts.ScriptTarget.Latest);
  const functions: FunctionSignature[] = [];

  function visit(node: ts.Node) {
    if (ts.isFunctionDeclaration(node) || ts.isArrowFunction(node)) {
      const name = ts.isFunctionDeclaration(node)
        ? node.name?.getText(sourceFile) || 'anonymous'
        : (node.parent as any)?.name?.getText(sourceFile) || 'anonymous';

      const params = node.parameters.map(p => p.type?.getText(sourceFile) || 'any');
      const returnType = node.type?.getText(sourceFile) || 'void';
      const startLine = sourceFile.getLineAndCharacterOfPosition(node.getStart()).line;
      const endLine = sourceFile.getLineAndCharacterOfPosition(node.getEnd()).line;

      functions.push({
        name,
        file: filePath,
        params,
        returnType,
        lineCount: endLine - startLine + 1,
      });
    }
    ts.forEachChild(node, visit);
  }

  visit(sourceFile);
  return functions;
}

function findSimilarFunctions(allFunctions: FunctionSignature[]): Array<[FunctionSignature, FunctionSignature]> {
  const similar: Array<[FunctionSignature, FunctionSignature]> = [];

  for (let i = 0; i < allFunctions.length; i++) {
    for (let j = i + 1; j < allFunctions.length; j++) {
      const a = allFunctions[i];
      const b = allFunctions[j];

      // 같은 파일이면 건너뛰기
      if (a.file === b.file) continue;

      // 시그니처가 유사한 함수 탐지
      const sameParamCount = a.params.length === b.params.length;
      const sameParamTypes = a.params.every((p, idx) => p === b.params[idx]);
      const sameReturnType = a.returnType === b.returnType;
      const similarSize = Math.abs(a.lineCount - b.lineCount) <= 5;
      const similarName = levenshteinDistance(a.name, b.name) <= 3;

      if (sameParamTypes && sameReturnType && similarSize && similarName) {
        similar.push([a, b]);
      }
    }
  }

  return similar;
}

// 실행
const allFiles = getTypeScriptFiles('src/');
const allFunctions = allFiles.flatMap(extractFunctions);
const duplicates = findSimilarFunctions(allFunctions);

if (duplicates.length > 0) {
  console.log(`⚠️  유사 함수 ${duplicates.length}쌍 발견:\n`);
  duplicates.forEach(([a, b]) => {
    console.log(`  ${a.name} (${a.file}:${a.lineCount}줄)`);
    console.log(`  ${b.name} (${b.file}:${b.lineCount}줄)\n`);
  });
}

죽은 코드 탐지

리팩토링이 반복되면 더 이상 호출되지 않는 함수, import되지 않는 모듈이 쌓인다. 에이전트는 기존 코드를 수정할 때 연관된 코드를 모두 정리하지 않는 경우가 있다.

// scripts/detect-dead-code.ts

import { execSync } from 'child_process';

// ts-prune: 사용되지 않는 export 탐지
function findUnusedExports(): string[] {
  const result = execSync('npx ts-prune --error', {
    encoding: 'utf-8',
    cwd: process.cwd(),
  });

  return result
    .split('\n')
    .filter(line => line.includes('used in module'))
    .map(line => line.trim());
}

// knip: 사용하지 않는 파일, 의존성, export 통합 탐지
function runKnip(): void {
  try {
    execSync('npx knip --reporter compact', {
      encoding: 'utf-8',
      stdio: 'inherit',
    });
  } catch (error) {
    console.log('⚠️  사용되지 않는 코드가 발견되었습니다.');
    process.exit(1);
  }
}

// knip.json — 죽은 코드 탐지 설정

{
  "entry": ["src/main.ts", "src/index.ts"],
  "project": ["src/**/*.ts"],
  "ignore": ["**/*.test.ts", "**/*.spec.ts"],
  "ignoreDependencies": ["@types/*"],
  "rules": {
    "files": "error",
    "dependencies": "error",
    "unlisted": "error",
    "exports": "warn",
    "types": "warn"
  }
}

문서-코드 일관성 검증

AGENTS.md에 “any 타입 사용 금지”라고 적혀있는데 실제 코드에 any가 있으면, 에이전트는 혼란스러운 신호를 받는다. 문서와 코드가 불일치하면, 에이전트는 둘 중 하나를 무시하게 된다.

// scripts/verify-doc-code-consistency.ts

import * as fs from 'fs';

interface Rule {
  description: string;
  check: () => { passed: boolean; details: string };
}

const rules: Rule[] = [
  {
    description: 'AGENTS.md에 "any 타입 금지"가 있으면 tsconfig에서도 noImplicitAny: true',
    check: () => {
      const agentsMd = fs.readFileSync('AGENTS.md', 'utf-8');
      const tsconfig = JSON.parse(fs.readFileSync('tsconfig.json', 'utf-8'));

      const docForbidsAny = agentsMd.toLowerCase().includes('any 타입') &&
                            agentsMd.includes('금지');
      const configEnforced = tsconfig.compilerOptions?.noImplicitAny === true;

      return {
        passed: !docForbidsAny || configEnforced,
        details: docForbidsAny && !configEnforced
          ? 'AGENTS.md는 any를 금지하지만 tsconfig.noImplicitAny가 false'
          : 'OK',
      };
    },
  },
  {
    description: 'AGENTS.md의 프로젝트 구조가 실제 디렉토리와 일치',
    check: () => {
      const agentsMd = fs.readFileSync('AGENTS.md', 'utf-8');
      const declaredDirs = ['domain', 'application', 'infrastructure', 'presentation'];
      const missing = declaredDirs.filter(dir => !fs.existsSync(`src/${dir}`));

      return {
        passed: missing.length === 0,
        details: missing.length > 0
          ? `AGENTS.md에 선언되었지만 존재하지 않는 디렉토리: ${missing.join(', ')}`
          : 'OK',
      };
    },
  },
  {
    description: 'package.json의 스크립트에 아키텍처 테스트가 포함',
    check: () => {
      const pkg = JSON.parse(fs.readFileSync('package.json', 'utf-8'));
      const hasArchTest = Object.keys(pkg.scripts || {}).some(
        key => key.includes('architecture') || key.includes('arch')
      );

      return {
        passed: hasArchTest,
        details: hasArchTest ? 'OK' : 'test:architecture 스크립트가 없음',
      };
    },
  },
];

// 실행
console.log('📋 문서-코드 일관성 검증\n');
let allPassed = true;

for (const rule of rules) {
  const result = rule.check();
  const icon = result.passed ? '✅' : '❌';
  console.log(`${icon} ${rule.description}`);
  if (!result.passed) {
    console.log(`   → ${result.details}`);
    allPassed = false;
  }
}

if (!allPassed) {
  console.log('\n⚠️  문서와 코드가 불일치합니다. 둘 중 하나를 수정하세요.');
  process.exit(1);
}

정리 에이전트: 자동화된 가비지 컬렉션

위의 개별 도구들을 조합하면, 주기적으로 실행되는 정리 에이전트를 구성할 수 있다.

# .github/workflows/entropy-cleanup.yml

name: Entropy Cleanup
on:
  schedule:
    - cron: '0 3 * * 1'  # 매주 월요일 새벽 3시
  workflow_dispatch:       # 수동 실행도 가능

jobs:
  entropy-scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Setup Node
        uses: actions/setup-node@v4
        with:
          node-version: '20'

      - name: Install dependencies
        run: npm ci

      - name: 중복 코드 탐지
        run: npx jscpd src/ --config .jscpd.json --reporters json
        continue-on-error: true

      - name: 죽은 코드 탐지
        run: npx knip --reporter compact
        continue-on-error: true

      - name: 유사 함수 스캔
        run: npx ts-node scripts/detect-similar-functions.ts
        continue-on-error: true

      - name: 문서-코드 일관성 검증
        run: npx ts-node scripts/verify-doc-code-consistency.ts
        continue-on-error: true

      - name: 사용하지 않는 의존성 탐지
        run: npx depcheck --ignores="@types/*,typescript"
        continue-on-error: true

      - name: 엔트로피 리포트 생성
        run: |
          echo "# 엔트로피 리포트 $(date +%Y-%m-%d)" > entropy-report.md
          echo "" >> entropy-report.md
          echo "## 중복 코드" >> entropy-report.md
          cat reports/jscpd/jscpd-report.json | jq '.statistics.total' >> entropy-report.md
          echo "" >> entropy-report.md
          echo "## 죽은 코드" >> entropy-report.md
          npx knip --reporter compact 2>&1 >> entropy-report.md || true
          echo "" >> entropy-report.md
          echo "## 사용하지 않는 의존성" >> entropy-report.md
          npx depcheck --ignores="@types/*,typescript" 2>&1 >> entropy-report.md || true

      - name: 리포트를 이슈로 생성
        uses: peter-evans/create-issue-from-file@v5
        with:
          title: '🧹 주간 엔트로피 리포트'
          content-filepath: entropy-report.md
          labels: entropy, maintenance

이 워크플로우는 매주 월요일 자동으로 실행되면서, 코드베이스의 건강 상태를 리포트로 만들어 GitHub 이슈로 등록한다. 사람이 리포트를 확인해서 직접 정리할 수도 있고, 정리 작업 자체를 다시 에이전트에게 위임할 수도 있다.

엔트로피 관리의 핵심 원칙

1. 예방과 탐지를 분리하라
   - 아키텍처 제약은 예방 (규칙 위반을 사전 차단)
   - 엔트로피 관리는 탐지 (규칙은 지켰지만 축적되는 불일치를 사후 발견)
   - 둘 다 필요하다. 예방만으로는 중복과 죽은 코드를 막을 수 없다.

2. 자동화하라
   - 인간이 수동으로 코드베이스를 순회하며 중복을 찾는 것은 비현실적
   - CI/CD 파이프라인에 통합하거나 스케줄링된 워크플로우로 실행
   - 리포트를 자동 생성하여 가시성 확보

3. 임계값을 설정하라
   - "중복 0%"는 비현실적이고 비생산적
   - 중복률 5% 이하, 죽은 코드 비율 3% 이하 등 팀에 맞는 기준 설정
   - 임계값 초과 시에만 알림 → 알림 피로 방지

4. 정리도 에이전트에게 위임할 수 있다
   - 엔트로피 리포트를 기반으로 정리 작업을 프롬프트로 변환
   - "reports/entropy-report.md를 읽고, 중복 함수를 통합하고, 죽은 코드를 제거해줘"
   - 단, 정리 작업의 결과도 동일한 아키텍처 테스트를 통과해야 함

엔트로피 관리는 한 번 설정하면 끝이 아니라, 지속적이고 반복적인 프로세스다. 코드베이스가 성장할수록 엔트로피는 자연스럽게 증가한다. 가비지 컬렉터가 프로그램이 실행되는 동안 계속 돌아가듯, 엔트로피 관리도 프로젝트가 지속되는 동안 계속 실행되어야 한다.