Node.js Clusterモジュール - マルチプロセスでスケールアップする

Node.jsはシングルスレッドで動作するため、デフォルトでは1つのCPUコアしか活用できません。マルチコア環境でアプリケーションをスケールさせるには、複数のプロセスを起動してリクエストを分散する必要があります。

本記事では、node:clusterモジュールを使用したマルチプロセス構成の実装方法を解説します。プライマリ・ワーカーモデルの仕組み、CPUコア数に応じた動的なプロセス生成、graceful shutdownの実装、そしてPM2を使った本番環境での運用まで、実践的なコード例を通じて習得できます。

実行環境

項目	バージョン
Node.js	20.x LTS以上
npm	10.x以上
OS	Windows/macOS/Linux

前提条件

JavaScriptの基礎知識（関数、オブジェクト、async/await）
Node.jsの基本API理解
HTTPサーバーの基本的な仕組みの理解

Clusterモジュールとは

node:clusterモジュールは、Node.jsプロセスのクラスタを作成し、サーバーポートを共有しながら複数のインスタンスを実行する機能を提供します。ワーカープロセスはchild_process.fork()を使用して生成され、プライマリプロセスとIPC（Inter-Process Communication） を通じて通信します。

Worker Threads・child_processとの使い分け

機能	cluster	Worker Threads	child_process
実行単位	プロセス	スレッド	プロセス
メモリ空間	独立	共有可能	独立
サーバーポート共有	可能	不可	不可
起動コスト	高	低	高
主な用途	HTTPサーバーのスケール	CPUバウンドな計算処理	外部コマンド実行
プロセス分離	完全分離	分離なし	完全分離

選択フローチャート

flowchart TD
    A[処理の種類を判断] --> B{HTTPサーバーのスケール?}
    B -->|Yes| C[cluster]
    B -->|No| D{CPUバウンドな計算?}
    D -->|Yes| E[Worker Threads]
    D -->|No| F{外部コマンド実行?}
    F -->|Yes| G[child_process]
    F -->|No| H[非同期APIを使用]

Clusterモジュールが適しているケース:

HTTPサーバーをマルチコアで並列処理したい
リクエストを複数のプロセスに分散したい
1つのワーカーがクラッシュしても他のワーカーでサービスを継続したい
ゼロダウンタイムデプロイを実現したい

Clusterの動作原理

ロードバランシング方式

Clusterモジュールは、2つのロードバランシング方式をサポートしています。

1. ラウンドロビン方式（デフォルト）

プライマリプロセスがポートをリッスンし、新しい接続をラウンドロビン方式でワーカーに分散します。Windows以外のすべてのプラットフォームでデフォルトで使用されます。

2. OSに委任する方式

プライマリプロセスがリッスンソケットを作成し、興味のあるワーカーに送信します。ワーカーは直接接続を受け付けます。理論上は最高のパフォーマンスを発揮しますが、OSのスケジューラに依存するため負荷の偏りが発生することがあります。

flowchart TD
    subgraph "ラウンドロビン方式"
        A[クライアント] --> B[プライマリプロセス]
        B --> C[ワーカー1]
        B --> D[ワーカー2]
        B --> E[ワーカー3]
    end

スケジューリングポリシーは環境変数またはcluster.schedulingPolicyで設定できます。

1
2
3
4
5
6
7


const cluster = require('node:cluster');

// ラウンドロビン方式を明示的に設定
cluster.schedulingPolicy = cluster.SCHED_RR;

// OSに委任する方式
// cluster.schedulingPolicy = cluster.SCHED_NONE;

環境変数で設定する場合:

1
2
3
4
5


# ラウンドロビン方式
NODE_CLUSTER_SCHED_POLICY=rr node app.js

# OSに委任する方式
NODE_CLUSTER_SCHED_POLICY=none node app.js

プライマリとワーカーの役割

flowchart LR
    subgraph Primary["プライマリプロセス"]
        P1[ワーカー管理]
        P2[ポートリッスン]
        P3[接続分散]
    end
    
    subgraph Workers["ワーカープロセス群"]
        W1[ワーカー1<br/>リクエスト処理]
        W2[ワーカー2<br/>リクエスト処理]
        W3[ワーカー3<br/>リクエスト処理]
    end
    
    Primary --> Workers

プライマリプロセスの責務:

ワーカープロセスの生成（fork）
ワーカーの死活監視
接続のロードバランシング
シグナルハンドリング

ワーカープロセスの責務:

実際のリクエスト処理
ビジネスロジックの実行
レスポンスの返却

基本的なCluster実装

最小構成のHTTPサーバー

プライマリとワーカーを同一ファイルに記述する基本パターンです。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31


// server.js
const cluster = require('node:cluster');
const http = require('node:http');
const os = require('node:os');
const process = require('node:process');

// 利用可能なCPUコア数を取得
const numCPUs = os.availableParallelism();

if (cluster.isPrimary) {
  console.log(`プライマリ ${process.pid} が起動しました`);
  console.log(`CPUコア数: ${numCPUs}`);

  // CPUコア数分のワーカーをフォーク
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }

  // ワーカーの終了を監視
  cluster.on('exit', (worker, code, signal) => {
    console.log(`ワーカー ${worker.process.pid} が終了しました`);
  });
} else {
  // ワーカープロセスはHTTPサーバーを起動
  http.createServer((req, res) => {
    res.writeHead(200);
    res.end(`ワーカー ${process.pid} が処理しました\n`);
  }).listen(3000);

  console.log(`ワーカー ${process.pid} が起動しました`);
}

実行結果:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


$ node server.js
プライマリ 12345 が起動しました
CPUコア数: 8
ワーカー 12346 が起動しました
ワーカー 12347 が起動しました
ワーカー 12348 が起動しました
ワーカー 12349 が起動しました
ワーカー 12350 が起動しました
ワーカー 12351 が起動しました
ワーカー 12352 が起動しました
ワーカー 12353 が起動しました

主要なAPIとプロパティ

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20


const cluster = require('node:cluster');

// プライマリかワーカーかを判定
console.log(cluster.isPrimary);  // true: プライマリ, false: ワーカー
console.log(cluster.isWorker);   // true: ワーカー, false: プライマリ

// ワーカー内でのみ利用可能
if (cluster.isWorker) {
  console.log(cluster.worker.id);        // ワーカーID（1から始まる整数）
  console.log(cluster.worker.process);   // ChildProcessオブジェクト
}

// プライマリ内でのみ利用可能
if (cluster.isPrimary) {
  // 全ワーカーのハッシュマップ（キー: worker.id）
  console.log(cluster.workers);
  
  // 設定オブジェクト
  console.log(cluster.settings);
}

ワーカーの設定をカスタマイズ

cluster.setupPrimary()を使用して、ワーカーの生成方法をカスタマイズできます。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14


const cluster = require('node:cluster');

if (cluster.isPrimary) {
  // ワーカーの設定をカスタマイズ
  cluster.setupPrimary({
    exec: 'worker.js',           // ワーカーとして実行するファイル
    args: ['--use', 'https'],    // ワーカーに渡す引数
    silent: false,               // ワーカーの出力を親に転送するか
  });

  // 環境変数を指定してフォーク
  cluster.fork({ NODE_ENV: 'production', WORKER_TYPE: 'api' });
  cluster.fork({ NODE_ENV: 'production', WORKER_TYPE: 'worker' });
}

ワーカーの死活監視と自動再起動

本番環境では、ワーカーがクラッシュした際に自動で再起動する仕組みが必要です。

基本的な自動再起動

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41


const cluster = require('node:cluster');
const http = require('node:http');
const os = require('node:os');
const process = require('node:process');

const numCPUs = os.availableParallelism();

if (cluster.isPrimary) {
  console.log(`プライマリ ${process.pid} が起動しました`);

  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }

  // ワーカー終了時の処理
  cluster.on('exit', (worker, code, signal) => {
    // 意図的な終了（disconnect経由）かどうかを判定
    if (worker.exitedAfterDisconnect === true) {
      console.log(`ワーカー ${worker.process.pid} は正常終了しました`);
    } else {
      // 予期しない終了の場合は再起動
      console.log(`ワーカー ${worker.process.pid} がクラッシュしました。再起動します...`);
      cluster.fork();
    }
  });

  // オンラインイベント
  cluster.on('online', (worker) => {
    console.log(`ワーカー ${worker.process.pid} がオンラインになりました`);
  });

} else {
  http.createServer((req, res) => {
    // 意図的にクラッシュをシミュレート（テスト用）
    if (req.url === '/crash') {
      process.exit(1);
    }
    res.writeHead(200);
    res.end(`ワーカー ${process.pid} が処理しました\n`);
  }).listen(3000);
}

再起動レート制限の実装

クラッシュループを防ぐため、再起動のレート制限を実装します。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53


const cluster = require('node:cluster');
const http = require('node:http');
const os = require('node:os');
const process = require('node:process');

const numCPUs = os.availableParallelism();

// 再起動履歴を管理
const restartHistory = new Map();
const RESTART_LIMIT = 5;        // 制限回数
const RESTART_WINDOW = 60000;   // 監視ウィンドウ（60秒）

function shouldRestart(workerId) {
  const now = Date.now();
  const history = restartHistory.get(workerId) || [];
  
  // 古い履歴を削除
  const recentRestarts = history.filter(time => now - time < RESTART_WINDOW);
  
  if (recentRestarts.length >= RESTART_LIMIT) {
    console.error(`ワーカー ${workerId} の再起動制限に達しました`);
    return false;
  }
  
  recentRestarts.push(now);
  restartHistory.set(workerId, recentRestarts);
  return true;
}

if (cluster.isPrimary) {
  console.log(`プライマリ ${process.pid} が起動しました`);

  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }

  cluster.on('exit', (worker, code, signal) => {
    if (!worker.exitedAfterDisconnect) {
      console.log(`ワーカー ${worker.id} (PID: ${worker.process.pid}) が異常終了`);
      
      if (shouldRestart(worker.id)) {
        console.log(`ワーカー ${worker.id} を再起動します...`);
        setTimeout(() => cluster.fork(), 1000); // 1秒後に再起動
      }
    }
  });

} else {
  http.createServer((req, res) => {
    res.writeHead(200);
    res.end(`ワーカー ${process.pid} が処理しました\n`);
  }).listen(3000);
}

プロセス間通信（IPC）

プライマリとワーカー間でメッセージを送受信できます。

基本的なメッセージ通信

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47


const cluster = require('node:cluster');
const http = require('node:http');
const os = require('node:os');
const process = require('node:process');

const numCPUs = os.availableParallelism();

if (cluster.isPrimary) {
  let totalRequests = 0;

  // 定期的にリクエスト数を表示
  setInterval(() => {
    console.log(`総リクエスト数: ${totalRequests}`);
  }, 5000);

  // ワーカーからのメッセージを受信
  function messageHandler(msg) {
    if (msg.cmd === 'notifyRequest') {
      totalRequests++;
    }
  }

  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }

  // 各ワーカーにメッセージハンドラを設定
  for (const id in cluster.workers) {
    cluster.workers[id].on('message', messageHandler);
  }

  cluster.on('exit', (worker, code, signal) => {
    if (!worker.exitedAfterDisconnect) {
      const newWorker = cluster.fork();
      newWorker.on('message', messageHandler);
    }
  });

} else {
  http.createServer((req, res) => {
    res.writeHead(200);
    res.end(`ワーカー ${process.pid} が処理しました\n`);

    // プライマリにリクエストを通知
    process.send({ cmd: 'notifyRequest' });
  }).listen(3000);
}

ワーカーへのブロードキャスト

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38


const cluster = require('node:cluster');
const http = require('node:http');
const os = require('node:os');
const process = require('node:process');

const numCPUs = os.availableParallelism();

if (cluster.isPrimary) {
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }

  // 全ワーカーにメッセージをブロードキャスト
  function broadcastMessage(message) {
    for (const id in cluster.workers) {
      cluster.workers[id].send(message);
    }
  }

  // 10秒後に設定更新を通知
  setTimeout(() => {
    broadcastMessage({ type: 'config:update', data: { timeout: 5000 } });
  }, 10000);

} else {
  // ワーカー側でメッセージを受信
  process.on('message', (msg) => {
    if (msg.type === 'config:update') {
      console.log(`ワーカー ${process.pid}: 設定を更新しました`, msg.data);
      // 設定を適用
    }
  });

  http.createServer((req, res) => {
    res.writeHead(200);
    res.end(`ワーカー ${process.pid} が処理しました\n`);
  }).listen(3000);
}

Graceful Shutdownの実装

本番環境では、シャットダウン時に処理中のリクエストを完了させてから終了する「Graceful Shutdown」が重要です。

基本的なGraceful Shutdown

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58


const cluster = require('node:cluster');
const http = require('node:http');
const os = require('node:os');
const process = require('node:process');

const numCPUs = os.availableParallelism();

if (cluster.isPrimary) {
  console.log(`プライマリ ${process.pid} が起動しました`);

  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }

  // SIGTERM/SIGINTを受信したらGraceful Shutdownを開始
  const shutdown = (signal) => {
    console.log(`${signal} を受信しました。Graceful Shutdownを開始します...`);
    
    // 全ワーカーを切断
    cluster.disconnect(() => {
      console.log('全ワーカーが切断されました');
      process.exit(0);
    });
  };

  process.on('SIGTERM', () => shutdown('SIGTERM'));
  process.on('SIGINT', () => shutdown('SIGINT'));

  cluster.on('exit', (worker, code, signal) => {
    console.log(`ワーカー ${worker.process.pid} が終了しました`);
  });

} else {
  const server = http.createServer((req, res) => {
    // 重い処理をシミュレート
    setTimeout(() => {
      res.writeHead(200);
      res.end(`ワーカー ${process.pid} が処理しました\n`);
    }, 1000);
  });

  server.listen(3000, () => {
    console.log(`ワーカー ${process.pid} がポート3000でリッスン中`);
  });

  // SIGTERM/SIGINTを受信したらサーバーを閉じる
  const gracefulShutdown = () => {
    console.log(`ワーカー ${process.pid}: シャットダウン中...`);
    
    server.close(() => {
      console.log(`ワーカー ${process.pid}: サーバーを閉じました`);
      process.exit(0);
    });
  };

  process.on('SIGTERM', gracefulShutdown);
  process.on('SIGINT', gracefulShutdown);
}

タイムアウト付きGraceful Shutdown

接続が長時間残る場合に備えて、タイムアウトを設定します。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80


const cluster = require('node:cluster');
const http = require('node:http');
const os = require('node:os');
const process = require('node:process');

const numCPUs = os.availableParallelism();
const SHUTDOWN_TIMEOUT = 30000; // 30秒

if (cluster.isPrimary) {
  console.log(`プライマリ ${process.pid} が起動しました`);

  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }

  const shutdown = (signal) => {
    console.log(`${signal} を受信しました`);
    
    // タイムアウトを設定
    const forceExitTimer = setTimeout(() => {
      console.error('Graceful Shutdownがタイムアウトしました。強制終了します。');
      process.exit(1);
    }, SHUTDOWN_TIMEOUT);

    // 各ワーカーにシャットダウンメッセージを送信
    for (const id in cluster.workers) {
      const worker = cluster.workers[id];
      worker.send('shutdown');
      worker.disconnect();
      
      // ワーカーごとのタイムアウト
      setTimeout(() => {
        if (!worker.isDead()) {
          console.log(`ワーカー ${worker.process.pid} を強制終了します`);
          worker.kill();
        }
      }, SHUTDOWN_TIMEOUT - 5000);
    }

    cluster.on('disconnect', () => {
      const aliveWorkers = Object.keys(cluster.workers).length;
      if (aliveWorkers === 0) {
        clearTimeout(forceExitTimer);
        console.log('全ワーカーが正常に終了しました');
        process.exit(0);
      }
    });
  };

  process.on('SIGTERM', () => shutdown('SIGTERM'));
  process.on('SIGINT', () => shutdown('SIGINT'));

} else {
  // アクティブな接続を追跡
  const connections = new Set();
  
  const server = http.createServer((req, res) => {
    connections.add(res);
    res.on('close', () => connections.delete(res));
    
    setTimeout(() => {
      res.writeHead(200);
      res.end(`ワーカー ${process.pid} が処理しました\n`);
    }, 1000);
  });

  server.listen(3000);

  process.on('message', (msg) => {
    if (msg === 'shutdown') {
      console.log(`ワーカー ${process.pid}: シャットダウン開始`);
      console.log(`アクティブな接続数: ${connections.size}`);
      
      server.close(() => {
        console.log(`ワーカー ${process.pid}: サーバーを閉じました`);
        process.exit(0);
      });
    }
  });
}

ゼロダウンタイムリスタート

ワーカーを順番に再起動することで、ダウンタイムなしでアプリケーションを更新できます。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63


const cluster = require('node:cluster');
const http = require('node:http');
const os = require('node:os');
const process = require('node:process');

const numCPUs = os.availableParallelism();

if (cluster.isPrimary) {
  console.log(`プライマリ ${process.pid} が起動しました`);

  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }

  // SIGUSRでゼロダウンタイムリスタートを開始
  process.on('SIGUSR2', async () => {
    console.log('ゼロダウンタイムリスタートを開始します...');
    
    const workerIds = Object.keys(cluster.workers);
    
    for (const id of workerIds) {
      const worker = cluster.workers[id];
      if (!worker) continue;
      
      console.log(`ワーカー ${worker.process.pid} を再起動中...`);
      
      // 新しいワーカーを起動
      const newWorker = cluster.fork();
      
      // 新しいワーカーがオンラインになるまで待機
      await new Promise((resolve) => {
        newWorker.once('listening', resolve);
      });
      
      // 古いワーカーを切断
      worker.disconnect();
      
      // 古いワーカーの終了を待機
      await new Promise((resolve) => {
        worker.once('exit', resolve);
      });
      
      console.log(`ワーカー ${worker.process.pid} の再起動完了`);
    }
    
    console.log('ゼロダウンタイムリスタート完了');
  });

  cluster.on('exit', (worker, code, signal) => {
    if (!worker.exitedAfterDisconnect) {
      console.log(`ワーカー ${worker.process.pid} がクラッシュしました。再起動します...`);
      cluster.fork();
    }
  });

} else {
  http.createServer((req, res) => {
    res.writeHead(200);
    res.end(`ワーカー ${process.pid} が処理しました\n`);
  }).listen(3000);

  console.log(`ワーカー ${process.pid} が起動しました`);
}

使用方法:

1
2
3
4
5


# サーバーを起動
node server.js

# 別のターミナルからゼロダウンタイムリスタートを実行
kill -USR2 <プライマリのPID>

PM2を使った本番運用

PM2は、Node.jsアプリケーションのプロセスマネージャーです。Clusterモジュールをラップし、より簡単にマルチプロセス構成を実現できます。

PM2のインストールと基本操作

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29


# グローバルインストール
npm install -g pm2

# アプリケーションをクラスターモードで起動
pm2 start app.js -i max

# インスタンス数を指定して起動
pm2 start app.js -i 4

# 状態を確認
pm2 status

# ログを確認
pm2 logs

# モニタリング
pm2 monit

# 停止
pm2 stop app

# 再起動
pm2 restart app

# リロード（ゼロダウンタイム）
pm2 reload app

# 削除
pm2 delete app

PM2設定ファイル（ecosystem.config.js）

本番環境では、設定ファイルを使用して管理します。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40


// ecosystem.config.js
module.exports = {
  apps: [{
    name: 'api-server',
    script: './src/app.js',
    instances: 'max',           // CPUコア数分のインスタンス
    exec_mode: 'cluster',       // クラスターモードを有効化
    
    // 環境変数
    env: {
      NODE_ENV: 'development',
      PORT: 3000,
    },
    env_production: {
      NODE_ENV: 'production',
      PORT: 3000,
    },
    
    // ログ設定
    log_date_format: 'YYYY-MM-DD HH:mm:ss Z',
    error_file: './logs/error.log',
    out_file: './logs/out.log',
    merge_logs: true,
    
    // 再起動設定
    max_memory_restart: '1G',   // メモリ使用量が1GBを超えたら再起動
    restart_delay: 4000,        // 再起動間隔（ミリ秒）
    max_restarts: 10,           // 最大再起動回数
    min_uptime: 5000,           // 最小稼働時間（これより短いと異常終了とみなす）
    
    // 監視設定（開発時のみ）
    watch: false,
    ignore_watch: ['node_modules', 'logs'],
    
    // Graceful Shutdown
    kill_timeout: 5000,         // シャットダウンタイムアウト
    listen_timeout: 8000,       // 起動タイムアウト
    shutdown_with_message: true,
  }]
};

設定ファイルを使用した起動:

1
2
3
4
5


# 開発環境
pm2 start ecosystem.config.js

# 本番環境
pm2 start ecosystem.config.js --env production

PM2でのGraceful Shutdown

PM2はSIGINTシグナルを送信してGraceful Shutdownを開始します。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31


// app.js
const http = require('node:http');
const process = require('node:process');

const server = http.createServer((req, res) => {
  res.writeHead(200);
  res.end('OK\n');
});

server.listen(process.env.PORT || 3000, () => {
  console.log(`サーバー起動: PID ${process.pid}`);
  
  // PM2にready状態を通知（wait_readyオプション使用時）
  if (process.send) {
    process.send('ready');
  }
});

// SIGINTシグナルを受信してGraceful Shutdown
process.on('SIGINT', () => {
  console.log('SIGINT受信: Graceful Shutdown開始');
  
  server.close((err) => {
    if (err) {
      console.error('サーバー終了エラー:', err);
      process.exit(1);
    }
    console.log('サーバー正常終了');
    process.exit(0);
  });
});

ecosystem.config.jsでwait_readyを有効化:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


module.exports = {
  apps: [{
    name: 'api-server',
    script: './src/app.js',
    instances: 'max',
    exec_mode: 'cluster',
    wait_ready: true,           // ready通知を待機
    listen_timeout: 10000,      // 起動タイムアウト
    kill_timeout: 5000,         // シャットダウンタイムアウト
  }]
};

PM2の便利なコマンド

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22


# クラスターのスケーリング
pm2 scale api-server 4        # インスタンス数を4に変更
pm2 scale api-server +2       # インスタンスを2つ追加
pm2 scale api-server -1       # インスタンスを1つ削減

# ゼロダウンタイムリロード
pm2 reload api-server

# 全プロセスをリロード
pm2 reload all

# 特定のワーカーを再起動
pm2 restart api-server --only 0

# スタートアップスクリプトを生成
pm2 startup

# 現在のプロセスリストを保存
pm2 save

# 保存したプロセスリストを復元
pm2 resurrect

ステートレスアプリケーションの設計

Clusterモードで運用する場合、各ワーカープロセスは独立したメモリ空間を持つため、アプリケーションをステートレスに設計する必要があります。

セッション管理

インメモリセッションはワーカー間で共有されないため、外部ストレージを使用します。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20


// 悪い例: インメモリセッション
const sessions = new Map(); // 各ワーカーで独立

// 良い例: Redisをセッションストアとして使用
const express = require('express');
const session = require('express-session');
const RedisStore = require('connect-redis').default;
const { createClient } = require('redis');

const app = express();

const redisClient = createClient({ url: 'redis://localhost:6379' });
redisClient.connect().catch(console.error);

app.use(session({
  store: new RedisStore({ client: redisClient }),
  secret: 'your-secret-key',
  resave: false,
  saveUninitialized: false,
}));

キャッシュ戦略

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19


// 悪い例: インメモリキャッシュ
const cache = new Map(); // 各ワーカーで独立

// 良い例: 外部キャッシュ（Redis）を使用
const { createClient } = require('redis');

const redis = createClient({ url: 'redis://localhost:6379' });

async function getFromCache(key) {
  const cached = await redis.get(key);
  if (cached) {
    return JSON.parse(cached);
  }
  return null;
}

async function setToCache(key, value, ttl = 3600) {
  await redis.setEx(key, ttl, JSON.stringify(value));
}

ファイルアップロード

一時ファイルはワーカー間で共有されないため、共有ストレージを使用します。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21


const multer = require('multer');
const { S3Client, PutObjectCommand } = require('@aws-sdk/client-s3');

// 悪い例: ローカルファイルシステムに保存
// const upload = multer({ dest: 'uploads/' });

// 良い例: S3などの共有ストレージを使用
const s3Client = new S3Client({ region: 'ap-northeast-1' });

const upload = multer({ storage: multer.memoryStorage() });

app.post('/upload', upload.single('file'), async (req, res) => {
  const command = new PutObjectCommand({
    Bucket: 'my-bucket',
    Key: `uploads/${Date.now()}-${req.file.originalname}`,
    Body: req.file.buffer,
  });
  
  await s3Client.send(command);
  res.json({ message: 'Upload successful' });
});

パフォーマンス測定

Clusterモードの効果を測定するため、ベンチマークを実施します。

autocannon を使用したベンチマーク

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


# autocannonのインストール
npm install -g autocannon

# シングルプロセスのベンチマーク
node single-server.js &
autocannon -c 100 -d 30 http://localhost:3000

# クラスターモードのベンチマーク
node cluster-server.js &
autocannon -c 100 -d 30 http://localhost:3000

ベンチマーク結果の比較例

構成	リクエスト/秒	平均レイテンシ	スループット
シングルプロセス	12,000	8.2ms	2.1 MB/s
4ワーカー	42,000	2.3ms	7.4 MB/s
8ワーカー	78,000	1.2ms	13.8 MB/s

まとめ

Node.jsのClusterモジュールを使用することで、マルチコア環境でアプリケーションをスケールさせ、可用性を向上させることができます。

Clusterモジュールを使用する際のポイント:

用途の明確化: HTTPサーバーのスケーリングにはCluster、CPU処理にはWorker Threadsを使用
自動再起動: ワーカークラッシュ時の自動復旧を実装
Graceful Shutdown: 処理中のリクエストを完了させてから終了
ステートレス設計: セッションやキャッシュは外部ストレージに保存
PM2の活用: 本番環境ではPM2でプロセス管理を簡素化

Clusterモジュールの理解は、Node.jsアプリケーションの本番運用において重要なスキルです。PM2と組み合わせることで、運用の複雑さを軽減しながら、高可用性でスケーラブルなシステムを構築できます。

実行環境#

前提条件#

Clusterモジュールとは#

Worker Threads・child_processとの使い分け#

選択フローチャート#

Clusterの動作原理#

ロードバランシング方式#

プライマリとワーカーの役割#

基本的なCluster実装#

最小構成のHTTPサーバー#

主要なAPIとプロパティ#

ワーカーの設定をカスタマイズ#

ワーカーの死活監視と自動再起動#

基本的な自動再起動#

再起動レート制限の実装#

プロセス間通信（IPC）#

基本的なメッセージ通信#

ワーカーへのブロードキャスト#

Graceful Shutdownの実装#

基本的なGraceful Shutdown#

タイムアウト付きGraceful Shutdown#

ゼロダウンタイムリスタート#

PM2を使った本番運用#

PM2のインストールと基本操作#

PM2設定ファイル（ecosystem.config.js）#

PM2でのGraceful Shutdown#

PM2の便利なコマンド#

ステートレスアプリケーションの設計#

セッション管理#

キャッシュ戦略#

ファイルアップロード#

パフォーマンス測定#

autocannon を使用したベンチマーク#

ベンチマーク結果の比較例#

まとめ#

参考リンク#