DataLoader는 어떻게 N+1을 없애는가

2026.04.04·9분

#DataLoader #GraphQL #NestJS #N+1 #이벤트루프

DataLoader를 처음 접하면 이런 의문이 생겨요.

1@ResolveField(() => [Comment])
2async comments(@Parent() post: Post) {
3  return this.commentLoader.load(post.id);  // post마다 개별 호출
4}

post가 20개면 load()가 20번 호출된다.
그런데 DB 쿼리는 1번만 실행된다고?

어떻게 이게 가능한지, 그리고 중첩 resolver에서 어떻게 구조화하는지 살펴보자.

핵심 원리: 이벤트 루프 한 틱

DataLoader의 핵심은 JavaScript 이벤트 루프에 있다.

load(id)를 호출하면 DataLoader는 즉시 DB를 조회하지 않는다.
대신 현재 콜스택이 비워질 때까지 요청을 큐에 쌓는다.

text

1콜스택 실행 중:
2  load(1) 호출 → 큐: [1]
3  load(2) 호출 → 큐: [1, 2]
4  load(3) 호출 → 큐: [1, 2, 3]
5  ...
6  load(20) 호출 → 큐: [1, 2, ..., 20]
7
8콜스택 비워짐 → process.nextTick 실행
9  → 배치 함수 호출 → SELECT WHERE post_id IN (1, ..., 20)

"틱"은 몇 초라는 고정 시간이 아니다.
현재 콜스택이 비워지는 순간이다. 20개든 1000개든 같은 콜스택 안에서 호출되면 전부 모인다.

GraphQL resolver는 같은 depth의 필드를 동기적으로 순회하기 때문에,
posts 20개의 comments resolver 호출이 전부 같은 콜스택 안에서 일어난다.
그래서 DataLoader가 20개를 한 번에 배치할 수 있는 것.

[!info] process.nextTick DataLoader는 내부적으로 process.nextTick()을 사용한다.
setTimeout(fn, 0)보다 더 빨리 실행되는 Node.js의 마이크로태스크 큐다.
콜스택이 비워지는 즉시 실행된다.

코드로 보는 동작 흐름

DataLoader 생성 시 배치 함수를 넘긴다.

1@Injectable({ scope: Scope.REQUEST })
2export class CommentLoader {
3  private loader = new DataLoader<number, Comment[]>(
4    async (postIds: readonly number[]) => {
5      // 틱이 끝난 뒤 모인 id 배열로 딱 한 번 호출됨
6      const comments = await this.commentService.findByPostIds([...postIds]);
7      // SELECT * FROM comment WHERE post_id IN (1, 2, ..., 20)
8
9      // 입력 순서 그대로 결과를 매핑해서 반환 — 이 규칙은 필수
10      return postIds.map(id =>
11        comments.filter(c => c.postId === id)
12      );
13    }
14  );
15
16  load(postId: number) {
17    return this.loader.load(postId);  // Promise 반환
18  }
19}

1@ResolveField(() => [Comment])
2async comments(@Parent() post: Post) {
3  return this.commentLoader.load(post.id);
4}

시간축으로 보면:

text

1t=0ms  post1.comments → load(1) → 큐: [1]
2t=0ms  post2.comments → load(2) → 큐: [1, 2]
3t=0ms  post3.comments → load(3) → 큐: [1, 2, 3]
4       콜스택 비워짐
5t=1ms  배치 함수 실행 → SELECT WHERE post_id IN (1, 2, 3)
6t=5ms  결과 반환 → load(1), load(2), load(3) 각각 resolve

배치 함수의 규칙

배치 함수에는 반드시 지켜야 할 규칙이 있다.

입력 배열과 출력 배열의 순서와 길이가 일치해야 한다.

1// 입력:  [1,      2,      3     ]
2// 출력:  [comments of 1, comments of 2, comments of 3]
3//        인덱스가 정확히 대응해야 함
4
5return postIds.map(id =>
6  comments.filter(c => c.postId === id)
7);

DataLoader는 load(1)의 결과를 출력 배열의 0번 인덱스에서 꺼낸다.
순서가 틀리면 엉뚱한 데이터가 반환된다.

[!warning] 흔한 실수 DB에서 가져온 결과를 그냥 return comments로 반환하면 안 된다.
입력 postId 순서대로 매핑하지 않으면 post1이 post3의 comments를 받을 수 있다.

캐싱

DataLoader는 같은 요청 안에서 동일한 id를 두 번 load하면 캐시에서 반환한다.

1loader.load(1)  // DB 조회
2loader.load(1)  // 캐시 히트 — DB 조회 없음
3loader.load(2)  // DB 조회

같은 GraphQL 요청 안에서 여러 컴포넌트가 동일한 데이터를 요청해도 DB는 1번만 조회된다.

캐시는 요청 단위여야 한다. NestJS에서 SINGLETON으로 등록하면 이전 요청의 캐시가 남아서 다른 사용자에게 잘못된 데이터가 반환될 수 있다.

1@Injectable({ scope: Scope.REQUEST })  // 요청마다 새 인스턴스 — 필수
2export class CommentLoader { ... }

중첩 resolver에서의 구조화

N+1은 list에만 생기지 않는다.
중첩이 깊어지면 단건 조회여도 발생한다.

graphql

1query {
2  posts {           # 20개
3    comments {      # post마다 10개 → 200개
4      author {      # comment마다 → author resolver 200번 실행
5        name
6      }
7    }
8  }
9}

DataLoader 없이 중첩하면:

text

1posts 조회:    1번
2comments 조회: 20번   (post 개수)
3author 조회:   200번  (comment 개수)
4
5총 221번

DataLoader는 각 엔티티 타입마다 하나씩 만든다.

1// "comment를 post_id로 불러오는 것"
2@Injectable({ scope: Scope.REQUEST })
3export class CommentLoader {
4  private loader = new DataLoader<number, Comment[]>(
5    async (postIds: readonly number[]) => {
6      const comments = await this.commentService.findByPostIds([...postIds]);
7      return postIds.map(id => comments.filter(c => c.postId === id));
8    }
9  );
10  load(postId: number) { return this.loader.load(postId); }
11}

1// "user를 user_id로 불러오는 것"
2@Injectable({ scope: Scope.REQUEST })
3export class UserLoader {
4  private loader = new DataLoader<number, User>(
5    async (userIds: readonly number[]) => {
6      const users = await this.userService.findByIds([...userIds]);
7      return userIds.map(id => users.find(u => u.id === id));
8    }
9  );
10  load(userId: number) { return this.loader.load(userId); }
11}

각 Loader는 서로를 모른다. 완전히 독립적이다.

1@ResolveField(() => [Comment])
2async comments(@Parent() post: Post) {
3  return this.commentLoader.load(post.id);  // CommentLoader 사용
4}

1@ResolveField(() => User)
2async author(@Parent() comment: Comment) {
3  return this.userLoader.load(comment.authorId);  // UserLoader 사용
4}

PostResolver는 CommentLoader만 알고, CommentResolver는 UserLoader만 안다.
중첩 구조를 PostResolver가 파악할 필요 없다. GraphQL 엔진이 체이닝을 알아서 처리한다.

DataLoader 적용 후 쿼리 횟수:

text

1posts 조회:    1번  → SELECT * FROM post
2comments 조회: 1번  → SELECT * FROM comment WHERE post_id IN (1..20)
3author 조회:   1번  → SELECT * FROM user WHERE id IN (1..200)
4
5총 3번  (depth 수만큼)

depth가 10단계여도 쿼리는 10번. 데이터 규모와 무관하게 고정된다.

의존성 구조

text

1PostResolver    → CommentLoader  (post_id 기준으로 comment 로드)
2CommentResolver → UserLoader     (user_id 기준으로 user 로드)
3
4CommentLoader  ↔  UserLoader     → 서로 모름, 의존 없음

DataLoader는 "무엇을 어떤 키로 불러오는가" 단위로 만든다.
중첩 구조나 어느 resolver가 쓰는지는 신경 쓰지 않는다.

정리

DataLoader가 N+1을 없애는 방식:

load(id) 호출 시 즉시 DB 조회 않고 큐에 쌓는다
콜스택이 비워지면 (process.nextTick) 큐의 id들을 배치 함수에 넘긴다
배치 함수가 IN 쿼리 하나로 모든 id를 처리한다
결과를 입력 순서대로 매핑해서 각 load() Promise로 분배한다
동일 id 재요청은 캐시에서 반환한다
Loader는 엔티티 타입마다 독립적으로 만들고, resolver에서 주입해서 쓴다

[!tip] DataLoader를 직접 구현하지 말고 Facebook이 만든 dataloader 패키지를 사용하세요.
핵심은 Scope.REQUEST로 등록하는 것. 이걸 빠뜨리면 캐시 오염으로 데이터 유출이 생길 수 있어요.