Machine learning

다중 헤드 셀프 어텐션

2017년 Vaswani와 동료들이 소개한 다중 헤드 셀프 어텐션은 시퀀스의 모든 위치가 병렬적으로 다른 모든 위치와의 관계를 계산할 수 있게 해주는 메커니즘입니다. 이는 트랜스포머 아키텍처의 핵심이며 BERT, GPT, T5의 기반이 됩니다.

MethodMind에서 열기곧 제공동영상곧 제공Download slides

방법 전문 읽기

회원 전용

무료 계정으로 로그인하면 이 섹션을 읽을 수 있습니다.

로그인

Method map

The neighbourhood of related methods — select a node to explore.

다중 헤드 셀프 어텐션

BERT 미세 조정 GPT 파인튜닝 LoRA 및 PEFT 랜덤 포레스트 XGBoost 어텐션 메커니즘 양방향 RNN 검색 증강 생성 (RAG)순차열 대 순차열 모델

출처

Vaswani, A. et al. (2017). Attention Is All You Need. NeurIPS. link ↗
Devlin, J. et al. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. NAACL. link ↗

이 페이지 인용 방법

ScholarGate. (2026, June 1). Multi-Head Self-Attention (Transformer Core). ScholarGate. https://scholargate.app/ko/deep-learning/self-attention-transformer

Which method?

Set this method beside its closest kin and read them side by side — the library lays the books on the table; the choice is yours.

Compare side by side →

이 방법을 참조하는 항목

어텐션 메커니즘 양방향 RNN 검색 증강 생성 (RAG)순차열 대 순차열 모델

이 페이지에서 오류를 발견하셨나요? 신고하거나 수정을 제안하세요 →

방법 전문 읽기

Method map

출처

이 페이지 인용 방법

관련 방법

Which method?

이 방법을 참조하는 항목