So sánh phương pháp
Xem các phương pháp đã chọn cạnh nhau; những hàng khác biệt được làm nổi bật.
| Hỏi đáp Tự giám sát× | Sinh Tăng Cường Truy Xuất (RAG)× | |
|---|---|---|
| Lĩnh vực≠ | Học sâu | Khai phá văn bản |
| Họ≠ | Machine learning | Process / pipeline |
| Năm ra đời≠ | 2019 | 2020 |
| Người khởi xướng≠ | Lewis, P.; Alberti, C. et al. (multiple independent groups ~2019) | Lewis, Patrick et al. (Meta AI / Facebook AI Research) |
| Loại≠ | Self-supervised NLP training paradigm | Hybrid retrieval + generation pipeline |
| Công trình gốc≠ | Lewis, P., Denoyer, L., & Riedel, S. (2019). Unsupervised Question Answering by Cloze Translation. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019), pp. 4896–4910. DOI ↗ | Lewis, P. et al. (2020). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. Advances in Neural Information Processing Systems (NeurIPS), 33, 9459-9474. DOI ↗ |
| Tên gọi khác | SSQA, unsupervised question answering, self-supervised QA, zero-label question answering | RAG, retrieval-augmented LLM, grounded generation, Erişim Destekli Metin Üretimi (RAG) |
| Liên quan≠ | 1 | 7 |
| Tóm tắt≠ | Self-supervised Question Answering (SSQA) is a training paradigm that automatically generates question-answer pairs from unlabeled text — using cloze translation, span masking, or neural question generation — to train QA models without any human-labeled data. It enables high-quality reading comprehension systems even when annotated datasets are scarce or domain-specific. | Retrieval-Augmented Generation (RAG) is a natural-language-processing pipeline introduced by Lewis et al. in 2020 that strengthens a large language model (LLM) with evidence fetched at inference time from an external knowledge base. Instead of relying solely on what a model memorised during training, RAG first retrieves the most relevant passages from a document index and then hands those passages to the LLM as context, grounding the generated answer in verifiable, up-to-date information. The approach reduces hallucination and allows domain-specific or time-sensitive knowledge to be injected without retraining the model. |
| ScholarGateBộ dữ liệu ↗ |
|
|