Machine learningDeep learning / NLP / CV

준지도 강화학습

준지도 강화학습(SSRL)은 표준 강화학습(에이전트가 희소한 보상 신호로부터 학습하는 방식)과 레이블이 없는 환경 상호작용에서 구조를 추출하는 준지도 기법을 결합합니다. 목표는 보상 피드백이 비싸거나, 지연되거나, 에이전트 경험의 일부에만 사용 가능한 경우 표본 효율성과 일반화 성능을 향상시키는 것입니다.

MethodMind에서 열기곧 제공동영상곧 제공Download slides

방법 전문 읽기

회원 전용

무료 계정으로 로그인하면 이 섹션을 읽을 수 있습니다.

로그인

Method map

The neighbourhood of related methods — select a node to explore.

준지도 강화학습

도메인 적응 강화학습 강화학습 자기 지도 강화 학습 준지도 학습 트랜스포머 강화학습에서의 전이 학습 약한 지도 강화학습

출처

Zhan, X., Zhu, X., & Shi, H. (2022). Deepthermal: Combustion optimization for thermal power generating units using offline reinforcement learning. Proceedings of the AAAI Conference on Artificial Intelligence, 36(4), 4680–4688. link ↗
Laskin, M., Srinivas, A., & Abbeel, P. (2020). CURL: Contrastive Unsupervised Representations for Reinforcement Learning. Proceedings of the 37th International Conference on Machine Learning (ICML), PMLR 119, 5639–5650. link ↗

이 페이지 인용 방법

ScholarGate. (2026, June 3). Semi-supervised Reinforcement Learning (SSRL). ScholarGate. https://scholargate.app/ko/deep-learning/semi-supervised-reinforcement-learning

Which method?

Set this method beside its closest kin and read them side by side — the library lays the books on the table; the choice is yours.

Compare side by side →

이 방법을 참조하는 항목

자기 지도 강화 학습 약한 지도 강화학습

이 페이지에서 오류를 발견하셨나요? 신고하거나 수정을 제안하세요 →