Machine learningDeep learning / NLP / CV

Fino podešeno pojačano učenje

Fino podešeno pojačano učenje prilagođava prethodno obučenog stratega ili model novom zadatku ili cilju ponašanja koristeći signale pojačanja — uključujući povratne informacije od ljudi — umesto ponovnog obučavanja od nule. Popularizovano od strane RLHF-a, to je osnovna tehnika koja stoji iza usklađivanja velikih jezičkih modela i prilagođavanja agenata dubokog pojačanog učenja specijalizovanim okruženjima sa minimalnim dodatnim podacima.

Otvorite u MethodMindUskoroVideoUskoroDownload slides

Pročitajte celu metodu

Samo za članove

Prijavite se besplatnim nalogom da biste pročitali ovaj odeljak.

Prijavite se

Method map

The neighbourhood of related methods — select a node to explore.

Izvori

  1. Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., Schulman, J., Hilton, J., Kelton, F., Miller, L., Simens, M., Askell, A., Welinder, P., Christiano, P., Leike, J., & Lowe, R. (2022). Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35, 27730–27744. link
  2. Christiano, P., Leike, J., Brown, T. B., Martic, M., Legg, S., & Amodei, D. (2017). Deep reinforcement learning from human preferences. Advances in Neural Information Processing Systems, 30. link

Kako citirati ovu stranicu

ScholarGate. (2026, June 3). Fine-Tuned Reinforcement Learning (Policy Adaptation via Fine-Tuning). ScholarGate. https://scholargate.app/sr/deep-learning/fine-tuned-reinforcement-learning

Which method?

Set this method beside its closest kin and read them side by side — the library lays the books on the table; the choice is yours.

Compare side by side

Citirana u

ScholarGateFine-Tuned Reinforcement Learning (Fine-Tuned Reinforcement Learning (Policy Adaptation via Fine-Tuning)). Preuzeto 2026-06-15 sa https://scholargate.app/sr/deep-learning/fine-tuned-reinforcement-learning · Skup podataka: https://doi.org/10.5281/zenodo.20539026