'reinforcement learning' 태그의 글 목록

Notice

Recent Posts

Recent Comments

Link

GitHub

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

목록reinforcement learning (2)

코딩하는 임초얀

논문 리뷰 - "Towards Applicable Reinforcement Learning: Improving the Generalization and Sample Efficiency with Policy Ensemble" (IJCAI 2022)

Abstract 주식 투자 같은 곳에 RL을 사용하기 힘든 이유: noisy observation과 환경의 지속적인 변화. 각각을 해결하려면 sample efficiency가 높아야 하고, generalization도 잘 해야 한다. SL (supervised learning)에서는 앙상블이 정확도도 높아지고 일반화도 잘 하는 걸 생각해보면, RL에도 앙상블을 적용해볼 수 있다. => end-to-end로 앙상블 policy들을 학습하는 EPPO가 등장!! 특히 EPPO는 1. subpolicy들과 ensemble policy를 유기적으로 결합하여 둘 다를 동시에 optimize한다. 2. policy 공간에서 diversity enhancement regularization을 사용해서 [unseen s..

Studies/논문 리뷰 2024. 4. 22. 15:13

Bellman Equation

A Bellman equation, named after Richard E. Bellman, is a necessary condition for optimality associated with the mathematical optimization method known as dynamic programming. It writes the "value" of a decision problem at a certain point in time in terms of the payoff from some initial choices and the "value" of the remaining decision problem thatresults from those initial choices. This breaks a..

Studies/Wiki 한글 번역 2023. 3. 3. 17:16

Prev 1 Next

목록reinforcement learning (2)

코딩하는 임초얀

티스토리툴바