'breakout: states actions rewards' 태그의 글 목록

breakout: states actions rewards

10. Deep Reinforcement Learning 2021.01.04

10. Deep Reinforcement Learning

2021. 1. 4. 10:19

728x90

아래 내용은 Udemy에서 Pytorch: Deep Learning and Artificial Intelligence를 보고 정리한 내용이다.

Reinforcement Learning 강화학습

supervised/unsupervised learning

image classifier just function

supervised Learning is like a static function

image를 입력하면 결과가 나온다.

Reinforcement Learning : loop

loop와 비슷하다.

we want to achieve a goal

future plan

agent

Tic-tac-Toe

environment: the computer program that implements the tic-tac-toe game

game.start()

game.move()

game.get_sate()

game.is_over() ->누가 이겼는지 computer agent

		O
X	O
O		X

episode

agent, environment, episode

states, actions, rewards

state = for each location

action = where to place the next X/ O

Reward = a number, received at each step of the game

Example : Maze

breakout: states actions rewards

states: what we observe in the environment, but can be derived from current and past observations

actions: just the different buttons/ inputs on the joystick or control pad

spaces: state spaces

states actions rewards Policies

states: discrete or continuous

policies:

policy as a function or mapping

the markov assumption

1. 예를 들어 내일 날씨는 rainy, sunny ,or cloudy 로 예측한다.

the markov assumption는 that tomorrow's weather depends only on today's weather, but not on any earlier day

어제 날씨와 관련만 되여있다.

2. 다음단어를 예측할 때

Bellman equation

prediction vs. control.

'교육동영상 > 02. pytorch: Deep Learning' 카테고리의 다른 글

13. VIP (0)	2021.01.08
12. Stock Trading Project with Deep Reinforcement Learning (0)	2021.01.06
10. GANs (0)	2020.12.28
09. Transfer Learning (0)	2020.12.23
08. Recommender System (0)	2020.12.21

PREV 1 NEXT

NAIAHD