반응형

아래 내용은 Udemy에서 Pytorch: Deep Learning and Artificial Intelligence를 보고 정리한 내용이다.

Reinforcement Learning 강화학습

supervised/unsupervised learning

 

image classifier just function

 

supervised Learning is like a static function

image를 입력하면 결과가 나온다.

 

 

Reinforcement Learning : loop

loop와 비슷하다.

we want to achieve a goal

future plan

 

agent

 

Tic-tac-Toe

environment: the computer program that implements the tic-tac-toe game

game.start()

game.move()

game.get_sate()

game.is_over() ->누가 이겼는지 computer agent

    O
X O  
O   X

 

episode 

 

agent, environment, episode

states, actions, rewards

state = for each location

action = where to place the next X/ O

Reward = a number, received at each step of the game

 

Example : Maze

 

breakout: states actions rewards

states:  what we observe in the environment, but can be derived from current and past observations

actions: just the different buttons/ inputs on the joystick or control pad

spaces:  state spaces

 

 

 

 

states actions rewards Policies

states: discrete or continuous

 

policies: 

policy as a function or mapping

 

the markov assumption

1. 예를 들어 내일 날씨는 rainy, sunny ,or cloudy 로 예측한다.

the markov assumption는 that tomorrow's weather depends only on today's weather, but not on any earlier day

어제 날씨와 관련만 되여있다.

2. 다음단어를 예측할 때 

 

Bellman equation

prediction vs. control.

 

반응형

'교육동영상 > 02. pytorch: Deep Learning' 카테고리의 다른 글

13. VIP  (0) 2021.01.08
12. Stock Trading Project with Deep Reinforcement Learning  (0) 2021.01.06
10. GANs  (0) 2020.12.28
09. Transfer Learning  (0) 2020.12.23
08. Recommender System  (0) 2020.12.21

+ Recent posts