아래 내용은 Udemy에서 Pytorch: Deep Learning and Artificial Intelligence를 보고 정리한 내용이다.
Reinforcement Learning 강화학습
supervised/unsupervised learning
image classifier just function
supervised Learning is like a static function
image를 입력하면 결과가 나온다.
Reinforcement Learning : loop
loop와 비슷하다.
we want to achieve a goal
future plan
agent
Tic-tac-Toe
environment: the computer program that implements the tic-tac-toe game
game.start()
game.move()
game.get_sate()
game.is_over() ->누가 이겼는지 computer agent
O | ||
X | O | |
O | X |
episode
agent, environment, episode
states, actions, rewards
state = for each location
action = where to place the next X/ O
Reward = a number, received at each step of the game
Example : Maze
breakout: states actions rewards
states: what we observe in the environment, but can be derived from current and past observations
actions: just the different buttons/ inputs on the joystick or control pad
spaces: state spaces
states actions rewards Policies
states: discrete or continuous
policies:
policy as a function or mapping
the markov assumption
1. 예를 들어 내일 날씨는 rainy, sunny ,or cloudy 로 예측한다.
the markov assumption는 that tomorrow's weather depends only on today's weather, but not on any earlier day
어제 날씨와 관련만 되여있다.
2. 다음단어를 예측할 때
Bellman equation
prediction vs. control.
'교육동영상 > 02. pytorch: Deep Learning' 카테고리의 다른 글
13. VIP (0) | 2021.01.08 |
---|---|
12. Stock Trading Project with Deep Reinforcement Learning (0) | 2021.01.06 |
10. GANs (0) | 2020.12.28 |
09. Transfer Learning (0) | 2020.12.23 |
08. Recommender System (0) | 2020.12.21 |