'Playing Attention to Shapes' 태그의 글 목록

Playing Attention to Shapes

06. Recurrent Neural Networks, Time Series, and Sequence Data 2020.11.20

06. Recurrent Neural Networks, Time Series, and Sequence Data

2020. 11. 20. 10:45

아래 내용은 Udemy에서 Pytorch: Deep Learning and Artificial Intelligence를 보고 정리한 내용이다.

44. Sequence Data

시퀸스 데이터

time series

airline passengers

speech /Audio

text

bag of words example

eamil -> sapm vs. not spam

sequence ?

1-D series signal

linear regression

shape of a sequence

NxTxD?

N: = #samples

D: = #features

T: #time steps in the sequence

eg: GPS data from their cars

N: one sample would be one person's single trip to work

D = 2 , the GPs will record(latitude, longitude) pairs

T: the number of(lat, lng) measurements taken from start to finish of a single trip

variable length sequence ?

Nx D xT

image data: N x H x W x C

N x C x H x W (pytorch, Theano)

In python : N is first , feature maps : C

45. Forecasting

RNNs

Linear Regression

loop를 사용하여 prediction 해야 한다.

x = last values of train set

predictions = []

for i in range(length_of_forecast):

x_next = model.predict(x)

predictions.append(x_next)

x = concat(x[1:], x_next)

model = nn.Linear(1,1)

model = nn.Sequential(
nn.Linear(1,10)
nn.ReLU(),
nn.Linear(10,1)
)

46. Autogressive Linear Model for Time Sereies

모델 생성

model = nn.Linear(T, 1)

pytorch RNN series data

import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt

데이터 생성

N = 1000
series = np.sin(0.1 * np.arange(N))

T = 10
X = []
Y = []

for t in range(len(series) - T):
  x = series[t:t+T]
  X.append(x)
  y = series[t+T]
  Y.append(y)

X = np.array(X).reshape(-1, T)
Y = np.array(Y).reshape(-1, 1)
N = len(X)
print("X.shape" , X.shape, "Y.shape" , Y.shape)

build the model

model = nn.Linear(T, 1)

loss and optimizer

regression이기 때무에 mean squar error를 사용한다.

criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr = 0.1)

X_train = torch.from_numpy(X[:-N//2].astype(np.float32))
y_train = torch.from_numpy(Y[:-N//2].astype(np.float32))
X_test = torch.from_numpy(X[-N//2:].astype(np.float32))
y_test = torch.from_numpy(Y[-N//2:].astype(np.float32))

모델 학습

def full_gd(model, criterion, optimizer, X_train , y_train, X_test, y_test, epochs = 200):

  train_losses = np.zeros(epochs)
  test_losses = np.zeros(epochs)

  for epoch in range(epochs):
    # 역전파 단계를 실행하기 전에 변화도를 0으로 함 
    optimizer.zero_grad()

    outputs = model(X_train)
    loss = criterion(outputs, y_train)

    loss.backward()
    optimizer.step()

    train_losses[epoch] = loss.item()

    test_outputs = model(X_test)
    test_loss = criterion(outputs, y_test)
    test_losses[epoch] = test_loss.item()

    if (epoch +1) % 10 ==0:
          print(f'Epoch{epoch+1}/{epochs}, Train loss:{loss.item():.4f}, Test loss:{test_loss.item():.4f}')

  return train_losses, test_losses

train_losses, test_losses = full_gd(model, criterion, optimizer, X_train , y_train, X_test, y_test)

loss그리기

plt.plot(train_losses, label='train loss')
plt.plot(test_losses, label='test loss')
plt.legend()
plt.show()

잘못 예측 한것 보여주기

validation_target = Y[-N//2:]
validation_predictsions = []

i = 0
while len(validation_predictsions) < len(validation_target):
  input_ = X_test[i].view(1, -1)
  p = model(input_)[0, 0].item() #array->scalar

  i+= 1

  validation_predictsions.append(p)

plt.plot(validation_target, label='forecast target')
plt.plot(validation_predictsions, label='forecast prediction')
plt.legend()

validation_target = Y[-N//2:]
validation_predictsions = []

last_x = torch.from_numpy(X[-N//2].astype(np.float32))

while len(validation_predictsions) < len(validation_target):
  input_ = last_x.view(1, -1)
  p = model(input_)

  i+= 1

  validation_predictsions.append(p[0,0].item())

  last_x = torch.cat((last_x[1:] , p[0]))

plt.plot(validation_target, label='forecast target')
plt.plot(validation_predictsions, label='forecast prediction')
plt.legend()

plt.plot(validation_target, label='forest target')
plt.plot(validation_predictsions, label='forest prediction')
plt.legend()
plt.show()

47. Proof that the Linear Model Works

np.sin(0.1 * np.arange(200))

np.arange(200)

0.1 is the angular frequency

48. Recurrent Neural Networks

why RNN useful

D= 100

T = 10,000

TX D = 1 million

ANN 는 일반적이다.

- it connects every input to every feature in the next hidden layer

hidden state

input -> hidden -> output

RNN Equation

Wxh - input to hidden weight

Whh - hidden to hidden weight

bh - hidden bias

Wo - hidden to output weight

bo - output bias

X - TxD input matrix

tanh hidden activation

softmax output activation

yhat =[]

h_last = h0

for t in range(T):

h_t = tanh(X[t].do(Wx) + h_last.dot(Wh) + bh)

yhat = softmax(h_t.dot(Wo) + bo)

yhat.append(yhat)

h_last = h_t

Biological Inspiration

Cnn have "shared weights" to take advantage of structure, resulting in savings

Calculating our savings

T= 100

D = 10

M = 15 (this is still a hyperparameter)

Flattened input vector: TxD = 1000

we will have T hidden states : T x m = 1500

assume binary classification : K=1

input- to = hidden weight: 1000 x 1500 = 1.5 million

hidden-to-output weight : 1500

Total : ~ 1.5 million

calculating our savings

Wxh - D x M = 10 x 15 = 150

Whh - M xM = 15 x 15 = 225

Wo - M x k = 15 x 1 = 15

Total = 150+225+15 = 390

Savings: 1,501,500 / 390 = 3850

49. RNN code Preparation

Steps:

load in the data

dataset

RNN(NxTxD)

build the model

earlier and right

class SimpleRNN(nn.Module):

def __init__ = nn.RNN(

input_size = num_inputs,

hidden_size = num_hidden,

num_layers = num_layers,

nonlinearity = 'relu',

batch_first = True

)

RNN은 feedback 있다.

full contructor

class SimpleRNN(nn.Module):

def __init__ (self, n_inputs, n_hidden, n_rnnlayers, n_outputs)

super(SimpleRNN, slef).__init__()

self.D = n_inputs

self.M = n_hidden

self.K = n_outputs

self.L = n_rnnlayers

self.rnn = nn.RNN(

input_size = self.D,

hidden_size = self.M,

num_layers = self.L;

nonlinearity = 'relu',

batch_first = True)

self.fc = nn.Linear(self.M, self.K) -> Denselayer

forwar function

def forward(self, X):

h0 = torch.zeros(self.L, X.size(0), self.M).to(device)

out, _ = self.rnn(X, h0) ->hidden layer of each time step

out = self.fc(out[:,-1,:]

return out

train the model

모델 초기화
model = SimpleRNN(n_inputs = 1, n_hidden = 5, n_rnnlayers =1, n_outputs=1)

model.to(device)

loss and optimizer

criterion = nn.MSELoss()

optimizer = torch.optim.Adam(model.parameters(), lr = 0.1)

make inputs and targets

evaluate the model
make predictions

N x T x D -> N x K(outut )

input_ = X_test[i].reshape(1, T,1)

p = model(input_)[0,0].item()

50. RNN for Time Series Prediction

simple rnn sine

import torch

import torch.nn as nn

import numpy as np

import matplotlib.pyplot as plt

create data

N = 1000
series = np.sin(0.1 * np.arange(N))

plt.plot(series)
plt.show()

create dataset

T = 10
X = []
Y = []

for t in range(len(series) - T):
  x = series[t:t+T]
  X.append(x)
  y = series[t+T]
  Y.append(y)

X = np.array(X).reshape(-1, T, 1)
Y = np.array(Y).reshape(-1, 1)
N = len(X)
print("X.shape" , X.shape, "Y.shape" , Y.shape)

cuda 사용하기

device = torch.device("cuda:0" if torch.cuda.is_available() else 'cpu')
print(device)

모델 생성하기

class SimpleRNN(nn.Module):
  def __init__ (self, n_inputs, n_hidden, n_rnnlayers, n_outputs):
    super(SimpleRNN, self).__init__()
    self.D = n_inputs
    self.M = n_hidden
    self.K = n_outputs
    self.L = n_rnnlayers

    self.rnn = nn.RNN(
      input_size = self.D,
      hidden_size = self.M,
      num_layers = self.L,
      nonlinearity = 'relu',
      batch_first = True
    )

    self.fc = nn.Linear(self.M, self.K)

  def forward(self, X):
    
    h0 = torch.zeros(self.L, X.size(0), self.M).to(device)
    out, _ = self.rnn(X, h0)

    out = self.fc(out[:,-1,:]) 
    return out

model = SimpleRNN(n_inputs=1, n_hidden=5, n_rnnlayers=1, n_outputs=1)
model.to(device)

gpu에 넣으면 다 gpu 에 넣어야 한다.

아니면 cuda 에러가 난다.

ann 는 cnn 보다 flexible하지만 성능이 더 좋다고 는 할 수 없다.

criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr = 0.1)

X_train = torch.from_numpy(X[:-N//2].astype(np.float32))
y_train = torch.from_numpy(Y[:-N//2].astype(np.float32))
X_test = torch.from_numpy(X[-N//2:].astype(np.float32))
y_test = torch.from_numpy(Y[-N//2:].astype(np.float32))

data to gpu

X_train, y_train = X_train.to(device), y_train.to(device)
X_test , y_test = X_test.to(device) , y_test.to(device)

def full_gd(model, criterion, optimizer, X_train , y_train, X_test, y_test, epochs = 200):

  train_losses = np.zeros(epochs)
  test_losses = np.zeros(epochs)

  for epoch in range(epochs):
    # 역전파 단계를 실행하기 전에 변화도를 0으로 함 
    optimizer.zero_grad()

    outputs = model(X_train)
    loss = criterion(outputs, y_train)

    loss.backward()
    optimizer.step()

    train_losses[epoch] = loss.item()

    test_outputs = model(X_test)
    test_loss = criterion(outputs, y_test)
    test_losses[epoch] = test_loss.item()

    if (epoch +1) % 10 ==0:
          print(f'Epoch{epoch+1}/{epochs}, Train loss:{loss.item():.4f}, Test loss:{test_loss.item():.4f}')

  return train_losses, test_losses

train_losses, test_losses = full_gd(model, criterion, optimizer, X_train , y_train, X_test, y_test)

plt.plot(train_losses , label='train_losses')
plt.plot(test_losses , label='test_losses')
plt.legend()
plt.show()

51. Playing Attention to Shapes

N = 1
T = 10
D = 3
M = 5
K = 2
X = np.random.randn(N, T, D)

gpu를 사용하지 않는다.

class SimpleRNN(nn.Module):
  def __init__ (self, n_inputs, n_hidden, n_outputs):
    super(SimpleRNN, self).__init__()
    self.D = n_inputs
    self.M = n_hidden
    self.K = n_outputs

    self.rnn = nn.RNN(
      input_size = self.D,
      hidden_size = self.M,
      nonlinearity = 'tanh',
      batch_first = True
    )

    self.fc = nn.Linear(self.M, self.K)

  def forward(self, X):
    
    h0 = torch.zeros(1, X.size(0), self.M)
    out, _ = self.rnn(X, h0)

    out = self.fc(out) 
    return out

model = SimpleRNN(n_inputs=D, n_hidden=M, n_outputs=K)

inputs = torch.from_numpy(X.astype(np.float32))
out = model(inputs)
out

W_xh, W_hh, b_xh, b_hh = model.rnn.parameters()

W_xh = W_xh.data.numpy()
b_xh = b_xh.data.numpy()
W_hh = W_hh.data.numpy()
b_hh = b_hh.data.numpy()

wo = wo.data.numpy()
bo = bo.data.numpy()
wo.shape, bo.shape

h_last = np.zeros(M)
x = X[0]
yhats = np.zeros((T, K))

for t in range(T):
  h = np.tanh(x[t].dot(W_xh.T) + b_xh+ h_last.dot(W_hh.T) + b_hh)
  y = h.dot(wo.T) + bo
  yhats[t] = y

  h_last = h

print(yhats)

np.allclose(yhats, yhats_torch)

'교육동영상 > 02. pytorch: Deep Learning' 카테고리의 다른 글

07. NLP (0)	2020.12.16
06-2. Recurrent Neural Networks, Time Series, and Sequence Data (0)	2020.12.14
05. Convolutional Nerual Networks (0)	2020.11.19
04. Feedforward Artificial Neural Networks (1)	2020.11.18
03. Machine Learning and Neurons (0)	2020.11.16

PREV 1 NEXT

NAIAHD