반응형

아래 내용은 Udemy에서 Pytorch: Deep Learning and Artificial Intelligence를 보고 정리한 내용이다.

44. Sequence Data

시퀸스 데이터 

 

time series

airline passengers

speech /Audio

text

 

bag of words example

eamil -> sapm vs. not spam

 

sequence ?

1-D series signal

linear regression 

 

shape of a sequence 

NxTxD?

N: = #samples

D: = #features

T:  #time steps in the sequence

 

eg: GPS data from their cars

N: one sample would be one person's single trip to work

D = 2 , the GPs will record(latitude, longitude) pairs

T: the number of(lat, lng) measurements taken from start to finish of a single trip 

 

variable length sequence ?

 

Nx D xT

image data: N x H x W x C

N x C x H x W (pytorch, Theano)

In python : N is first , feature maps : C 

 

 

 

45. Forecasting

RNNs

Linear Regression

 

loop를 사용하여 prediction 해야 한다.

x = last values of train set

predictions = []

for i in range(length_of_forecast):

  x_next = model.predict(x)

  predictions.append(x_next)

  x = concat(x[1:], x_next)

 

 

model = nn.Linear(1,1) model = nn.Sequential(
  nn.Linear(1,10)
  nn.ReLU(),
  nn.Linear(10,1)
)

 

 

46. Autogressive Linear Model for Time Sereies

 

모델 생성

model = nn.Linear(T, 1)

 

pytorch RNN series data

import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt

데이터 생성

N = 1000
series = np.sin(0.1 * np.arange(N))
T = 10
X = []
Y = []

for t in range(len(series) - T):
  x = series[t:t+T]
  X.append(x)
  y = series[t+T]
  Y.append(y)

X = np.array(X).reshape(-1, T)
Y = np.array(Y).reshape(-1, 1)
N = len(X)
print("X.shape" , X.shape, "Y.shape" , Y.shape)

build the model

model = nn.Linear(T, 1)

loss and optimizer

regression이기 때무에 mean squar error를 사용한다.

criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr = 0.1)
X_train = torch.from_numpy(X[:-N//2].astype(np.float32))
y_train = torch.from_numpy(Y[:-N//2].astype(np.float32))
X_test = torch.from_numpy(X[-N//2:].astype(np.float32))
y_test = torch.from_numpy(Y[-N//2:].astype(np.float32))

모델 학습

def full_gd(model, criterion, optimizer, X_train , y_train, X_test, y_test, epochs = 200):

  train_losses = np.zeros(epochs)
  test_losses = np.zeros(epochs)

  for epoch in range(epochs):
    # 역전파 단계를 실행하기 전에 변화도를 0으로 함 
    optimizer.zero_grad()

    outputs = model(X_train)
    loss = criterion(outputs, y_train)

    loss.backward()
    optimizer.step()

    train_losses[epoch] = loss.item()

    test_outputs = model(X_test)
    test_loss = criterion(outputs, y_test)
    test_losses[epoch] = test_loss.item()

    if (epoch +1) % 10 ==0:
          print(f'Epoch{epoch+1}/{epochs}, Train loss:{loss.item():.4f}, Test loss:{test_loss.item():.4f}')

  return train_losses, test_losses

train_losses, test_losses = full_gd(model, criterion, optimizer, X_train , y_train, X_test, y_test)

loss그리기

plt.plot(train_losses, label='train loss')
plt.plot(test_losses, label='test loss')
plt.legend()
plt.show()

잘못 예측 한것 보여주기

validation_target = Y[-N//2:]
validation_predictsions = []

i = 0
while len(validation_predictsions) < len(validation_target):
  input_ = X_test[i].view(1, -1)
  p = model(input_)[0, 0].item() #array->scalar

  i+= 1

  validation_predictsions.append(p)

plt.plot(validation_target, label='forecast target')
plt.plot(validation_predictsions, label='forecast prediction')
plt.legend()
validation_target = Y[-N//2:]
validation_predictsions = []

last_x = torch.from_numpy(X[-N//2].astype(np.float32))

while len(validation_predictsions) < len(validation_target):
  input_ = last_x.view(1, -1)
  p = model(input_)

  i+= 1

  validation_predictsions.append(p[0,0].item())

  last_x = torch.cat((last_x[1:] , p[0]))

plt.plot(validation_target, label='forecast target')
plt.plot(validation_predictsions, label='forecast prediction')
plt.legend()
plt.plot(validation_target, label='forest target')
plt.plot(validation_predictsions, label='forest prediction')
plt.legend()
plt.show()

47. Proof that the Linear Model Works

np.sin(0.1 * np.arange(200))

np.arange(200) 

0.1 is the angular frequency 

 

 

48. Recurrent Neural Networks

why RNN useful

 

D= 100

T = 10,000

TX D = 1 million

 

ANN 는 일반적이다.

- it connects every input to every feature in the next hidden layer

 

hidden state

input -> hidden -> output

 

RNN Equation

 

Wxh - input to hidden weight

Whh - hidden to hidden weight

bh - hidden bias

Wo - hidden to output weight

bo - output bias

X - TxD input matrix

 

tanh hidden activation

softmax output activation

 

yhat =[]

h_last = h0

for t in range(T):

  h_t = tanh(X[t].do(Wx) + h_last.dot(Wh) + bh)

  yhat = softmax(h_t.dot(Wo) + bo)

  yhat.append(yhat)

  

  h_last = h_t

 

Biological Inspiration

Cnn have "shared weights" to take advantage of structure, resulting in savings

 

Calculating our savings

T= 100

D = 10

M = 15 (this is still a hyperparameter)

Flattened input vector: TxD = 1000

we will have T hidden states : T x m = 1500

assume binary classification : K=1

input- to = hidden weight: 1000 x 1500 = 1.5 million

hidden-to-output weight : 1500

Total : ~ 1.5 million

 

 

calculating our savings

Wxh - D x M = 10 x 15 = 150

Whh - M xM = 15 x 15 = 225

Wo - M x k = 15 x 1 = 15

Total = 150+225+15 = 390

Savings: 1,501,500 / 390 = 3850

 

 

 

49. RNN code Preparation

Steps:

  • load in the data

      dataset

      RNN(NxTxD)

 

  • build the model

    earlier and right

  class SimpleRNN(nn.Module):

       def __init__ = nn.RNN(

          input_size = num_inputs,

          hidden_size = num_hidden,

          num_layers = num_layers,

          nonlinearity = 'relu',

          batch_first = True

       )

 

RNN은 feedback  있다.

 

full contructor

class SimpleRNN(nn.Module):

  def __init__ (self, n_inputs, n_hidden, n_rnnlayers, n_outputs)

    super(SimpleRNN, slef).__init__()

    self.D = n_inputs

    self.M = n_hidden

    self.K = n_outputs

    self.L = n_rnnlayers

    self.rnn = nn.RNN(

      input_size = self.D,

      hidden_size = self.M,

      num_layers = self.L;

      nonlinearity = 'relu',

 

      batch_first = True)

    self.fc = nn.Linear(self.M, self.K) -> Denselayer

 

   

forwar function

def forward(self, X):

  h0 = torch.zeros(self.L, X.size(0), self.M).to(device)

  out, _ = self.rnn(X, h0) ->hidden layer of each time step

  out = self.fc(out[:,-1,:] 

  return out

 

  • train the model

모델 초기화 
model = SimpleRNN(n_inputs = 1, n_hidden = 5, n_rnnlayers =1, n_outputs=1)

model.to(device)

 

loss and optimizer

criterion = nn.MSELoss()

optimizer = torch.optim.Adam(model.parameters(), lr = 0.1)

 

make inputs and targets

 

 

  • evaluate the model
  • make predictions

N x T x D  -> N x K(outut )

input_ = X_test[i].reshape(1, T,1)

p = model(input_)[0,0].item()

 

 

50. RNN for Time Series Prediction

simple rnn sine

 

import torch

import torch.nn as nn

import numpy as np

import matplotlib.pyplot as plt

 

create data

N = 1000
series = np.sin(0.1 * np.arange(N))

plt.plot(series)
plt.show()

 

create dataset

T = 10
X = []
Y = []

for t in range(len(series) - T):
  x = series[t:t+T]
  X.append(x)
  y = series[t+T]
  Y.append(y)

X = np.array(X).reshape(-1, T, 1)
Y = np.array(Y).reshape(-1, 1)
N = len(X)
print("X.shape" , X.shape, "Y.shape" , Y.shape)

cuda 사용하기

device = torch.device("cuda:0" if torch.cuda.is_available() else 'cpu')
print(device)

모델 생성하기

class SimpleRNN(nn.Module):
  def __init__ (self, n_inputs, n_hidden, n_rnnlayers, n_outputs):
    super(SimpleRNN, self).__init__()
    self.D = n_inputs
    self.M = n_hidden
    self.K = n_outputs
    self.L = n_rnnlayers

    self.rnn = nn.RNN(
      input_size = self.D,
      hidden_size = self.M,
      num_layers = self.L,
      nonlinearity = 'relu',
      batch_first = True
    )

    self.fc = nn.Linear(self.M, self.K)

  def forward(self, X):
    
    h0 = torch.zeros(self.L, X.size(0), self.M).to(device)
    out, _ = self.rnn(X, h0)

    out = self.fc(out[:,-1,:]) 
    return out
model = SimpleRNN(n_inputs=1, n_hidden=5, n_rnnlayers=1, n_outputs=1)
model.to(device)

 

gpu에 넣으면 다 gpu 에 넣어야 한다.

아니면 cuda 에러가 난다.

 

ann 는 cnn 보다 flexible하지만 성능이 더 좋다고 는 할 수 없다.

criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr = 0.1)
X_train = torch.from_numpy(X[:-N//2].astype(np.float32))
y_train = torch.from_numpy(Y[:-N//2].astype(np.float32))
X_test = torch.from_numpy(X[-N//2:].astype(np.float32))
y_test = torch.from_numpy(Y[-N//2:].astype(np.float32))

 

data to gpu

X_train, y_train = X_train.to(device), y_train.to(device)
X_test , y_test = X_test.to(device) , y_test.to(device)
def full_gd(model, criterion, optimizer, X_train , y_train, X_test, y_test, epochs = 200):

  train_losses = np.zeros(epochs)
  test_losses = np.zeros(epochs)

  for epoch in range(epochs):
    # 역전파 단계를 실행하기 전에 변화도를 0으로 함 
    optimizer.zero_grad()

    outputs = model(X_train)
    loss = criterion(outputs, y_train)

    loss.backward()
    optimizer.step()

    train_losses[epoch] = loss.item()

    test_outputs = model(X_test)
    test_loss = criterion(outputs, y_test)
    test_losses[epoch] = test_loss.item()

    if (epoch +1) % 10 ==0:
          print(f'Epoch{epoch+1}/{epochs}, Train loss:{loss.item():.4f}, Test loss:{test_loss.item():.4f}')

  return train_losses, test_losses

train_losses, test_losses = full_gd(model, criterion, optimizer, X_train , y_train, X_test, y_test)
plt.plot(train_losses , label='train_losses')
plt.plot(test_losses , label='test_losses')
plt.legend()
plt.show()

 

51. Playing Attention to Shapes

N = 1
T = 10
D = 3
M = 5
K = 2
X = np.random.randn(N, T, D)

gpu를 사용하지 않는다.

class SimpleRNN(nn.Module):
  def __init__ (self, n_inputs, n_hidden, n_outputs):
    super(SimpleRNN, self).__init__()
    self.D = n_inputs
    self.M = n_hidden
    self.K = n_outputs

    self.rnn = nn.RNN(
      input_size = self.D,
      hidden_size = self.M,
      nonlinearity = 'tanh',
      batch_first = True
    )

    self.fc = nn.Linear(self.M, self.K)

  def forward(self, X):
    
    h0 = torch.zeros(1, X.size(0), self.M)
    out, _ = self.rnn(X, h0)

    out = self.fc(out) 
    return out
model = SimpleRNN(n_inputs=D, n_hidden=M, n_outputs=K)
inputs = torch.from_numpy(X.astype(np.float32))
out = model(inputs)
out
W_xh, W_hh, b_xh, b_hh = model.rnn.parameters()
W_xh = W_xh.data.numpy()
b_xh = b_xh.data.numpy()
W_hh = W_hh.data.numpy()
b_hh = b_hh.data.numpy()
wo = wo.data.numpy()
bo = bo.data.numpy()
wo.shape, bo.shape
h_last = np.zeros(M)
x = X[0]
yhats = np.zeros((T, K))

for t in range(T):
  h = np.tanh(x[t].dot(W_xh.T) + b_xh+ h_last.dot(W_hh.T) + b_hh)
  y = h.dot(wo.T) + bo
  yhats[t] = y

  h_last = h

print(yhats)
np.allclose(yhats, yhats_torch)
반응형

+ Recent posts