'교육동영상/02. pytorch: Deep Learning' 카테고리의 글 목록 (2 Page)

교육동영상/02. pytorch: Deep Learning

08. Recommender System 2020.12.21
07. NLP 2020.12.16
06-2. Recurrent Neural Networks, Time Series, and Sequence Data 2020.12.14
06. Recurrent Neural Networks, Time Series, and Sequence Data 2020.11.20
05. Convolutional Nerual Networks 2020.11.19
04. Feedforward Artificial Neural Networks 2020.11.18 1
03. Machine Learning and Neurons 2020.11.16
01. Introduction 02. Google Colab 2020.11.14

08. Recommender System

2020. 12. 21. 11:09

728x90

아래 내용은 Udemy에서 Pytorch: Deep Learning and Artificial Intelligence를 보고 정리한 내용이다.

Recommender Systems with Deep Learning Theory

applicable concepts in ML

google

아마존 - Amazon

facebook

netflex

News -> exciting 한 뉴스를 본다.

Ratings Recommenders

movies

How to recommend?

Given a dataset of triples: user, item, rating

fit a model to the data: f(u, m ) ->r

what should it do?

#1. if the user u and movie m appeared in the dataset, then the predicted rating should be close to the true rating

#2. the function should predict what user u would rate movie m, even if it didn't appear in the training set

Neural networks(function approximators) do this!

since our model can predict ratings for unseen movies, this is easy

given a user, get predicted for every unseen movie=>보지 않았든것

Sort by predicted rating(descending)

recommend movies with the highest predicted rating

How to build the model?

users and movies categorical

neural networks do matrix multiplications

we can't multiply a category by a number(e.g. "Start wars" x5)

NLP

embeddings

mapping

A neural network for recommenders

Interesting point

NLP

algorithms word2vec and GloVe

They both do things like " king-man = queen -women"

ANN all data is the same

The inputs are clearly not an N x D matrix of features

All data is not the same

we still have N samples

1st column(users): N- length array of categories()

2nd column(movies): N- length array of categories(can bu duped)

Embedding

NLP: word indexes(N x T) -> word vectors(N x T x D )

recommenders: users / movies(N) -> user/movie vectors(N x D)

How is the data stored?

They are already integers !

Embeddings & Concatenation

After embedding, users and movies have shape N x D

combine concatenate N x 2D

Now " all data is the same" once again

forward :

something special about this model: it has 2 inputs, not just one!

Loading in data

data is just a csv (pandas)

training 을 하기 위해서 DataLoader로 한다.

pytorch Recommender systems with deep learning code

모델 만들기

class Model(nn.Module):
  def __init__(self, n_users, n_items, embed_dim, n_hidden = 1024):
    super(Model, self).__init__()
    self.N = n_users
    self.M = n_items
    self.D = embed_dim

    self.u_emb = nn.Embedding(self.N, self.D)
    self.m_emb = nn.Embedding(self.M , self.D)
    self.fc1 = nn.Linear(2 * self.D , n_hidden)
    self.fc2 = nn.Linear(n_hidden , 1)

  def forward(self, u, m ):
    u = self.u_emb(u)
    m = self.m_emb(m)

    out = torch.cat((u, m ), 1)

    out = self.fc1(out)
    out = F.relu(out)
    out = self.fc2(out)
    return out

model = Model(N, M ,D )
model.to(device)

criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters())

make dataset

#make dataset
Ntrain = int(0.8 * len(ratings))
train_dataset = torch.utils.data.TensorDataset(
    user_ids_t[:Ntrain],
    movie_ids_t[:Ntrain],
    ratings_t[:Ntrain]
)

test_dataset = torch.utils.data.TensorDataset(
    user_ids_t[Ntrain:],
    movie_ids_t[Ntrain:],
    ratings_t[Ntrain:]
)

batch_size = 512
train_loader = torch.utils.data.DataLoader(dataset = train_dataset, 
                                           batch_size = batch_size,
                                           shuffle = True)

test_loader = torch.utils.data.DataLoader(dataset = test_dataset, 
                                           batch_size = batch_size,
                                           shuffle = False)

pytorch %prun

얼마나 function에 사용했는지 알려진다.

5분 정도 걸리기 때문에 이것을 사용한다.

%prun train_losses, test_losses = batch_gd(model, criterion, optimizer, train_loader, test_loader, 25)

same prediction

pytorch weight 변경해서 하기

class Model(nn.Module):
  def __init__(self, n_users, n_items, embed_dim, n_hidden = 1024):
    super(Model, self).__init__()
    self.N = n_users
    self.M = n_items
    self.D = embed_dim

    self.u_emb = nn.Embedding(self.N, self.D)
    self.m_emb = nn.Embedding(self.M , self.D)
    self.fc1 = nn.Linear(2 * self.D , n_hidden)
    self.fc2 = nn.Linear(n_hidden , 1)

    # set the wights since N(0,1) leads to poor results
    self.u_emb.weight.data = nn.Parameter(torch.Tensor(np.random.randn(self.N, self.D) * 0.01))
    self.m_emb.weight.data = nn.Parameter(torch.Tensor(np.random.randn(self.M, self.D) * 0.01))

  def forward(self, u, m ):
    u = self.u_emb(u)
    m = self.m_emb(m)

    out = torch.cat((u, m ), 1)

    out = self.fc1(out)
    out = F.relu(out)
    out = self.fc2(out)
    return out

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = Model(N, M ,D )
model.to(device)

criterion = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr = 0.08 , momentum = 0.9)

pytorch top 10 recommended

watched_movie_ids = df[df.new_user_id ==1].new_movie_id.values
watched_movie_ids

potential_movie_ids = df[-df.new_movie_id.isin(watched_movie_ids)].new_movie_id.unique()
print(potential_movie_ids.shape)
print(len(set(potential_movie_ids)))

user_id_to_recommend = np.ones_like(potential_movie_ids)

t_user_ids = torch.from_numpy(user_id_to_recommend).long().to(device)
t_movie_ids = torch.from_numpy(potential_movie_ids).long().to(device)

with torch.no_grad():
  predictions = model(t_user_ids, t_movie_ids)

predictions_np = predictions.cpu().numpy().flatten()
sort_idx = np.argsort(-predictions_np)

top_10_movie_ids = potential_movie_ids[sort_idx[:10]]
top_10_scores = predictions_np[sort_idx[:10]]

for movie, score in zip(top_10_movie_ids, top_10_scores):
  print("movie:", movie," score:", score)

'교육동영상 > 02. pytorch: Deep Learning' 카테고리의 다른 글

10. GANs (0)	2020.12.28
09. Transfer Learning (0)	2020.12.23
07. NLP (0)	2020.12.16
06-2. Recurrent Neural Networks, Time Series, and Sequence Data (0)	2020.12.14
06. Recurrent Neural Networks, Time Series, and Sequence Data (0)	2020.11.20

07. NLP

2020. 12. 16. 16:46

728x90

아래 내용은 Udemy에서 Pytorch: Deep Learning and Artificial Intelligence를 보고 정리한 내용이다.

Embeddings

Text is sequence data, but it is not continuous

One-Hot Encoding?

vector = > 0을 포함한 것

map each word to an index

a => [1,0] 1

b => [0,1] 2

T words -> one hot encoded vector of size V

T x V matrix

data has no useful geometrical structure.

each word to a D-dimensional vector(not one-hot encoded)

RNN with Embedding

D = embedding dimension -> hypperparameter

V = vecab size (# of unique words)

Embedding matrix is V x D

Each row is a D-size vector for a word

forward Function

out = self.embed(x)

Text Preprocessing

Text Files

each word -> Integer ->vector

Structured Text Files

CSV(comma separated value)

dovument

pandas for CSV?

document

multiple words -> individual words

Multiple words to single words(Tokenization)

string.split() => a lot of cases

can't handle punctuation => .

remove 를 해야 한다.

Tokens to Integers

dataset = long sequence of words

current_idx = 0

word2idx = {}

for word in dataset:

if word not in word2idx :

word2idx[word] = current_idx

current_idx += 1

current_idx = 0 => 2로 한다.

보통 0으로 안한다.

1 = padding

0 = unknown => 우리는 학습을 못 할 떄

Constant-Length Sequence

길이를 같게 하고 하기 위해 padding 을 한다.

Pre-padding vs Post-padding

- Text classification RNN reads the input from left-to-right, the pre-padding 이 더 좋다.

시작 -> 끝

- challenging for RNNs to learn long-term dependencies!

convert to csv

Tokenization

map each token to a unique integer

The task

text classification(many-to-one)

Input: sequence of words, output: a single label (spam or not spam)

Field Objects

import torchtext.data as ttd

TEXT = ttd.Field(

sequentail = True,

batch_first = True,

lower = True,

pad_first = True

)

LABEL = ttd.Field(sequentail = False, use_vocab = False, is_target = True)

TabularDataset Object

split train, test data

dataset.split()

build vocab

TEXT.build_vocab(train_dataset)

vocab = TEXT.vocab

vocab object

stoi(C-style naming) 문자 -> 숫자

itos(reverse mapping) 숫자 -> 문자

stoi and itos

Dictionary = keys -> values

List = keys -> value

pytorch Text Preprocessing

import torch
import torch.nn as nn
import torchtext.data as ttd
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from datetime import datetime

data = {
    "label" : [0,1,1] ,
    "data"  :[
              "I like eggs and ham.",
              "Eggs I like!",
              "Ham and eggs or just ham?"
    ]
}
df = pd.DataFrame(data)
df.head()

df.to_csv("data.csv", index = False)

TEXT = ttd.Field(
    sequential = True,
    batch_first = True,
    lower = True,
    tokenize = 'spacy',
    pad_first = True
)

LABEL = ttd.Field(sequential=False, use_vocab=False, is_target=True)

dataset = ttd.TabularDataset(
    path = 'data.csv',
    format = 'csv',
    skip_header = True,
    fields = [('label', LABEL) ,('data' , TEXT)]
)

ex = dataset.examples[0]

train_dataset , test_dataset = dataset.split(0.66)
TEXT.build_vocab(train_dataset,)
vocab = TEXT.vocab
vocab.stoi
vocab.itos

train_iter, test_iter = ttd.Iterator.splits(
    (train_dataset, test_dataset) , sort_key = lambda x: len(x.data),
    batch_sizes = (2,2)  , device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
)

for inputs, targets in train_iter:
  print("inputs:" , inputs, "shape:", inputs.shape)
  print("targets:" , targets, "shape:" ,targets.shape)
  break
  
for inputs, targets in test_iter:
  print("inputs:" , inputs, "shape:", inputs.shape)
  print("targets:" , targets, "shape:" ,targets.shape)
  break

CNNs for Text

sequence

1-D convolution

For images:

2 spatial dimensions + 1 input feature dimensions + 1 output feature dimension = 4

for sequences:

1 time dimension + 1 input feature dimension + 1 output feature dimension = 3

경고: feature first

Recall that for images , pytorch cnn expects image to be N x C x H x W

"feature first"

whereas, in Tensorflow / OpenCV/ others, it's N x H x W x C

"feature last"

the torchvision data generators hide this detail

NLP, output of embedding is N x T x D ("feature last")

nn.Conv1d(), we expect N x D x T as input!("feature first")

그래서 , must reshape before and after convolutions

Text Classification with CNNs

output of embediding is alwasy (N, T,D)

conv1d expects (N, D, T)

=> out.permute(0,2,1)

change it back

out.permute(0,2,1)

cnn 도 경과가 좋게 나온다.

class CNN(nn.Module):
  def __init__(self, n_vocab, embed_dim, n_outputs):
    super(CNN, self).__init__()
    self.V = n_vocab
    self.D = embed_dim
    self.K = n_outputs

    self.embed = nn.Embedding(self.V, self.D)

    self.conv1 = nn.Conv1d(self.D, 32, 3, padding = 1)
    self.pool1 = nn.MaxPool1d(2)
    self.conv2 = nn.Conv1d(32, 64, 3, padding = 1)
    self.pool2 = nn.MaxPool1d(2)
    self.conv3 = nn.Conv1d(64, 128, 3, padding = 1)

    self.fc = nn.Linear(128, self.K)
  def forward(self, X):
    out = self.embed(X)
    out = out.permute(0,2,1)
    out = self.conv1(out)
    out = F.relu(out)
    out = self.pool1(out)
    out = self.conv2(out)
    out = F.relu(out)
    out = self.pool2(out)
    out = self.conv3(out)
    out = F.relu(out)
    
    out = out.permute(0,2,1)

    out, _ = torch.max(out, 1)
    out = self.fc(out)
    return out

making predicions with Trained NLP Model

single_sentence = 'Our dating service has been asked 2 contast U by someone shy!'
toks= TEXT.preprocess(single_sentence)
sent_idx = TEXT.numericalize([toks])
model(sent_idx.to(device))

'교육동영상 > 02. pytorch: Deep Learning' 카테고리의 다른 글

09. Transfer Learning (0)	2020.12.23
08. Recommender System (0)	2020.12.21
06-2. Recurrent Neural Networks, Time Series, and Sequence Data (0)	2020.12.14
06. Recurrent Neural Networks, Time Series, and Sequence Data (0)	2020.11.20
05. Convolutional Nerual Networks (0)	2020.11.19

06-2. Recurrent Neural Networks, Time Series, and Sequence Data

2020. 12. 14. 18:04

728x90

아래 내용은 Udemy에서 Pytorch: Deep Learning and Artificial Intelligence를 보고 정리한 내용이다.

GRU and LSTM

Modern RNN Units

LSTM

GRU is like 간단한 버전의 LSTM 과 비슷하다.(파라미터가 적고 thus more efficient )

simple RNN이 is not enough 는 vanishing gradient 때문이다.

vanishing gradient해결하는데 ReLU 사용 ????????

하지만 더 효과적인 GRU, LSTM를 발견하였다.

Simple RNN은 수식이 하나이다.

GRU

recurrent Unit

SimpleRNNs have no choice bue to eventually forget, due to the vanishing gradient

binary classifier (logistic regression neurons) as our gates

SimpleRNN 은 long-term dependencies에 학습 할 떄 문제가 있다.

the hidden state becomes the weighted sum of the previos hidden state and new value(allowing you to remeber the old state)

these are controlled by "gates" which are like binary classifiers /logistic regression / neurons

GRU less parameter > more performent

LSTM(long-short term memory)

Simple RNN GRU LSTM code

nn.RNN(

input_size = self.D,

hidden_size = self.M,

num_layers = self.L,

nonlinearity = 'relu',

batch_first = True

)

GRU

nn.GRU(

input_size = self.D,

hidden_size = self.M,

num_layers = self.L,

batch_first = True

)

LSTM

nn.LSTM(

input_size = self.D,

hidden_size = self.M,

num_layers = self.L,

batch_first = True

)

A more challenging Sequence

pytorch nonlinear sequence Linear code

import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt

series = np.sin((0.1 * np.arange(400)) ** 2)

data 만들기

T = 10
D = 1
X = []
Y = []
for t in range(len(series) - T) :
  x = series[t:t+T]
  X.append(x)
  y = series[t+T]
  Y.append(y)

X = np.array(X).reshape(-1, T)
Y = np.array(Y).reshape(-1, 1)
N = len(X)
print(X.shape, "   " , Y.shape)

model = nn.Linear(T, 1)

criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr = 0.1)

train test set split

X_train = torch.from_numpy(X[:-N//2].astype(np.float32))
y_train = torch.from_numpy(Y[:-N//2].astype(np.float32))
X_test = torch.from_numpy(X[-N//2:].astype(np.float32))
y_test = torch.from_numpy(Y[-N//2:].astype(np.float32))

모델 학습

LOSS 확인 하기

ONE-STEP forecast

validation_target = Y[-N//2:]
with torch.no_grad():
  validation_predictions = model(X_test).numpy()

Linear model has terrible result

pytorch nonlinear sequence SimpleRNN code

T = 10
D = 1
X = []
Y = []
for t in range(len(series) - T) :
  x = series[t:t+T]
  X.append(x)
  y = series[t+T]
  Y.append(y)

X = np.array(X).reshape(-1, T , 1) #MAKE IT N x T
Y = np.array(Y).reshape(-1, 1)
N = len(X)
print(X.shape, "   " , Y.shape)

class RNN(nn.Module):
  def __init__(self, n_inputs, n_hidden, n_rnn_layers, n_outputs):
    super(RNN, self).__init__()
    self.D = n_inputs
    self.M = n_hidden
    self.K = n_rnn_layers
    self.L = n_outputs

    self.rnn = nn.RNN(
      input_size = self.D,
      hidden_size = self.M,
      num_layers = self.L,
      nonlinearity = 'relu',
      batch_first = True
    )

    self.fc = nn.Linear(self.M, self.K)

  def forward(self, X):
    
    h0 = torch.zeros(self.L, X.size(0), self.M).to(device)
    out, _ = self.rnn(X, h0)

    out = self.fc(out[:,-1,:]) 
    return out

pytorch nonlinear sequence LSTM code

class RNN(nn.Module):
  def __init__(self, n_inputs, n_hidden, n_rnn_layers, n_outputs):
    super(RNN, self).__init__()
    self.D = n_inputs
    self.M = n_hidden
    self.K = n_rnn_layers
    self.L = n_outputs

    self.rnn = nn.LSTM(
      input_size = self.D,
      hidden_size = self.M,
      num_layers = self.L,
      batch_first = True
    )

    self.fc = nn.Linear(self.M, self.K)

  def forward(self, X):
    
    h0 = torch.zeros(self.L, X.size(0), self.M).to(device)
    c0 = torch.zeros(self.L, X.size(0), self.M).to(device)
    out, _ = self.rnn(X, (h0,c0))

    out = self.fc(out[:,-1,:]) 
    return out

RNNs for Image Classification

MNIST , Fashion MNIST

H x W (2-d )

Code preparation

step 1: load in the data

step 2: model

step 3: fit / plot the loss /etc.

regression is harder than classification

classification 은 라벨만 하면 된다.

lstm

'교육동영상 > 02. pytorch: Deep Learning' 카테고리의 다른 글

08. Recommender System (0)	2020.12.21
07. NLP (0)	2020.12.16
06. Recurrent Neural Networks, Time Series, and Sequence Data (0)	2020.11.20
05. Convolutional Nerual Networks (0)	2020.11.19
04. Feedforward Artificial Neural Networks (1)	2020.11.18

06. Recurrent Neural Networks, Time Series, and Sequence Data

2020. 11. 20. 10:45

728x90

아래 내용은 Udemy에서 Pytorch: Deep Learning and Artificial Intelligence를 보고 정리한 내용이다.

44. Sequence Data

시퀸스 데이터

time series

airline passengers

speech /Audio

text

bag of words example

eamil -> sapm vs. not spam

sequence ?

1-D series signal

linear regression

shape of a sequence

NxTxD?

N: = #samples

D: = #features

T: #time steps in the sequence

eg: GPS data from their cars

N: one sample would be one person's single trip to work

D = 2 , the GPs will record(latitude, longitude) pairs

T: the number of(lat, lng) measurements taken from start to finish of a single trip

variable length sequence ?

Nx D xT

image data: N x H x W x C

N x C x H x W (pytorch, Theano)

In python : N is first , feature maps : C

45. Forecasting

RNNs

Linear Regression

loop를 사용하여 prediction 해야 한다.

x = last values of train set

predictions = []

for i in range(length_of_forecast):

x_next = model.predict(x)

predictions.append(x_next)

x = concat(x[1:], x_next)

model = nn.Linear(1,1)

model = nn.Sequential(
nn.Linear(1,10)
nn.ReLU(),
nn.Linear(10,1)
)

46. Autogressive Linear Model for Time Sereies

모델 생성

model = nn.Linear(T, 1)

pytorch RNN series data

import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt

데이터 생성

N = 1000
series = np.sin(0.1 * np.arange(N))

T = 10
X = []
Y = []

for t in range(len(series) - T):
  x = series[t:t+T]
  X.append(x)
  y = series[t+T]
  Y.append(y)

X = np.array(X).reshape(-1, T)
Y = np.array(Y).reshape(-1, 1)
N = len(X)
print("X.shape" , X.shape, "Y.shape" , Y.shape)

build the model

model = nn.Linear(T, 1)

loss and optimizer

regression이기 때무에 mean squar error를 사용한다.

criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr = 0.1)

X_train = torch.from_numpy(X[:-N//2].astype(np.float32))
y_train = torch.from_numpy(Y[:-N//2].astype(np.float32))
X_test = torch.from_numpy(X[-N//2:].astype(np.float32))
y_test = torch.from_numpy(Y[-N//2:].astype(np.float32))

모델 학습

def full_gd(model, criterion, optimizer, X_train , y_train, X_test, y_test, epochs = 200):

  train_losses = np.zeros(epochs)
  test_losses = np.zeros(epochs)

  for epoch in range(epochs):
    # 역전파 단계를 실행하기 전에 변화도를 0으로 함 
    optimizer.zero_grad()

    outputs = model(X_train)
    loss = criterion(outputs, y_train)

    loss.backward()
    optimizer.step()

    train_losses[epoch] = loss.item()

    test_outputs = model(X_test)
    test_loss = criterion(outputs, y_test)
    test_losses[epoch] = test_loss.item()

    if (epoch +1) % 10 ==0:
          print(f'Epoch{epoch+1}/{epochs}, Train loss:{loss.item():.4f}, Test loss:{test_loss.item():.4f}')

  return train_losses, test_losses

train_losses, test_losses = full_gd(model, criterion, optimizer, X_train , y_train, X_test, y_test)

loss그리기

plt.plot(train_losses, label='train loss')
plt.plot(test_losses, label='test loss')
plt.legend()
plt.show()

잘못 예측 한것 보여주기

validation_target = Y[-N//2:]
validation_predictsions = []

i = 0
while len(validation_predictsions) < len(validation_target):
  input_ = X_test[i].view(1, -1)
  p = model(input_)[0, 0].item() #array->scalar

  i+= 1

  validation_predictsions.append(p)

plt.plot(validation_target, label='forecast target')
plt.plot(validation_predictsions, label='forecast prediction')
plt.legend()

validation_target = Y[-N//2:]
validation_predictsions = []

last_x = torch.from_numpy(X[-N//2].astype(np.float32))

while len(validation_predictsions) < len(validation_target):
  input_ = last_x.view(1, -1)
  p = model(input_)

  i+= 1

  validation_predictsions.append(p[0,0].item())

  last_x = torch.cat((last_x[1:] , p[0]))

plt.plot(validation_target, label='forecast target')
plt.plot(validation_predictsions, label='forecast prediction')
plt.legend()

plt.plot(validation_target, label='forest target')
plt.plot(validation_predictsions, label='forest prediction')
plt.legend()
plt.show()

47. Proof that the Linear Model Works

np.sin(0.1 * np.arange(200))

np.arange(200)

0.1 is the angular frequency

48. Recurrent Neural Networks

why RNN useful

D= 100

T = 10,000

TX D = 1 million

ANN 는 일반적이다.

- it connects every input to every feature in the next hidden layer

hidden state

input -> hidden -> output

RNN Equation

Wxh - input to hidden weight

Whh - hidden to hidden weight

bh - hidden bias

Wo - hidden to output weight

bo - output bias

X - TxD input matrix

tanh hidden activation

softmax output activation

yhat =[]

h_last = h0

for t in range(T):

h_t = tanh(X[t].do(Wx) + h_last.dot(Wh) + bh)

yhat = softmax(h_t.dot(Wo) + bo)

yhat.append(yhat)

h_last = h_t

Biological Inspiration

Cnn have "shared weights" to take advantage of structure, resulting in savings

Calculating our savings

T= 100

D = 10

M = 15 (this is still a hyperparameter)

Flattened input vector: TxD = 1000

we will have T hidden states : T x m = 1500

assume binary classification : K=1

input- to = hidden weight: 1000 x 1500 = 1.5 million

hidden-to-output weight : 1500

Total : ~ 1.5 million

calculating our savings

Wxh - D x M = 10 x 15 = 150

Whh - M xM = 15 x 15 = 225

Wo - M x k = 15 x 1 = 15

Total = 150+225+15 = 390

Savings: 1,501,500 / 390 = 3850

49. RNN code Preparation

Steps:

load in the data

dataset

RNN(NxTxD)

build the model

earlier and right

class SimpleRNN(nn.Module):

def __init__ = nn.RNN(

input_size = num_inputs,

hidden_size = num_hidden,

num_layers = num_layers,

nonlinearity = 'relu',

batch_first = True

)

RNN은 feedback 있다.

full contructor

class SimpleRNN(nn.Module):

def __init__ (self, n_inputs, n_hidden, n_rnnlayers, n_outputs)

super(SimpleRNN, slef).__init__()

self.D = n_inputs

self.M = n_hidden

self.K = n_outputs

self.L = n_rnnlayers

self.rnn = nn.RNN(

input_size = self.D,

hidden_size = self.M,

num_layers = self.L;

nonlinearity = 'relu',

batch_first = True)

self.fc = nn.Linear(self.M, self.K) -> Denselayer

forwar function

def forward(self, X):

h0 = torch.zeros(self.L, X.size(0), self.M).to(device)

out, _ = self.rnn(X, h0) ->hidden layer of each time step

out = self.fc(out[:,-1,:]

return out

train the model

모델 초기화
model = SimpleRNN(n_inputs = 1, n_hidden = 5, n_rnnlayers =1, n_outputs=1)

model.to(device)

loss and optimizer

criterion = nn.MSELoss()

optimizer = torch.optim.Adam(model.parameters(), lr = 0.1)

make inputs and targets

evaluate the model
make predictions

N x T x D -> N x K(outut )

input_ = X_test[i].reshape(1, T,1)

p = model(input_)[0,0].item()

50. RNN for Time Series Prediction

simple rnn sine

import torch

import torch.nn as nn

import numpy as np

import matplotlib.pyplot as plt

create data

N = 1000
series = np.sin(0.1 * np.arange(N))

plt.plot(series)
plt.show()

create dataset

T = 10
X = []
Y = []

for t in range(len(series) - T):
  x = series[t:t+T]
  X.append(x)
  y = series[t+T]
  Y.append(y)

X = np.array(X).reshape(-1, T, 1)
Y = np.array(Y).reshape(-1, 1)
N = len(X)
print("X.shape" , X.shape, "Y.shape" , Y.shape)

cuda 사용하기

device = torch.device("cuda:0" if torch.cuda.is_available() else 'cpu')
print(device)

모델 생성하기

class SimpleRNN(nn.Module):
  def __init__ (self, n_inputs, n_hidden, n_rnnlayers, n_outputs):
    super(SimpleRNN, self).__init__()
    self.D = n_inputs
    self.M = n_hidden
    self.K = n_outputs
    self.L = n_rnnlayers

    self.rnn = nn.RNN(
      input_size = self.D,
      hidden_size = self.M,
      num_layers = self.L,
      nonlinearity = 'relu',
      batch_first = True
    )

    self.fc = nn.Linear(self.M, self.K)

  def forward(self, X):
    
    h0 = torch.zeros(self.L, X.size(0), self.M).to(device)
    out, _ = self.rnn(X, h0)

    out = self.fc(out[:,-1,:]) 
    return out

model = SimpleRNN(n_inputs=1, n_hidden=5, n_rnnlayers=1, n_outputs=1)
model.to(device)

gpu에 넣으면 다 gpu 에 넣어야 한다.

아니면 cuda 에러가 난다.

ann 는 cnn 보다 flexible하지만 성능이 더 좋다고 는 할 수 없다.

criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr = 0.1)

X_train = torch.from_numpy(X[:-N//2].astype(np.float32))
y_train = torch.from_numpy(Y[:-N//2].astype(np.float32))
X_test = torch.from_numpy(X[-N//2:].astype(np.float32))
y_test = torch.from_numpy(Y[-N//2:].astype(np.float32))

data to gpu

X_train, y_train = X_train.to(device), y_train.to(device)
X_test , y_test = X_test.to(device) , y_test.to(device)

def full_gd(model, criterion, optimizer, X_train , y_train, X_test, y_test, epochs = 200):

  train_losses = np.zeros(epochs)
  test_losses = np.zeros(epochs)

  for epoch in range(epochs):
    # 역전파 단계를 실행하기 전에 변화도를 0으로 함 
    optimizer.zero_grad()

    outputs = model(X_train)
    loss = criterion(outputs, y_train)

    loss.backward()
    optimizer.step()

    train_losses[epoch] = loss.item()

    test_outputs = model(X_test)
    test_loss = criterion(outputs, y_test)
    test_losses[epoch] = test_loss.item()

    if (epoch +1) % 10 ==0:
          print(f'Epoch{epoch+1}/{epochs}, Train loss:{loss.item():.4f}, Test loss:{test_loss.item():.4f}')

  return train_losses, test_losses

train_losses, test_losses = full_gd(model, criterion, optimizer, X_train , y_train, X_test, y_test)

plt.plot(train_losses , label='train_losses')
plt.plot(test_losses , label='test_losses')
plt.legend()
plt.show()

51. Playing Attention to Shapes

N = 1
T = 10
D = 3
M = 5
K = 2
X = np.random.randn(N, T, D)

gpu를 사용하지 않는다.

class SimpleRNN(nn.Module):
  def __init__ (self, n_inputs, n_hidden, n_outputs):
    super(SimpleRNN, self).__init__()
    self.D = n_inputs
    self.M = n_hidden
    self.K = n_outputs

    self.rnn = nn.RNN(
      input_size = self.D,
      hidden_size = self.M,
      nonlinearity = 'tanh',
      batch_first = True
    )

    self.fc = nn.Linear(self.M, self.K)

  def forward(self, X):
    
    h0 = torch.zeros(1, X.size(0), self.M)
    out, _ = self.rnn(X, h0)

    out = self.fc(out) 
    return out

model = SimpleRNN(n_inputs=D, n_hidden=M, n_outputs=K)

inputs = torch.from_numpy(X.astype(np.float32))
out = model(inputs)
out

W_xh, W_hh, b_xh, b_hh = model.rnn.parameters()

W_xh = W_xh.data.numpy()
b_xh = b_xh.data.numpy()
W_hh = W_hh.data.numpy()
b_hh = b_hh.data.numpy()

wo = wo.data.numpy()
bo = bo.data.numpy()
wo.shape, bo.shape

h_last = np.zeros(M)
x = X[0]
yhats = np.zeros((T, K))

for t in range(T):
  h = np.tanh(x[t].dot(W_xh.T) + b_xh+ h_last.dot(W_hh.T) + b_hh)
  y = h.dot(wo.T) + bo
  yhats[t] = y

  h_last = h

print(yhats)

np.allclose(yhats, yhats_torch)

'교육동영상 > 02. pytorch: Deep Learning' 카테고리의 다른 글

07. NLP (0)	2020.12.16
06-2. Recurrent Neural Networks, Time Series, and Sequence Data (0)	2020.12.14
05. Convolutional Nerual Networks (0)	2020.11.19
04. Feedforward Artificial Neural Networks (1)	2020.11.18
03. Machine Learning and Neurons (0)	2020.11.16

05. Convolutional Nerual Networks

2020. 11. 19. 11:13

728x90

아래 내용은 Udemy에서 Pytorch: Deep Learning and Artificial Intelligence를 보고 정리한 내용이다.

31. What is Convolution?(par1)

convolutional neural networks

add

multiply

3 object

input image * filter (kernel) = output image

Convolution = Image Modifier

Image

10	20	30	40
0	1	0	1
30	0	10	20
0	10	20	30

* Filter

1	0
0	2

output

110+020+00+12 = 12	120+020+01+20 = 20

excercise: Write Pseudocode

input_image, kernel

output_height = input_height- kernel_height+1

output_width = input_width - kernel_width +1

mode ='valid'

ouput input is samller than input

padding "same" mdoe

same size as the input

full mode

we could extend the filter further and still get non-zero outputs

"Full" padding;

input length = N

Kernel length = K

Output length = N+K-1

INPUT LENGTH N kernel k

mode	output size	usae
valid	N-K +1	typical
same	N	typical
full	N+K-1	Atypical

32. what is convolution ?(part2)

convolution as "pattern finding"

cos similiraty

dot product

high positive correlation -> dot product is large and positive

high negative correlation -> dot product is large and negative

no correlation -> dot product is zero

33. what is convolution?(part3)

optional understanding

1-d convolution

matrix multiplication

2-d convolution

matric multiplication

color?

Translational Invariance: 위치 다르게 가능하다.

34. convolution on color images

3-d objects : Height x width x color

pooling

downsample by 2

3-D same size in depth dimension

input image HxWx3

kernel : KxKx3

output image:(H -K +1) x ( W-K +1)

how much do we save?

input image : 32x 32 x 3

filter : 3x5x5x 64

35. cnn architecture

CNN has two steps:

pooling

high level, pooling is downsampling

eg. output a smaller images from a bigger image

input 100x100 => pool size of 2 woule yield 50x50

Max pooling, average pooling

Max pooling :

why use pooling?

pattern finder에서 pattern이 found한 곳만 찾기 위해서 이다.

diferent pool sizes

stride : overlap이 가능하다.

pooling

conv-pool shrinks

losing information

Dowe lose information if we shrink the image ? Yes!

We lose spatial information : we don't care where the feature was found

hyperparameters:

learning rate, hidden layers, hidden units per layer

pooling : stride

pixel

Dense Neural Network: 1xD layer

global max pooling

36. cnn code preparation(part1)

build the model

C1 x H x W x C2

nn.Conv2d(in_channels =1, out_channels = 32, kernel_size = 3, stride =2)

color images 3-d

color is not a spatial dimension

1-D convolution example: time series

3-D convolution example: video (height, width, time)

3-D convolution example: voxels(height, width, depth)
"pixel" = "Picture Element"

"Voxel" = "Volume Element"

conv2d -> conv2d -> conv2d -> flatten -> Dense -> Dense

conv2d = Image

Dense = Vector

ANN Review

model = nn.Sequential(

nn.Linear(784, 128),

nn.ReLU(),

nn.Linear(128, 10)

)

class ANN(nn.Modeuls):

def __init__(self):

super(ANN, self).__init__()

self.layer1= Linear(784, 128),

self.layer2= ReLU()

self.layer3 =Linear(128,1)

def forward(self, x):

x = self.layer1(x)

x = self.layer2(x)

x = self.layer3(x)

return x

model = ANN()

Linear model

variable = nn.Linear(D,1)

outputs = variable(inputs)

class CNN(nn.Module):

def __init__(self):

super(CNN, self).__init__()

self.conv = nn.Sequential(

nn.Conv2d(2, 32, kernel_size =3, stride=2)

nn.Conv2d(32, 64, kernel_size =3, stride=2)

nn.Conv2d(64, 128, kernel_size =3, stride=2)

)

self.dense = nn.Sequential(

nn.Linear(?, 1024)

nn.Linear(1024, K))

def forward(self, x):

out = self.conv(x)

out = out.view(-1,?)

out = self.dense(out)

return out

CNN Sequentail version

model = nn.Sequential(

nn.Conv2d(3, 32, kernel_size=3, stride =2)

nn.Conv2d(32, 64, kernel_size=3, stride =2)

nn.Conv2d(64, 128, kernel_size=3, stride =2)

nn.Flatten()

nn.Linear(?, 1024)

nn.Linear(1024, K)

)

dropout Regularization:

L1 and L2 Regularization

dense layers

보통 dropout는 dense layer 사이에 사용한다.convolutions에 사용하지 않는다.

and sometimes RNNs

37. cnn code preparation(part2)

fill in the detaill

Convolutioanal Arithmetic

model = nn.Sequential(

nn.Conv2d(3, 32, kernel_size=3, stride =2)

nn.Conv2d(32, 64, kernel_size=3, stride =2)

nn.Conv2d(64, 128, kernel_size=3, stride =2)

nn.Flatten()

nn.Linear(?, 1024)

nn.Linear(1024, K)

)

keras

32->16->8->4

"padding" argument

중요한 점

pytorch 특이한 점

why NxCxHxW and not NxHxWxC?

Conventions = whatever the programmer decided to do

Theano / Pytorch == channels first

OpenCV, Tensorflow, Matplotlib, pillow == channels last

OpenDV == BGR instead of RGB (try imread and then plot with Maplotlib)

38. cnn code preparation(part3)

1.load in the data

Fashion MNIST and CIFAR -10

2. BUILD THE MODEL ->convolutional

3. TRAIN the model

4. evaluate the model

5. make predictions

"all machine learning interfaces are the same"

load in the data

data augmentation

Pytorch loading in the data

train_dataset = torchvision.datasets.FashionMNIST(

root = '.',

train = True,

transform = transforms.ToTensor(),

download = True)

train_dataset = torchvision.datasets.CIFAR10(

root = '.',

train = True,

transform = transforms.ToTensor(),

download = True)

train_loader = torch.utils.data.DataLoader(dataset = train_dataset, batch_size = batch_size, shuffle = True)

all data is the same

all machine learning interfaces are the same

Training loop

for i in rage(epochs):

for inputs, targets in data_loader:

...

evaluating accuracy

for inputs, targets in data_loader:

...

39. cnn for fashion mnist

import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime

train , test set 가져오기

train_dataset = torchvision.datasets.FashionMNIST(
  root = '.',
  train = True, 
  transform = transforms.ToTensor(),
  download = True)

test_dataset = torchvision.datasets.FashionMNIST(
  root = '.',
  train = False, 
  transform = transforms.ToTensor(),
  download = True)

class 종류 확인하기

classes = len(set(train_dataset.targets.numpy()))
print("number of classes:", classes)

class CNN(nn.Module):
  def __init__(self, classes):
    super(CNN, self).__init__()

    self.conv_layers = nn.Sequential(
      nn.Conv2d(in_channels=1, out_channels = 32, kernel_size =3, stride=2),
      nn.ReLU(),
      nn.Conv2d(in_channels=32, out_channels = 64, kernel_size =3, stride=2),
      nn.ReLU(),
      nn.Conv2d(in_channels=64, out_channels = 128, kernel_size =3, stride=2),
      nn.ReLU()
    )
    self.dense_layers = nn.Sequential(
      nn.Dropout(0.2),
      nn.Linear(128*2*2, 512),
      nn.ReLU(),
      nn.Dropout(0.2),
      nn.Linear(512, classes)
    )

  def forward(self, x):
    out = self.conv_layers(x)
    out = out.view(out.size(0), -1)
    out = self.dense_layers(out)
    return out

model = CNN(classes)

다른 또 한가지 방법 tensorflow

'''
model = nn.Sequential(
  nn.Conv2d(in_channels=1, out_channels = 32, kernel_size =3, stride=2),
  nn.ReLU(),
  nn.Conv2d(in_channels=32, out_channels = 64, kernel_size =3, stride=2),
  nn.ReLU(),
  nn.Conv2d(in_channels=64, out_channels = 128, kernel_size =3, stride=2),
  nn.ReLU(),
  nn.Flatten(),
  nn.Dropout(0.2),
  nn.Linear(128*2*2, 512),
  nn.ReLU(),
  nn.Dropout(0.2),
  nn.Linear(512, classes) 
)
'''

model to gpu

device = torch.device("cuda:0" if torch.cuda.is_available() else 'cpu')
print(device)
model.to(device)

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters())

batch_size = 128
train_loader = torch.utils.data.DataLoader(dataset = train_dataset, batch_size = batch_size, shuffle = True)
test_loader = torch.utils.data.DataLoader(dataset = test_dataset, batch_size = batch_size, shuffle = False)

def batch_gd(model, criterion, optimizer, X_train, y_train, epochs):
  train_losses = np.zeros(epochs)
  test_losses = np.zeros(epochs)

  for epoch in range(epochs):
    t0 = datetime.now()
    train_loss = []

    for inputs, targets in train_loader:
      inputs, targets = inputs.to(device), targets.to(device)
      optimizer.zero_grad()

      outputs = model(inputs)
      loss = criterion(outputs, targets)

      loss.backward()
      optimizer.step()

      train_loss.append(loss.item())

    train_loss = np.mean(train_loss)


    test_loss = []

    for inputs, targets in test_loader:
      inputs, targets = inputs.to(device), targets.to(device)
      outputs = model(inputs)
      loss = criterion(outputs, targets)
      test_loss.append(loss.item())
    test_loss = np.mean(test_loss)

    train_losses[epoch] = train_loss
    test_losses[epoch] = test_loss

    dt = datetime.now() - t0
    print(f'Epoch{epoch+1}/{epochs}, Train loss:{train_loss:.4f}, Test loss:{test_loss:.4f}')

  return train_losses, test_losses

train_losses, test_losses = batch_gd(model, criterion, optimizer, train_loader, test_loader, epochs=15)

정확도 구하기

n_correct = 0.
n_total = 0.
for inputs, targets in train_loader:
  inputs, targets = inputs.to(device), targets.to(device)
  outputs = model(inputs)
  _, predictions = torch.max(outputs, 1)
  n_correct += (predictions == targets).sum().item()
  n_total+= targets.shape[0]
train_acc = n_correct/ n_total
  
n_correct = 0.
n_total = 0.
for inputs, targets in test_loader:
  inputs, targets = inputs.to(device), targets.to(device)
  outputs = model(inputs)
  _, predictions = torch.max(outputs, 1)
  n_correct += (predictions == targets).sum().item()
  n_total+= targets.shape[0]
test_acc = n_correct/ n_total
print(f"Train acc: {train_acc:.4f} , Test acc:{test_acc:.4f}")

from sklearn.metrics import confusion_matrix
import numpy as np
import itertools
def plot_confusion_matrix(cm, classes, normalize = False, title =' Confusion matrix', cmap = plt.cm.Blues):
  if normalize:
    cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
    print("Normalized confusion matrix")
  else:
    print("confusion_matrix: without Normalized")
  print(cm)
  plt.imshow(cm, interpolation='nearest', cmap=cmap)
  plt.title(title)
  plt.colorbar()
  tick_marks = np.arange(len(classes))
  plt.xticks(tick_marks, classes, rotation = 45)
  plt.yticks(tick_marks, classes)

  fmt ='.2f' if normalize else 'd'
  thresh = cm.max()/2
  for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
    plt.text(j, i, format(cm[i,j], fmt), horizontalalignment="center" , color="white" if cm[i, j]> thresh else 'black')
  
  plt.tight_layout()
  plt.ylabel('True label')
  plt.xlabel('Predicted label')
  plt.show()

x_test = test_dataset.data.numpy()
y_test = test_dataset.targets.numpy()
p_test = np.array([])
for inputs, targets in test_loader:
  inputs = inputs.to(device)
  
  outputs = model(inputs)
  _, predictions = torch.max(outputs, 1)
  p_test = np.concatenate((p_test, predictions.cpu().numpy()))

cm = confusion_matrix(y_test, p_test)
plot_confusion_matrix(cm, list(range(10)))

p_test = p_test.astype(np.uint8)
misclassfied_idx = np.where(p_test != y_test)[0]
i = np.random.choice(misclassfied_idx)
plt.imshow(x_test[i].reshape(28,28) , cmap = 'gray')
plt.title('True label: %s Predicted : %s' % (labels[y_test[i]], labels[p_test[i]]))

cnn better than ann

40. cnn for cifar-10

import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime

cifar 10 dataset color imageset

train_dataset = torchvision.datasets.CIFAR10(
  root = '.',
  train = True, 
  transform = transforms.ToTensor(),
  download = True)

test_dataset = torchvision.datasets.CIFAR10(
  root = '.',
  train = False, 
  transform = transforms.ToTensor(),
  download = True)

cifar 10 regular

classes = len(set(train_dataset.targets))
print("number of classes: " , classes)

dataloader

batch_size = 128
train_loader = torch.utils.data.DataLoader(dataset = train_dataset, batch_size = batch_size, shuffle = True)
test_loader = torch.utils.data.DataLoader(dataset = test_dataset, batch_size = batch_size, shuffle = False)

cnn 정의 한다.

class CNN(nn.Module):
  def __init__(self, classes):
    super(CNN, self).__init__()
    self.conv1 = nn.Conv2d(in_channels=3, out_channels = 32, kernel_size =3, stride=2)
    self.conv2 = nn.Conv2d(in_channels=32, out_channels = 64, kernel_size =3, stride=2)
    self.conv3 = nn.Conv2d(in_channels=64, out_channels = 128, kernel_size =3, stride=2)
    self.fc1 = nn.Linear(128*3*3, 1024)
    self.fc2 = nn.Linear(1024, classes)

  def forward(self, x):
    x = F.relu(self.conv1(x))
    x = F.relu(self.conv2(x))
    x = F.relu(self.conv3(x))
    x = x.view(-1, 128*3*3)
    x = F.dropout(x, p =0.5)
    x = F.relu(self.fc1(x))
    x = F.dropout(x, p =0.2)
    x = self.fc2(x)
    return x

model = CNN(classes)

41. data augmentation

generators/ iterator

0.. 10: for i in range(10)

python 2 range(10)

python 2, use xrange(10)

python 3, print(range(10))

.....

for x in my_image_aumgmentation_generator():

print(x)

def my_image_aumgmentation_generator():

for x_batch, y_batch in zip(X_train, y_train):

x_batch = gument(x_batch)

yield x_batch, y_batch

Data augmentation with Torch Vision

transform = torchvision.transforms.Compose([

torchvision.transforms.ColorJitter(

brightness=0.2, contrast = 0.2, saturation =0.2, hue =0.2),

torchvision.transforms.RandomHorizontailFlip(p=0.5),

torchvision.transforms.RandomRotation(degrees=15),

transforms.ToTensor(),

)

])

train_dataset = torchvision.datasets.CIFAR10(

root = '.',

train=True,

transform = transform,

download = True

)

dataloader은 그래도 한다. 이전과 같이 진행하면 된다.

Data Augmentation with your data

train_dataset = torchvision.datasets.yourdataset?(

root = '.',

train=True,

transform = transforms.ToTensor(),

download = True

)

42. Batch Normalization

z = (x - μ) / σ

for epoch in ragne(epochs):

for x_batch, y_batch in data_loader:

x<- w - learning_rate * grad(x_batch , y_batch)

Dense 사이에 한다.

Batch Norm as regularization

can help with overfitting

43. Improving CIFAR -10 Results

pytorch data augmentation

data augmentation

train_transform = torchvision.transforms.Compose([
  tranforms.RandomCrop(32, padding = 4),
  torchvision.transforms.RandomHorizontalFlip(p=0.5),
  torchvision.transforms.RandomAffine(0, translate=(0.1,0.1)),
  tranforms.ToTensor(),
])

train_dataset = torchvision.datasets.CIFAR10(
  root = '.',
  train = True, 
  transform = train_transform,
  download = True)

test_dataset = torchvision.datasets.CIFAR10(
  root = '.',
  train = False, 
  transform = train_transform,
  download = True)

모델 생성

maxpooling

class CNN(nn.Module):
  def __init__(self, classes):
    super(CNN, self).__init__()
    self.conv1 = nn.Sequential(
        nn.Conv2d(3, 32, kernel_size= 3, padding =1 ),
        nn.ReLU(),
        nn.BatchNorm2d(32),
        nn.Conv2d(32, 32, kernel_size=3, padding=1),
        nn.ReLU(),
        nn.BatchNorm2d(32),
        nn.MaxPool2d(2),
    )
    self.conv2 = nn.Sequential(
        nn.Conv2d(32, 64, kernel_size= 3, padding =1 ),
        nn.ReLU(),
        nn.BatchNorm2d(64),
        nn.Conv2d(64, 64,kernel_size=3, padding=1),
        nn.ReLU(),
        nn.BatchNorm2d(64),
        nn.MaxPool2d(2),
    )
    self.conv3 = nn.Sequential(
        nn.Conv2d(64, 128, kernel_size= 3, padding =1 ),
        nn.ReLU(),
        nn.BatchNorm2d(128),
        nn.Conv2d(128, 128,kernel_size=3, padding=1),
        nn.ReLU(),
        nn.BatchNorm2d(128),
        nn.MaxPool2d(2),
    )
    self.fc1 = nn.Linear(128 * 4 * 4, 1024)
    self.fc2 = nn.Linear(1024, classes)
  def forward(self, output):
    output = self.conv1(output)
    output = self.conv2(output)
    output = self.conv3(output)
    output = output.view(output.size(0) , -1)
    output = F.dropout(output, p = 0.5)
    output = F.relu(self.fc1(output))
    output = F.dropout(output, p = 0.2)
    output = self.fc2(output)
    return output

vgg mach larger images

from torchsummary import summary
summary(model, (3,32,32))

'교육동영상 > 02. pytorch: Deep Learning' 카테고리의 다른 글

06-2. Recurrent Neural Networks, Time Series, and Sequence Data (0)	2020.12.14
06. Recurrent Neural Networks, Time Series, and Sequence Data (0)	2020.11.20
04. Feedforward Artificial Neural Networks (1)	2020.11.18
03. Machine Learning and Neurons (0)	2020.11.16
01. Introduction 02. Google Colab (0)	2020.11.14

04. Feedforward Artificial Neural Networks

2020. 11. 18. 09:46

728x90

아래 내용은 Udemy에서 Pytorch: Deep Learning and Artificial Intelligence를 보고 정리한 내용이다.

22. Artificial Neural Networks Section introduction

CNNs

RNNs

Artificial Neural Networks: ANNs

neural networks

activation functions:

These are what make neural networks 활성화 되는지

Multiclass classification: 여러개 구분하는 것

Image data: images,. text, and sound

23. Forward Propagation

nerual networks -> predictions

E.g.: input is a face

one neuron 는 the presence of an 눈을 보고

one neuron 는 the presence of an 코를 보고

그들은 각각 다른 feature를 보고 있다.

input hidden output

layer

a chain of nerons

uniform structure

y = wx+b

y = ax+b

sigmoid

앞의 neural network output을 구한다음 뒤어로 전달하면서 계싼한다.

regression

dense layer -> dense layer -> dense layer -> linear regression

classification

dense layer -> dense layer -> dense layer -> logistic regression

Hierachies

solve the compelicated problem

24. The Geometric Picture

geometric picture

feature engineering

linear regression

y hat = ax^2 + b

gradient descent

25. Activation Functions

sigmoid 0~1

f(x) = 1 / 1 + exp(-a)

binary classification

standardization

tanh -1~1

vanishing gradient problem

변화가 거의 나지 않을 경우

deaad end

default : Relu

doesn't have a "vaishing' gradient..

the gradient in the left half is aleady vanished!

BRU activation

higher accuracy

softplus

biological plausibility

26. Multiclass Classification

softmax function

softmax technically an activation function , but unlike the sigmoid/tahh, ReLU hidden activations는 아니다.

pytorch softmax function

nn.Sequential(

nn.Linear(D,M),

nn.ReLU(),

nn.Linear(M, K),

nn.Softmax()

)

crossEngropyLoss()

model = nn.Linear(D,K)

criterion = nn.CrossEntropyLoss()

activation function

task	activation function
Regression	None/Identity
binary classification	sigmoid
multiclass classification	softmax

The Model Type Doesn't matter

linear regression

dense

ann Regression

dense + Dense

binary Logistic Regression

Dense+sigmoid

ANN Binary Classification

Dense+Dense+sigmoid

Multicalss Logistic Regression

Dense+ Softmax

ANN Multiclass Classification

Dense+Dense+ Softmax

same pattern applies to CNNs,RNNs - the type of task corresponds only to the final activation function

softmax is more general

multiclass classification

binary classification k = 2

27. How to Represent Images

이미지가 어떻게 데이터에 입력 되는지 확인 해야 한다.

height/width

matrix

column of the image

colors?

RGB red/green/bue

black = 0

white = 255

Images as input to neural networks

0 ... 255

feature vector

3dimensions: height,width , color

quantization:

color is light, measured by light intensity

fugured out that 8 bits(1byte)

2^3 => 0 ~ 255

=> 500 x 500의 이미지는 얼마 만큼의 space를 찾이하는가 ?

500 x 500 x 3 x 8 = 6 million bits

jepg allows us to compress images

Hex Colors

each byte( 8 bits)

Grayscale Images : not have color

2- D array (height, width)

black = 0 , white = 255

only be a white and black

plt.imshow()

plt.imshow( , cmap ='gray')

Images as input to neural networks

0...1 => 사이가 편하다.

Another exception

VGG

images are centered around 0, but the range is still 256

Images as input to neural networks

N = #samples, D = #features

input X of shape NxD

A single image is HxWxC

N x HxWxC

Image to Feature Vector

reshape() or view()

NxD array

28. Code Preparation (ANN)

1. load int the data

MNIST dataset ->handwrite

2. build the model

3. train the model

4. evaluate the model

5. make the predictions

pytorch load MNIST

step1. load in the data -> pytorch library

grascale => 28x28

train_dataset = torchvision.datasets.MNIST(

root = '.',

train = True,

download = True)

x_train = train_dateset.data

y_train = train_dateset.targets

x_train.shape = N x 28 x 28

y_train.shape = N

n = 60,000

test_dataset = torchvision.datasets.MNIST(

root = '.',

train = False,

download = True)

x_test = test_dataset.data

y_test = test_dataset.targets

x_test.shape = Ntest x 28 x 28

y_test.shape = Ntest

Ntest = 1,000

trainsforming the data

# reshape the input -> small range

inputs = inputs.view(-1, 784)

step 2. model

model = nn.Sequential(

nn.Linear(784, 128),

nn.ReLU(),

nn.Linear(128, 10)

)

10-> classification 결과

step 3. trian the model

batch gradient Descent

for epoch in range(epochs):

for x_batch, y_batch in batches(X,Y , batch_size = 128): => batch_size로 나누어서 학습 한다.

train(x_batch, y_batch)

Batch Gradient Descent in pytTorch

train_loader = torch.utils.data.DataLoader(dataset = train_dataset, batch_siz e= batch_size, shuffle = True)

for epoch in range(epochs):

for inputs, targets in train_loader:

optimizer.zero_grad()

ramdom sample

step 4/5

n_correct = 0

n_total = 0

for inputs, targets in train_loader:

output = model(inputs)

acc = n_correct/ n_total

_, predictions = torch.max(outputs, 1)

29. ANN for Image Classification

import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
import numpy as np
import matplotlib.pyplot as plt

train_dataset = torchvision.datasets.MNIST(
  root = '.',
  train = True, 
  transform = transforms.ToTensor(),
  download = True)

train_dataset.data

train_dataset.data.max()

train_dataset.data.shape

train_dataset.targets

이미 다운로드 되여서 다운로드 하지는 않는다.

train_dataset = torchvision.datasets.MNIST(
  root = '.',
  train = True, 
  transform = transforms.ToTensor(),
  download = True)

model = nn.Sequential(
 nn.Linear(784, 128),
 nn.ReLU(),
 nn.Linear(128, 10)
)
# no need for final softmax!

gpu를 사용 여부 확인하면서 있을 경우 사용한다.

속도와 관련 있다.

device = torch.device("cuda:0" if torch.cuda.is_available() else 'cpu')
print(device)
model.to(device)

loss and optimizer

ctriterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters())

batch_size = 128
train_loader = torch.utils.data.DataLoader(dataset = train_dataset, batch_size = batch_size, shuffle = True)
test_loader = torch.utils.data.DataLoader(dataset = test_dataset, batch_size = batch_size, shuffle = False)

tmp_loader = torch.utils.data.DataLoader(dataset = train_dataset, batch_size = 1, shuffle = True)
tmp_loader

for x,y in tmp_loader:
    print(x)
    print(x.shape)
    print(y.shape)
    break

train_dataset.transform(train_dataset.data.numpy()).max()

epochs= 10

train_losses = np.zeros(epochs)
test_losses = np.zeros(epochs)

for epoch in range(epochs):
  train_loss = []
  for inputs, targets in train_loader:
    inputs, targets = inputs.to(device) , targets.to(device)

    inputs = inputs.view(-1, 784)
    optimizer.zero_grad()

    outputs = model(inputs)
    loss = ctriterion(outputs, targets)

    loss.backward()
    optimizer.step()

    train_loss.append(loss.item())
  
  train_loss = np.mean(train_loss)

  test_loss = []
  for inputs, targets in test_loader:
    inputs, targets = inputs.to(device) , targets.to(device)

    inputs = inputs.view(-1, 784)
    optimizer.zero_grad()

    outputs = model(inputs)
    loss = ctriterion(outputs, targets)

    test_loss.append(loss.item())
  
  test_loss = np.mean(test_loss)

  train_losses[epoch] = train_loss
  test_losses[epoch] = test_loss
  print(f'Epoch {epoch+1} / {epochs} , train loss : {train_loss:.4f} , Test loss: {test_loss:.4f}')

plt.plot(train_losses, label ='train loss')
plt.plot(test_losses, label = 'test loss')
plt.legend()
plt.show()

n_correct = 0.
n_total = 0.
for inputs, targets in train_loader:
  inputs, targets = inputs.to(device), targets.to(device)
  inputs = inputs.view(-1, 784)
  outputs = model(inputs)
  _, predictions = torch.max(outputs, 1)
  n_correct += (predictions == targets).sum().item()
  n_total+= targets.shape[0]
train_acc = n_correct/ n_total
  
n_correct = 0.
n_total = 0.
for inputs, targets in test_loader:
  inputs, targets = inputs.to(device), targets.to(device)
  inputs = inputs.view(-1, 784)
  outputs = model(inputs)
  _, predictions = torch.max(outputs, 1)
  n_correct += (predictions == targets).sum().item()
  n_total+= targets.shape[0]
test_acc = n_correct/ n_total
print(f"Train acc: {train_acc:.4f} , Test acc:{test_acc:.4f}")

from sklearn.metrics import confusion_matrix
import numpy as np
import itertools
def plot_confusion_matrix(cm, classes, normalize = False, title =' Confusion matrix', cmap = plt.cm.Blues):
  if normalize:
    cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
    print("Normalized confusion matrix")
  else:
    print("confusion_matrix: without Normalized")
  print(cm)
  plt.imshow(cm, interpolation='nearest', cmap=cmap)
  plt.title(title)
  plt.colorbar()
  tick_marks = np.arange(len(classes))
  plt.xticks(tick_marks, classes, rotation = 45)
  plt.yticks(tick_marks, classes)

  fmt ='.2f' if normalize else 'd'
  thresh = cm.max()/2
  for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
    plt.text(j, i, format(cm[i,j], fmt), horizontalalignment="center" , color="white" if cm[i, j]> thresh else 'black')
  
  plt.tight_layout()
  plt.ylabel('True label')
  plt.xlabel('Predicted label')
  plt.show()

x_test = test_dataset.data.numpy()
y_test = test_dataset.targets.numpy()
p_test = np.array([])
for inputs, targets in test_loader:
  inputs = inputs.to(device)

  inputs = inputs.view(-1, 784)
  
  outputs = model(inputs)
  _, predictions = torch.max(outputs, 1)
  p_test = np.concatenate((p_test, predictions.cpu().numpy()))

cm = confusion_matrix(y_test, p_test)
plot_confusion_matrix(cm, list(range(10)))

결과가 안같은 것 보여주기

misclassified_idx = np.where(p_test != y_test)[0]
i = np.random.choice(misclassified_idx)
plt.imshow(x_test[i], cmap ='gray')
plt.title("True label: %s Predicted: %s" % (y_test[i], int(p_test[i])))

30. ANN for Regression

pytorch regression

import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

N = 1000
X = np.random.random((N,2)) * 6 -3
y = np.cos(2 * X[:,0]) + np.cos(3*X[:,1])

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(X[:,0] , X[:,1] , y)

notebook과 다른점은 plt.show()할 필요 없다.

모델 생성하기

#build the model

model = nn.Sequential(
    nn.Linear(2, 128),
    nn.ReLU(),
    nn.Linear(128, 1)
)

criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr = 0.01)

def full_gd(model, criterion, optimizer, X_train, y_train, epochs= 1000):
  train_losses = np.zeros(epochs)

  for epoch in range(epochs):
    optimizer.zero_grad()

    outputs = model(X_train)
    loss = criterion(outputs, y_train)

    loss.backward()
    optimizer.step()

    train_losses[epoch] = loss.item()

    if(epoch+1) % 50 == 0:
      print(f'Epoch {epoch+1}/{epochs}, Train loss:{loss.item():.4f} ')

  return train_losses

X_train = torch.from_numpy(X.astype(np.float32))
y_train = torch.from_numpy(y.astype(np.float32).reshape(-1,1))
train_lossses = full_gd(model, criterion, optimizer, X_train, y_train)

plt.plot(train_losses)

fig = plt.figure()
ax = fig.add_subplot(111, projection = "3d")
ax.scatter(X[:,0] , X[:,1] , y)

with torch.no_grad():
  line = np.linspace(-3, 3, 50)
  XX, yy = np.meshgrid(line, line)
  Xgrid = np.vstack((XX.flatten(), yy.flatten())).T
  Xgrid_torch = torch.from_numpy(Xgrid.astype(np.float32))
  yhat = model(Xgrid_torch).numpy().flatten()
  ax.plot_trisurf(Xgrid[:, 0] , Xgrid[:, 1] , yhat, linewidth = 0.2 , antialiased = True)
  plt.show()

아래 그림은 더 크게 만들어준다.

fig = plt.figure()
ax = fig.add_subplot(111, projection = "3d")
ax.scatter(X[:,0] , X[:,1] , y)

with torch.no_grad():
  line = np.linspace(-5, 5, 50)
  XX, yy = np.meshgrid(line, line)
  Xgrid = np.vstack((XX.flatten(), yy.flatten())).T
  Xgrid_torch = torch.from_numpy(Xgrid.astype(np.float32))
  yhat = model(Xgrid_torch).numpy().flatten()
  ax.plot_trisurf(Xgrid[:, 0] , Xgrid[:, 1] , yhat, linewidth = 0.2 , antialiased = True)
  plt.show()

'교육동영상 > 02. pytorch: Deep Learning' 카테고리의 다른 글

06-2. Recurrent Neural Networks, Time Series, and Sequence Data (0)	2020.12.14
06. Recurrent Neural Networks, Time Series, and Sequence Data (0)	2020.11.20
05. Convolutional Nerual Networks (0)	2020.11.19
03. Machine Learning and Neurons (0)	2020.11.16
01. Introduction 02. Google Colab (0)	2020.11.14

03. Machine Learning and Neurons

2020. 11. 16. 17:29

728x90

아래 내용은 Udemy에서 Pytorch: Deep Learning and Artificial Intelligence를 보고 정리한 내용이다.

7. what is machine learning?

machine learning 이란?

복잡

machine learning is nothing but a geometry problem

regression number

fit a line or curve

예측

y hat = mx + b

make the line/curve close to the data points

classification category/label

target

separate

make the line/curve separate data points of different classes

supervised learning

8. regression basics

tensorflow

dataset

x-axis

y-axis

scikit-learn approach

load the data => pd.read_csv()

1. model architecture model = LinearRegression()

2. model.predict(X)

3. model.fit(X, y)

loss function

loss = cost == error = objective

pytorch 다른점

no predefined models

no fit or predict

mse loss 함수

Mean Squared Error : 차이

error가 작으면 작을 수록 good fit이고

크면 bad fit

perfect fit = zero error

linear regression

gradient descent

9. Regression Code Preparation

Regression code preparation

cocepts는 같지만 코드는 다르다.

python loop

for i in range(10):

print(i)

java loop

for(int i = 0; i < 10; i++){

System.out.println(i);

}

we konw the concepts, but not the syntax

#1. build the model

model = nn.Linear(1,1)

#2. train the model

#loss and optimizer

criterion = nn.MSELoss()

optimizer = torch.optim.SGD(model.parameters(), lr=0.1)

#Train the model(gradient descent loop)

n_epochs = 30

for it in range(n_epochs):

# zero the parameter gradients

optimizer.zero_grad()

# gradient

# forward pass

outputs= model(inputs)

loss = criterion(outputs, targets)

# backward and optimize

loss.backward()

optimizer.step()

(inputs, targets)

(X, y)

pytorch에서 numpy 는 안된다. tensor은 된다.

Array to tensor

X = X.reshape(N,1)

Y = Y.reshape(N,1)

#pytorch는 type에 엄청 까다롭다.

#pytorch는 float32 default

#numpy float64 default

inputs = torch.from_numpy(X.astype(np.float32))

targets = torch.from_numpy(Y.astype(np.float32)

#3. make predictions

#Forward pass

outpus = model(inputs)

<- No, this just gives us a Torch Tensor

pytorch는 tensor 가능하다.

predictions = model(inputs).detach().numpy()

<-Detach from graph(more detail later), and convert to Numpy array

summary:

10. Regression Notebook

import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt

N = 20
X= np.random.random(N) * 10 -5
y = 0.5 * X - 1 +np.random.randn(N)

plt.scatter(X,y )

linear one input one output

model = nn.Linear(1,1)

criterion = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.1)

X = X.reshape(N, 1)
y = y.reshape(N, 1)

inputs = torch.from_numpy(X.astype(np.float32))
targets = torch.from_numpy(y.astype(np.float32))

type(inputs)

torch.Tensor =>torch.Tensor

n_epochs = 30
losses = []
for it in range(n_epochs):

  optimizer.zero_grad()

  outputs= model(inputs)
  loss = criterion(outputs, targets)

  losses.append(loss.item())

  loss.backward()
  optimizer.step()

  print(f'Epoch {it+1}/{n_epochs}, Loss:{loss.item():.4f}')

plt.plot(losses)

prediction

predicted = model(inputs).detach().numpy()
plt.scatter(X, y , label ='Original data')
plt.plot(X, predicted, label ='Fitted lne')
plt.legend()
plt.show()

#Error
model(inputs).numpy()

with torch.no_grad():
  output = model(inputs).numpy()
output

w = model.weight.data.numpy()
b = model.bias.data.numpy()
print(w, b)

11. 무어의 법칙

무어의 법칙은 컴퓨터구조에서 배운다.

컴퓨터 power 은 grows exponentially.

2배씩

computer power

exponentailly eg, 1,2,4,8,...

y = ax+ b=> linear 함수

normalize

12. Moore's Law Notebook

데이터를

Error tokenizing data 오류가 날 경우에는

error_bad_lines=False 추가한다.

데이터가 바꿔져서 그런지 오류가 나서 진행이 안된다.

데이터 가공

모델 생성

loss

모델 학습

모델 predict

transforming back to original scale

import torch
import torch.nn as nn
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

data = pd.read_csv('moore.csv' , header= None , error_bad_lines=False).values
data

13. Linear Classification Basics

regression: we want the line to be close to the data points

classification : we want the line to separate data points

	tensorflow 모델 학습	pytorch
1. load in some data
2. model 생성	model = MyLinearClassifier()
3. train model	model.fit(X,y)	없다.
4. predictions	model.predict(X_test)	없다.
5. evaluate accuracy	model.score(X,y)

accuracy = #correct / #total

error = #incorrect/ #total

error = 1- accuracy

linear

a = w1x1+w2x2 +b

if a>= 0 -> predict 1

if a< 0 -> predict 0

sigmoid 0 ~ 1 사이의 값

model architecture

making predictions

training

14. Classification Code Prepatration

loading in the data

from sklearn.datasets import load_breast_cancer

data = load_breast_cancer()
X,y = data.data, data.target

preprocess the data

split the data

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size =0.3)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

build the model

Recall: the shape of the data is "NxD"

model = nn.Sequentail(

nn.Linear(D,1),

nn.Sigmoid()

)

input each feature for dataset

Train the model

#loss and optimizer

criterion = nn.BCELoss() #binary cross entropy

optimizer = torch.optim.Adam(model.parameters()) #adam hypperparammeter

for it in range(n_epochs):

optimizer.zero_grad()

hypperparameter

evaluating the model

Recall : in regression , we use the MSE(loss)

RMSE 도 가능하다.

GET THE ACCURACY:

with torch.no_grad():

p_train = model(X_train)

p_train = np.round(p_train.numpy())

train_acc = np.mean(y_train.numpy() == p_train)

15. Classification Notebook

pytorch Classification code

import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt

from sklearn.datasets import load_breast_cancer

data = load_breast_cancer()
type(data)

data.keys()

shape를 확인한다.

data.data.shape

569 데이타 가 있고 30개 feature이 있다.

data.target

data.target_names

data.target.shape

data.feature_names

data split

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size =0.3)
N, D = X_train.shape

normalization or standardscaler

from sklearn.preprocessing import StandardScaler


scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

pytorch build model

model = nn.Sequential(
  nn.Linear(D,1),
  nn.Sigmoid()
)

loss and optimizer

criterion = nn.BCELoss()
optimizer = torch.optim.Adam(model.parameters())

convert data tensor

2d array -> 1d array

X_train = torch.from_numpy(X_train.astype(np.float32))
X_test = torch.from_numpy(X_test.astype(np.float32))
y_train = torch.from_numpy(y_train.astype(np.float32).reshape(-1,1))
y_test = torch.from_numpy(y_test.astype(np.float32).reshape(-1,1))

epochs = 1000
train_losses = np.zeros(epochs)
test_losses = np.zeros(epochs)

for epoch in range(epochs):
  optimizer.zero_grad()

  outputs = model(X_train)
  loss = criterion(outputs, y_train )

  outputs_test = model(X_test)
  loss_test = criterion(outputs_test, y_test)

  #save losses
  train_losses[epoch] = loss.item()
  test_losses[epoch] = loss_test.item()

  if (epoch+1) % 50 == 0: 
    print(f'Epoch {epoch+1}/ {epochs} , Train loss: {loss.item():.4f} , Test loss: {loss_test.item():.4f}'  )

plt.plot(train_losses , label = 'train loss')
plt.plot(test_losses , label = 'test loss')
plt.legend()
plt.show()

#loss function

accuracy

# get Accuracy
with torch.no_grad():
  p_train = model(X_train)
  p_train = np.round(p_train.numpy())
  train_acc = np.mean(y_train.numpy() == p_train)

  p_test = model(X_test)
  p_test = np.round(p_test.numpy())
  test_acc = np.mean(y_test.numpy() == p_test)

print(f"Train acc: {train_acc:.4f} , Test acc:{test_acc:.4f}")

loss and accuracy

plt.plot(train_acc , label = 'train acc')
plt.plot(test_acc , label = 'test acc')
plt.legend()
plt.show()

16. Saving and loading a model

pytorch saving and loading model

dictionary

model.state_dict()

save the model

torch.save(model.state_dict(), 'myFirstModel.pt')

!ls

model1 = nn.Sequential(
  nn.Linear(D,1),
  nn.Sigmoid()
)
model1.load_state_dict(torch.load('myFirstModel.pt'))

# get Accuracy
with torch.no_grad():
  p_train = model1(X_train)
  p_train = np.round(p_train.numpy())
  train_acc = np.mean(y_train.numpy() == p_train)

  p_test = model1(X_test)
  p_test = np.round(p_test.numpy())
  test_acc = np.mean(y_test.numpy() == p_test)

print(f"Train acc: {train_acc:.4f} , Test acc:{test_acc:.4f}")

google colab 모델 다운로드 하기

from google.colab import files
files.download('myFirstModel.pt')

구글에서 다운로드가 된다.

17. A short Neuroscience Primer

linear regression y = ax + b

logistic regression

nueron network

nueron

senses -> signals

18. How does a model "learn"?

linear regression

line of best fit

mse mean squared error

minimizing cost -> making small is possible

gradient ->기울기

gradient Descent

gradient zero

epochs to train

gradient Descent

learning rate -> hyperparameter

19. Model With logits

전 부 같은데 모델 생성하는 부분이 다르다.

계산이 포함 되여 있어서 sigmoid 필요없다.

model2 = nn.Linear(D,1)

criterion = nn.BCELoss()
optimizer = torch.optim.Adam(model2.parameters())

prediction 도 다르다.

숫자로 결과를 나오기 때문에 확인 해야 한다.

# get Accuracy
with torch.no_grad():
  p_train = model2(X_train)
  p_train = np.round(p_train.numpy() > 0)
  train_acc = np.mean(y_train.numpy() == p_train)

  p_test = model2(X_test)
  p_test = np.round(p_test.numpy() > 0)
  test_acc = np.mean(y_test.numpy() == p_test)

print(f"Train acc: {train_acc:.4f} , Test acc:{test_acc:.4f}")

20 Train Sets vs. Validation Sets vs. Test Sets

overfitting을 방지하기 위하여 데이터 셋을 나눈다.

optimim 찾기

cross-validation

hypperparameter

21. Suggestion Box

'교육동영상 > 02. pytorch: Deep Learning' 카테고리의 다른 글

06-2. Recurrent Neural Networks, Time Series, and Sequence Data (0)	2020.12.14
06. Recurrent Neural Networks, Time Series, and Sequence Data (0)	2020.11.20
05. Convolutional Nerual Networks (0)	2020.11.19
04. Feedforward Artificial Neural Networks (1)	2020.11.18
01. Introduction 02. Google Colab (0)	2020.11.14

01. Introduction 02. Google Colab

2020. 11. 14. 20:11

728x90

아래 내용은 Udemy에서 Pytorch: Deep Learning and Artificial Intelligence를 보고 정리한 내용이다.

01. Introduction

1. Welcome

pytorch : simple and easy

CNNs

RNNs - sequence data

Stock Prediction with RNNs

GANs

Deep Reinforcement Learnin g: 강화학습

알파고

NLP

Object detection

facial recognition

deep fakes

2. overview and outline

tensorflow 업그레이드 되여서 간단해졌지만 high level

보통적인 것은 쉽지만 , 특수화 된것은 어렵다.

pytorch는 no need to use linear algebra and calculus to derive backpropagation

gpu

google colab

Jupyter Notebook but hosted by Google

NLP

Text classification

Embeddings

Transfer Learning

Their Network + Your Network

combine Your Network with theirs - get a state-of-the-art deep net within seconds

3. where to get the code

google colab

git clone <url>

git pull -> 마지막 버전

경험이다.

Theory lecture -> code lecture -> Possible extensions

02. Google Colab

4. jupyter notebook, goolge colab

download datafiles

access to GPU and TPU(Tensor Processing Unit)

stored in Google Driver( the "cloud")

many libiriary

google drive

Google colab이 안보일 경우

Connect More apps ->More menu -> colab

없을 경우 connect more apps

gpu, tpu 설정 runtimes에 가서 설정해야 한다.

runtime 연형변경

아래에 따라 code, text 추가하기

위의 조절함에 따라 글자체가 달라진다.

제목을 할것인지 선택하면 된다.

코드 추가하기

numpy, plt추가하기

import numpy as np
import matplotlib.pyplot as plt

예제

x = np.linspace(0, 10* np.pi, 1000)
y = np.sin(x)

plt.plot(x,y)

note book 이여서 plt.show()를 할 필요 없다.

__version__으로 버전 확인 하기

import tensorflow as tf
tf.__version__

disconnected일경우에는 다시 연결하면 된다. Reconnect

오래 동안 하지 않을 경우 에 메시지가 띄운다.

run out of memory 할 수 도 있다.

5. Uploading your own data to google colab

down load th data from a url

!wget를 사용한다.

데이터 다운로드 하기

!wget를 사용하여 데이터 다운로드 하기

!ls => 로 해당 경로 list확인 하기

!head => 앞의 몇 라인을 확인하기 , head row 있는지도 확인하기

헤더가 없어서 header = None으로 확인 한다.

import pandas as pd
df = pd.read_csv('arrhythmia.data', header = None)

여러가지 컬럼이 있기 때문에 앞의 몇개를 가공하면서 사용한다.

data = df[[0,1,2,3,4,5]]
data.columns = ['age', 'sex', 'height', 'weight', 'QRS duration' , 'P-R interval']

document를 활용하여 데이터의 정보를 알 수 있다.

!head arrhythmia.data

마지막에 ;을 하는게 좋다.

import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = [15,15]
data.hist();

from pandas.plotting import scatter_matrix
scatter_matrix(data);

part 2: USING tf.keras

tensorflow 를 사용하는 이유는 pytorch만 사용할 수 없기 때문에 tensorflow 도 같이 사용할 떄도 있다.

tensorflow downgrade

!pip install -q tensorflow==2.0.0-beta1
import tensorflow as tf
print(tf.__version__)

다운했는데도 불구하고 버전이 바꿔지지 않을 경우에는 runtime을 다시 시작하면 된다.

다시 수행할 경우에는

tf.keras.utils.get_file('auto-mpg.data', url)

!head '/root/.keras/datasets/auto-mpg.data'

헤더와 whitespace를 확인한다.

import pandas as pd
df = pd.read_csv('/root/.keras/datasets/auto-mpg.data', header = None, delim_whitespace=True)
df.head()

upload the file youself

from google.colab import files
uploaded = files.upload()

파일을 선택하여 할 수 있다.

dictionary를 확인할 수 있다.

uploaded

!ls

import pandas as pd
df = pd.read_csv('daily-minimum-temperatures-in-me.csv', error_bad_lines=False)
df.head()

from google.colab import files
uploaded = files.upload()

함수의 이름을 보여준다.

from fake_util import my_useful_function
my_useful_function()

!pwd

access file from google drive

mount the dirve

from google.colab import drive
drive.mount('/content/gdrive')

url 클릭후 코드 복사해서 넣는다.

!ls

!ls gdrive

google drive에 뭐 있는지 확인한다.

!ls '/content/gdrive/My Drive'

6. where can I learn about numpy, scipy, matplotlib, pandas, and Scikit-learn?

Numpy

matrix arighmetic

tensors(arrays) Tensorflow로 왔다.

1-D tensor (vector)

2-D tensor (matrix)

3-D and 4-D tensors

계산

matrix multiply = dot/inner procuct(np.dot

element-wise multiply(x)

matplot

그림

pandas

loading data

scipy

no real use of this (so far)

numpy is lowe-level: adding, multiplying

power version of Numpy

scikit-learn

basic machine learning

Machine Learning step

1. Load in the data

2. split into trian/test sets

3. build a model

4. fit the model(gradient descent)

5. evaluate the model -> accuracy

overfitting and underfitting

6. make predictions <- need to convert between Numpy array and Torch Tensor

'교육동영상 > 02. pytorch: Deep Learning' 카테고리의 다른 글

06-2. Recurrent Neural Networks, Time Series, and Sequence Data (0)	2020.12.14
06. Recurrent Neural Networks, Time Series, and Sequence Data (0)	2020.11.20
05. Convolutional Nerual Networks (0)	2020.11.19
04. Feedforward Artificial Neural Networks (1)	2020.11.18
03. Machine Learning and Neurons (0)	2020.11.16

PREV 1 2 NEXT