728x90
반응형

아래 내용은 Udemy에서 Pytorch: Deep Learning and Artificial Intelligence를 보고 정리한 내용이다.

Recommender Systems with Deep Learning Theory

applicable concepts in ML

google

아마존 - Amazon

facebook

netflex

News -> exciting 한 뉴스를 본다.

 

 

Ratings Recommenders

movies

 

How to recommend?

Given a dataset of triples: user, item, rating

fit a model to the data: f(u, m ) ->r

what should it do?

#1. if the user u and movie m appeared in the dataset, then the predicted rating should be close to the true rating

#2. the function should predict what user u would rate movie m, even if it didn't appear in the training set

Neural networks(function approximators) do this!

 

since our model can predict ratings for unseen movies, this is easy 

given a user, get predicted for every unseen movie=>보지 않았든것 

Sort by predicted rating(descending)

recommend movies with the highest predicted rating

 

 

How to build the model?

users and movies categorical 

neural networks do matrix multiplications

we can't multiply a category by a number(e.g. "Start wars" x5)

 

NLP

embeddings

mapping

 

 

A neural network for recommenders

 

 

Interesting point

NLP 

algorithms  word2vec and GloVe

They both do things like " king-man = queen -women"

 

 

ANN  all data is the same

The inputs are clearly not an N x D matrix of features

 

All data is not the same

we still have N samples

1st column(users): N- length array of categories()

2nd column(movies): N- length array of categories(can bu duped)

 

Embedding

NLP: word indexes(N x T) -> word vectors(N x T x D )

recommenders: users / movies(N) -> user/movie vectors(N x D)

 

How is the data stored?

They are already integers !

Embeddings & Concatenation

After embedding, users and movies have shape N x D

combine concatenate N x 2D

Now " all data is the same" once again

 

forward :

something special about this model: it has 2 inputs, not just one!

 

Loading in data

data is just a csv (pandas)

training 을 하기 위해서 DataLoader로 한다.

 

pytorch Recommender systems with deep learning code

모델 만들기

class Model(nn.Module):
  def __init__(self, n_users, n_items, embed_dim, n_hidden = 1024):
    super(Model, self).__init__()
    self.N = n_users
    self.M = n_items
    self.D = embed_dim

    self.u_emb = nn.Embedding(self.N, self.D)
    self.m_emb = nn.Embedding(self.M , self.D)
    self.fc1 = nn.Linear(2 * self.D , n_hidden)
    self.fc2 = nn.Linear(n_hidden , 1)

  def forward(self, u, m ):
    u = self.u_emb(u)
    m = self.m_emb(m)

    out = torch.cat((u, m ), 1)

    out = self.fc1(out)
    out = F.relu(out)
    out = self.fc2(out)
    return out
model = Model(N, M ,D )
model.to(device)

criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters())

make dataset

#make dataset
Ntrain = int(0.8 * len(ratings))
train_dataset = torch.utils.data.TensorDataset(
    user_ids_t[:Ntrain],
    movie_ids_t[:Ntrain],
    ratings_t[:Ntrain]
)

test_dataset = torch.utils.data.TensorDataset(
    user_ids_t[Ntrain:],
    movie_ids_t[Ntrain:],
    ratings_t[Ntrain:]
)

batch_size = 512
train_loader = torch.utils.data.DataLoader(dataset = train_dataset, 
                                           batch_size = batch_size,
                                           shuffle = True)

test_loader = torch.utils.data.DataLoader(dataset = test_dataset, 
                                           batch_size = batch_size,
                                           shuffle = False)

 

pytorch %prun 

얼마나 function에 사용했는지 알려진다.

5분 정도 걸리기 때문에 이것을 사용한다.

%prun train_losses, test_losses = batch_gd(model, criterion, optimizer, train_loader, test_loader, 25)

 

 

 

same prediction

pytorch weight 변경해서 하기 

class Model(nn.Module):
  def __init__(self, n_users, n_items, embed_dim, n_hidden = 1024):
    super(Model, self).__init__()
    self.N = n_users
    self.M = n_items
    self.D = embed_dim

    self.u_emb = nn.Embedding(self.N, self.D)
    self.m_emb = nn.Embedding(self.M , self.D)
    self.fc1 = nn.Linear(2 * self.D , n_hidden)
    self.fc2 = nn.Linear(n_hidden , 1)

    # set the wights since N(0,1) leads to poor results
    self.u_emb.weight.data = nn.Parameter(torch.Tensor(np.random.randn(self.N, self.D) * 0.01))
    self.m_emb.weight.data = nn.Parameter(torch.Tensor(np.random.randn(self.M, self.D) * 0.01))

  def forward(self, u, m ):
    u = self.u_emb(u)
    m = self.m_emb(m)

    out = torch.cat((u, m ), 1)

    out = self.fc1(out)
    out = F.relu(out)
    out = self.fc2(out)
    return out

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = Model(N, M ,D )
model.to(device)

criterion = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr = 0.08 , momentum = 0.9)

 

pytorch top 10 recommended

watched_movie_ids = df[df.new_user_id ==1].new_movie_id.values
watched_movie_ids
potential_movie_ids = df[-df.new_movie_id.isin(watched_movie_ids)].new_movie_id.unique()
print(potential_movie_ids.shape)
print(len(set(potential_movie_ids)))
user_id_to_recommend = np.ones_like(potential_movie_ids)
t_user_ids = torch.from_numpy(user_id_to_recommend).long().to(device)
t_movie_ids = torch.from_numpy(potential_movie_ids).long().to(device)

with torch.no_grad():
  predictions = model(t_user_ids, t_movie_ids)
predictions_np = predictions.cpu().numpy().flatten()
sort_idx = np.argsort(-predictions_np)
top_10_movie_ids = potential_movie_ids[sort_idx[:10]]
top_10_scores = predictions_np[sort_idx[:10]]

for movie, score in zip(top_10_movie_ids, top_10_scores):
  print("movie:", movie," score:", score)
반응형
728x90
반응형

아래 내용은 Udemy에서 Pytorch: Deep Learning and Artificial Intelligence를 보고 정리한 내용이다.

Embeddings

Text is sequence data, but it is not continuous

 

One-Hot Encoding?

vector  = > 0을 포함한 것 

map each word to an index 

  a => [1,0] 1

  b => [0,1] 2

 

T words -> one hot encoded vector of size V 

T x V matrix

 

data has no useful geometrical structure.

 

each word to a D-dimensional vector(not one-hot encoded)

 

 

 

 

 

RNN with Embedding

 

D = embedding dimension -> hypperparameter

V = vecab size (# of unique words)

 

Embedding matrix is V x D

Each row is a D-size vector for a word

 

 

forward Function

out = self.embed(x)

 

 

Text Preprocessing

Text Files

each word -> Integer ->vector

 

Structured Text Files

CSV(comma separated value)

dovument 

 

 

pandas for CSV?

document 

multiple words -> individual words

 

Multiple words to single words(Tokenization)

string.split() => a lot of cases

can't handle punctuation  => . 

remove 를 해야 한다. 

 

Tokens to Integers

dataset = long sequence of words

current_idx = 0

word2idx = {}

for word in dataset:

  if word not in word2idx :

    word2idx[word] = current_idx

    current_idx += 1

 

current_idx = 0 => 2로 한다. 

보통 0으로 안한다.

1 = padding

0 = unknown  => 우리는 학습을 못 할 떄 

 

Constant-Length Sequence

길이를 같게 하고 하기 위해 padding 을 한다.

 

Pre-padding vs Post-padding

- Text classification RNN reads the input from left-to-right, the pre-padding 이 더 좋다.

   시작 -> 끝 

- challenging for RNNs to learn long-term dependencies!

 

convert to csv

Tokenization

map each token to a unique integer

 

The task

text classification(many-to-one)

Input: sequence of words, output: a single label (spam or not spam)

 

Field Objects

import torchtext.data as ttd

TEXT = ttd.Field(

  sequentail = True,

  batch_first = True,

  lower = True,

  pad_first = True

)

LABEL = ttd.Field(sequentail = False, use_vocab = False, is_target = True)

 

TabularDataset Object

 

split train, test data

dataset.split()

 

build vocab

TEXT.build_vocab(train_dataset)

vocab = TEXT.vocab

 

vocab object

stoi(C-style naming)   문자 -> 숫자

itos(reverse mapping) 숫자 -> 문자

 

 

stoi and itos

Dictionary = keys -> values

List = keys -> value

 

pytorch Text Preprocessing

import torch
import torch.nn as nn
import torchtext.data as ttd
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from datetime import datetime
data = {
    "label" : [0,1,1] ,
    "data"  :[
              "I like eggs and ham.",
              "Eggs I like!",
              "Ham and eggs or just ham?"
    ]
}
df = pd.DataFrame(data)
df.head()
df.to_csv("data.csv", index = False)
TEXT = ttd.Field(
    sequential = True,
    batch_first = True,
    lower = True,
    tokenize = 'spacy',
    pad_first = True
)

LABEL = ttd.Field(sequential=False, use_vocab=False, is_target=True)
dataset = ttd.TabularDataset(
    path = 'data.csv',
    format = 'csv',
    skip_header = True,
    fields = [('label', LABEL) ,('data' , TEXT)]
)
ex = dataset.examples[0]
train_dataset , test_dataset = dataset.split(0.66)
TEXT.build_vocab(train_dataset,)
vocab = TEXT.vocab
vocab.stoi
vocab.itos
train_iter, test_iter = ttd.Iterator.splits(
    (train_dataset, test_dataset) , sort_key = lambda x: len(x.data),
    batch_sizes = (2,2)  , device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
)
for inputs, targets in train_iter:
  print("inputs:" , inputs, "shape:", inputs.shape)
  print("targets:" , targets, "shape:" ,targets.shape)
  break
  
for inputs, targets in test_iter:
  print("inputs:" , inputs, "shape:", inputs.shape)
  print("targets:" , targets, "shape:" ,targets.shape)
  break

 

CNNs for Text

sequence

1-D convolution

For images:

2 spatial dimensions + 1 input feature dimensions + 1 output feature dimension = 4

for sequences:

1 time dimension + 1 input feature dimension + 1 output feature dimension = 3

 

 

경고: feature first

Recall that for images , pytorch cnn expects image to be N x C x H x W 

  "feature first"

whereas, in Tensorflow  / OpenCV/ others, it's N x H x W x C

  "feature last"

the torchvision data generators hide this detail

NLP, output of embedding is N x T x D ("feature last")

nn.Conv1d(), we expect N x D x T as input!("feature first")

그래서 , must reshape before and after convolutions

 

 

Text Classification with CNNs

output of embediding is alwasy (N, T,D)

conv1d expects (N, D, T)

 

 

=> out.permute(0,2,1)

 

 

change it back

out.permute(0,2,1)

 

cnn 도 경과가 좋게 나온다.

class CNN(nn.Module):
  def __init__(self, n_vocab, embed_dim, n_outputs):
    super(CNN, self).__init__()
    self.V = n_vocab
    self.D = embed_dim
    self.K = n_outputs

    self.embed = nn.Embedding(self.V, self.D)

    self.conv1 = nn.Conv1d(self.D, 32, 3, padding = 1)
    self.pool1 = nn.MaxPool1d(2)
    self.conv2 = nn.Conv1d(32, 64, 3, padding = 1)
    self.pool2 = nn.MaxPool1d(2)
    self.conv3 = nn.Conv1d(64, 128, 3, padding = 1)

    self.fc = nn.Linear(128, self.K)
  def forward(self, X):
    out = self.embed(X)
    out = out.permute(0,2,1)
    out = self.conv1(out)
    out = F.relu(out)
    out = self.pool1(out)
    out = self.conv2(out)
    out = F.relu(out)
    out = self.pool2(out)
    out = self.conv3(out)
    out = F.relu(out)
    
    out = out.permute(0,2,1)

    out, _ = torch.max(out, 1)
    out = self.fc(out)
    return out

 

making predicions with Trained NLP Model

single_sentence = 'Our dating service has been asked 2 contast U by someone shy!'
toks= TEXT.preprocess(single_sentence)
sent_idx = TEXT.numericalize([toks])
model(sent_idx.to(device))

 

반응형
728x90
반응형

아래 내용은 Udemy에서 Pytorch: Deep Learning and Artificial Intelligence를 보고 정리한 내용이다.

GRU and LSTM

 

Modern RNN Units

LSTM

GRU is like 간단한 버전의 LSTM 과 비슷하다.(파라미터가 적고 thus more efficient )

 

simple RNN이 is not enough 는 vanishing gradient 때문이다.

 vanishing gradient해결하는데 ReLU 사용 ????????

하지만 더 효과적인 GRU, LSTM를 발견하였다.

 

Simple RNN은 수식이 하나이다.

 

GRU

recurrent Unit

SimpleRNNs have no choice bue to eventually forget, due to the vanishing gradient 

binary classifier (logistic regression neurons) as our gates 

 

SimpleRNN 은 long-term dependencies에 학습 할 떄 문제가 있다.

the hidden state becomes the weighted sum of the previos hidden state and new value(allowing you to remeber the old state)

these are controlled by "gates" which are like binary classifiers /logistic regression / neurons

GRU less parameter > more performent

 

 

LSTM(long-short term memory)

 

 

Simple RNN GRU LSTM code

nn.RNN(

input_size = self.D,

hidden_size = self.M,

num_layers = self.L, 

nonlinearity = 'relu',

batch_first = True

)

 

GRU

nn.GRU(

input_size = self.D,

hidden_size = self.M,

num_layers = self.L, 

batch_first = True

)

 

LSTM

nn.LSTM(

input_size = self.D,

hidden_size = self.M,

num_layers = self.L, 

batch_first = True

)

 

A more challenging Sequence

pytorch nonlinear sequence Linear code

import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt
series = np.sin((0.1 * np.arange(400)) ** 2)

data 만들기

T = 10
D = 1
X = []
Y = []
for t in range(len(series) - T) :
  x = series[t:t+T]
  X.append(x)
  y = series[t+T]
  Y.append(y)

X = np.array(X).reshape(-1, T)
Y = np.array(Y).reshape(-1, 1)
N = len(X)
print(X.shape, "   " , Y.shape)
model = nn.Linear(T, 1)

criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr = 0.1)

train test set split

X_train = torch.from_numpy(X[:-N//2].astype(np.float32))
y_train = torch.from_numpy(Y[:-N//2].astype(np.float32))
X_test = torch.from_numpy(X[-N//2:].astype(np.float32))
y_test = torch.from_numpy(Y[-N//2:].astype(np.float32))

모델 학습 

LOSS 확인 하기

 

ONE-STEP forecast

validation_target = Y[-N//2:]
with torch.no_grad():
  validation_predictions = model(X_test).numpy()

Linear model has terrible result 

 

 

pytorch nonlinear sequence SimpleRNN code

T = 10
D = 1
X = []
Y = []
for t in range(len(series) - T) :
  x = series[t:t+T]
  X.append(x)
  y = series[t+T]
  Y.append(y)

X = np.array(X).reshape(-1, T , 1) #MAKE IT N x T
Y = np.array(Y).reshape(-1, 1)
N = len(X)
print(X.shape, "   " , Y.shape)
class RNN(nn.Module):
  def __init__(self, n_inputs, n_hidden, n_rnn_layers, n_outputs):
    super(RNN, self).__init__()
    self.D = n_inputs
    self.M = n_hidden
    self.K = n_rnn_layers
    self.L = n_outputs

    self.rnn = nn.RNN(
      input_size = self.D,
      hidden_size = self.M,
      num_layers = self.L,
      nonlinearity = 'relu',
      batch_first = True
    )

    self.fc = nn.Linear(self.M, self.K)

  def forward(self, X):
    
    h0 = torch.zeros(self.L, X.size(0), self.M).to(device)
    out, _ = self.rnn(X, h0)

    out = self.fc(out[:,-1,:]) 
    return out

 

pytorch nonlinear sequence LSTM code

class RNN(nn.Module):
  def __init__(self, n_inputs, n_hidden, n_rnn_layers, n_outputs):
    super(RNN, self).__init__()
    self.D = n_inputs
    self.M = n_hidden
    self.K = n_rnn_layers
    self.L = n_outputs

    self.rnn = nn.LSTM(
      input_size = self.D,
      hidden_size = self.M,
      num_layers = self.L,
      batch_first = True
    )

    self.fc = nn.Linear(self.M, self.K)

  def forward(self, X):
    
    h0 = torch.zeros(self.L, X.size(0), self.M).to(device)
    c0 = torch.zeros(self.L, X.size(0), self.M).to(device)
    out, _ = self.rnn(X, (h0,c0))

    out = self.fc(out[:,-1,:]) 
    return out

 

RNNs for Image Classification

 

MNIST , Fashion MNIST

H x W (2-d )

 

 

Code preparation

step 1: load in the data

step 2: model

step 3: fit / plot the loss /etc. 

 

regression is harder than classification

classification 은 라벨만 하면 된다.

 

lstm

 

반응형
728x90
반응형

아래 내용은 Udemy에서 Pytorch: Deep Learning and Artificial Intelligence를 보고 정리한 내용이다.

44. Sequence Data

시퀸스 데이터 

 

time series

airline passengers

speech /Audio

text

 

bag of words example

eamil -> sapm vs. not spam

 

sequence ?

1-D series signal

linear regression 

 

shape of a sequence 

NxTxD?

N: = #samples

D: = #features

T:  #time steps in the sequence

 

eg: GPS data from their cars

N: one sample would be one person's single trip to work

D = 2 , the GPs will record(latitude, longitude) pairs

T: the number of(lat, lng) measurements taken from start to finish of a single trip 

 

variable length sequence ?

 

Nx D xT

image data: N x H x W x C

N x C x H x W (pytorch, Theano)

In python : N is first , feature maps : C 

 

 

 

45. Forecasting

RNNs

Linear Regression

 

loop를 사용하여 prediction 해야 한다.

x = last values of train set

predictions = []

for i in range(length_of_forecast):

  x_next = model.predict(x)

  predictions.append(x_next)

  x = concat(x[1:], x_next)

 

 

model = nn.Linear(1,1) model = nn.Sequential(
  nn.Linear(1,10)
  nn.ReLU(),
  nn.Linear(10,1)
)

 

 

46. Autogressive Linear Model for Time Sereies

 

모델 생성

model = nn.Linear(T, 1)

 

pytorch RNN series data

import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt

데이터 생성

N = 1000
series = np.sin(0.1 * np.arange(N))
T = 10
X = []
Y = []

for t in range(len(series) - T):
  x = series[t:t+T]
  X.append(x)
  y = series[t+T]
  Y.append(y)

X = np.array(X).reshape(-1, T)
Y = np.array(Y).reshape(-1, 1)
N = len(X)
print("X.shape" , X.shape, "Y.shape" , Y.shape)

build the model

model = nn.Linear(T, 1)

loss and optimizer

regression이기 때무에 mean squar error를 사용한다.

criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr = 0.1)
X_train = torch.from_numpy(X[:-N//2].astype(np.float32))
y_train = torch.from_numpy(Y[:-N//2].astype(np.float32))
X_test = torch.from_numpy(X[-N//2:].astype(np.float32))
y_test = torch.from_numpy(Y[-N//2:].astype(np.float32))

모델 학습

def full_gd(model, criterion, optimizer, X_train , y_train, X_test, y_test, epochs = 200):

  train_losses = np.zeros(epochs)
  test_losses = np.zeros(epochs)

  for epoch in range(epochs):
    # 역전파 단계를 실행하기 전에 변화도를 0으로 함 
    optimizer.zero_grad()

    outputs = model(X_train)
    loss = criterion(outputs, y_train)

    loss.backward()
    optimizer.step()

    train_losses[epoch] = loss.item()

    test_outputs = model(X_test)
    test_loss = criterion(outputs, y_test)
    test_losses[epoch] = test_loss.item()

    if (epoch +1) % 10 ==0:
          print(f'Epoch{epoch+1}/{epochs}, Train loss:{loss.item():.4f}, Test loss:{test_loss.item():.4f}')

  return train_losses, test_losses

train_losses, test_losses = full_gd(model, criterion, optimizer, X_train , y_train, X_test, y_test)

loss그리기

plt.plot(train_losses, label='train loss')
plt.plot(test_losses, label='test loss')
plt.legend()
plt.show()

잘못 예측 한것 보여주기

validation_target = Y[-N//2:]
validation_predictsions = []

i = 0
while len(validation_predictsions) < len(validation_target):
  input_ = X_test[i].view(1, -1)
  p = model(input_)[0, 0].item() #array->scalar

  i+= 1

  validation_predictsions.append(p)

plt.plot(validation_target, label='forecast target')
plt.plot(validation_predictsions, label='forecast prediction')
plt.legend()
validation_target = Y[-N//2:]
validation_predictsions = []

last_x = torch.from_numpy(X[-N//2].astype(np.float32))

while len(validation_predictsions) < len(validation_target):
  input_ = last_x.view(1, -1)
  p = model(input_)

  i+= 1

  validation_predictsions.append(p[0,0].item())

  last_x = torch.cat((last_x[1:] , p[0]))

plt.plot(validation_target, label='forecast target')
plt.plot(validation_predictsions, label='forecast prediction')
plt.legend()
plt.plot(validation_target, label='forest target')
plt.plot(validation_predictsions, label='forest prediction')
plt.legend()
plt.show()

47. Proof that the Linear Model Works

np.sin(0.1 * np.arange(200))

np.arange(200) 

0.1 is the angular frequency 

 

 

48. Recurrent Neural Networks

why RNN useful

 

D= 100

T = 10,000

TX D = 1 million

 

ANN 는 일반적이다.

- it connects every input to every feature in the next hidden layer

 

hidden state

input -> hidden -> output

 

RNN Equation

 

Wxh - input to hidden weight

Whh - hidden to hidden weight

bh - hidden bias

Wo - hidden to output weight

bo - output bias

X - TxD input matrix

 

tanh hidden activation

softmax output activation

 

yhat =[]

h_last = h0

for t in range(T):

  h_t = tanh(X[t].do(Wx) + h_last.dot(Wh) + bh)

  yhat = softmax(h_t.dot(Wo) + bo)

  yhat.append(yhat)

  

  h_last = h_t

 

Biological Inspiration

Cnn have "shared weights" to take advantage of structure, resulting in savings

 

Calculating our savings

T= 100

D = 10

M = 15 (this is still a hyperparameter)

Flattened input vector: TxD = 1000

we will have T hidden states : T x m = 1500

assume binary classification : K=1

input- to = hidden weight: 1000 x 1500 = 1.5 million

hidden-to-output weight : 1500

Total : ~ 1.5 million

 

 

calculating our savings

Wxh - D x M = 10 x 15 = 150

Whh - M xM = 15 x 15 = 225

Wo - M x k = 15 x 1 = 15

Total = 150+225+15 = 390

Savings: 1,501,500 / 390 = 3850

 

 

 

49. RNN code Preparation

Steps:

  • load in the data

      dataset

      RNN(NxTxD)

 

  • build the model

    earlier and right

  class SimpleRNN(nn.Module):

       def __init__ = nn.RNN(

          input_size = num_inputs,

          hidden_size = num_hidden,

          num_layers = num_layers,

          nonlinearity = 'relu',

          batch_first = True

       )

 

RNN은 feedback  있다.

 

full contructor

class SimpleRNN(nn.Module):

  def __init__ (self, n_inputs, n_hidden, n_rnnlayers, n_outputs)

    super(SimpleRNN, slef).__init__()

    self.D = n_inputs

    self.M = n_hidden

    self.K = n_outputs

    self.L = n_rnnlayers

    self.rnn = nn.RNN(

      input_size = self.D,

      hidden_size = self.M,

      num_layers = self.L;

      nonlinearity = 'relu',

 

      batch_first = True)

    self.fc = nn.Linear(self.M, self.K) -> Denselayer

 

   

forwar function

def forward(self, X):

  h0 = torch.zeros(self.L, X.size(0), self.M).to(device)

  out, _ = self.rnn(X, h0) ->hidden layer of each time step

  out = self.fc(out[:,-1,:] 

  return out

 

  • train the model

모델 초기화 
model = SimpleRNN(n_inputs = 1, n_hidden = 5, n_rnnlayers =1, n_outputs=1)

model.to(device)

 

loss and optimizer

criterion = nn.MSELoss()

optimizer = torch.optim.Adam(model.parameters(), lr = 0.1)

 

make inputs and targets

 

 

  • evaluate the model
  • make predictions

N x T x D  -> N x K(outut )

input_ = X_test[i].reshape(1, T,1)

p = model(input_)[0,0].item()

 

 

50. RNN for Time Series Prediction

simple rnn sine

 

import torch

import torch.nn as nn

import numpy as np

import matplotlib.pyplot as plt

 

create data

N = 1000
series = np.sin(0.1 * np.arange(N))

plt.plot(series)
plt.show()

 

create dataset

T = 10
X = []
Y = []

for t in range(len(series) - T):
  x = series[t:t+T]
  X.append(x)
  y = series[t+T]
  Y.append(y)

X = np.array(X).reshape(-1, T, 1)
Y = np.array(Y).reshape(-1, 1)
N = len(X)
print("X.shape" , X.shape, "Y.shape" , Y.shape)

cuda 사용하기

device = torch.device("cuda:0" if torch.cuda.is_available() else 'cpu')
print(device)

모델 생성하기

class SimpleRNN(nn.Module):
  def __init__ (self, n_inputs, n_hidden, n_rnnlayers, n_outputs):
    super(SimpleRNN, self).__init__()
    self.D = n_inputs
    self.M = n_hidden
    self.K = n_outputs
    self.L = n_rnnlayers

    self.rnn = nn.RNN(
      input_size = self.D,
      hidden_size = self.M,
      num_layers = self.L,
      nonlinearity = 'relu',
      batch_first = True
    )

    self.fc = nn.Linear(self.M, self.K)

  def forward(self, X):
    
    h0 = torch.zeros(self.L, X.size(0), self.M).to(device)
    out, _ = self.rnn(X, h0)

    out = self.fc(out[:,-1,:]) 
    return out
model = SimpleRNN(n_inputs=1, n_hidden=5, n_rnnlayers=1, n_outputs=1)
model.to(device)

 

gpu에 넣으면 다 gpu 에 넣어야 한다.

아니면 cuda 에러가 난다.

 

ann 는 cnn 보다 flexible하지만 성능이 더 좋다고 는 할 수 없다.

criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr = 0.1)
X_train = torch.from_numpy(X[:-N//2].astype(np.float32))
y_train = torch.from_numpy(Y[:-N//2].astype(np.float32))
X_test = torch.from_numpy(X[-N//2:].astype(np.float32))
y_test = torch.from_numpy(Y[-N//2:].astype(np.float32))

 

data to gpu

X_train, y_train = X_train.to(device), y_train.to(device)
X_test , y_test = X_test.to(device) , y_test.to(device)
def full_gd(model, criterion, optimizer, X_train , y_train, X_test, y_test, epochs = 200):

  train_losses = np.zeros(epochs)
  test_losses = np.zeros(epochs)

  for epoch in range(epochs):
    # 역전파 단계를 실행하기 전에 변화도를 0으로 함 
    optimizer.zero_grad()

    outputs = model(X_train)
    loss = criterion(outputs, y_train)

    loss.backward()
    optimizer.step()

    train_losses[epoch] = loss.item()

    test_outputs = model(X_test)
    test_loss = criterion(outputs, y_test)
    test_losses[epoch] = test_loss.item()

    if (epoch +1) % 10 ==0:
          print(f'Epoch{epoch+1}/{epochs}, Train loss:{loss.item():.4f}, Test loss:{test_loss.item():.4f}')

  return train_losses, test_losses

train_losses, test_losses = full_gd(model, criterion, optimizer, X_train , y_train, X_test, y_test)
plt.plot(train_losses , label='train_losses')
plt.plot(test_losses , label='test_losses')
plt.legend()
plt.show()

 

51. Playing Attention to Shapes

N = 1
T = 10
D = 3
M = 5
K = 2
X = np.random.randn(N, T, D)

gpu를 사용하지 않는다.

class SimpleRNN(nn.Module):
  def __init__ (self, n_inputs, n_hidden, n_outputs):
    super(SimpleRNN, self).__init__()
    self.D = n_inputs
    self.M = n_hidden
    self.K = n_outputs

    self.rnn = nn.RNN(
      input_size = self.D,
      hidden_size = self.M,
      nonlinearity = 'tanh',
      batch_first = True
    )

    self.fc = nn.Linear(self.M, self.K)

  def forward(self, X):
    
    h0 = torch.zeros(1, X.size(0), self.M)
    out, _ = self.rnn(X, h0)

    out = self.fc(out) 
    return out
model = SimpleRNN(n_inputs=D, n_hidden=M, n_outputs=K)
inputs = torch.from_numpy(X.astype(np.float32))
out = model(inputs)
out
W_xh, W_hh, b_xh, b_hh = model.rnn.parameters()
W_xh = W_xh.data.numpy()
b_xh = b_xh.data.numpy()
W_hh = W_hh.data.numpy()
b_hh = b_hh.data.numpy()
wo = wo.data.numpy()
bo = bo.data.numpy()
wo.shape, bo.shape
h_last = np.zeros(M)
x = X[0]
yhats = np.zeros((T, K))

for t in range(T):
  h = np.tanh(x[t].dot(W_xh.T) + b_xh+ h_last.dot(W_hh.T) + b_hh)
  y = h.dot(wo.T) + bo
  yhats[t] = y

  h_last = h

print(yhats)
np.allclose(yhats, yhats_torch)
반응형
728x90
반응형

아래 내용은 Udemy에서 Pytorch: Deep Learning and Artificial Intelligence를 보고 정리한 내용이다.

 

31. What is Convolution?(par1)

convolutional neural networks

 

add

multiply 

 

3 object 

input image * filter (kernel) = output image

 

Convolution = Image Modifier

 

 

Image 

10 20 30 40
0 1 0 1
30 0 10 20
0 10 20 30

 

* Filter

1 0
0 2

 

output

1*10+0*20+0*0+1*2 = 12 1*20+0*20+0*1+2*0 = 20  
     
     

excercise: Write Pseudocode

input_image, kernel

output_height = input_height- kernel_height+1

output_width = input_width - kernel_width +1

 

mode ='valid'

ouput input is samller than input

 

padding "same" mdoe

same size as the input

 

full mode

we could extend the filter further and still get non-zero outputs

"Full" padding;

input length = N

Kernel length = K

Output length = N+K-1

 

INPUT LENGTH N kernel k

mode output size usae
valid N-K +1 typical
same N typical
full  N+K-1 Atypical

 

32. what is convolution ?(part2)

convolution as "pattern finding"

cos similiraty

 

dot product 

high positive correlation  -> dot product is large and positive

high negative correlation  -> dot product is large and negative

no correlation  -> dot product is zero

 

33. what is convolution?(part3)

optional understanding

1-d convolution

matrix multiplication

2-d convolution

 

matric multiplication

color?

 

Translational Invariance: 위치 다르게 가능하다.

 

34. convolution on color images

3-d objects : Height x width x color

pooling 

downsample by 2

 

3-D same size in depth dimension

 

input image HxWx3

kernel : KxKx3

output image:(H -K +1) x ( W-K +1)

 

how much do we save?

input image : 32x 32 x 3

filter : 3x5x5x 64

 

 

35. cnn architecture

CNN has two steps:

pooling

high level, pooling is downsampling

eg. output a smaller images from a bigger image

input 100x100 => pool size of 2 woule yield 50x50

 

 

 

Max pooling, average pooling 

Max pooling : 

why use pooling?

pattern finder에서 pattern이 found한 곳만 찾기 위해서 이다.

 

diferent pool sizes

stride : overlap이 가능하다.

 

pooling

conv-pool shrinks

 

losing information

Dowe lose information if we shrink the image ? Yes!

We lose spatial information : we don't care where the feature was found

 

 

hyperparameters: 

learning rate, hidden layers, hidden units per layer


pooling : stride

 

pixel

 

Dense Neural Network: 1xD layer

 

global max pooling

 

 

36. cnn code preparation(part1)

build the model

C1 x H x W x C2

nn.Conv2d(in_channels =1, out_channels = 32, kernel_size = 3, stride =2)

color images 3-d

 

color is not a spatial dimension

1-D convolution example: time series

3-D convolution example: video (height, width, time)

3-D convolution example: voxels(height, width, depth)
"pixel" = "Picture Element"

"Voxel" = "Volume Element"

 

conv2d -> conv2d -> conv2d -> flatten -> Dense -> Dense

conv2d = Image

Dense = Vector

 

ANN Review

model = nn.Sequential(

 nn.Linear(784128),

 nn.ReLU(),

 nn.Linear(12810)

)

 

class ANN(nn.Modeuls):

  def __init__(self):

    super(ANN, self).__init__()

    self.layer1= Linear(784128),

    self.layer2= ReLU()

    self.layer3 =Linear(128,1)

def forward(self, x):

    x = self.layer1(x)

    x = self.layer2(x)

    x = self.layer3(x)

    return x

model = ANN()

 

Linear model

variable = nn.Linear(D,1)

outputs = variable(inputs)

 

 

class CNN(nn.Module):

 

  def __init__(self):

    super(CNN, self).__init__()

    self.conv = nn.Sequential(

     nn.Conv2d(2, 32, kernel_size =3, stride=2)

     nn.Conv2d(32, 64, kernel_size =3, stride=2)

     nn.Conv2d(64, 128, kernel_size =3, stride=2)

    )

    self.dense = nn.Sequential(

      nn.Linear(?, 1024)

      nn.Linear(1024, K))

  def forward(self, x):

    out = self.conv(x)

    out = out.view(-1,?)

    out = self.dense(out)

    return out

 

CNN Sequentail version

model = nn.Sequential(

  nn.Conv2d(3, 32, kernel_size=3, stride =2)

  nn.Conv2d(32, 64, kernel_size=3, stride =2)

  nn.Conv2d(64, 128, kernel_size=3, stride =2)

  nn.Flatten()

  nn.Linear(?, 1024)

  nn.Linear(1024, K)

)

 

dropout Regularization:

L1 and L2 Regularization

dense layers

보통 dropout는 dense layer 사이에 사용한다.convolutions에 사용하지 않는다.

and sometimes RNNs

 

 

37. cnn code preparation(part2)

fill in the detaill

Convolutioanal Arithmetic

model = nn.Sequential(

  nn.Conv2d(3, 32, kernel_size=3, stride =2)

  nn.Conv2d(32, 64, kernel_size=3, stride =2)

  nn.Conv2d(64, 128, kernel_size=3, stride =2)

  nn.Flatten()

  nn.Linear(?, 1024)

  nn.Linear(1024, K)

)

 

keras 

32->16->8->4

"padding" argument

 

 중요한 점 

pytorch 특이한 점

why NxCxHxW and not NxHxWxC?

Conventions = whatever the programmer decided to do

Theano / Pytorch == channels first

OpenCV, Tensorflow, Matplotlib, pillow == channels last

OpenDV == BGR instead of RGB (try imread and then plot with Maplotlib)

 

38. cnn code preparation(part3)

1.load in the data

Fashion MNIST and CIFAR -10

2. BUILD THE MODEL ->convolutional 

3. TRAIN the model

4. evaluate the model

5. make predictions

 

"all machine learning interfaces are the same"

 

load in the data

 

data augmentation

 

 

 

Pytorch loading in the data

train_dataset = torchvision.datasets.FashionMNIST(

  root = '.',

  train = True

  transform = transforms.ToTensor(),

  download = True)

 

train_dataset = torchvision.datasets.CIFAR10(

  root = '.',

  train = True

  transform = transforms.ToTensor(),

  download = True)

 

train_loader = torch.utils.data.DataLoader(dataset = train_dataset, batch_size = batch_size, shuffle = True)

 

all data is the same

all machine learning interfaces are the same

 

Training loop

for i in rage(epochs):

  for inputs, targets in data_loader:

    ...

 

evaluating accuracy

for inputs, targets in data_loader:

  ...

 

39. cnn for fashion mnist

import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime

 

train , test set 가져오기 

train_dataset = torchvision.datasets.FashionMNIST(
  root = '.',
  train = True, 
  transform = transforms.ToTensor(),
  download = True)
test_dataset = torchvision.datasets.FashionMNIST(
  root = '.',
  train = False, 
  transform = transforms.ToTensor(),
  download = True)

 

class 종류 확인하기

classes = len(set(train_dataset.targets.numpy()))
print("number of classes:", classes)

class CNN(nn.Module):
  def __init__(self, classes):
    super(CNN, self).__init__()

    self.conv_layers = nn.Sequential(
      nn.Conv2d(in_channels=1, out_channels = 32, kernel_size =3, stride=2),
      nn.ReLU(),
      nn.Conv2d(in_channels=32, out_channels = 64, kernel_size =3, stride=2),
      nn.ReLU(),
      nn.Conv2d(in_channels=64, out_channels = 128, kernel_size =3, stride=2),
      nn.ReLU()
    )
    self.dense_layers = nn.Sequential(
      nn.Dropout(0.2),
      nn.Linear(128*2*2, 512),
      nn.ReLU(),
      nn.Dropout(0.2),
      nn.Linear(512, classes)
    )

  def forward(self, x):
    out = self.conv_layers(x)
    out = out.view(out.size(0), -1)
    out = self.dense_layers(out)
    return out

 

model = CNN(classes)

 

다른 또 한가지 방법 tensorflow

'''
model = nn.Sequential(
  nn.Conv2d(in_channels=1, out_channels = 32, kernel_size =3, stride=2),
  nn.ReLU(),
  nn.Conv2d(in_channels=32, out_channels = 64, kernel_size =3, stride=2),
  nn.ReLU(),
  nn.Conv2d(in_channels=64, out_channels = 128, kernel_size =3, stride=2),
  nn.ReLU(),
  nn.Flatten(),
  nn.Dropout(0.2),
  nn.Linear(128*2*2, 512),
  nn.ReLU(),
  nn.Dropout(0.2),
  nn.Linear(512, classes) 
)
'''

 

model to gpu

device = torch.device("cuda:0" if torch.cuda.is_available() else 'cpu')
print(device)
model.to(device)

 

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters())
batch_size = 128
train_loader = torch.utils.data.DataLoader(dataset = train_dataset, batch_size = batch_size, shuffle = True)
test_loader = torch.utils.data.DataLoader(dataset = test_dataset, batch_size = batch_size, shuffle = False)
def batch_gd(model, criterion, optimizer, X_train, y_train, epochs):
  train_losses = np.zeros(epochs)
  test_losses = np.zeros(epochs)

  for epoch in range(epochs):
    t0 = datetime.now()
    train_loss = []

    for inputs, targets in train_loader:
      inputs, targets = inputs.to(device), targets.to(device)
      optimizer.zero_grad()

      outputs = model(inputs)
      loss = criterion(outputs, targets)

      loss.backward()
      optimizer.step()

      train_loss.append(loss.item())

    train_loss = np.mean(train_loss)


    test_loss = []

    for inputs, targets in test_loader:
      inputs, targets = inputs.to(device), targets.to(device)
      outputs = model(inputs)
      loss = criterion(outputs, targets)
      test_loss.append(loss.item())
    test_loss = np.mean(test_loss)

    train_losses[epoch] = train_loss
    test_losses[epoch] = test_loss

    dt = datetime.now() - t0
    print(f'Epoch{epoch+1}/{epochs}, Train loss:{train_loss:.4f}, Test loss:{test_loss:.4f}')

  return train_losses, test_losses
train_losses, test_losses = batch_gd(model, criterion, optimizer, train_loader, test_loader, epochs=15)

정확도 구하기

n_correct = 0.
n_total = 0.
for inputs, targets in train_loader:
  inputs, targets = inputs.to(device), targets.to(device)
  outputs = model(inputs)
  _, predictions = torch.max(outputs, 1)
  n_correct += (predictions == targets).sum().item()
  n_total+= targets.shape[0]
train_acc = n_correct/ n_total
  
n_correct = 0.
n_total = 0.
for inputs, targets in test_loader:
  inputs, targets = inputs.to(device), targets.to(device)
  outputs = model(inputs)
  _, predictions = torch.max(outputs, 1)
  n_correct += (predictions == targets).sum().item()
  n_total+= targets.shape[0]
test_acc = n_correct/ n_total
print(f"Train acc: {train_acc:.4f} , Test acc:{test_acc:.4f}")
from sklearn.metrics import confusion_matrix
import numpy as np
import itertools
def plot_confusion_matrix(cm, classes, normalize = False, title =' Confusion matrix', cmap = plt.cm.Blues):
  if normalize:
    cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
    print("Normalized confusion matrix")
  else:
    print("confusion_matrix: without Normalized")
  print(cm)
  plt.imshow(cm, interpolation='nearest', cmap=cmap)
  plt.title(title)
  plt.colorbar()
  tick_marks = np.arange(len(classes))
  plt.xticks(tick_marks, classes, rotation = 45)
  plt.yticks(tick_marks, classes)

  fmt ='.2f' if normalize else 'd'
  thresh = cm.max()/2
  for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
    plt.text(j, i, format(cm[i,j], fmt), horizontalalignment="center" , color="white" if cm[i, j]> thresh else 'black')
  
  plt.tight_layout()
  plt.ylabel('True label')
  plt.xlabel('Predicted label')
  plt.show()
x_test = test_dataset.data.numpy()
y_test = test_dataset.targets.numpy()
p_test = np.array([])
for inputs, targets in test_loader:
  inputs = inputs.to(device)
  
  outputs = model(inputs)
  _, predictions = torch.max(outputs, 1)
  p_test = np.concatenate((p_test, predictions.cpu().numpy()))

cm = confusion_matrix(y_test, p_test)
plot_confusion_matrix(cm, list(range(10)))

 

p_test = p_test.astype(np.uint8)
misclassfied_idx = np.where(p_test != y_test)[0]
i = np.random.choice(misclassfied_idx)
plt.imshow(x_test[i].reshape(28,28) , cmap = 'gray')
plt.title('True label: %s Predicted : %s' % (labels[y_test[i]], labels[p_test[i]]))

cnn better than ann

 

40. cnn for cifar-10

import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime

 

cifar 10 dataset color imageset 

train_dataset = torchvision.datasets.CIFAR10(
  root = '.',
  train = True, 
  transform = transforms.ToTensor(),
  download = True)

test_dataset = torchvision.datasets.CIFAR10(
  root = '.',
  train = False, 
  transform = transforms.ToTensor(),
  download = True)

cifar 10 regular 

classes = len(set(train_dataset.targets))
print("number of classes: " , classes)

dataloader

batch_size = 128
train_loader = torch.utils.data.DataLoader(dataset = train_dataset, batch_size = batch_size, shuffle = True)
test_loader = torch.utils.data.DataLoader(dataset = test_dataset, batch_size = batch_size, shuffle = False)

cnn 정의 한다.

class CNN(nn.Module):
  def __init__(self, classes):
    super(CNN, self).__init__()
    self.conv1 = nn.Conv2d(in_channels=3, out_channels = 32, kernel_size =3, stride=2)
    self.conv2 = nn.Conv2d(in_channels=32, out_channels = 64, kernel_size =3, stride=2)
    self.conv3 = nn.Conv2d(in_channels=64, out_channels = 128, kernel_size =3, stride=2)
    self.fc1 = nn.Linear(128*3*3, 1024)
    self.fc2 = nn.Linear(1024, classes)

  def forward(self, x):
    x = F.relu(self.conv1(x))
    x = F.relu(self.conv2(x))
    x = F.relu(self.conv3(x))
    x = x.view(-1, 128*3*3)
    x = F.dropout(x, p =0.5)
    x = F.relu(self.fc1(x))
    x = F.dropout(x, p =0.2)
    x = self.fc2(x)
    return x
model = CNN(classes)

 

 

41. data augmentation

generators/ iterator

0.. 10: for i in range(10)

python 2 range(10)

python 2, use xrange(10)

python 3, print(range(10))

.....

 

for x in my_image_aumgmentation_generator():

  print(x)

 

def my_image_aumgmentation_generator():

  for x_batch, y_batch in zip(X_train, y_train):

    x_batch = gument(x_batch)

   yield x_batch, y_batch

 

Data augmentation with Torch Vision

transform = torchvision.transforms.Compose([

  torchvision.transforms.ColorJitter( 

    brightness=0.2, contrast = 0.2, saturation =0.2, hue =0.2),

    torchvision.transforms.RandomHorizontailFlip(p=0.5),

    torchvision.transforms.RandomRotation(degrees=15),

    transforms.ToTensor(),

  )

])

 

train_dataset = torchvision.datasets.CIFAR10(

  root = '.',

  train=True,

  transform = transform,

  download = True

)

 

 

dataloader은 그래도 한다. 이전과 같이 진행하면 된다. 

 

 Data Augmentation with your data

train_dataset = torchvision.datasets.yourdataset?(

  root = '.',

  train=True,

  transform = transforms.ToTensor(),

  download = True

)

 

42. Batch Normalization

z = (x - μ) / σ 

 

for epoch in ragne(epochs):

  for x_batch, y_batch in data_loader:

    x<- w - learning_rate * grad(x_batch , y_batch)

 

Dense 사이에 한다.

 

 

Batch Norm as regularization

can help with overfitting

43. Improving CIFAR -10 Results

 

pytorch data augmentation

data augmentation

train_transform = torchvision.transforms.Compose([
  tranforms.RandomCrop(32, padding = 4),
  torchvision.transforms.RandomHorizontalFlip(p=0.5),
  torchvision.transforms.RandomAffine(0, translate=(0.1,0.1)),
  tranforms.ToTensor(),
])

train_dataset = torchvision.datasets.CIFAR10(
  root = '.',
  train = True, 
  transform = train_transform,
  download = True)

test_dataset = torchvision.datasets.CIFAR10(
  root = '.',
  train = False, 
  transform = train_transform,
  download = True)

모델 생성

maxpooling 

class CNN(nn.Module):
  def __init__(self, classes):
    super(CNN, self).__init__()
    self.conv1 = nn.Sequential(
        nn.Conv2d(3, 32, kernel_size= 3, padding =1 ),
        nn.ReLU(),
        nn.BatchNorm2d(32),
        nn.Conv2d(32, 32, kernel_size=3, padding=1),
        nn.ReLU(),
        nn.BatchNorm2d(32),
        nn.MaxPool2d(2),
    )
    self.conv2 = nn.Sequential(
        nn.Conv2d(32, 64, kernel_size= 3, padding =1 ),
        nn.ReLU(),
        nn.BatchNorm2d(64),
        nn.Conv2d(64, 64,kernel_size=3, padding=1),
        nn.ReLU(),
        nn.BatchNorm2d(64),
        nn.MaxPool2d(2),
    )
    self.conv3 = nn.Sequential(
        nn.Conv2d(64, 128, kernel_size= 3, padding =1 ),
        nn.ReLU(),
        nn.BatchNorm2d(128),
        nn.Conv2d(128, 128,kernel_size=3, padding=1),
        nn.ReLU(),
        nn.BatchNorm2d(128),
        nn.MaxPool2d(2),
    )
    self.fc1 = nn.Linear(128 * 4 * 4, 1024)
    self.fc2 = nn.Linear(1024, classes)
  def forward(self, output):
    output = self.conv1(output)
    output = self.conv2(output)
    output = self.conv3(output)
    output = output.view(output.size(0) , -1)
    output = F.dropout(output, p = 0.5)
    output = F.relu(self.fc1(output))
    output = F.dropout(output, p = 0.2)
    output = self.fc2(output)
    return output

vgg mach larger images 

 

from torchsummary import summary
summary(model, (3,32,32))
반응형
728x90
반응형

아래 내용은 Udemy에서 Pytorch: Deep Learning and Artificial Intelligence를 보고 정리한 내용이다.

 

22. Artificial Neural Networks Section introduction

CNNs

RNNs

 

Artificial Neural Networks: ANNs

neural networks

 

http://alexlenail.me/NN-SVG/index.html

activation functions:

These are what make neural networks 활성화 되는지 

Multiclass classification: 여러개 구분하는 것

Image data: images,. text, and sound

 

23. Forward Propagation

nerual networks -> predictions

 

E.g.: input is a face

one neuron 는 the presence of an 눈을 보고 

one neuron 는 the presence of an 코를 보고 

그들은 각각 다른 feature를 보고 있다.

 

input hidden output

layer

a chain of nerons

uniform structure

 

y =  wx+b

y = ax+b

 

sigmoid

 

앞의 neural network output을 구한다음 뒤어로 전달하면서 계싼한다.

 

regression

dense layer -> dense layer -> dense layer -> linear regression

classification

dense layer -> dense layer -> dense layer -> logistic regression

 

Hierachies

solve the compelicated problem

 

24. The Geometric Picture

geometric picture

feature engineering

linear regression

y hat = ax^2 + b

gradient descent

 

25. Activation Functions

sigmoid 0~1

 f(x) = 1 / 1 + exp(-a)

binary classification

 

standardization

 

tanh  -1~1 

 

vanishing gradient problem

변화가 거의 나지 않을 경우 

 

deaad end

default : Relu 

  doesn't have a "vaishing' gradient..

  the gradient in the left half is aleady vanished!

 

BRU activation

 higher accuracy

 

softplus

 

biological plausibility

 

26. Multiclass Classification

softmax function

softmax technically an activation function , but unlike the sigmoid/tahh, ReLU hidden activations는 아니다.

pytorch softmax function

nn.Sequential(

  nn.Linear(D,M),

  nn.ReLU(),

  nn.Linear(M, K),

  nn.Softmax()

)

 

crossEngropyLoss()

 

model = nn.Linear(D,K)

criterion = nn.CrossEntropyLoss()

 

activation function

task activation function
Regression None/Identity
binary classification sigmoid
multiclass classification softmax

 

The Model Type Doesn't matter

 

linear regression

dense

 

ann Regression 

dense + Dense

 

binary Logistic Regression

Dense+sigmoid

 

ANN Binary Classification

Dense+Dense+sigmoid

 

Multicalss Logistic Regression

Dense+ Softmax

ANN Multiclass Classification

Dense+Dense+ Softmax

 

same pattern applies to CNNs,RNNs - the type of task corresponds only to the final activation function

 

softmax is more general

multiclass classification

binary classification k = 2

27. How to Represent Images

이미지가 어떻게 데이터에 입력 되는지 확인 해야 한다.

height/width

matrix

column of the image

 

colors?

RGB  red/green/bue

black = 0

white  = 255

 

Images as input to neural networks

0 ... 255

feature vector

 

3dimensions: height,width , color

 

quantization:

color is light, measured by light intensity

fugured out that 8 bits(1byte)

2^3 => 0 ~ 255 

=> 500 x 500의 이미지는 얼마 만큼의 space를 찾이하는가 ?

500 x 500 x 3 x 8 = 6 million bits

jepg allows us to compress images

 

Hex Colors

each byte( 8 bits)

 

Grayscale Images : not have color 

2- D array (height, width)

black = 0 , white = 255

only be a white and black

plt.imshow() 

plt.imshow( , cmap ='gray') 

 

Images as input to neural networks

0...1  => 사이가 편하다.

 

Another exception 

VGG

images are centered around 0, but the range is still 256

 

Images as input to neural networks

N = #samples, D = #features

input X of shape NxD

A single image is HxWxC 

N x HxWxC 

 

Image to Feature Vector

reshape() or view()

NxD  array

28. Code Preparation (ANN)

1. load int the data

       MNIST dataset ->handwrite

2. build the model

3. train the model

4. evaluate the model

5. make the predictions

 

pytorch load MNIST

step1. load in the data -> pytorch library

grascale => 28x28 

train_dataset = torchvision.datasets.MNIST(

root = '.',

train = True, 

download = True)

x_train = train_dateset.data

y_train = train_dateset.targets

x_train.shape = N x 28 x 28

y_train.shape = N

n = 60,000

 

 

test_dataset = torchvision.datasets.MNIST(

root = '.',

train = False, 

download = True)

x_test = test_dataset.data

y_test = test_dataset.targets

x_test.shape = Ntest x 28 x 28

y_test.shape = Ntest

Ntest = 1,000

 

 

trainsforming the data

# reshape the input  -> small range

inputs = inputs.view(-1, 784)

 

step 2. model

model = nn.Sequential(

 nn.Linear(784, 128),

 nn.ReLU(),

 nn.Linear(128, 10)

)

10->  classification 결과 

 

step 3. trian the model

batch gradient Descent

for epoch in range(epochs):

  for x_batch, y_batch in batches(X,Y , batch_size = 128): => batch_size로 나누어서 학습 한다.

    train(x_batch, y_batch)

 

Batch Gradient Descent in pytTorch

train_loader = torch.utils.data.DataLoader(dataset = train_dataset, batch_siz e= batch_size, shuffle = True)

 

for epoch in range(epochs):

  for inputs, targets in train_loader:

    optimizer.zero_grad()

 

ramdom sample

 

step 4/5

n_correct = 0

n_total = 0

for inputs, targets in train_loader:

  output = model(inputs)

 

acc = n_correct/ n_total

 

_, predictions = torch.max(outputs, 1)

 

 

 

29. ANN for Image Classification

import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
import numpy as np
import matplotlib.pyplot as plt
train_dataset = torchvision.datasets.MNIST(
  root = '.',
  train = True, 
  transform = transforms.ToTensor(),
  download = True)
train_dataset.data

train_dataset.data.max()

train_dataset.data.shape

train_dataset.targets

 

이미 다운로드 되여서 다운로드 하지는 않는다.

train_dataset = torchvision.datasets.MNIST(
  root = '.',
  train = True, 
  transform = transforms.ToTensor(),
  download = True)

model = nn.Sequential(
 nn.Linear(784, 128),
 nn.ReLU(),
 nn.Linear(128, 10)
)
# no need for final softmax!

gpu를 사용  여부 확인하면서 있을 경우 사용한다.

속도와 관련 있다. 

device = torch.device("cuda:0" if torch.cuda.is_available() else 'cpu')
print(device)
model.to(device)

loss and optimizer

ctriterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters())
batch_size = 128
train_loader = torch.utils.data.DataLoader(dataset = train_dataset, batch_size = batch_size, shuffle = True)
test_loader = torch.utils.data.DataLoader(dataset = test_dataset, batch_size = batch_size, shuffle = False)

 

 

tmp_loader = torch.utils.data.DataLoader(dataset = train_dataset, batch_size = 1, shuffle = True)
tmp_loader

for x,y in tmp_loader:
    print(x)
    print(x.shape)
    print(y.shape)
    break

train_dataset.transform(train_dataset.data.numpy()).max()
epochs= 10

train_losses = np.zeros(epochs)
test_losses = np.zeros(epochs)

for epoch in range(epochs):
  train_loss = []
  for inputs, targets in train_loader:
    inputs, targets = inputs.to(device) , targets.to(device)

    inputs = inputs.view(-1, 784)
    optimizer.zero_grad()

    outputs = model(inputs)
    loss = ctriterion(outputs, targets)

    loss.backward()
    optimizer.step()

    train_loss.append(loss.item())
  
  train_loss = np.mean(train_loss)

  test_loss = []
  for inputs, targets in test_loader:
    inputs, targets = inputs.to(device) , targets.to(device)

    inputs = inputs.view(-1, 784)
    optimizer.zero_grad()

    outputs = model(inputs)
    loss = ctriterion(outputs, targets)

    test_loss.append(loss.item())
  
  test_loss = np.mean(test_loss)

  train_losses[epoch] = train_loss
  test_losses[epoch] = test_loss
  print(f'Epoch {epoch+1} / {epochs} , train loss : {train_loss:.4f} , Test loss: {test_loss:.4f}')

 

plt.plot(train_losses, label ='train loss')
plt.plot(test_losses, label = 'test loss')
plt.legend()
plt.show()

n_correct = 0.
n_total = 0.
for inputs, targets in train_loader:
  inputs, targets = inputs.to(device), targets.to(device)
  inputs = inputs.view(-1, 784)
  outputs = model(inputs)
  _, predictions = torch.max(outputs, 1)
  n_correct += (predictions == targets).sum().item()
  n_total+= targets.shape[0]
train_acc = n_correct/ n_total
  
n_correct = 0.
n_total = 0.
for inputs, targets in test_loader:
  inputs, targets = inputs.to(device), targets.to(device)
  inputs = inputs.view(-1, 784)
  outputs = model(inputs)
  _, predictions = torch.max(outputs, 1)
  n_correct += (predictions == targets).sum().item()
  n_total+= targets.shape[0]
test_acc = n_correct/ n_total
print(f"Train acc: {train_acc:.4f} , Test acc:{test_acc:.4f}")
from sklearn.metrics import confusion_matrix
import numpy as np
import itertools
def plot_confusion_matrix(cm, classes, normalize = False, title =' Confusion matrix', cmap = plt.cm.Blues):
  if normalize:
    cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
    print("Normalized confusion matrix")
  else:
    print("confusion_matrix: without Normalized")
  print(cm)
  plt.imshow(cm, interpolation='nearest', cmap=cmap)
  plt.title(title)
  plt.colorbar()
  tick_marks = np.arange(len(classes))
  plt.xticks(tick_marks, classes, rotation = 45)
  plt.yticks(tick_marks, classes)

  fmt ='.2f' if normalize else 'd'
  thresh = cm.max()/2
  for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
    plt.text(j, i, format(cm[i,j], fmt), horizontalalignment="center" , color="white" if cm[i, j]> thresh else 'black')
  
  plt.tight_layout()
  plt.ylabel('True label')
  plt.xlabel('Predicted label')
  plt.show()

 

x_test = test_dataset.data.numpy()
y_test = test_dataset.targets.numpy()
p_test = np.array([])
for inputs, targets in test_loader:
  inputs = inputs.to(device)

  inputs = inputs.view(-1, 784)
  
  outputs = model(inputs)
  _, predictions = torch.max(outputs, 1)
  p_test = np.concatenate((p_test, predictions.cpu().numpy()))

cm = confusion_matrix(y_test, p_test)
plot_confusion_matrix(cm, list(range(10)))

 

결과가 안같은 것 보여주기 

misclassified_idx = np.where(p_test != y_test)[0]
i = np.random.choice(misclassified_idx)
plt.imshow(x_test[i], cmap ='gray')
plt.title("True label: %s Predicted: %s" % (y_test[i], int(p_test[i])))

 

 

30. ANN for Regression

pytorch regression

import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
N = 1000
X = np.random.random((N,2)) * 6 -3
y = np.cos(2 * X[:,0]) + np.cos(3*X[:,1])
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(X[:,0] , X[:,1] , y)

notebook과 다른점은 plt.show()할 필요 없다.

 

모델 생성하기

#build the model

model = nn.Sequential(
    nn.Linear(2, 128),
    nn.ReLU(),
    nn.Linear(128, 1)
)
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr = 0.01)
def full_gd(model, criterion, optimizer, X_train, y_train, epochs= 1000):
  train_losses = np.zeros(epochs)

  for epoch in range(epochs):
    optimizer.zero_grad()

    outputs = model(X_train)
    loss = criterion(outputs, y_train)

    loss.backward()
    optimizer.step()

    train_losses[epoch] = loss.item()

    if(epoch+1) % 50 == 0:
      print(f'Epoch {epoch+1}/{epochs}, Train loss:{loss.item():.4f} ')

  return train_losses

X_train = torch.from_numpy(X.astype(np.float32))
y_train = torch.from_numpy(y.astype(np.float32).reshape(-1,1))
train_lossses = full_gd(model, criterion, optimizer, X_train, y_train)
plt.plot(train_losses)

fig = plt.figure()
ax = fig.add_subplot(111, projection = "3d")
ax.scatter(X[:,0] , X[:,1] , y)

with torch.no_grad():
  line = np.linspace(-3, 3, 50)
  XX, yy = np.meshgrid(line, line)
  Xgrid = np.vstack((XX.flatten(), yy.flatten())).T
  Xgrid_torch = torch.from_numpy(Xgrid.astype(np.float32))
  yhat = model(Xgrid_torch).numpy().flatten()
  ax.plot_trisurf(Xgrid[:, 0] , Xgrid[:, 1] , yhat, linewidth = 0.2 , antialiased = True)
  plt.show()

 

아래 그림은 더 크게 만들어준다.

fig = plt.figure()
ax = fig.add_subplot(111, projection = "3d")
ax.scatter(X[:,0] , X[:,1] , y)

with torch.no_grad():
  line = np.linspace(-5, 5, 50)
  XX, yy = np.meshgrid(line, line)
  Xgrid = np.vstack((XX.flatten(), yy.flatten())).T
  Xgrid_torch = torch.from_numpy(Xgrid.astype(np.float32))
  yhat = model(Xgrid_torch).numpy().flatten()
  ax.plot_trisurf(Xgrid[:, 0] , Xgrid[:, 1] , yhat, linewidth = 0.2 , antialiased = True)
  plt.show()

반응형
728x90
반응형

아래 내용은 Udemy에서 Pytorch: Deep Learning and Artificial Intelligence를 보고 정리한 내용이다.

7. what is machine learning?

machine learning 이란?

복잡

machine learning is nothing but a geometry problem

 

regression number

fit a line or curve

예측 

y hat = mx + b

make the line/curve close to the data points

 

classification category/label

target 

separate 

make the line/curve separate data points of different classes

 

supervised learning 

 

8. regression basics

tensorflow 

dataset

x-axis

y-axis

 

scikit-learn approach

load the data  => pd.read_csv()

 

1. model architecture model = LinearRegression()

2. model.predict(X)

3. model.fit(X, y)

 

loss function

loss = cost == error = objective

 

pytorch 다른점

no predefined models

no fit or predict

 

mse loss 함수 

Mean Squared Error : 차이  

error가 작으면 작을 수록 good fit이고 

크면 bad fit

perfect fit = zero error

 

linear regression

 

gradient descent

 

9. Regression Code Preparation

Regression code preparation

 

cocepts는 같지만 코드는 다르다.

python loop

for i in range(10):

  print(i)

 

java loop

for(int i = 0; i < 10; i++){

  System.out.println(i);

}

we konw the concepts, but not the syntax

 

#1. build the model

model = nn.Linear(1,1)

 

#2. train the model

#loss and optimizer

criterion = nn.MSELoss()

optimizer = torch.optim.SGD(model.parameters(), lr=0.1)

 

 

#Train the model(gradient descent loop)

n_epochs = 30

for it in range(n_epochs):

 # zero the parameter gradients

  optimizer.zero_grad()

  # gradient 

  # forward pass

  outputs= model(inputs)

  loss = criterion(outputs, targets)

  # backward and optimize

  loss.backward()

  optimizer.step()

 

(inputs, targets)

(X, y)

pytorch에서 numpy 는 안된다. tensor은 된다.

Array to tensor

X = X.reshape(N,1)

Y = Y.reshape(N,1)

 

#pytorch는 type에 엄청 까다롭다.

#pytorch는 float32 default

#numpy float64 default

inputs = torch.from_numpy(X.astype(np.float32))

targets = torch.from_numpy(Y.astype(np.float32)

 

#3. make predictions

#Forward pass

outpus = model(inputs)

<- No, this just gives us a Torch Tensor

pytorch는 tensor 가능하다. 

 

predictions = model(inputs).detach().numpy()

<-Detach from graph(more detail later), and convert to Numpy array

 

 

summary:

 

10. Regression Notebook

import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt
N = 20
X= np.random.random(N) * 10 -5
y = 0.5 * X - 1 +np.random.randn(N)
plt.scatter(X,y )

linear one input one output

model = nn.Linear(1,1)
criterion = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.1)
X = X.reshape(N, 1)
y = y.reshape(N, 1)

inputs = torch.from_numpy(X.astype(np.float32))
targets = torch.from_numpy(y.astype(np.float32))
type(inputs)

torch.Tensor =>torch.Tensor

n_epochs = 30
losses = []
for it in range(n_epochs):

  optimizer.zero_grad()

  outputs= model(inputs)
  loss = criterion(outputs, targets)

  losses.append(loss.item())

  loss.backward()
  optimizer.step()

  print(f'Epoch {it+1}/{n_epochs}, Loss:{loss.item():.4f}')

 

plt.plot(losses)

prediction 

predicted = model(inputs).detach().numpy()
plt.scatter(X, y , label ='Original data')
plt.plot(X, predicted, label ='Fitted lne')
plt.legend()
plt.show()

#Error
model(inputs).numpy()

with torch.no_grad():
  output = model(inputs).numpy()
output

w = model.weight.data.numpy()
b = model.bias.data.numpy()
print(w, b)

 

 

 

11. 무어의 법칙

무어의 법칙은 컴퓨터구조에서 배운다. 

컴퓨터 power 은 grows exponentially.

2배씩 

 

computer power

 exponentailly eg, 1,2,4,8,...

 

y = ax+ b=> linear 함수 

 

normalize

 

 

12. Moore's Law Notebook

데이터를 

 

Error tokenizing data  오류가 날 경우에는

error_bad_lines=False 추가한다.

 

 

 데이터가 바꿔져서 그런지 오류가 나서 진행이 안된다.

 

데이터 가공

모델 생성

loss 

모델 학습

모델 predict

 

 

transforming back to original scale

import torch
import torch.nn as nn
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
data = pd.read_csv('moore.csv' , header= None , error_bad_lines=False).values
data

 

 

13. Linear Classification Basics

regression: we want the line to be close to the data points

classification : we want the line to separate data points

  tensorflow 모델 학습  pytorch
1. load in some data    
2. model 생성 model = MyLinearClassifier()  
3. train model model.fit(X,y) 없다.
4. predictions model.predict(X_test) 없다.
5. evaluate accuracy model.score(X,y)  

accuracy = #correct / #total

error = #incorrect/ #total

error = 1- accuracy

 

linear

a = w1x1+w2x2 +b

if a>= 0 -> predict 1

if a<   0 -> predict 0

sigmoid 0 ~ 1 사이의 값

 

model architecture

making predictions

training

 

14. Classification Code Prepatration

loading in the data

from sklearn.datasets import load_breast_cancer

data = load_breast_cancer()
X,y = data.data, data.target

preprocess the data

split the data

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size =0.3)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

build the model

Recall: the shape of the data is "NxD"

model = nn.Sequentail(

  nn.Linear(D,1),

  nn.Sigmoid()

)

input each feature for dataset

 

Train the model

#loss and optimizer

criterion = nn.BCELoss() #binary cross entropy 

optimizer = torch.optim.Adam(model.parameters()) #adam hypperparammeter

 

for it in range(n_epochs):

  optimizer.zero_grad()

 

hypperparameter

 

evaluating the model

Recall : in regression , we use the MSE(loss)

RMSE 도 가능하다.

 

GET THE ACCURACY:

with torch.no_grad():

  p_train = model(X_train)

  p_train = np.round(p_train.numpy())

  train_acc = np.mean(y_train.numpy() == p_train)

 

 

15. Classification Notebook

pytorch Classification code

import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_breast_cancer

data = load_breast_cancer()
type(data)

data.keys()

shape를 확인한다.

data.data.shape

 

569 데이타 가 있고 30개 feature이 있다.

data.target

data.target_names

data.target.shape

data.feature_names

data split

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size =0.3)
N, D = X_train.shape

normalization or standardscaler

from sklearn.preprocessing import StandardScaler


scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

 

pytorch build model

model = nn.Sequential(
  nn.Linear(D,1),
  nn.Sigmoid()
)

 

loss and optimizer

criterion = nn.BCELoss()
optimizer = torch.optim.Adam(model.parameters())

 

convert data tensor

2d array -> 1d array

X_train = torch.from_numpy(X_train.astype(np.float32))
X_test = torch.from_numpy(X_test.astype(np.float32))
y_train = torch.from_numpy(y_train.astype(np.float32).reshape(-1,1))
y_test = torch.from_numpy(y_test.astype(np.float32).reshape(-1,1))
epochs = 1000
train_losses = np.zeros(epochs)
test_losses = np.zeros(epochs)

for epoch in range(epochs):
  optimizer.zero_grad()

  outputs = model(X_train)
  loss = criterion(outputs, y_train )

  outputs_test = model(X_test)
  loss_test = criterion(outputs_test, y_test)

  #save losses
  train_losses[epoch] = loss.item()
  test_losses[epoch] = loss_test.item()

  if (epoch+1) % 50 == 0: 
    print(f'Epoch {epoch+1}/ {epochs} , Train loss: {loss.item():.4f} , Test loss: {loss_test.item():.4f}'  )

 

plt.plot(train_losses , label = 'train loss')
plt.plot(test_losses , label = 'test loss')
plt.legend()
plt.show()

#loss function 

accuracy 

# get Accuracy
with torch.no_grad():
  p_train = model(X_train)
  p_train = np.round(p_train.numpy())
  train_acc = np.mean(y_train.numpy() == p_train)

  p_test = model(X_test)
  p_test = np.round(p_test.numpy())
  test_acc = np.mean(y_test.numpy() == p_test)

print(f"Train acc: {train_acc:.4f} , Test acc:{test_acc:.4f}")

 

loss and accuracy 

plt.plot(train_acc , label = 'train acc')
plt.plot(test_acc , label = 'test acc')
plt.legend()
plt.show()

 

16. Saving and loading a model

pytorch saving and loading model

dictionary 

model.state_dict()

 

save the model

torch.save(model.state_dict(), 'myFirstModel.pt')
!ls

model1 = nn.Sequential(
  nn.Linear(D,1),
  nn.Sigmoid()
)
model1.load_state_dict(torch.load('myFirstModel.pt'))

# get Accuracy
with torch.no_grad():
  p_train = model1(X_train)
  p_train = np.round(p_train.numpy())
  train_acc = np.mean(y_train.numpy() == p_train)

  p_test = model1(X_test)
  p_test = np.round(p_test.numpy())
  test_acc = np.mean(y_test.numpy() == p_test)

print(f"Train acc: {train_acc:.4f} , Test acc:{test_acc:.4f}")

google colab 모델 다운로드 하기

from google.colab import files
files.download('myFirstModel.pt')

구글에서 다운로드가 된다.

 

17. A short Neuroscience Primer

linear regression   y = ax + b

logistic regression

 

nueron network 

nueron

senses -> signals 

 

18. How does a model "learn"?

linear regression

line of best fit

 

mse mean squared error

minimizing cost -> making small is possible 

gradient ->기울기 

 

gradient Descent 

gradient zero 

epochs to train

 

gradient Descent 

learning rate -> hyperparameter

 

 

19. Model With logits

전 부 같은데 모델 생성하는 부분이 다르다.

계산이 포함 되여 있어서 sigmoid 필요없다.

 

model2 = nn.Linear(D,1)

criterion = nn.BCELoss()
optimizer = torch.optim.Adam(model2.parameters())

 

prediction 도 다르다.

숫자로 결과를 나오기 때문에 확인 해야 한다.

# get Accuracy
with torch.no_grad():
  p_train = model2(X_train)
  p_train = np.round(p_train.numpy() > 0)
  train_acc = np.mean(y_train.numpy() == p_train)

  p_test = model2(X_test)
  p_test = np.round(p_test.numpy() > 0)
  test_acc = np.mean(y_test.numpy() == p_test)

print(f"Train acc: {train_acc:.4f} , Test acc:{test_acc:.4f}")

 

20 Train Sets vs. Validation Sets vs. Test Sets

overfitting을 방지하기 위하여 데이터 셋을 나눈다.

optimim 찾기

 

cross-validation

hypperparameter 

 

 

21. Suggestion Box

 

 

반응형
728x90
반응형

아래 내용은 Udemy에서 Pytorch: Deep Learning and Artificial Intelligence를 보고 정리한 내용이다.

 

01. Introduction

1. Welcome

pytorch : simple and easy

CNNs

RNNs - sequence data

Stock Prediction with RNNs

GANs

Deep Reinforcement Learnin g: 강화학습 

  알파고

NLP

Object detection

facial recognition

deep fakes

 

2. overview and outline

tensorflow 업그레이드 되여서 간단해졌지만  high level

보통적인 것은 쉽지만 , 특수화 된것은 어렵다.

pytorch는 no need to use linear algebra and calculus to derive backpropagation

  gpu 

google colab

Jupyter Notebook but hosted by Google

 

NLP 

Text classification

Embeddings

 

 

Transfer Learning

Their Network + Your Network

combine Your Network with theirs - get a state-of-the-art deep net within seconds

 

3. where to get the code

google colab

git clone <url>

git pull  -> 마지막 버전 

경험이다.

Theory lecture -> code lecture -> Possible extensions

 

02. Google Colab

4. jupyter notebook, goolge colab

download datafiles

access to GPU and TPU(Tensor Processing Unit)

stored in Google Driver( the "cloud")

many libiriary

 

google drive  

Google colab이 안보일 경우 

Connect More apps ->More menu -> colab

없을 경우 connect more apps

 

gpu, tpu 설정 runtimes에 가서 설정해야 한다.

runtime 연형변경

 

아래에 따라 code, text 추가하기 

 

위의 조절함에 따라 글자체가 달라진다.

제목을 할것인지 선택하면 된다.

 

코드 추가하기 

numpy, plt추가하기 

import numpy as np
import matplotlib.pyplot as plt

예제

x = np.linspace(0, 10* np.pi, 1000)
y = np.sin(x)
plt.plot(x,y)

note book 이여서 plt.show()를 할 필요 없다.

 

 

__version__으로 버전 확인 하기

import tensorflow as tf
tf.__version__

disconnected일경우에는 다시 연결하면 된다. Reconnect

오래 동안 하지 않을 경우 에 메시지가 띄운다.

run out of memory 할 수 도 있다.

 

5. Uploading your own data to google colab

down load th data from a url

!wget를 사용한다.

데이터 다운로드 하기

!wget를 사용하여 데이터 다운로드 하기

 

!ls  =>  로 해당 경로 list확인 하기

!head  => 앞의 몇 라인을 확인하기 , head row 있는지도 확인하기

 

헤더가 없어서 header = None으로 확인 한다.

import pandas as pd
df = pd.read_csv('arrhythmia.data', header = None)

여러가지 컬럼이 있기 때문에 앞의 몇개를 가공하면서 사용한다.

data = df[[0,1,2,3,4,5]]
data.columns = ['age', 'sex', 'height', 'weight', 'QRS duration' , 'P-R interval']

document를 활용하여 데이터의 정보를 알 수 있다.

!head arrhythmia.data

 

마지막에 ;을 하는게 좋다. 

import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = [15,15]
data.hist();

 

 

from pandas.plotting import scatter_matrix
scatter_matrix(data);

 

part 2: USING tf.keras

tensorflow 를 사용하는 이유는 pytorch만 사용할 수 없기 때문에 tensorflow 도 같이 사용할 떄도 있다.

 

tensorflow downgrade

!pip install -q tensorflow==2.0.0-beta1
import tensorflow as tf
print(tf.__version__)

 

다운했는데도 불구하고 버전이 바꿔지지 않을 경우에는 runtime을 다시 시작하면 된다.

다시 수행할 경우에는 

 

tf.keras.utils.get_file('auto-mpg.data', url)

!head '/root/.keras/datasets/auto-mpg.data'

 

헤더와 whitespace를 확인한다.

import pandas as pd
df = pd.read_csv('/root/.keras/datasets/auto-mpg.data', header = None, delim_whitespace=True)
df.head()

 

upload the file youself

from google.colab import files
uploaded = files.upload()

 

파일을 선택하여 할 수 있다.

dictionary를 확인할 수 있다.

uploaded

!ls

 

import pandas as pd
df = pd.read_csv('daily-minimum-temperatures-in-me.csv', error_bad_lines=False)
df.head()
from google.colab import files
uploaded = files.upload()

 

함수의 이름을 보여준다.

from fake_util import my_useful_function
my_useful_function()

!pwd

access file from google drive

mount the dirve

from google.colab import drive
drive.mount('/content/gdrive')

url 클릭후 코드 복사해서 넣는다.

 

!ls
!ls gdrive

google drive에 뭐 있는지 확인한다.

!ls '/content/gdrive/My Drive'

 

6. where can I learn about numpy, scipy, matplotlib, pandas, and Scikit-learn?

Numpy

matrix arighmetic

tensors(arrays) Tensorflow로 왔다.

1-D tensor (vector)

2-D tensor (matrix)

3-D and 4-D tensors

계산 

matrix multiply = dot/inner procuct(np.dot

element-wise multiply(x)

 

matplot

그림 

 

pandas

loading data

 

scipy

no real use of this (so far)

numpy is lowe-level: adding, multiplying

power version of Numpy

 

scikit-learn

basic machine learning

 

Machine Learning step

1. Load in the data

2. split into trian/test sets

3. build a model

4. fit the model(gradient descent)

5. evaluate the model  -> accuracy

   overfitting and underfitting

6. make predictions <- need to convert between Numpy array and Torch Tensor

 

반응형

+ Recent posts