'풀링 계층 구현하기' 태그의 글 목록

풀링 계층 구현하기

07-2. 합성곱 신경망(cnn) 2020.12.09

07-2. 합성곱 신경망(cnn)

2020. 12. 9. 10:20

7.4 합성곱/풀링 계층 구현하기

7.4.1 4차원 배열

cnn에서 계층 사이를 흐르는 데이터는 4차원입니다.

(10,3,28,28) => 높이 28 너비 28 채널 3개인 데이터가 10개

import numpy as np
x = np.random.rand(10,1,28,29)
x.shape

7.4.2 im2col로 데이터 전개하기

합성곱 연산을 곧이곧대로 구현하려면 for문을 겹겹이 쌓아야 하는데 성능이 떨어지는 단점이 있어서

im2col 사용

im2col은 입력 데이터를 필터링(가중치 계산)하기 좋게 전개하는 (펼치는 )함수입니다.

3차원 입력 데이터에 im2col을 적용하면 2차원 행렬로 바뀝니다. (정확히는 배치 안의 데이터 수 까지 포함한 4차원 데이터를 2차원으로 변환합니다.)

필털이 하기 좋게 입력 데이터를 전개합니다.

입력 데이터에서 필터를 적용하는 영역(3차원 블록)을 한 줄로 늘어놓습니다.

필터 적용 영역이 겹치게 되면 im2col로 전개한 후의 원소 수가 원래 블록의 원소 수보다 많아집니다.

단점: im2col을 사용해 구현하면 메모리를 더 많이 소비하는 단점이 있습니다.

하지만 컴퓨터는 큰 행렬을 묶어서 계산하는데 탁월합니다.

image to column => Caffe와 Chainer등 의 딥러닝 프레임워크는 im2col이라는 이름의 함수를 만들어 합성곱 계층을 구현할 때 이용하고 있습니다.

im2col로 입력 데이터를 전개한 다음에는 합성곱 계층의 필터(가중치)를 1열로 전개하고 , 두 행렬의 곱을 계산하면 됩니다. Affine계층

7.4.3 합성곱 계층 구현하기

input_data(데이터수 , 채널수 , 높이 , 너비) => 4차원

import sys, os
sys.path.append(os.pardir)
from common.util import im2col
import numpy as np

x1 = np.random.rand(1,3,7,7)
col1 = im2col(x1, 5, 5, stride=1, pad=0)
print(col1.shape) #(9, 75)

x2 = np.random.rand(10,3,7,7)
col2 = im2col(x2,5,5,stride=1,pad=0)
print(col2.shape) #(90, 75)

ModuleNotFoundError: No module named 'common.util'

im2col함수를 정의하였다.

def im2col(input_img, FH, FW, stride=1, pad=0):
    N, C, H, W = input_img.shape
    out_h = 1 + int((H + 2 * pad - FH) / stride)
    out_w = 1 + int((W + 2 * pad - FW) / stride)
    img = np.pad(input_img, [(0, 0), (0, 0), (pad, pad), (pad, pad)], 'constant')  # pad 0 
    out_img = np.zeros((N, C, FH, FW, out_h, out_w))
    for y in range(FH):
        y_max = y + out_h * stride
        for x in range(FW):
            x_max = x + out_w * stride
            out_img[:, :, y, x, :, :] = img[:, :, y:y_max:stride,x:x_max:stride] 
    out_img = np.transpose(out_img,(0, 4, 5, 1, 2, 3))  # N,OH,OW,C,FH,FW
    out_img = out_img.reshape(N * out_h * out_w, -1)  # inmage 를 reshape height N*out_h*out_w，width C*FW*FH
    return out_img

아래의 방식도 가능하다.

def im2col(input_img, FH, FW, stride=1, pad=0):
    N, C, H, W = input_img.shape
    out_h = 1 + int((H + 2 * pad - FH) / stride)
    out_w = 1 + int((W + 2 * pad - FW) / stride)
    img = np.pad(input_img, [(0, 0), (0, 0), (pad, pad), (pad, pad)], 'constant') 
    out_img = np.zeros((N, C, FH, FW, out_h, out_w))
    for y in range(out_h):
        y_left=y*stride
        for x in range(out_w):
            x_left=x*stride
            out_img[:, :, :, :, y, x] = img[:, :, y_left:y_left+FH,x_left:x_left+FW] 
    out_img = np.transpose(out_img,
                           (0, 4, 5, 1, 2, 3))  
    out_img = out_img.reshape(N * out_h * out_w, -1)  
    return out_img

out_h = (H + 2*pad - filter_h)//stride + 1 => 3/2 =>1

np.zeros((N, C, filter_h, filter_w, out_h, out_w)) => 초기화

np.transpose(out_img,(0, 4, 5, 1, 2, 3))

=> 0은 움직이지 않고, 1->3 , 2->4 , 3-> 5, 4->1 , 5->2

0 1 2 3 4 5

0 4 5 1 2 3

reshape 2차원

import sys, os
import numpy as np

x1 = np.random.rand(1,3,7,7)
col1 = im2col(x1, 5, 5, stride=1, pad=0)
print(col1.shape) #(9, 75)

합성곱 계층을 Convolution이라는 클래스로 구현

python reshape

reshape 두번째 인수를 -1로 지정했는데 , 이는 reshape이 제공하는 편의 기능

reshape -1을 지정하면 다차원 배열의 원소 수가 변환 후에도 똑같이 유지되도록 적절히 묶어집니다.

x = np.random.rand(10,3,7,7)
print(x.shape)
print(x.reshape(10,-1).shape)

forward에서

python transpose

transpose 다차원 배열의 축 순서를 바뀌주는 함수

im2col로 전개한 덕분에 완전연결 계층의 Affine 계층과 거의 똑같이 구현할 수 있다.

class Conv:
    '''
    초기 필터(가중치), 편향, 스트라이드 , 패딩  
    4차원 형상(FN , C, FH , FW)
    FN - 필터 개수 
    C - 채널 개수
    FH -  필터 높이
    FW - 필터 너비 
    '''
    def __init__(self, W, b, stride = 1, pad = 0):
        self.W = W
        self.b = b
        self.stride = stride
        self.pad = pad
    def forward(self, x):
        FN , C , FH , FW = self.W.shape
        N, C, H, W = x.shape
        out_h = int(1 + H+ 2 * self.pad -FH / self.stride)
        out_w = int(1 + W+ 2 * self.pad -FW / self.stride)
        
        #RESHAPE를 통해 2차원 배열로 전개 
        # 블록을 한 줄로 펼쳐봅니다.
        col = im2col(x, FH, FW, self.stride, self.pad)
        col_W = self.W.reshape(FN, -1).T
        out = np.dot(col, col_W) + self.b
        
        out = out.reshape(N, out_h, out_w, -1).transpose(0,3,1,2)
        
        return out
        
    def backward(self, dout):
        FN, C, FH, FW = self.W.shape
        dout = dout.transpose(0,2,3,1).reshape(-1, FN)

        self.db = np.sum(dout, axis=0)
        self.dW = np.dot(self.col.T, dout)
        self.dW = self.dW.transpose(1, 0).reshape(FN, C, FH, FW)

        dcol = np.dot(dout, self.col_W.T)
        dx = col2im(dcol, self.x.shape, FH, FW, self.stride, self.pad)

        return dx

7.4.4 풀링 계층 구현하기

채널 쪽이 독립적이라는 점이 합성곱 계층 때와 다릅니다.

풀링 적용 영역을 채널마다 독립적으로 전개합니다.

class Pooling:
    def __init__(self, pool_h, pool_w, stride = 1, pad = 0):
        self.pool_h = pool_h
        self.pool_w = pool_w
        self.stride = stride
        self.pad = pad
        
    def forward(self, x):
        N , C , H , W = x.shape
        N, C, H, W = x.shape
        out_h = int(1 + (H - self.pool_h) / self.stride)
        out_w = int(1 + (W - self.pool_w) / self.stride)
        
        #RESHAPE를 통해 2차원 배열로 전개 
        # 블록을 한 줄로 펼쳐봅니다.
        col = im2col(x, self.pool_h, self.pool_w, self.stride, self.pad)
        col = self.W.reshape(-1,self.pool_h * self.pool_w)
        
        out = np.max(col, axis = 1) #최댓값
        
        out = out.reshape(N, out_h, out_w, -1).transpose(0,3,1,2)
        
        return out
        
    def backward(self, dout):
        dout = dout.transpose(0, 2, 3, 1)
        
        pool_size = self.pool_h * self.pool_w
        dmax = np.zeros((dout.size, pool_size))
        dmax[np.arange(self.arg_max.size), self.arg_max.flatten()] = dout.flatten()
        dmax = dmax.reshape(dout.shape + (pool_size,)) 
        
        dcol = dmax.reshape(dmax.shape[0] * dmax.shape[1] * dmax.shape[2], -1)
        dx = col2im(dcol, self.x.shape, self.pool_h, self.pool_w, self.stride, self.pad)
        
        return dx

1. 입력 데이터를 전개한다.

2. 행별 최댓값을 구한다.

3. 적절한 모양으로 성형한다.

np.max 메소드

np.max 인수로 축(axis)과 같이 쓰면 입력 x의 1번째 차원의 축 마다 최댓값을 구합니다.

'책 > 밀바닥부터 시작하는 딥러닝' 카테고리의 다른 글

08-1. 딥러닝 (0)	2020.12.28
07-3 합성곱 신경망(CNN) (0)	2020.12.18
07-1. 합성곱 신경망(cnn) (0)	2020.11.19
06. 학습 관련 기술들 (0)	2020.10.20
05. 오차역전파법 (0)	2020.10.16

PREV 1 NEXT

NAIAHD