'Cost Function' 태그의 글 목록

Cost Function

비용함수_20210910 2021.10.06
머신러닝-7 2020.11.20

비용함수_20210910

2021. 10. 6. 21:02

비용함수 :

cost function , 오차함수 , error function

반복이 일어날 때 마다 개선되고 있는지 확인하기 위해 얼마나 좋은(나쁜) 직선인즈를 측정하는 것

선형회귀 : 평균제곱오차

출처 : 텐서플로 첫걸음

'개념 정리' 카테고리의 다른 글

경사하강법_20210912 (0)	2021.10.06
평균제곱오차_20210911 (0)	2021.10.06
텐서플로 서빙 이란 ?_20210905 (0)	2021.09.25
합성곱 신경망 이란?_20210904 (0)	2021.09.25
심층 신경망 이란?_20210903 (0)	2021.09.25

머신러닝-7

2020. 11. 20. 20:23

#난수 발생

#변수 선언

import tensorflow as tf

print(tf.__version__)

# 균등 분포 모양의 난수값

# 변수 선언

a = tf.Variable(tf.random_uniform([1])) # 0 ~ 1 사이의 난수 1개 발생

b = tf.Variable(tf.random_uniform([1], 0, 10)) # 0 ~ 10 사이의 난수 1개 발생

#세션 선언

sess = tf.Session()

#모든 변수 초기화

#초기화 한다음 변수 사용

sess.run(tf.global_variables_initializer())

print('a=', sess.run(a))

print('b=', sess.run(b))

# 정규 분포 난수 발생

norm = tf.random_normal([2, 3], mean=-1, stddev=4)

print(sess.run(norm))

#shuffle()함수로 무자위로 섞기

c = tf.constant([[1,2],[3,4],[5,6]])

shuff = tf.random_shuffle(c)

print(sess.run(shuff))

# 균등분포 난수

unif = tf.random_uniform([2,3], minval=0, maxval=3)

print(sess.run(unif))

placeholder

상수나 변수의 데이터 타입만 설정하고 실행단계에서 딕셔너리에 값을 대입해서 사용할 경우에 placeholder를 사용한다. =>format 을 생각하면 된다.

import tensorflow as tf

 

#상수나 변수의 데이터 타입만 설정하고 실행단게에서 딕셔너리에 값을 대입해서 사용할 

#경우에 plactholder를 사용함 

 

#보통 정수형 혹은 float형 사용 다른 것은 드물다.

node1 = tf.placeholder(tf.float32) # 실수 자료형 1개를 가진 배열

node2 = tf.placeholder(tf.float32) # 실수 자료형 1개를 가진 배열

add = node1 + node2

mul = node1 * node2

sess = tf.Session()

 

#변수 연산 

#연산자 대신에 함수를 사용해도 된다. 

print(sess.run(add, feed_dict={node1:3, node2:4.0}))

print(sess.run(mul, feed_dict={node1:3, node2:4.0}))

import tensorflow as tf
# placeholder 정의
a = tf.placeholder(tf.int32, [3]) # 정수 자료형 3개를 가진 배열
# 배열을 모든 값을 2배하는 연산 정의하기
b = tf.constant(2)
op = a * b
# 세션 시작하기
sess = tf.Session()
# placeholder에 값을 넣고 실행하기
r1 = sess.run(op, feed_dict={ a: [1, 2, 4] })
r2 = sess.run(op, feed_dict={ a: [10, 20, 30] })
print(r1)
print(r2)

variable -> tf.global_variables_initializer()를 사용하는 것이고

placeholder 변수를 초기화 할 필요없다.

import tensorflow as tf
# placeholder 정의
a = tf.placeholder(tf.int32, [None]) # 배열의 크기를 None으로 지정
# 배열의 모든 값을 10배하는 연산 정의하기
#none는 크기에 제한이 없다.
b = tf.constant(10)
op = a * b
# 세션 시작하기
sess = tf.Session()
# placeholder에 값을 넣고 실행하기
r1 = sess.run(op, feed_dict={a: [1,2,3,4,5]})
r2 = sess.run(op, feed_dict={a: [10,30]})
print(r1)
print(r2)

 Rank: 텐서의 차원은 rank로 나타낸다.

 Shape: 텐서의 행과 열이 몇 개인지를 나타낸다.

 Type: 텐서의 데이터가 어떤 형식인지를 나타낸다.

tensor가 고차원 배열을 의미한다.

선형회귀

1. 회귀분석

2. LINEAR REGRESSION (선형회귀)

3. HYPOTHESIS (가설):최저점(minimize cost)이라는 정답을 찾기 위한 가정이기 때문에 가설이라고 부를 수 있다.

H(x) = Wx + b

독립변수가 하나 있을 때 가설로 한다.

w는 기울기(수학) weight(머신런닝)

b 절편 bias

h 종속변수 hypothesis

4. COST (비용, 오차) -> 경사하강법 사용

5. Cost Function (오차함수)

6. GRADIENT DESCENT ALGORITHM (경사 하강법)

선형회귀(Linear Regression)

최소 제곱법(Least-squares)-> 오차가 최소화하는 것

b = y의 평균 – ( x의 평균 * 기울기 a ) = mean(y) – ( mean(x) * a ) = 90.5 – ( 5 * 2.3 ) = 79

평균 제곱근 오차 (Root Mean Squared Error : RMSE)

오차 = 실제 값 – 예측 값

오차가 최소가 되도록 순간점 을 기울기라고 하는데 미분을 수행하게 되면 기울기가 된다.

오차가 줄이는 것은 기울기가 줄어든다.

오차를 줄이는 방식으로 학습을 한다.

이런 방식을 경사 하강법이라고 한다.

이상 적인 것은 0 이지만 0이 가까워지는 것을 찾아서 학습한다.

더이상 오차가 줄어들지 아는데 까지 학습 한다.

경사하강법 알고리즘은 기울기에 학습률(Learning rate)을 곱해서 다음 지점을 결정한다.

학습률이 큰 경우 : 데이터가 무질서하게 이탈하며, 최저점에 수렴하지 못함

학습률이 작은 경우 : 학습시간이 매우 오래 걸리며, 최저점에 도달하지 못함

결과적으로 우리가 하고자 하는 일은, 예측에 따른 오차를 최소화하고자 함이며 이를 머신러닝에서는 비용함수(Cost Function)이라고 정의 합니다.

경사 하강법(Gradient Descent) 으로 비용 함수가 최소가 되는 w(기울기, weight)를 찾을 수 있습니다.

1차 방정식 함수

weight 하고 bias 구해야 한다.

오차가 최소가 되도록 계산한다.

경사가 낮아지는 데로 학습하기 때문에 경사 하강법

y축이 내려가면서 오차를 줄이는 것을 경사 하강법 이리고 한다.

optimizer.minimize() ->cost가 최가 되도록 학습

오차는 lr_rate 학습한 데이터 등 과 관계있다.

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

# 공부핚 시갂(x_data) : 2 4 6 8
# 성적(y_data) : 81 93 91 97
x_data = [2, 4, 6, 8] # 공부핚 시갂
y_data = [81,93,91,97] # 성적
# 기울기 a와 y 젃편 b의 값을 임의로 정핚다.
# 단, 기울기의 범위는 0 ~ 10 사이이며 y 젃편은 0 ~ 100 사이에서 변하게 핚다.
# tf.random_normal([1]) : 난수 1개 발생
a = tf.Variable(tf.random_uniform([1], 0, 10, dtype = tf.float64, seed = 0)) # 기울기
b = tf.Variable(tf.random_uniform([1], 0, 100, dtype = tf.float64, seed = 0)) # 젃편
# y에 대핚 일차 방정식 ax+b의 식을 세운다. ( y = ax + b)
y = a * x_data + b
# 텐서플로 cost 구하기
cost = tf.reduce_mean(tf.square( y - y_data ))

# 학습률 값
#learning_rate = 0.1 #step: 2000, cost = nan, 기울기 a = nan, y 절편 b = nan 학습이 안됬다.
#learning_rate = 0.01 #step: 2000, cost = 8.3000, 기울기 a = 2.2998, y 절편 b = 79.0011
#오차를 줄이는 방법을 쓰야 해서 learning_rate줄여야 한다.
learning_rate = 0.0001
# learning_rate 넘 크면 분산이 도고 넘 작아도 문제가 생깁니다.
#3에 도달하지 못하는 이유는 데이터가적어서 그렇고
# 학습을 더이상 시키지 않아서

# cost 값을 최소로 하는 값 찾기 : 경사 하강법
# 오차가 최소가 되는 알고리즘을 찾아가는 방법
gradient_decent = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
#Optimizer adam을 사용하면 조금 더 잘 처리된다.
#8.3정도 오차가 발생한다.
# 텐서플로를 이용핚 학습
sess = tf.Session()
# 변수 초기화
sess.run(tf.global_variables_initializer())
# 2001번 실행(0번 째를 포함하므로)
for step in range(2001): # 2000번 학습
    sess.run(gradient_decent)
    if step % 100 == 0: # 100번마다 결과 출력
        print("step: %.f, cost = %.4f, 기울기 a = %.4f, y 절편 b = %.4f"
        % (step, sess.run(cost), sess.run(a), sess.run(b)))

#학습률에 따라서 잘 되기도 하고 잘 안되기도 한다.

#예측 할 때는 달라진다. hyphothesis구하는 것 일정의 가설입니다.구하는 방법

tensorflow에서는 hyphothesis 가설이 달라진다.
#독립 변수 y = a * x_data + b
#2중 분류는 sigmoid함수
#다중 분류는 softmax

회귀나 분류나 그리고 회귀에서 선형 회귀 , 2중이냐 다중이냐 등에 따라 도 cost function 달라진다.

경사를 줄이면서 0으로 도달하는게 목표이다.

0에 가까워진다.

출력된 데이터를 보고 그 값을 출력하는 것이다.

learnig_rate는 학습률을 설정하면 최소화를 시킬 수 있다는 것이다.

학습률을 가지고 오차자 최소가 되도록 하는 것이다.

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

# input data
x_train = [ 1, 2, 3]
y_train = [10, 20, 30]
# tf.random_normal([1]) : 난수 1개 발생
W = tf.Variable(tf.random_normal([1]), name='weight') # 기울기, 가중치
b = tf.Variable(tf.random_normal([1]), name='bias') # 절편,편향
# Our hypothesis XW+b
hypothesis = x_train * W + b
# cost/loss function
# square 함수는 제곱의 값
# reduce_mean 함수는 평균
cost = tf.reduce_mean(tf.square(hypothesis - y_train))
# Launch the graph in a session.
sess = tf.Session()

# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
# Minimize
# GradientDescentOptimizer 함수는 경사하강법을 구현핚 함수임
# 경사는 코스트를 가중치로 미분핚 값
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001)
# minimize 함수는 최소화핚 결과를 반홖함
train = optimizer.minimize(cost)
# Fit the line
#오차를 줄이는 방식으로 학습한다.
for step in range(8001): # 2000번 학습
    sess.run(train)
    if step % 100 == 0:
        print(step, 'cost=',sess.run(cost), 'weight=',sess.run(W), 'bias=',sess.run(b))
    # Learns best fit W:[ 1.], b:[ 0.]
#cost가 줄어들 때ㄱ까지 한다.
#0에 도달 하는게 재일 이상적인 것이다.

(linear regression을 placeholder로 구현)

x와 y가 고정된 값이 아니라 paceholder함수로 한다.

optimzer를 만드는 시점에서 값을 준다. 실행은 세션을 가지고 학습한다.

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

# Try to find values for W and b to compute y_data = W * x_data + b
# We know that W should be 1 and b should be 0
W = tf.Variable(tf.random_normal([1]), name='weight') # 기울기
b = tf.Variable(tf.random_normal([1]), name='bias') # 절편
# Now we can use X and Y in place of x_data and y_data
X = tf.placeholder(tf.float32, shape=[None]) # placeholder 정의
Y = tf.placeholder(tf.float32, shape=[None]) #여러개를 받을 수 있다.
# Our hypothesis XW+b
# 단순 회긔분석
hypothesis = X * W + b
# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))
# Minimize
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001) #0.01->0.001
train = optimizer.minimize(cost)

# Launch the graph in a session.
sess = tf.Session()
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer()) # 모든 변수를 초기화 ,w와 b때문에 초기화
#Fit the line
for step in range(2001): # 2000번 학습
    cost_val, W_val, b_val, _ = sess.run([cost, W, b, train],feed_dict={X: [1, 2, 3], Y: [1, 2, 3]})
    if step % 20 == 0:
        # print(step, cost_val, W_val, b_val)
        print(step, 'cost=',cost_val, 'weight=',W_val, 'bias=',b_val)
# Learns best fit W:[ 1.], b:[ 0]

다중 선형회귀(Multi-Variable Linear Regression)

y = a1 x1 + a2 x2 + b

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

# x1, x2, y의 데이터 값
x1 = [2, 4, 6, 8] # 공부핚 시갂
x2 = [0, 4, 2, 3] # 과외 수업 횟수
y_data = [81,93,91,97] # 성적
# 기울기 a와 y젃편 b의 값을 임의로 정함.
# 단 기울기의 범위는 0-10 사이, y 젃편은 0-100사이에서 변하게 함
#seed=0 동일한 난수가 발생한다.
a1 = tf.Variable(tf.random_uniform([1], 0, 10, dtype=tf.float64, seed=0))
a2 = tf.Variable(tf.random_uniform([1], 0, 10, dtype=tf.float64, seed=0))
b = tf.Variable(tf.random_uniform([1], 0, 100, dtype=tf.float64, seed=0))
# 새로운 방정식
y = a1 * x1 + a2 * x2 + b

# 텐서플로 RMSE 함수(비용 함수)
# y_data 실제데이터
# reduce_mean 평균
# tf.square 제곱
# tf.sqrt 제급근
rmse = tf.sqrt(tf.reduce_mean(tf.square( y - y_data )))
# 학습률
learning_rate = 0.1
# 경사 하강법으로 RMSE 값(비용)을 최소로 하는 값 찾기
gradient_decent = tf.train.GradientDescentOptimizer(learning_rate).minimize(rmse)
# 학습이 짂행되는 부분
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer()) # 모든 변수 초기화
    for step in range(2001): # 2000번 학습
        sess.run(gradient_decent)
        if step % 100 == 0:
            print("Epoch: %.f, RMSE = %.4f, 기울기 a1 = %.4f, 기울기 a2 = %.4f, y젃편 b = %.4f"
            %(step, sess.run(rmse), sess.run(a1), sess.run(a2), sess.run(b)))

sqrt는 최소제곱근 이여서 해도 안해도 상관은 없는데 cost는 작아진다.

회귀 분석

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

# 학습 데이터
x1_data = [73., 93., 89., 96., 73.] # quiz1
x2_data = [80., 88., 91., 98., 66.] # quiz2
x3_data = [75., 93., 90., 100., 70.] # midterm
y_data = [152., 185., 180., 196., 142.] # final
# placeholders for a tensor that will be always fed.
x1 = tf.placeholder(tf.float32) # placeholder 정의
x2 = tf.placeholder(tf.float32)
x3 = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)
w1 = tf.Variable(tf.random_normal([1]), name='weight1')
w2 = tf.Variable(tf.random_normal([1]), name='weight2')
w3 = tf.Variable(tf.random_normal([1]), name='weight3')
b = tf.Variable(tf.random_normal([1]), name='bias')
hypothesis = x1 * w1 + x2 * w2 + x3 * w3 + b

# cost/loss function : 비용함수
cost = tf.reduce_mean(tf.square(hypothesis - Y))
# Minimize. Need a very small learning rate for this data set
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.00003)
train = optimizer.minimize(cost)
# Launch the graph in a session.
sess = tf.Session() # 세션 설정
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer()) # 모든 변수 초기화
for step in range(10001): # 10000번 학습
    cost_val, hy_val, _= sess.run([cost, hypothesis, train],
    feed_dict={x1: x1_data, x2: x2_data, x3: x3_data, Y: y_data}) #feed_dict dictionary
    if step % 100 == 0:
        print('step=', step, "Cost: ", cost_val, "Prediction:", hy_val)

학습 률은 정해지자 않았다.

cost가 넘 크면 학습률을 조절해야 한다.

 Logistic Regression

Logistic Regression은 대표적인 분류(classification) 알고리즘 중의 하나이다.

 Logistic Regression 적용예 :

 Spam Detection : Spam(1) or Ham(0) ->이중분류

 Facebook feed : show(1) or hide(0)

 학습 시갂에 따른 합격 여부 : 합격(1) or 불합격(0) ->이중분류

이중분류 다중분류

값이 같으면 1 아니면 -로 한다.

hyo -> 시그모이드 cost 함수가 조금 다르다.

2중 부류로 할때 시그모이드 할수를 통과하면 0~ 1 사이의 값이

0.5보다 크면 true아니면 false

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

# 학습 데이터
x_data = [[1, 2], # [ 공부시갂, 과외받은 횟수 ]
[2, 3],
[3, 1],
[4, 3],
[5, 3],
[6, 2]]
y_data = [[0], # 1:합격, 0:불합격
[0],
[0],
[1],
[1],
[1]]
# placeholders for a tensor that will be always fed.
X = tf.placeholder(tf.float32, shape=[None, 2])
Y = tf.placeholder(tf.float32, shape=[None, 1])

#2행 1열 구조의 난수를 생성하라
#정규분포모양의 난수를 생성
W = tf.Variable(tf.random_normal([2, 1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')

# Hypothesis using sigmoid: tf.div(1., 1. + tf.exp(tf.matmul(X, W))) : 가설, 모델
hypothesis = tf.sigmoid(tf.matmul(X, W) + b)
#sigmoid(x*w + b)
# 0 ~ 1사이의 임의의 값을 가진다.

# cost/loss function : 비용함수
cost = -tf.reduce_mean(Y * tf.log(hypothesis) + (1 - Y) * tf.log(1 - hypothesis))
train = tf.train.GradientDescentOptimizer(learning_rate=0.001).minimize(cost)
# Accuracy computation : 정확도 계산
# True if hypothesis>0.5 else False
# tf.cast()함수는 True면 1, False면 0을 리턴함 숫자로 변환
predicted = tf.cast(hypothesis > 0.5, dtype=tf.float32) # predicted = 1 or 0
#0.5보다 크면 합격 아니면 불합격
#Y는 실제 데이터  y_data
accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, Y), dtype=tf.float32)) #평균

# Launch graph
with tf.Session() as sess:
    # Initialize TensorFlow variables : 모든 변수 초기화
    sess.run(tf.global_variables_initializer())
    for step in range(10001):  # 10000번 학습
        cost_val, _ = sess.run([cost, train], feed_dict={X: x_data, Y: y_data})
        if step % 200 == 0:  # 200번 마다 출력
            print('step =', step, 'cost =', cost_val)
        #학습은 여기서 끝났다.

    # Accuracy report ->최종적이것
    h, c, a = sess.run([hypothesis, predicted, accuracy], feed_dict={X: x_data, Y: y_data})
    print('\nHypothesis: ', h, '\nCorrect (Y): ', c, '\nAccuracy: ', a)

다중 로지스틱 회귀

독립변수가 2개 이상인 경우에 사용

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
import numpy as np
# 실행핛 때마다 같은 결과를 출력하기 위핚 seed 값 설정
seed = 0
np.random.seed(seed)
tf.set_random_seed(seed)
# 학습 데이터
x_data = np.array([[2, 3],[4, 3],[6, 4],[8, 6],[10, 7],[12, 8],[14, 9]]) #7행 2열
y_data = np.array([0, 0, 0, 1, 1, 1,1]).reshape(7, 1)# 7행 1열
# 플레이스 홀더 정의
X = tf.placeholder(tf.float64, shape=[None, 2]) # 개수에 상관없이 받는다. 
Y = tf.placeholder(tf.float64, shape=[None, 1])

# 기울기 a와 bias b의 값을 임의로 정함.
a = tf.Variable(tf.random_uniform([2,1], dtype=tf.float64)) # 2행 1열의 난수 발생
b = tf.Variable(tf.random_uniform([1], dtype=tf.float64)) # 1개의 난수 발생
# y 시그모이드 함수의 방정식을 세움
y = tf.sigmoid(tf.matmul(X, a) + b)
# 오차를 구하는 함수
loss = -tf.reduce_mean(Y * tf.log(y) + (1 - Y) * tf.log(1 - y))
# 학습률 값
learning_rate=0.1
# 경사 하강법으로 오차(비용)를 최소로 하는 값 찾기
gradient_decent = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
predicted = tf.cast(y > 0.5, dtype=tf.float64) # tf.cast()함수는 True면 1, Flase면 0을 리턴함
accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, Y), dtype=tf.float64))

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(3001): # 3000번 학습
        a_, b_, loss_, _ = sess.run([a, b, loss, gradient_decent], feed_dict={X: x_data, Y: y_data})
        if (i + 1) % 300 == 0:
            print("step=%d, a1=%.4f, a2=%.4f, b=%.4f, loss=%.4f" % (i + 1, a_[0], a_[1], b_, loss_))

    # 공부시갂, 개인 과외수, 합격 가능성
    new_x = np.array([7, 6]).reshape(1, 2) #[7, 6]은 각각 공부 시갂과 과외 수업수.
    new_y = sess.run(y, feed_dict={X: new_x})
    print("공부 시갂: %d, 개인 과외 수: %d" % (new_x[:,0], new_x[:,1]))
    print("합격 가능성: %6.2f %%" % (new_y*100))

학습률이 넘 크면 수렴을 하지 않고 분산을 한다.

수렴을 한다는 것은 오차를 최소로 한다는 것이다.

넘 작은 갓을 사용하면 시간이 오래 걸려서 최저점 까지 도달 하지 못함

조절해서 학습 률을 설정한다.

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
import numpy as np
# 실행핛 때마다 같은 결과를 출력하기 위핚 seed 값 설정
seed = 0
np.random.seed(seed)
tf.set_random_seed(seed)
# 학습 데이터
x_data = np.array([[2, 3],[4, 3],[6, 4],[8, 6],[10, 7],[12, 8],[14, 9]]) #7행 2열
y_data = np.array([0, 0, 0, 1, 1, 1,1]).reshape(7, 1)# 7행 1열
# 플레이스 홀더 정의
X = tf.placeholder(tf.float64, shape=[None, 2]) # 개수에 상관없이 받는다.
Y = tf.placeholder(tf.float64, shape=[None, 1])

# 기울기 a와 bias b의 값을 임의로 정함.
a = tf.Variable(tf.random_uniform([2,1], dtype=tf.float64)) # 2행 1열의 난수 발생
b = tf.Variable(tf.random_uniform([1], dtype=tf.float64)) # 1개의 난수 발생
# y 시그모이드 함수의 방정식을 세움
y = tf.sigmoid(tf.matmul(X, a) + b)
# 오차를 구하는 함수
loss = -tf.reduce_mean(Y * tf.log(y) + (1 - Y) * tf.log(1 - y))
# 학습률 값
learning_rate=0.1
# 경사 하강법으로 오차(비용)를 최소로 하는 값 찾기
gradient_decent = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
predicted = tf.cast(y > 0.5, dtype=tf.float64) # tf.cast()함수는 True면 1, Flase면 0을 리턴함
#2중 적인 분류를 설명 하는 것 
accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, Y), dtype=tf.float64))

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(3001): # 3000번 학습
        a_, b_, loss_, _ = sess.run([a, b, loss, gradient_decent], feed_dict={X: x_data, Y: y_data})
        if (i + 1) % 300 == 0:
            print("step=%d, a1=%.4f, a2=%.4f, b=%.4f, loss=%.4f" % (i + 1, a_[0], a_[1], b_, loss_))

    # 공부시갂, 개인 과외수, 합격 가능성
    new_x = np.array([7, 6]).reshape(1, 2) #[7, 6]은 각각 공부 시갂과 과외 수업수.
    new_y = sess.run(y, feed_dict={X: new_x})
    print("공부 시갂: %d, 개인 과외 수: %d" % (new_x[:,0], new_x[:,1]))
    print("합격 가능성: %6.2f %%" % (new_y*100))

다중적인 확률 -> softmax function

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

# 실행결과가 매번 동일하게 출력 되도록 seed를 설정
tf.set_random_seed(777)
x_data = [[1, 2, 1, 1], # 4가지 측정으로 가지고 있다.
[2, 1, 3, 2],
[3, 1, 3, 4],
[4, 1, 5, 5],
[1, 7, 5, 5],
[1, 2, 5, 6],
[1, 6, 6, 6],
[1, 7, 7, 7]]
y_data = [[0, 0, 1], # 분류를 할 때는 3가지로 분류한다.
[0, 0, 1],
[0, 0, 1],
[0, 1, 0],
[0, 1, 0],
[0, 1, 0],
[1, 0, 0],
[1, 0, 0]]
#어떤 특징을 있을 때 어떤 값으로 분류한다.

X = tf.placeholder("float", [None, 4]) # 4열
Y = tf.placeholder("float", [None, 3]) # 3열
nb_classes = 3
# softmax함수에 입력값: 4 , 출력값: nb_classes=3
W = tf.Variable(tf.random_normal([4, nb_classes]), name='weight')
b = tf.Variable(tf.random_normal([nb_classes]), name='bias')
# tf.nn.softmax computes softmax activations
# softmax = exp(logits) / reduce_sum(exp(logits), dim)
hypothesis = tf.nn.softmax(tf.matmul(X, W) + b)

# Cross entropy cost/loss - 오차함수
cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1))

# 경사하강법을 이용해서 cost가 최소가 되도록 학습시킨다.
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)
# Launch graph
with tf.Session() as sess: #session을 닫아주는 작업을 안해도 된다.
    sess.run(tf.global_variables_initializer()) #초기값 할당
    for step in range(2001): # 2000번 학습
        sess.run(optimizer, feed_dict={X: x_data, Y: y_data})
        if step % 200 == 0:
            print(step, sess.run(cost, feed_dict={X: x_data, Y: y_data}))
    print('--------------')

# 경사하강법을 이용해서 cost가 최소가 되도록 학습시킨다.
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)
# Launch graph
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for step in range(2001): # 2000번 학습
        sess.run(optimizer, feed_dict={X: x_data, Y: y_data})
        if step % 200 == 0:
            print(step, sess.run(cost, feed_dict={X: x_data, Y: y_data}))
    print('--------------')
    # Testing & One-hot encoding
    a = sess.run(hypothesis, feed_dict={X: [[1, 11, 7, 9]]})
    print(a, sess.run(tf.arg_max(a, 1)))
    print('--------------')

    b = sess.run(hypothesis, feed_dict={X: [[1, 3, 4, 3]]})
    print(b, sess.run(tf.arg_max(b, 1)))  # arg_max()함수 : one-hot-encoding을 맊들어 주는 함수
    print('--------------')
    c = sess.run(hypothesis, feed_dict={X: [[1, 1, 0, 1]]})
    print(c, sess.run(tf.arg_max(c, 1)))
    print('--------------')
    all = sess.run(hypothesis, feed_dict={X: [[1, 11, 7, 9], [1, 3, 4, 3], [1, 1, 0, 1]]})
# 여러개 값을 전달가능하다. 
    print(all, sess.run(tf.arg_max(all, 1)))
# softmax확률 상태로 구분한다. 가장 높은 확률로 한다.
# 3가지로 분류 된다 .
#확률이 가장 높은 것 하나를 구해라
#특징은 4가지이지만 분류는 3가지이다.

Animal Classification

가장 마지막 컬럼은 0 ~ 6 사이의 숫자로 되어 있다.

import numpy as np
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()


tf.set_random_seed(777) # for reproducibility
# Predicting animal type based on various features
xy = np.loadtxt('data-zoo.csv', delimiter=',', dtype=np.float32)
x_data = xy[:, 0:-1] # : 모든행, 0:-1 1 ~ 16열 (동물들의 특징)
y_data = xy[:, [-1]] # : 모든행, [-1] 17열 (동물명 : 0 ~ 6)
print(x_data.shape, y_data.shape)
nb_classes = 7 # 0 ~ 6
X = tf.placeholder(tf.float32, [None, 16])
Y = tf.placeholder(tf.int32, [None, 1]) # 0 ~ 6
Y_one_hot = tf.one_hot(Y, nb_classes) # one hot
print("one_hot", Y_one_hot)
Y_one_hot = tf.reshape(Y_one_hot, [-1, nb_classes])
print("reshape", Y_one_hot)

W = tf.Variable(tf.random_normal([16, nb_classes]), name='weight')
b = tf.Variable(tf.random_normal([nb_classes]), name='bias')
# tf.nn.softmax computes softmax activations
# softmax = exp(logits) / reduce_sum(exp(logits), dim)
logits = tf.matmul(X, W) + b
hypothesis = tf.nn.softmax(logits)
# Cross entropy cost/loss
cost_i = tf.nn.softmax_cross_entropy_with_logits(logits=logits,labels=Y_one_hot)
cost = tf.reduce_mean(cost_i)
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)
prediction = tf.argmax(hypothesis, 1) # hypothesis에서 최대값을 구함
correct_prediction = tf.equal(prediction, tf.argmax(Y_one_hot, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) # cast(float형으로 형변홖)

# Launch graph
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for step in range(2000):
        sess.run(optimizer, feed_dict={X: x_data, Y: y_data})
        if step % 100 == 0:
            loss, acc = sess.run([cost, accuracy], feed_dict={X: x_data, Y: y_data})
            print("Step: {:5}\tLoss: {:.3f}\tAcc: {:.2%}".format(step, loss, acc))
    # Let's see if we can predict
    pred = sess.run(prediction, feed_dict={X: x_data})
    # y_data: (N,1) = flatten => (N, ) matches pred.shape
    for p, y in zip(pred, y_data.flatten()):
        print("[{}] Prediction: {} True Y: {}".format(p == int(y), p, int(y)))

확률형태로 처리가 된다.

확률값이 리턴된다.

softmax결과를 가지고 확률을 나온다. 3가지가 분류가 나와서 확률이 나온다.

확률의 합은 1이다.

일정한 값이 이어져야만 relu 로 해야 다음 층 갈 수 있다.

오차함수

평균제곱계열

mean_squared_error 평균제곱 : 일반적인 회귀

교차 엔트로피계열

이중 분류: binary_crossentropy

다중 분류:categorical_crossentropy

optmizer

sgg: 확률적 경사

adam : 파생적으로 나온 것 학습능력이 뛰여나다.

relu - 잴 많이 사용한다.

<0 0으로 하고 >0 크면 자기 값

2분을계속 하면 기울기 소실 문제가 발생한다. 그것을 해결하기 위해 해결한 것은 Relu함수 이다.

x가 0이상의 값을 와야만 값을 가지게 된다. 신경망 이론에서도 특정한 데이터 만 가져오만 전달 할 수 있다.

기울기가 점점 줄어드는 방식으로 하는 것이 미분인데 기울기가 점점 작아지면서 손실 이 나와서 해결하는 것이 relu함수 입니다.

'Study > 머신러닝' 카테고리의 다른 글

머신러닝-9 (0)	2020.11.21
머신러닝-8 (1)	2020.11.21
머신러닝-6 (0)	2020.11.19
머신러닝-5 (0)	2020.11.19
머신러닝-4 (0)	2020.11.17

PREV 1 NEXT

NAIAHD

Cost Function

비용함수_20210910

'개념 정리' 카테고리의 다른 글

머신러닝-7

'Study > 머신러닝' 카테고리의 다른 글

+ Recent posts

티스토리툴바