1. Reminder

Logistic Regression은 Binary Classification이다. 0 또는 1로 분류함!
Sigmoid 함수를 이용해 0과 1에 근사하게 한다.

2. Import

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

torch.manual_seed(1) # 결과 재현을 위한 시드 설정

3. Training Data

x_data(독립변수) : 6 * 2
y_data(종속변수) : 6 * 1

x_data = [[1, 2], [2, 3], [3, 1], [4, 3], [5, 3], [6, 2]]
y_data = [[0], [0], [0], [1], [1], [1]]

x_train = torch.FloatTensor(x_data) # FloatTensor화!
y_train = torch.FloatTensor(y_data)

print(x_train.shape)
print(y_train.shape)

4. Computing the Hypothesis

print('e^1 equals: ', torch.exp(torch.FloatTensor([1]))) #torch.exp를 통해 exponential을 이용 가능!

W = torch.zeros((2, 1), requires_grad = True) # W 초기화, 학습 가능 설정
b = torch.zeros(1, requires_grad = True) # b 초기화, 학습 가능 설정

hypothesis = 1 / (1 + torch.exp(-(x_train.matmul(W) + b))) # 모델 정의. matmul을 이용해 x 데이터만큼 weight 곱함.
# hypothesis = torch.sigmoid(x_train.matmul(W) + b) # 위 처럼 수식을 직접 입력하는 것 대신에, Pytorch에서 제공하는 Sigmoid 함수로 쉽게 모델을 정의할 수도 있다!

print(hypothesis)
print(hypothesis.shape)

5. Computing the Cost Function

losses = -(y_train * torch.log(hypothesis) + (1 - y_train) * torch.log(1 - hypothesis))
print(losses) # 전체 샘플에 대한 loss

cost = losses.mean() # loss 평균
print(cost)
# F.binary_cross_entropy(hypothesis, y_train) # 위 처럼 수식을 직접 입력하는 것 대신에, PyTorch에서 제공하는 cross entropy 함수로 쉽게 모델을 정의할 수도 있다!

6. Whole Training Procedure

W = torch.zeros((2, 1), requires_grad = True) # W 초기화 및 학습 가능 설정
b = torch.zeros(1, requires_grad = True) # b 초기화 및 학습 가능 설정

optimizer = optim.SGD((W, b), lr = 1) # SGD로 Gradient descent. W, b를 학습시킨다고 선언, lr을 설정함.

nb_ epochs = 1000 # 1000회 돌림!

for epoch in range(nb_epochs + 1):

 hypothesis = torch.sigmoid(x_train.matmul(W) + b) # 모델
 cost = F.binary_cross_entropy(hypothesis, y_train) # 코스트
 
 # Gradient descent
 optimizer.zero_grad() # 기존 Grad 초기화 (이걸 하지 않으면 기존 grad에 더하게 됨)
 cost.backward() # cost를 minimize하는 방향으로 backpropagate를 하게 된다.
 optimizer.step() # Update!
 
 if epoch % 100 == 0:
  print('Epoch {:%d}/{} Cost: {:.6f"'.format(epoch, nb_epochs, cost.item()))

7. Evaluation

내가 만든 모델의 정확도를 평가해 보자.
Training set으로 훈련 시켰으니, Test set으로 평가를 해보자.
예측값과 정답값들을 비교하여, 평균을 내서 정확도를 구할 수 있다. (수작업 방법)

hypothesis = torch.sigmoid(x_test.matmul(W) + b) # 모델에 test 셋을 넣어서 hypothesis를 예측을 한다.
print(hypothesis[:5]) # 모델을 통한 예측값 5개!

prediction = hypothesis >= torch.FloatTensor([0.5]) # 0.5보다 크면 1로, 0.5보다 작으면 0으로 만들어서 예측값을 binary하게 만든다.
print(prediction[:5]) # binary 한 예측값 5개!

correct_prediction = prediction.float() == y_train # 예측값과 정답값이 같은지를 판단해 correct prediction에 넣는다.
print(correct_prediction[:5]) # correct prediction 확인!

8. Higher Implementation

우리는 앞에서 Binary classifier의 hypothesis 모델을 직접 만들어서 이용했다.
하지만 PyTorch의 클래스를 상속받아서 쉽게 모델을 정의할 수 있다.

class BinaryClassifier(nn.Module): # nn.Module 클래스 상속받아서 BinaryClassifier 만듦.
 def __init__(self):
  super().__init__()
  self.linear = nn.Linear(8, 1) # 8개의 element 받아서 0인지 1인지 예측하는 모델
  self.sigmoid = nn.Sigmoid()
  
 def forward(self, x):
  return self.sigmoid(self.linear(x)) # 모델은 self.linear을 한 후 self.sigmoid를 하는 모델임을 정의
  
model = BinaryClassifier() # 모델!

9. Higher Implementation with Class full code

optimizer = optim.SGD(model.parameters(), lr = 1) # optimizer에 model의 parameter를 넣어 훈련 가능하게 함.

nb_epochs = 100
for epoch in range(nb_epochs + 1):
 hypothesis = model(x_train) # 모델 정의
 
 cost = F.binary_cross_entropy(hypothesis, y_train) # hypothesis 예측값과 y_train 비교
 
 optimizer.zero_grad()
 cost.backward()
 optimizer.step()
 
 if epoch % 10 == 0:
  prediction = hypothesis >= torch.FloatTensor([0.5])
  correct_prediction = prediction.float() == y_train
  accuracy = correct_prediction.sum().item() / len(correct_prediction)
  print('Epoch {:4d} / {} Cost : {:.6f} Accuracy {:2.2f}'.format(epoch, nb_epochs, cost.item(), accuracy * 100,))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

5. Logistic Regression.md

5. Logistic Regression.md

1. Reminder

2. Import

3. Training Data

4. Computing the Hypothesis

5. Computing the Cost Function

6. Whole Training Procedure

7. Evaluation

8. Higher Implementation

9. Higher Implementation with Class full code

Files

5. Logistic Regression.md

Latest commit

History

5. Logistic Regression.md

File metadata and controls

1. Reminder

2. Import

3. Training Data

4. Computing the Hypothesis

5. Computing the Cost Function

6. Whole Training Procedure

7. Evaluation

8. Higher Implementation

9. Higher Implementation with Class full code