1. Weight initialization

모델에 곱해지는 웨이트의 초기값을 잘 설정해야 한다!
웨이트를 초기화 하고 학습한 경우 학습이 더 잘 되고, 성능이 뛰어났다.

2. Weight 초기화 방법

상수로 초기화하기 (0 제외! BP 할 때 weight가 0이면 모든 gradient가 0으로 바뀜)
Restricted Boltzmann Machine(RBM)
Deep Belief Network(DBM)

3. Restricted Boltzmann Machine(RBM)

Restricted : 한 레이어 안의 노드들끼리는 연결이 없다. 다만 A 레이어의 모든 노드와 B 레이어의 모든 노드가 Fully-connecte하게 연결 되어 있다.
RBM은 입력 X를 출력 Y로 진행하고(encoding), Y를 통해 X를 복원할 수 있다.(decoding)

4. DBN과 Weight 초기화

요즘에는 사용되지 않는 방법이다! 구식 버전

(1) Pre-training 과정 : Weight 초기화 과정

(1) RBM for x : x 레이어와 hidden1 레이어에 RBM을 통해 입력 X를 출력 Y로 진행하는 encoding, Y를 통해 X를 복원하는 decoding을 학습한다.
(2) RBM for h^1 : hidden2 레이어를 더 쌓고, x 레이어와 hidden1 레이어의 weight를 고정시켜 놓는다. hidden1 레이어와 hidden2 레이어 사이에 RBM 학습을 한다.
(3) RBM for h^2 : hidden3 레이어를 더 쌓고, hidden1 레이어와 hidden2 레이어의 weight를 고정시켜 놓는다. hidden2 레이어와 hidden3 레이어 사이에 RBM 학습을 한다.
(4) 위 과정으로 출력 레이어까지 n개의 레이어 쌓음!

(2) Fine-tuning 과정 : 학습 과정

인풋 x를 통해 hidden layer들을 거쳐서 아웃풋 y를 구하고, loss를 구한 뒤 BP로 weight를 업데이트를 한다.

5. Xavier / He initialization

요즘에 사용되는 방법이다! 신식 버전
예전에는 랜덤하게 Weight를 초기화 했지만, 이 방법들은 레이어의 특성을 반영하여 Weight를 초기화 하는 방법이다.

(1) Xavier Normal Initialization

Xavier Normal Initialization 수식을 이용해서 초기화!
https://www.youtube.com/watch?v=CJB0g_i7pYk&list=PLQ28Nx3M4JrhkqBVIXg-i5_CVVoS1UzAv&index=16&t=0s

(2) Xavier Uniform Initialization

Xavier Uniform Initialization 수식을 이용해서 초기화!
https://www.youtube.com/watch?v=CJB0g_i7pYk&list=PLQ28Nx3M4JrhkqBVIXg-i5_CVVoS1UzAv&index=16&t=0s

(3) He Normal Initialization

He Normal Initialization 수식을 이용해서 초기화!
https://www.youtube.com/watch?v=CJB0g_i7pYk&list=PLQ28Nx3M4JrhkqBVIXg-i5_CVVoS1UzAv&index=16&t=0s

(4) He Uniform Initialization

He Uniform Initialization 수식을 이용해서 초기화!
https://www.youtube.com/watch?v=CJB0g_i7pYk&list=PLQ28Nx3M4JrhkqBVIXg-i5_CVVoS1UzAv&index=16&t=0s

6. Xavier의 Package Code 살펴보기 (잘 몰라도 됨)

PyTorch 공식 사이트에 있는 Xavier의 Package의 Code를 살펴보자.

def xavier_uniform_(tensor, gain=1):
 fan_in, fan_out = _calculate_fan_in_and_out(tensor) # in, out
 
 # 수식
 std = gain * math.sqrt(2.0 / (fan_in + fan_out))
 a = math.sqrt(3.0) * std
 with torch.no_grad():
   return tensor_uniform_(-a, a)ㅡ

7. MNIST - xavier 초기화 실습해보기

# 기존에 배운 것
linear1 = torch.nn.Linear(784, 256, bias = True)
linear2 = torch.nn.Linear(256, 256, bias = True)
linear3 = torch.nn.Linear(256, 10, bias = True)
relu = torch.nn.ReLU()

# 기존에서 바뀐 부분!
torch.nn.init.xavier_uniform_(linear1.weight) #linear1의 weight를 xavier uniform으로 초기화!
torch.nn.init.xavier_uniform_(linear2.weight)
torch.nn.init.xavier_uniform_(linear3.weight)

# ... 생략

# 결과 : Accuracy 98%! Weight 초기화만 normal에서 xavier로 바꿨는데 성능 상승!!

8. MNIST - xavier 초기화 딥하고 넓은 레이어에 실습해보기

# 5 MLP
linear1 = torch.nn.Linear(784, 512, bias = True)
linear2 = torch.nn.Linear(512, 512, bias = True)
linear3 = torch.nn.Linear(512, 512, bias = True)
linear4 = torch.nn.Linear(512, 512, bias = True)
linear5 = torch.nn.Linear(512, 10, bias = True)
relu = torch.nn.ReLU()

# xavier uniform 초기화!
torch.nn.init.xavier_uniform_(linear1.weight)
torch.nn.init.xavier_uniform_(linear2.weight)
torch.nn.init.xavier_uniform_(linear3.weight)
torch.nn.init.xavier_uniform_(linear4.weight)
torch.nn.init.xavier_uniform_(linear5.weight)

# ... 생략

# 결과 : Accuracy 98.1%! 이전보다 조금 더 향상!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

9-2. Weight initialization.md

9-2. Weight initialization.md

1. Weight initialization

2. Weight 초기화 방법

3. Restricted Boltzmann Machine(RBM)

4. DBN과 Weight 초기화

(1) Pre-training 과정 : Weight 초기화 과정

(2) Fine-tuning 과정 : 학습 과정

5. Xavier / He initialization

(1) Xavier Normal Initialization

(2) Xavier Uniform Initialization

(3) He Normal Initialization

(4) He Uniform Initialization

6. Xavier의 Package Code 살펴보기 (잘 몰라도 됨)

7. MNIST - xavier 초기화 실습해보기

8. MNIST - xavier 초기화 딥하고 넓은 레이어에 실습해보기

Files

9-2. Weight initialization.md

Latest commit

History

9-2. Weight initialization.md

File metadata and controls

1. Weight initialization

2. Weight 초기화 방법

3. Restricted Boltzmann Machine(RBM)

4. DBN과 Weight 초기화

(1) Pre-training 과정 : Weight 초기화 과정

(2) Fine-tuning 과정 : 학습 과정

5. Xavier / He initialization

(1) Xavier Normal Initialization

(2) Xavier Uniform Initialization

(3) He Normal Initialization

(4) He Uniform Initialization

6. Xavier의 Package Code 살펴보기 (잘 몰라도 됨)

7. MNIST - xavier 초기화 실습해보기

8. MNIST - xavier 초기화 딥하고 넓은 레이어에 실습해보기