[차원축소] SVD (Singular Value Decomposition)

Notice

Recent Posts

Recent Comments

Link

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Tags more

Archives

Today

Total

관리 메뉴

데이터로그😎

[차원축소] SVD (Singular Value Decomposition) 본문

#4. 기타 공부/#4.2. 머신러닝

[차원축소] SVD (Singular Value Decomposition)

지지킴 2023. 9. 5. 15:56

특잇값분해!!!
- 고유값 분해와 달리 모든 행렬에 적용 가능: 이미지 처리, 자연어 처리, 추천 시스템 등에 사용
A = UΣV^T
- A: m x n 크기의 행렬
- 특이벡터:
  - U: m x m 크기의 직교행렬(orthogonal matrix) (A행렬의 행)
  - V: n x n 크기의 직교행렬 (A행렬의 열)
  - Σ: m x n 크기의 직사각 대각행렬(rectangular diagonal matrix)이며, 대각원소인 특이값(singular value)들은 0 이상의 실수 (A행렬 행x열)

SVD(Singular Value Decomposition)

Σ의 비대각 부분과 특이값 중 0인 부분을 제거
제거된 Σ에 대응하는 U와 V원소도 함께 제거하여 차원 줄이는 것
불러올 때from scipy.linalg import svd 둘다 사용 가능
from numpy.linalg import svd

import numpy as np

#임의 행렬 
np.random.seed(121)
a= np.random.rand(4,4)

>>>array([[-0.21203317, -0.28492917, -0.57389821, -0.44031017],
       [-0.33011056,  1.18369457,  1.61537293,  0.36706247],
       [-0.01411931,  0.6296418 ,  1.70964074, -1.32698736],
       [ 0.40187312, -0.19142667,  1.40382596, -1.96876855]])

SVD 행렬 분해

from numpy.linalg import svd
#from scipy.linalg import svd

U, Sigma, VT = svd(a)

print("U:\n{}".format(U))
print()
print("Sigma:\n{}".format(Sigma))
print()
print("VT:\n{}".format(VT))


>>
U:
[[-0.07913928 -0.31822729  0.86653217  0.37628494]
 [ 0.38294523  0.78655287  0.12002097  0.46934262]
 [ 0.65640669  0.02243181  0.35668848 -0.66437971]
 [ 0.64515128 -0.52873697 -0.32785711  0.44353889]]

Sigma:
[3.4229581  2.02287339 0.46263157 0.07935069]

VT:
[[ 0.04100747  0.22367823  0.78643002 -0.57429458]
 [-0.20019867  0.56209596  0.37041464  0.71187191]
 [-0.77847455  0.3945136  -0.33259252 -0.3573774 ]
 [-0.5934781  -0.69164673  0.36565426  0.18895901]]

Sigma(특이값)을 대각행렬로 변환

Sigma_mat = np.diag(Sigma)

>>>
array([[3.4229581 , 0.        , 0.        , 0.        ],
       [0.        , 2.02287339, 0.        , 0.        ],
       [0.        , 0.        , 0.46263157, 0.        ],
       [0.        , 0.        , 0.        , 0.07935069]])

원본행렬 = 다시 조합된 행렬

# 원본
원본행렬 a:
[[-0.21203317 -0.28492917 -0.57389821 -0.44031017]
 [-0.33011056  1.18369457  1.61537293  0.36706247]
 [-0.01411931  0.6296418   1.70964074 -1.32698736]
 [ 0.40187312 -0.19142667  1.40382596 -1.96876855]]


# 다시 조합
a_ U @ Sigma_mat @ VT

>>>
조합된 행렬 a_: 
[[-0.21203317 -0.28492917 -0.57389821 -0.44031017]
 [-0.33011056  1.18369457  1.61537293  0.36706247]
 [-0.01411931  0.6296418   1.70964074 -1.32698736]
 [ 0.40187312 -0.19142667  1.40382596 -1.96876855]]

Truncated SVD

Σ의 원소 중 상위 몇개만 추출
불러올 때
- from scipy.sparse.linalg import svds

import numpy as np
from scipy.sparse.linalg import svds
from scipy.linalg import svd # np.lianalg.svd와 같음

np.random.seed(121)
matrix = np.random.random((6,6))


>>>
[[0.11133083 0.21076757 0.23296249 0.15194456 0.83017814 0.40791941]
 [0.5557906  0.74552394 0.24849976 0.9686594  0.95268418 0.48984885]
 [0.01829731 0.85760612 0.40493829 0.62247394 0.29537149 0.92958852]
 [0.4056155  0.56730065 0.24575605 0.22573721 0.03827786 0.58098021]
 [0.82925331 0.77326256 0.94693849 0.73632338 0.67328275 0.74517176]
 [0.51161442 0.46920965 0.6439515  0.82081228 0.14548493 0.01806415]]

num_components = 4
U, Sigma, VT = svds(matrix, k=num_components)

사이킷런에서 Truncated SVD

from sklearn.decomposition import TruncatedSVD

iris = load_iris()

tsvd = TruncatedSVD(n_components = 2)
tsvd.fit(iris.data)

iris_tsvd = tvsd.transform(iris.data)
iris_tsvd.shape

>>> (150,2)  --> 4차원에서 2차원으로 축소

import pandas as pd

iris_tsvd_df = pd.DataFrame(data = iris_tsvd,columns = ['component_1','component_2']

iris_tsvd_df['target'] = iris.target

sns.scatterplot(
        x='component_1',
        y='component_2',
        hue='target',
        palette='muted',
        data = iris_tsvd_df)

plt.show()

'#4. 기타 공부 > #4.2. 머신러닝' 카테고리의 다른 글

[비지도 학습] 군집: KMeans, GMM (0)	2023.09.05
[비지도 학습] 군집 (clustering) (0)	2023.09.05
[차원축소] LDA (Linear Descriminant Analysis) (0)	2023.09.05
[차원축소] PCA (Principal Component Analysis) (0)	2023.09.05
[분류] 자전거대여 수요예측 (0)	2023.09.05

'#4. 기타 공부/#4.2. 머신러닝' Related Articles

데이터로그😎

[차원축소] SVD (Singular Value Decomposition) 본문

[차원축소] SVD (Singular Value Decomposition)

SVD(Singular Value Decomposition)

Truncated SVD

사이킷런에서 Truncated SVD

'#4. 기타 공부 > #4.2. 머신러닝' 카테고리의 다른 글

티스토리툴바