伊莉討論區

標題: Python3.6-LDA模型_特徵值分布 [打印本頁]

作者: 凱斯先生 時間: 2018-1-28 12:15 PM 標題: Python3.6-LDA模型_特徵值分布

圖一
[attach]122020918[/attach]
圖二
[attach]122020935[/attach]
with open('C:\conte-out.txt',encoding='utf8') as f3:
ks = f3.read()

stpwrdpath = ("C:\stopwords.txt")
stpwrd_dic = open(stpwrdpath, 'rb')
stpwrd_content = stpwrd_dic.read()

stpwrdlst = stpwrd_content.splitlines()
stpwrd_dic.close()
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.decomposition import LatentDirichletAllocation
corpus = [ks]
cntVector = CountVectorizer(stop_words=stpwrdlst)
cntTf = cntVector.fit_transform(corpus)
print (cntTf)
import numpy as np
np.set_printoptions(threshold=np.inf, precision=8)
lda = LatentDirichletAllocation(n_topics=120,
                              learning_offset=50.,
                              random_state=0)
docres = lda.fit_transform(cntTf)
print(lda.components_,file = open('c:/testone.txt','a',encoding='utf8'))

----------------------------------------------------
這不是伸手文，只是不太了解....n_topics=120  我將主題數設定為某值後，輸出後應該是T1,T2,T3嗎@@
為何會變成圖一這樣子0.0

另外，print(lda.components_)，看國外的討論之後，還是不知道他的功能是什麼，有人說是主題和詞分布(圖2)
輸出後跑出了291481筆的數值..也不清楚哪個是主題哪個是詞...
------------------------------------------------------------------------------------------------------------
我的資料總共才294筆數據(1筆為一天)，也就是294天的資料..
麻煩各位大哥姐姐們，解答疑問，感謝您們。

歡迎光臨伊莉討論區 (http://ww2.eyny.com/)