Sklearn How To Get The 10 Words From Each Topic
I want to get the top 10 frequency of words from each topic, and after I use TfidfTransformer, I get: and the type is scipy.sparse.csr.csr_matrix But I don't know how to get the hi
Solution 1:
You can use the TfidfVectorizer
to expose the get_feature_names
method. The transformer doesn't have this method, but the docs clearly state that the Vectorizer
is equivalent to CountVectorizer
followed by the transformer. If you don't want to use this, then I think you're going to be stuck building a lookup before you vectorize.
TfidfVectorizer in the docs: https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html
Edit: to sort and slice the output of fit_transform
from the TfidfVectorizer
normal sparse matrix operations should work.
Post a Comment for "Sklearn How To Get The 10 Words From Each Topic"