wordcloud詞雲

兩種方法:

1、使用pyecharts ,它是一個用於生成 Echarts 圖表的類庫

教程:訪問pyechart教程

2、使用python的wordcloud類庫

安裝:

python -m pip install wordcloud

代碼:

"""
Masked wordcloud
================
Using a mask you can generate wordclouds in arbitrary shapes.
"""

from os import path
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import os

from wordcloud import WordCloud, STOPWORDS,  ImageColorGenerator

# get data directory (using getcwd() is needed to support running example in generated IPython notebook)
d = path.dirname(__file__) if "__file__" in locals() else os.getcwd()

# Read the whole text.
text = open(path.join(d, 'alice.txt')).read()

# read the mask image
# taken from
# http://www.stencilry.org/stencils/movies/alice%20in%20wonderland/255fk.jpg
alice_mask = np.array(Image.open(path.join(d, "alice_mask.png")))

#設置需要屏蔽的詞
stopwords = set(STOPWORDS)
stopwords.add("said")

#定義字體路徑
#當前爲中文simhei,默認不支持中文,字體可在C:\Windows\Fonts中尋找
fontpath = path.join(d,'simhei.ttf')
wc = WordCloud(background_color="white", max_words=2000, mask=alice_mask,
               stopwords=stopwords, contour_width=3, contour_color='steelblue',font_path=fontpath)

# generate word cloud
#該方法能先進行分詞,再進行詞頻統計,最後生成詞雲,但是對中文的支持不太好
wc.generate(text)

#根據詞頻來生成詞雲,對中文分詞,推薦使用該方法
#wc.generate_from_frequencies(dict)

# 從圖像創建着色
#image_colors = ImageColorGenerator(alice_mask)

# store to file
wc.to_file(path.join(d, "alice.png"))

# 1、show
plt.imshow(wc, interpolation='bilinear')
# 2、着色詞雲並顯示
#plt.imshow(wc.recolor(color_func=image_colors), interpolation="bilinear")
plt.axis("off")
plt.figure()
plt.imshow(alice_mask, cmap=plt.cm.gray, interpolation='bilinear')
plt.axis("off")
plt.show()

同路徑下的文件:

alice_mask.png

alice_mask.png

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章