不知道大家做词云图都用啥工具,我之前用过一个用C++写的词云图程序(mask_word_cloud),github上有源代码,还挺好用的,能保存给各种矢量图格式。
今天找了找python的词云图程序包,发现也不是很多,有个叫word_cloud的python库可以做词云图,但是无法保存为矢量图。只能保存为png格式,这显然不是我要的感觉。经过几个小时的琢磨,终于实现了矢量图的导出方法。下面给个例子看看:以一只鹦鹉照片为背景颜色和mask,然后在维基百科上找了一个某总统的百科资料,随便复制下来渲染一下。
效果还不错吧
代码如下
其中plotWordCloud函数是我花了两个小时琢磨出来的一个很magic的函数,哪位同学想学习可以给我来个三连击,在评论区留言我可以私信给你
import os
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
from scipy.ndimage import gaussian_gradient_magnitude
from wordcloud import WordCloud, ImageColorGenerator
# get data directory (using getcwd() is needed to support running example in generated IPython notebook)
d = os.path.dirname(__file__) if "__file__" in locals() else os.getcwd()
# load wikipedia text on rainbow
text = open(os.path.join(d, 'Data/wordcloud/wiki_rainbow.txt'), encoding="utf-8").read()
# load image. This has been modified in gimp to be brighter and have more saturation.
parrot_color = np.array(Image.open(os.path.join(d, "Data/wordcloud/parrot-by-jose-mari-gimenez2.jpg")))
# subsample by factor of 3. Very lossy but for a wordcloud we don't really care.
parrot_color = parrot_color[::3, ::3]
# create mask white is "masked out"
parrot_mask = parrot_color.copy()
parrot_mask[parrot_mask.sum(axis=2) == 0] = 255
# some finesse: we enforce boundaries between colors so they get less washed out.
# For that we do some edge detection in the image
edges = np.mean([gaussian_gradient_magnitude(parrot_color[:, :, i] / 255., 2) for i in range(3)], axis=0)
parrot_mask[edges > .08] = 255
# create wordcloud. A bit sluggish, you can subsample more strongly for quicker rendering
# relative_scaling=0 means the frequencies in the data are reflected less
# acurately but it makes a better picture
wc = WordCloud(max_words=2000, mask=parrot_mask, max_font_size=40, random_state=42, relative_scaling=0,font_path=fontpath)
# generate word cloud
wc.generate(text)
# create coloring from image
image_colors = ImageColorGenerator(parrot_color)
wc.recolor(color_func=image_colors)
# 3. 绘制词云图
ax=plotWordCloud(wc,bkcolor='k')
ax.axis('off')
# plt.subplots_adjust(left=0,right=1,top=1,bottom=0)
fname_fig='../../figures/Chapter3/Lecture3_8_example2.pdf'
plt.savefig(fname_fig,facecolor='k')
plt.show()
wc.to_file(fname_fig.replace('.pdf','.png'))