python通过设置WordCloud参数实现定制词云

 更新时间:2023年11月05日 09:53:23   作者:微小冷  
这篇文章主要为大家详细介绍了python如何通过设置WordCloud参数实现定制词云,文中的示例代码讲解详细,感兴趣的小伙伴可以跟随小编一起学习一下

添加整型参数

我们所有的设置都放在了wcDct中,所以若想用更多的参数来定制词云,那么只需在wcDct中添加内容,例如下面这些整型参数

其次,WordCloud中有很多参数的数据类型都是整型,这些适用于Spinbox

参数说明合适的范围步长
width词云宽度100-200010
height词云高度100-200010
min_font_size最小文字尺寸1-501
max_font_size最大文字尺寸10-100010
font_step字体步长1-201
max_words最大单词数10-50010
min_word_length最短单词长度0-101
scale图像缩放默认是1

下面就是要向wcDct中添加的内容。

wcDct = {
    "最小文字" : {
        "ctrl": ttk.Spinbox, 
        "paras" : {"width":10, "from_":1, "to":50, "increment":1},
        "default":4,
        "call" : "min_font_size"},
    "最大文字" : {
        "ctrl": ttk.Spinbox, 
        "paras" : {"width":10, "from_":10, "to":1e3, "increment":10},
        "default":400,
        "call" : "max_font_size"},
    "字体步长" : {
        "ctrl": ttk.Spinbox, 
        "paras" : {"width":10, "from_":1, "to":20, "increment":1},
        "default":10,
        "call" : "font_step"},
    "最短词长" : {
        "ctrl": ttk.Spinbox, 
        "paras" : {"width":10, "from_":0, "to":10, "increment":1},
        "default":1,
        "call" : "min_word_length"},
    "最多词数" : {
        "ctrl": ttk.Spinbox, 
        "paras" : {"width":10, "from_":10, "to":500, "increment":10},
        "default":200,
        "call" : "max_words"},
    "图像缩放" : {
        "ctrl": ttk.Spinbox, 
        "paras" : {"width":10, "from_":0.5, "to":5, "increment":0.1},
        "default":1,
        "call" : "scale"},
}

布尔型参数

然后是一些布尔类型的参数,适合用Checkbutton

参数说明类型适用组件
repeat是否重复单词默认FalseCheckbutton
include_numbers是否包含数字默认False
normalize_plurals是否去掉词尾的s默认True
wcDct = {
    "单词重复" : {
        "ctrl": ttk.Checkbutton, 
        "paras" : {"width":10},
        "default": False,
        "call" : "repeat"},
    "包含数字" : {
        "ctrl": ttk.Checkbutton, 
        "paras" : {"width":10},
        "default": False,
        "call" : "include_numbers"},
    "去词尾s" : {
        "ctrl": ttk.Checkbutton, 
        "paras" : {"width":10},
        "default": False,
        "call" : "normalize_plurals"},
}

背景颜色

最后,还有一个背景颜色对话框

参数说明对话框类型说明
background_color背景色颜色对话框默认"black"
wcDct = {
    "背景颜色" : {
        "ctrl": DialogButton,
        "paras" : {"height":5, "widthL":22, "widthR":8, "logtype":"颜色"},
        "call" : "background_color",
        "default" : "balck"},
}

更改之后的界面如下

词云生成逻辑

有了这些之后,还要修改词云生成逻辑,即调用这些参数所获得的值,最后根据上图中的参数,得到点云如下

源代码

所有源代码如下

import tkinter as tk 
import tkinter.ttk as ttk
from tkinter.filedialog import (askopenfilename,
    askopenfilenames, askdirectory, asksaveasfilename)
from tkinter.colorchooser import askcolor

from threading import Thread

import numpy as np
import re
import csv
import jieba
from wordcloud import WordCloud

import os

class DialogButton(ttk.Frame):
    def __init__(self, master, 
        height, widthL, widthR, logtype, label=None, text=None, 
        frmDct={}, btnDct={}, enyDct={}, logDct={}):
        w = widthL + widthR
        super().__init__(master, 
            height=height, width = w, **frmDct)
        self.pack(fill=tk.X)

        self.text = tk.StringVar() if not text else text
        ttk.Entry(self, width=widthL, textvariable=self.text, 
            **enyDct).pack(side=tk.LEFT, fill = tk.X, padx=5)
        
        ttk.Button(self, width=widthR, 
            text=self.setLabel(logtype, label),
            command = self.Click, **btnDct).pack(side=tk.RIGHT)
        self.logtype = logtype
        self.logDct = logDct

    def setLabel(self, key, label=None):
        if label:
            return label
        labelDct = {
            "文件"   : "选择文件",
            "文件夹" : "选择路径",
            "多文件" : "选择多个文件",
            "保存" : "存储路径",
            "颜色"   : "选择颜色",
        }
        return labelDct[key]

    def Click(self):
        typeDct = {
            "文件"  : askopenfilename,
            "文件夹": askdirectory,
            "多文件": askopenfilenames,
            "保存"  : asksaveasfilename,
            "颜色"  : askcolor,
        }
        text = typeDct[self.logtype](**self.logDct)
        if self.logtype == "颜色":
            text = text[1]
        self.text.set(text)

    def get(self):
        return self.text.get()

    def set(self, txt):
        self.text.set(txt)

wcDct = {
    "词云宽度" : {
        "ctrl": ttk.Spinbox, 
        "paras" : {"width":10, "from_":100, "to":2000, "increment":10},
        "default":800,
        "call" : "width"},
    "词云高度" : {
        "ctrl": ttk.Spinbox, 
        "paras" : {"width":10, "from_":100, "to":2000, "increment":10},
        "default":450,
        "call" : "height"},
    "图像缩放" : {
        "ctrl": ttk.Spinbox, 
        "paras" : {"width":10, "from_":0.5, "to":10, "increment":0.1},
        "default":1,
        "call" : "scale"},
    "最小文字" : {
        "ctrl": ttk.Spinbox, 
        "paras" : {"width":10, "from_":1, "to":50, "increment":1},
        "default":4,
        "call" : "min_font_size"},
    "最大文字" : {
        "ctrl": ttk.Spinbox, 
        "paras" : {"width":10, "from_":10, "to":1e3, "increment":10},
        "default":400,
        "call" : "max_font_size"},
    "字体步长" : {
        "ctrl": ttk.Spinbox, 
        "paras" : {"width":10, "from_":1, "to":20, "increment":1},
        "default":10,
        "call" : "font_step"},
    "最短词长" : {
        "ctrl": ttk.Spinbox, 
        "paras" : {"width":10, "from_":0, "to":10, "increment":1},
        "default":1,
        "call" : "min_word_length"},
    "最多词数" : {
        "ctrl": ttk.Spinbox, 
        "paras" : {"width":10, "from_":10, "to":500, "increment":10},
        "default":200,
        "call" : "max_words"},
    "字体路径" : {"ctrl": DialogButton,
                 "paras" : {"height":5, "widthL":22, "widthR":8, "logtype":"文件"},
                 "call" : "font_path",
                 "default" : r"C:\Windows\Fonts\simhei.ttf"},
    "输入路径" : {"ctrl": DialogButton, "paras" : {"width":25},
                "paras" : {"height":5, "widthL":22, "widthR":8, "logtype":"文件"},},
    "输出路径" : {"ctrl": DialogButton, "paras" : {"width":25},
                "paras" : {"height":5, "widthL":22, "widthR":8, "logtype":"保存"},},
    "背景颜色" : {
        "ctrl": DialogButton,
        "paras" : {"height":5, "widthL":22, "widthR":8, "logtype":"颜色"},
        "call" : "background_color",
        "default" : "balck"},
    "单词重复" : {
        "ctrl": ttk.Checkbutton, 
        "paras" : {"width":10},
        "default": False,
        "call" : "repeat"},
    "包含数字" : {
        "ctrl": ttk.Checkbutton, 
        "paras" : {"width":10},
        "default": False,
        "call" : "include_numbers"},
    "去词尾s" : {
        "ctrl": ttk.Checkbutton, 
        "paras" : {"width":10},
        "default": False,
        "call" : "normalize_plurals"},
}


class DrawWords(ttk.Frame):
    def __init__(self, master, **options):
        super().__init__(master, **options)
        self.pack()
        self.words = None

        self.initWidgets()

    
    def initWidgets(self):
        frm = ttk.Frame(self)
        frm.pack(side=tk.LEFT, fill=tk.Y)
        self.initPara(frm)

        frm = ttk.LabelFrame(self, text="分词结果")
        frm.pack(fill=tk.BOTH, expand=True)
        self.txtSplit = tk.Text(frm)
        self.txtSplit.pack(side=tk.LEFT, fill=tk.BOTH, padx=5, pady=5, expand=True)
        self.addScroll(frm, self.txtSplit)
    
    def addScroll(self, frm, txt):
        scroll = ttk.Scrollbar(frm)
        scroll.pack(side=tk.RIGHT,fill=tk.Y)
        txt.config(yscrollcommand=scroll.set)
        scroll.config(command=txt.yview)

    def setOneCheck(self, frm, key):
        v = wcDct[key]      # 组件参数
        n = v["call"]       # 调用名
        self.vars[n] = tk.BooleanVar()
        self.vars[n].set(v["default"])
        self.checks[n] = v["ctrl"](frm, text=key,
            variable=self.vars[n], **v["paras"])
        self.checks[n].pack(side=tk.LEFT)
        

    def setOneSpinBox(self, frm, key):
        ttk.Label(frm, width=8, text=key).pack(side=tk.LEFT)
        v = wcDct[key]      # 组件参数
        n = v["call"]       # 调用名
        self.spins[n] = v["ctrl"](frm, **v["paras"])
        self.spins[n].set(v["default"])
        self.spins[n].pack(side=tk.LEFT)
    
    def setOneDiaButton(self, frmPara, key):
        frm = ttk.Frame(frmPara)
        frm.pack(side=tk.TOP, fill=tk.X)
        ttk.Label(frm, width=8, text=key).pack(side=tk.LEFT)
        v = wcDct[key]
        n = v["call"] if 'call' in v else key
        self.paths[n] = v["ctrl"](frm, **v['paras'])
        self.paths[n].pack(side=tk.LEFT)
        if 'default' in v:
            self.paths[n].set(v['default'])

    def setOneColButton(self, frm, key):
        frm = ttk.Frame(frmPara)
        frm.pack(side=tk.TOP, fill=tk.X)
        ttk.Label(frm, width=8, text=key).pack(side=tk.LEFT)
        v = wcDct[key]
        n = v["call"] if 'call' in v else key
        self.paths[n] = v["ctrl"](frm, **v['paras'])
        self.paths[n].pack(side=tk.LEFT)
        if 'default' in v:
            self.paths[n].set(v['default'])


    def initPara(self, frmPara):
        self.spins = {}
        self.checks = {}
        self.vars = {}
        keys = ["词云宽度", "词云高度", "最小文字", "最大文字", 
            "字体步长", "图像缩放", "最短词长", "最多词数"]
        for i,key in enumerate(keys):
            if i%2==0:
                frm = ttk.Frame(frmPara)
                frm.pack(side=tk.TOP, fill=tk.X, pady=2)
            self.setOneSpinBox(frm, key)
        
        keys = ["单词重复", "包含数字", "去词尾s"]
        for i,key in enumerate(keys):
            if i%4==0:
                frm = ttk.Frame(frmPara)
                frm.pack(side=tk.TOP, fill=tk.X, pady=2)
            self.setOneCheck(frm, key)

        self.paths = {}
        for key in ["背景颜色", "输入路径", "输出路径", "字体路径"]:
            self.setOneDiaButton(frmPara, key)
                
        frm = ttk.Frame(frmPara)
        frm.pack(side=tk.TOP, fill=tk.X)
        ttk.Button(frm, text="分词预览", 
            command=self.splitWords).pack(side=tk.LEFT)
        ttk.Button(frm, text="分词保存", 
            command=self.saveWords).pack(side=tk.LEFT)
        ttk.Button(frm, text="输出词云", 
            command=self.genWordCloud).pack(side=tk.LEFT)

    def splitWords(self):
        p = self.paths["输入路径"].get()
        with open(p, encoding='utf8') as f:
            text = f.read()
        words = jieba.lcut(text)
        self.words = [w for w in words if len(w)>1] # 取出长度大于1的词
        self.setSplit("\n".join(self.words))

    def saveWords(self):
        path = asksaveasfilename()
        with open(path) as f:
            f.write(self.txtSplit.get(1.0, 'end'))

    def genWordCloud(self):
        dct = {}
        keys = ['width', 'height', 'font_path', 'scale',
            'min_font_size', 'max_font_size', 'font_step', 
            'max_words', 'min_word_length', 'background_color',
            'repeat', 'include_numbers', 'normalize_plurals']
        for key in keys:
            if key in self.spins:
                dct[key] = int(self.spins[key].get())
            if key in self.paths:
                dct[key] = self.paths[key].get()
            if key in self.checks:
                dct[key] = self.vars[key].get()=='1'  
        print(dct)
        try:
            cloud = WordCloud(**dct)
        except Exception as e:
            print(e)
        txt = self.txtSplit.get(1.0, "end")
        txt = " ".join(txt.split("\n"))
        cloud.generate(txt)
        p = self.paths["输出路径"].get()
        if not (p.endswith('.png') or p.endswith('.svg')):
            p = p+".png"
        cloud.to_file(p)
    
    def setSplit(self, txt):
        self.txtSplit.delete(1.0, "end")
        self.txtSplit.insert("end", txt)
        self.txtSplit.see("end")

if __name__ == "__main__":
    root = tk.Tk()
    DrawWords(root).pack(side=tk.TOP, fill=tk.BOTH)
    root.mainloop()

以上就是python通过设置WordCloud参数实现定制词云的详细内容,更多关于python词云的资料请关注脚本之家其它相关文章!

相关文章

  • 简单了解pytest测试框架setup和tearDown

    简单了解pytest测试框架setup和tearDown

    这篇文章主要介绍了简单了解pytest测试框架setup和tearDown,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友可以参考下
    2020-04-04
  • Pycharm学习教程(7)虚拟机VM的配置教程

    Pycharm学习教程(7)虚拟机VM的配置教程

    这篇文章主要为大家详细介绍了最全的Pycharm学习教程第七篇,Python快捷键相关设置,文中示例代码介绍的非常详细,具有一定的参考价值,感兴趣的小伙伴们可以参考一下
    2017-05-05
  • Python入门必须知道的11个知识点

    Python入门必须知道的11个知识点

    这篇文章主要为大家详细介绍了Python入门必须知道的11个知识点,帮助更好地了解python,感兴趣的小伙伴们可以参考一下
    2018-03-03
  • vc6编写python扩展的方法分享

    vc6编写python扩展的方法分享

    有些C/C++的代码要在Python中要用到,又不想转成python,所以就写成python的扩展来调用,以下是我尝试后,在VC6下编写python扩展的过程
    2014-01-01
  • python中的np.argmax() 返回最大值索引号

    python中的np.argmax() 返回最大值索引号

    这篇文章主要介绍了python中的np.argmax() 返回最大值索引号操作,具有很好的参考价值,希望对大家有所帮助。如有错误或未考虑完全的地方,望不吝赐教
    2021-06-06
  • python实现plt x轴坐标按1刻度显示

    python实现plt x轴坐标按1刻度显示

    这篇文章主要介绍了python实现plt x轴坐标按1刻度显示,具有很好的参考价值,希望对大家有所帮助。如有错误或未考虑完全的地方,望不吝赐教
    2022-07-07
  • python PIL/cv2/base64相互转换实例

    python PIL/cv2/base64相互转换实例

    今天小编就为大家分享一篇python PIL/cv2/base64相互转换实例,具有很好的参考价值,希望对大家有所帮助。一起跟随小编过来看看吧
    2020-01-01
  • Python标准库之加密模块详解

    Python标准库之加密模块详解

    这篇文章主要为大家详细介绍了Python标准库中加密模块的相关知识,文中的示例代码讲解详细,具有一定的学习价值,感兴趣的小伙伴可以了解一下
    2023-07-07
  • 基于django micro搭建网站实现加水印功能

    基于django micro搭建网站实现加水印功能

    这篇文章主要介绍了基于django micro搭建网站实现加水印功能,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友可以参考下
    2020-05-05
  • python多线程实现同时执行两个while循环的操作

    python多线程实现同时执行两个while循环的操作

    这篇文章主要介绍了python多线程实现同时执行两个while循环的操作,具有很好的参考价值,希望对大家有所帮助。一起跟随小编过来看看吧
    2020-05-05

最新评论