admin 管理员组

文章数量: 1087652

python词频统计GUI(thinter)

本文介绍利用python实现了简单的词频统计程序，其中涉及了简单的正则表达式的使用和python可视化模块tkinter的使用。完成了选择任意的文件，然后统计其中的单词的出现频度并以列表的形式展现出来。最后连接数据库并将所得的结果写入数据库。

一，首先是简单的词频统计

利用文件名读取文件，然后调用remove_punctuation()函数去除其中的杂乱的字符，实现只有英文的字符。然后将得到的字符串转化为字典，单词作为索引，次数作为值，一遍循环以后实现了建立词频统计，然后将结果写入了文件中，用于验证。

    wordDict = {}with open(filenamevar) as file:content = file.read()# remove the character which is not in a-zA-Zcontent = remove_punctuation(content)# covert the str to the lower statementcontent = content.lower()wordList = sorted(list(content.split()))for word in wordList:if word not in wordDict:wordDict[word] = 1else:wordDict[word] = wordDict[word] + 1file.close()with open('out.txt', 'w') as file2:for x, y in wordDict.items():file2.write(x + ' ' + str(y) + '\n')

以下是remove_punctuation()函数部分，利用简单的正则表达式，用空格替换原来的除英文意外的字符，返回目标字符串：

def remove_punctuation(line):rule = re.compile(r"[^a-zA-Z]")line = rule.sub(' ', line)return line

二，tkinter的使用

利用tkinter构建简单的客户端：

root = tk.Tk()
entryvar = tk.StringVar()  # 路径
root.geometry('600x400+400+200')
# var.set("请选择你要打开的文件")
root.title("统计词频")# define the frame
frame = tk.Frame(root)
frame2 = tk.Frame(root)frame2.place(x=100, y=30, width=300, height=175)
# label = tk.Label(frame2, textvariable = var).pack(side = tk.LEFT)
Entryvar = tk.Entry(frame2, textvariable=entryvar, width=20)
Entryvar.pack(side=tk.LEFT)
# print((type(Entryvar)))
button = tk.Button(frame2, text="读入文件", command=getFile).pack(side=tk.LEFT)
button2 = tk.Button(frame2, text="统计", command=wordResult).pack(side=tk.LEFT)
button3 = tk.Button(frame2, text="插入数据库", command=pyDatebase).pack(side=tk.LEFT)# 表格
frame.place(x=400, y=30, width=170, height=330)
scrollBar = tk.Scrollbar(frame)
scrollBar.pack(side=tk.RIGHT, fill=tk.Y)tree = Treeview(frame, columns=('c1', 'c2'), show="headings", yscrollcommand=scrollBar.set)
tree.column('c1', width=90, anchor='center')
tree.column('c2', width=70, anchor='center')
tree.heading('c1', text='单词')
tree.heading('c2', text='出现次数')
tree.pack(side=tk.LEFT, fill=tk.Y)
scrollBar.config(command=tree.yview)root.mainloop()

三，数据库连接

利用pymysql连接工具连接数据库，向数据库中插入数据。

def pyDatebase():connect = pymysql.Connect(host='localhost', port=3306, user='root', passwd='123456', db='python', charset='utf8')# 获取游标cursor = connect.cursor()#删除原数据sql_delete = "delete from test;"cursor.execute(sql_delete)# 插入数据sql_insert = "INSERT INTO test (word, number) VALUES ( '%s', '%s')"for x, y in wordDict.items():date = x, ycursor.execute(sql_insert % date)connect.commit()print('成功插入条数据')

最终测试结果如下：

结果反思：

首先在tkinter的使用过程中，最开始是直接通过askopenfilename()这个函数去获取对话框中打开的文件的文件名，然后想要利用这个文件名进行一系列的操作，但是，经过不断的测试发现，利用button()的command属性调用的函数，不会再mainloop()循环以内就返回值，也就是说，再可视化的循环内是获取不到filename的，这也就导致了无法去获取文件名，无法进一步进行操作，然后在搜寻资料发现了要向在mainloop()循环内获得这个值，那么就必须得使用tkinter的内部String变量的方法StringVar并进行动态赋值，才能实现文件名的获取，也才能实现后续的一系列操作。

由此可见，对一个已调用模块的熟悉程度也就注定了自己在这个模块的使用的过程中会有多大的返工。

附源码如下：

import re
import tkinter as tk
import pymysql.cursors
import tkinter.filedialog
from tkinter.ttk import TreeviewwordDict = {}def pyDatebase():connect = pymysql.Connect(host='localhost', port=3306, user='root', passwd='123456', db='python', charset='utf8')# 获取游标cursor = connect.cursor()#删除原数据sql_delete = "delete from test;"cursor.execute(sql_delete)# 插入数据sql_insert = "INSERT INTO test (word, number) VALUES ( '%s', '%s')"for x, y in wordDict.items():date = x, ycursor.execute(sql_insert % date)connect.commit()print('成功插入条数据')def remove_punctuation(line):rule = re.compile(r"[^a-zA-Z]")line = rule.sub(' ', line)return linedef wordResult():filenamevar = Entryvar.get()with open(filenamevar) as file:content = file.read()# remove the character which is not in a-zA-Zcontent = remove_punctuation(content)# covert the str to the lower statementcontent = content.lower()wordList = sorted(list(content.split()))for word in wordList:if word not in wordDict:wordDict[word] = 1else:wordDict[word] = wordDict[word] + 1file.close()with open('out.txt', 'w') as file2:for x, y in wordDict.items():file2.write(x + ' ' + str(y) + '\n')for x, y in wordDict.items():tree.insert('', 'end', value=(x, y))# 获取文件的路径
def getFile():global filenamefilename = tk.filedialog.askopenfilename()entryvar.set(filename)root = tk.Tk()
entryvar = tk.StringVar()  # 路径
root.geometry('600x400+400+200')
# var.set("请选择你要打开的文件")
root.title("统计词频")# define the frame
frame = tk.Frame(root)
frame2 = tk.Frame(root)frame2.place(x=100, y=30, width=300, height=175)
# label = tk.Label(frame2, textvariable = var).pack(side = tk.LEFT)
Entryvar = tk.Entry(frame2, textvariable=entryvar, width=20)
Entryvar.pack(side=tk.LEFT)
# print((type(Entryvar)))
button = tk.Button(frame2, text="读入文件", command=getFile).pack(side=tk.LEFT)
button2 = tk.Button(frame2, text="统计", command=wordResult).pack(side=tk.LEFT)
button3 = tk.Button(frame2, text="插入数据库", command=pyDatebase).pack(side=tk.LEFT)# 表格
frame.place(x=400, y=30, width=170, height=330)
scrollBar = tk.Scrollbar(frame)
scrollBar.pack(side=tk.RIGHT, fill=tk.Y)tree = Treeview(frame, columns=('c1', 'c2'), show="headings", yscrollcommand=scrollBar.set)
tree.column('c1', width=90, anchor='center')
tree.column('c2', width=70, anchor='center')
tree.heading('c1', text='单词')
tree.heading('c2', text='出现次数')
tree.pack(side=tk.LEFT, fill=tk.Y)
scrollBar.config(command=tree.yview)root.mainloop()

本文标签： python词频统计GUI(thinter)

版权声明：本文标题：python词频统计GUI(thinter) 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.roclinux.cn/b/1686561228a10479.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

Linux大棚 – 不忘初心的技术博客，浮躁时代的安静角落

python词频统计GUI(thinter)

python词频统计GUI(thinter)

一，首先是简单的词频统计

二，tkinter的使用

三，数据库连接

结果反思：

更多相关文章

python词频统计GUI(thinter)

发表评论

推荐文章

Issue with Jump Metrics Calculation from Force-Time Data Using Python: Errors in Phase Detection and Velocity - Stack Overflow

javascript - Angular - ui-router How to go to specific portion of a page - Stack Overflow

javascript - How do you force an electron app to have a single instance? - Stack Overflow

html - Javascript: Highlighting part of a string with &lt;b&gt; tags - Stack Overflow

文档处理控件Aspose.Words 教程：在 Word 中删除空白页完整指南

热门文章

javascript - Can&#39;t get express-handlebars render an HTML page - Stack Overflow

javascript - Whatsapp Cloud API uploading media files error - Stack Overflow

javascript - how to make an id in a realtime database - Stack Overflow

amazon web services - AWS Glue 5.0 &quot;Installation of Python modules timed out after 10 minutes&quot; - Stack Overflo

google cloud platform - ubuntu 2404 image creation failure in creation using ansible - Stack Overflow

通过在统信UOS操作系统中使用Ventoy制作U盘引导盘

Windows 下使用 nmap ncat 命令测试 UDP 端口连接

Win11不合适？4个方法让你轻松退回Win10！

Word删除空白页方法，由分节符导致多出空白页删除方法

【亲测有效】Ubuntu22.04安装黑屏&amp;重启进入系统卡死

最新文章

javascript - How do I toggle the readonly attribute of all child element with jquery - Stack Overflow

javascript - Might it be possible to block an entire US state from accessing my site, using PHP? - Stack Overflow

c++ - Is dereferencing std::span::end always undefined? - Stack Overflow

javascript - Delay function execution if it has been called recently - Stack Overflow

javascript - Google Maps Autocomplete List - Stack Overflow

【免费下载】 重温经典：MSDN原版Windows 7 with SP1各版本下载推荐

【免费下载】 大神U盘工具（Win10PE）UEFI纯净版启动盘制作工具

【免费下载】 重温经典：Windows 98原版系统镜像下载资源推荐

Windows系统更新，显示Windows启动管理器，进去后为重装系统界面的解决方法。

win11登录密码忘记了？别慌！无需重装系统，一个U盘轻松移除！

Exploring the Finest Accommodations: A Comprehensive Guide to Ruston LA Hotels

The Enchanting Experience of ScaliniTella NYC: A Culinary Gem in the Heart of Manhattan

Exploring the Exquisite Aloft Chicago O'Hare: A Blend of Modern Luxury and Convenience

A Culinary Journey: Discovering the Finest Dining Experiences in Waco, TX

A Culinary Journey: Discovering the Finest Dining Experiences in Athens, GA

html - Javascript: Highlighting part of a string with <b> tags - Stack Overflow

javascript - Can't get express-handlebars render an HTML page - Stack Overflow

amazon web services - AWS Glue 5.0 "Installation of Python modules timed out after 10 minutes" - Stack Overflo

【亲测有效】Ubuntu22.04安装黑屏&重启进入系统卡死

【免费下载】重温经典：MSDN原版Windows 7 with SP1各版本下载推荐

【免费下载】大神U盘工具（Win10PE）UEFI纯净版启动盘制作工具

【免费下载】重温经典：Windows 98原版系统镜像下载资源推荐