admin 管理员组

文章数量: 1087678

#001 Python整理MYSQL的XML数据字典

一、业务背景

MYSQL DB导出的数据字典是XML的,我们需要整理成公司的EXCEL标准,提交给数据中心留档

二、我们用Python来实现XML转表格

2.1、先来看看MYSQL导出的数据字典的XML样式:

<?xml version="1.0" encoding="UTF-8" ?>
<project name="ERP" id="Project_315" database="MySql" ><schema name="ERP360" ><table name="CHANNEL_CATEGORY" generator_rows="100" ><comment><![CDATA[商品分类表]]></comment><column name="ID" type="varchar" length="255" decimal="0" jt="12" mandatory="y" /><column name="NAME" type="varchar" length="255" decimal="0" jt="12" ><comment><![CDATA[分类名称]]></comment></column><column name="LEVEL" type="varchar" length="255" decimal="0" jt="12" ><comment><![CDATA[分类级别]]></comment></column><column name="SOURCE" type="varchar" length="255" decimal="0" jt="12" ><comment><![CDATA[来源]]></comment></column></table></schema><layout name="Default Layout" id="Layout_248b" show_column_type="y" show_relation="columns" ><entity schema="ERP" name="CHANNEL_CATEGORY" color="C1D8EE" x="64" y="48" /></layout>
</project>

2.2、具体的思路和步骤

1.先读取XML文件
2.根据XML的内容,用DOM的遍历办法把相关的表、字段和字段的属性读取到ROW
3.把ROW导入到DF
4.DF导出CSV或者Excel

最后来看看代码

# -*- coding: utf-8 -*-
"""@Author  : Nick@Time    : 2023/8/29@Comment : 
"""
from xml.dom.minidom import parse
import pandas as pddef readXML():xtree = parse("./p360-mysql.xml")# 文档根元素xroot = xtree.documentElementprint(xroot.nodeName)df_cols = ["table","table_cn","id", "type", "length", "decimal", "jt","mandatory","column_cn"]rows = []# 所有Tabletables = xroot.getElementsByTagName("table")for table in tables:if table.hasAttribute("name"):t_id = table.getAttribute("name")t_id_cn = table.getElementsByTagName("comment")[0].firstChild.data# 根据元素名查找columns = table.getElementsByTagName("column")# 遍历for column in columns:id = column.getAttribute('name')type = column.getAttribute('type')length = column.getAttribute('length')decimal = column.getAttribute('decimal')jt = column.getAttribute('jt')mandatory = column.getAttribute('mandatory')comments = column.getElementsByTagName("comment")column_cn = ""for comment in comments:column_cn = comment.firstChild.datarows.append({"table":t_id,"table_cn":t_id_cn,"id": id, "type": type, "length": length, "decimal": decimal, "jt": jt, "mandatory": mandatory, "column_cn": column_cn})out_df = pd.DataFrame(rows, columns = df_cols)out_df.to_csv('p360.csv')if __name__ == '__main__':readXML()

2.3 导出后的效果

1693379527693.jpg

注意:导出的csv是utf-8, Excel打开会中文乱码,可以修改文件的编码为ANSII才可以

本文标签: 001 Python整理MYSQL的XML数据字典