admin 管理员组文章数量: 1087652
2024年1月23日发(作者:大疆开启fcc教程)
Unicode Nearly Plain Text Encoding of Mathematics
Unicode Nearly Plain-Text Encoding of Mathematics
Version 3
Murray Sargent III
Publisher Text Services, Microsoft Corporation
10-Mar-10
1.
Introduction ............................................................................................................ 2
2.
Encoding Simple Math Expressions ...................................................................... 3
2.1
Fractions .......................................................................................................... 4
2.2
Subscripts 6
2.3
Use of the Blank (Space) Character ............................................................... 7
3.
Encoding Other Math Expressions ........................................................................ 8
3.1
Delimiters ........................................................................................................ 8
3.2
Literal Operators ........................................................................................... 10
3.3
Prescripts and Above/Below Scripts ........................................................... 11
3.4
n-ary Operators ............................................................................................. 12
3.5
Mathematical Functions ............................................................................... 13
3.6
Square Roots and Radicals ........................................................................... 13
3.7
Enclosures ..................................................................................................... 14
3.8
Stretchy Characters ....................................................................................... 15
3.9
Matrices ......................................................................................................... 16
3.10
Accent Operators ....................................................................................... 16
3.11
Differential, Exponential, and Imaginary Symbols ................................. 17
3.12
Unicode Subscripts and Superscripts ...................................................... 18
3.13
Concatenation Operators .......................................................................... 18
3.14
Comma, Period, and Colon ........................................................................ 18
3.15
Ordinary Text Inside Math Zones ............................................................. 19
3.16
Space Characters ....................................................................................... 19
3.17
Phantoms and Smashes ............................................................................ 21
3.18
Arbitrary Groupings .................................................................................. 22
3.19
Equation Arrays ......................................................................................... 22
3.20
Math Zones ................................................................................................. 22
3.21
Equation Numbers .................................................................................... 23
3.22
Linear Format Characters and Operands ................................................ 23
3.23
Equation Breaking and Alignment ........................................................... 26
3.24
Size Overrides ............................................................................................ 26
4.
Input Methods ...................................................................................................... 27
4.1
Character Translations ................................................................................. 27
4.2
Math Keyboards ............................................................................................ 29
4.3
Hexadecimal Input ........................................................................................ 29
4.4
Pull-Down Menus, Toolbars, Context Menus .............................................. 29
4.5
Macros ............................................................................................................ 30
4.6
Linear Format Math Autocorrect List .......................................................... 30
4.7
Handwritten Input ........................................................................................ 30
5.
Recognizing Mathematical Expressions ............................................................. 31
Unicode Technical Note 28
1
Unicode Nearly Plain Text Encoding of Mathematics
6.
Using the Linear Format in Programming Languages ....................................... 32
6.1
Advantages of Linear Format in Programs ................................................. 33
6.2
Comparison of Programming Notations ..................................................... 34
6.3
Export to TeX ................................................................................................. 36
7.
Conclusions ........................................................................................................... 37
Acknowledgements ..................................................................................................... 37
Appendix A. Linear Format Grammar ....................................................................... 38
Appendix B. Character Keywords and Properties .................................................... 39
Version Differences ..................................................................................................... 48
References .................................................................................................................... 48
1. Introduction
Getting computers to understand human languages is important in increasing
the utility of computers. Natural-language translation, speech recognition and gen-eration, and programming are typical ways in which such machine comprehension
plays a role. The better this comprehension, the more useful the computer, and
hence there has been considerable current effort devoted to these areas since the
early 1960s. Ironically one truly international human language that tends to be ne-glected in this connection is mathematics itself.
With a few conventions, Unicode1 can encode many mathematical expressions
in readable nearly plain text. Technically this format is a “lightly marked up format”;
hence the use of “nearly”. The format is linear, but it can be displayed in built-up
presentation form. To distinguish the two kinds of formats in this paper, we refer to
the nearly plain-text format as the linear format and to the built-up presentation
format as the built-up format. This linear format can be used with heuristics based
on the Unicode math properties to recognize mathematical expressions without the
aid of explicit math-on/off commands. The recognition is facilitated by Unicode’s
strong support for mathematical symbols.2 Alternatively, the linear format can be
used in “math zones” explicitly controlled by the user either with on-off characters
as used in TeX or with a character format attribute in a rich-text environment. Use of
math zones is desirable, since the recognition heuristics are not infallible.
The linear format is more compact and easy to read than [La]TeX,3,4 or
MathML.5 However unlike those formats, it doesn’t attempt to include all typograph-ical embellishments. Instead we feel it’s useful to handle some embellishments in
the higher-level layer that handles rich text properties like text and background col-ors, font size, footnotes, comments, hyperlinks, etc. In principle one can extend the
notation to include the properties of the higher-level layer, but at the cost of re-duced readability. Hence embedded in a rich-text environment, the linear format
can faithfully represent rich mathematical text, whereas embedded in a plain-text
environment it lacks most rich-text properties and some mathematical typograph-ical properties. The linear format is primarily concerned with presentation, but it
has some semantic features that might seem to be only content oriented, e.g., n-2
Unicode Technical Note 28
Unicode Nearly Plain Text Encoding of Mathematics
aryands and function-apply arguments (see Secs. 3.4 and 3.5). These have been in-cluded to aid in displaying built-up functions with proper typography, but they also
help to interoperate with math-oriented programs.
Most mathematical expressions can be represented unambiguously in the line-ar format, from which they can be exported to [La]TeX, MathML, C++, and symbolic
manipulation programs. The linear format borrows notation from TeX for mathe-matical objects that don’t lend themselves well to a mathematical linear notation,
e.g., for matrices.
A variety of syntax choices can be used for a linear format. The choices made in
this paper favor a number of criteria: efficient input of mathematical formulae, suffi-cient generality to support high-quality mathematical typography, the ability to
round trip elegant mathematical text at least in a rich-text environment, and a for-mat that resembles a real mathematical notation. Obviously compromises between
these goals had to be made.
The linear format is useful for 1) inputting mathematical expressions,6 2) dis-playing mathematics by text engines that cannot display a built-up format, and 3)
computer programs. For more general storage and interchange of math expressions
between math-aware programs, MathML and other higher-level languages are pre-ferred.
Section 2 motivates and illustrates the linear format for math using the fraction,
subscripts, and superscripts along with a discussion of how the ASCII space U+0020
is used to build up one construct at a time. Section 3 summarizes the usage of the
other constructs along with their relative precedences, which are used to simplify
the notation. Section 4 discusses input methods. Section 5 gives ways to recognize
mathematical expressions embedded in ordinary text. Section 6 explains how
Unicode plain text can be helpful in programming languages. Section 7 gives conclu-sions. The appendices present a simplified linear-format grammar and a partial list
of operators.
2. Encoding Simple Math Expressions
Given Unicode’s strong support for mathematics2 relative to ASCII, how much
better can a plain-text encoding of mathematical expressions look using Unicode?
The most well-known ASCII encoding of such expressions is that of TeX, so we use it
for comparison. MathML is more verbose than TeX and some of the comparisons ap-ply to it as well. Notwithstanding TeX’s phenomenal success in the science and engi-neering communities, a casual glance at its representations of mathematical expres-sions reveals that they do not look very much like the expressions they represent.
It’s not easy to make algebraic calculations by hand directly using TeX’s notation.
With Unicode, one can represent mathematical expressions more readably, and the
resulting nearly plain text can often be used with few or no modifications for such
calculations. This capability is considerably enhanced by using the linear format in a
system that can also display and edit the mathematics in built-up form.
Unicode Technical Note 28
3
Unicode Nearly Plain Text Encoding of Mathematics
The present section introduces the linear format with fractions, subscripts, and
superscripts. It concludes with a subsection on how the ASCII space character
U+0020 is used to build up one construct at a time. This is a key idea that makes the
linear format ideal for inputting mathematical formulae. In general where syntax
and semantic choices were made, input convenience was given high priority.
2.1 Fractions
One way to specify a fraction linearly is LaTeX’s frac{numerator}{denominator}.
The
{ } are not printed when the fraction is built up. These simple rules immediately
give a “plain text” that is unambiguous, but looks quite different from the corre-sponding mathematical notation, thereby making it harder to read.
Instead we define a simple operand to consist of all consecutive letters and
decimal digits, i.e., a span of alphanumeric characters, those belonging to the Lx and
Nd General Categories (see The Unicode Standard 5.0,1 Table 4-2. General Category).
As such, a simple numerator or denominator is terminated by most nonalphanumer-ic characters, including, for example, arithmetic operators, the blank (U+0020), and
Unicode characters in the ranges U+2200..U+23FF, U+2500..U+27FF, and U+2900 ..
U+2AFF. The fraction operator is given by the usual solidus / (U+002F). So the sim-ple built-up fraction
版权声明:本文标题:Unicode Nearly Plain Text Encoding of Mathematics 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.roclinux.cn/p/1705975422a496357.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
更多相关文章
IE 11安装(WIN 7)教程
IE 11安装(WIN 7)教程 【下载地址】IE11安装WIN7教程 本开源项目提供了一套详细的IE 11.0在WIN 7系统上的安装教程,帮助用户解决默认IE 8.0版本
Mac连接路由器后没有反应_轻松解决设置了路由器还是没法上网的问题(新手教程)...
现实当中正常设置好路由去了,为啥还是上不去网?接下来建议大家按照如下步骤来尝试解决问题: 1、 通过RESET按钮(孔)将路由器恢复到出厂状态,重新按照
Windows系统中文版切换英文版教程
首先确定系统版本是否为单语言版,单语言版无法切换显示语言。点击鼠标右键--个性化,进入个性化界面。点击左上角“齿轮”。 进入设置界面,点击“时间和语言”。 进入“时间和语言”
服务器2016安装系统教材,Windows Server 2016 正式版安装教程
微软今天正式发布了Windows Server 2016,它可以理解为服务器版的Windows 10,宣告整个核心架构定型稳定。实际上,上月末,微软就已
windows 仍在设置此设备的类配置。 (代码 56)_Win10系统局域网设置的终极教程
1.安装 SMB1 协议 Win10 能成功访问共享文件夹,必须有安装 SMB1 协议,否则会提示找不到网络名称的提示。 快捷键WinR打开运行框,输入control回车,打
【ShuQiHere】Windows远程桌面配置教程:远程桌面协议(Remote Desktop Protocol, RDP)及其使用方法 ️✨
【ShuQiHere】🖥️✨ 详细目录 引言 💬Windows远程桌面协议(RDP)概述 🛠️ RDP的定义与功能 &a
Ubuntu Server 20.04 U盘启动-详细安装教程
Ubuntu Server 20.04超详细安装教程 1. Ubuntu Server20.04启动盘制作 1.1 下载镜像 去Ubuntu官网找到20.04的镜像文件(20.04下载地址)
Windows 10系统安装全攻略:U盘启动盘制作与安装教程
Windows 10系统安装全攻略:U盘启动盘制作与安装教程 引言 在数字化办公场景中,系统崩溃或需要重装系统是每个技术从业者都可能遇到的挑战。本文将通过分步图解+操作要点解析的方式,系统讲解从U盘启动盘制作到Windows 10系统安
Win10关闭体验共享功能详细教程
Win10关闭体验共享功能详细教程 在Windows 10操作系统中,体验共享功能是一项旨在通过收集用户数据来帮助微软改进其产品的特性。尽管这一功能对于提升操作系统的整体体验有着积极作用,但对于注重个人隐私和数据安全的用户而言,关闭这一功
VMware虚拟机安装Win7专业版详细教程(很详细)
目录 一、Win7镜像下载 二、配置虚拟机 三、安装Win7 四、常见问题 今天教大家如何在VMware虚拟机安装Win7专业版教程: 一、Win7镜像下载 镜像下载链接:点击下载 此
Idea如何彻底卸载干净+最新版本下载教程(小白版本)
一、卸载idea 打开控制面板删除idea 二、清理注册表 winR 键,输入regedit 打开注册表,根据目录:计算机HKEY_CURRENT_USERSOF
win10如何共享计算机网络打印机,win10系统如何安装网络打印机?windows10安装网络打印机图文教程...
近来,有位刚刚升级win10系统的用户反映自己为了能够更加方便地打印,因此想要在电脑上安装网络打印机,可是尝试了半天,都没有安装成功。那么&#x
Ubuntu系统 nano 编辑器使用教程
以下是 Ubuntu 系统下 nano 编辑器的详细使用教程,涵盖基础操作、快捷键和实用技巧: 1. 安装 nano Ubuntu 系统通常预装 nano,若未安装可通过以下
Windows 10搭建Intel SGX环境教程
文章目录 一、初始环境二、资源下载三、安装四、检查五、参考 一、初始环境 Windows10 CPU:i5第八代 支持SGX 判断CPU是否支持Intel SGX, 如果你的版本不支持sgx,可
windows安装Tomcat及配置教程
下载以及解压 1.官网下载需要的版本 https:tomcat.apache 2. 下载完后解压到指定的路径,一定要记住自己解压的路径,作者以解压到D盘为例 D:apacheapa
LRTimelapse Pro v7.0.0 激活版下载安装教程 (延迟摄影后期渲染)
前言 LRTimelapse Pro 是一款优秀的高级延时摄影解决方案,使用将为用户提供最全面的工具和功能,从导入和组织您的延时摄影序列、视觉工作流程、在 LRTimelapse 中直接编辑和渲
新手教程:如何安装Windows 11 安卓子系统
Windows Subsystem for Android (简称WSA) 支持在 Windows 11系统上运行 Android 软件 Android子系统的要求 确保Windows 11版本为22000.xxx或更高版本&#
2024Nessus超详细下载安装教程(windows)
一、下载安装 1.建议保证电脑剩余内存大于30G,要用官网下载:点我去下载页面 2.下载后点击安装包安装,记得给它建个家(空文件夹࿰
Qt for Android(b站讯为QT教程)
一、前言 QT 是一个非常优秀的跨平台工具。所以我们只需要掌握 QT 做界面的方法,我们就可以很方便的做 QT 的跨平台应用了。为什么要学习 QT_For_Android?1. 体验 QT 跨平台的优越性
斐讯K2路由器华硕固件刷机教程
本文还有配套的精品资源,点击获取简介:斐讯K2是一款性能强大且具有可玩性的无线路由器,用户可以通过刷机升级到华硕固件以增强功能和提升网络性能。本资料包包括刷机过程和华
发表评论