Unicode Nearly Plain Text Encoding of Mathematics-Linux大棚

admin 管理员组

文章数量: 1087652

2024年1月23日发(作者：大疆开启fcc教程)

Unicode Nearly Plain Text Encoding of Mathematics

Unicode Nearly Plain-Text Encoding of Mathematics

Version 3

Murray Sargent III

Publisher Text Services, Microsoft Corporation

10-Mar-10

Introduction ............................................................................................................ 2

Encoding Simple Math Expressions ...................................................................... 3

2.1

Fractions .......................................................................................................... 4

2.2

Subscripts 6

2.3

Use of the Blank (Space) Character ............................................................... 7

Encoding Other Math Expressions ........................................................................ 8

3.1

Delimiters ........................................................................................................ 8

3.2

Literal Operators ........................................................................................... 10

3.3

Prescripts and Above/Below Scripts ........................................................... 11

3.4

n-ary Operators ............................................................................................. 12

3.5

Mathematical Functions ............................................................................... 13

3.6

Square Roots and Radicals ........................................................................... 13

3.7

Enclosures ..................................................................................................... 14

3.8

Stretchy Characters ....................................................................................... 15

3.9

Matrices ......................................................................................................... 16

3.10

Accent Operators ....................................................................................... 16

3.11

Differential, Exponential, and Imaginary Symbols ................................. 17

3.12

Unicode Subscripts and Superscripts ...................................................... 18

3.13

Concatenation Operators .......................................................................... 18

3.14

Comma, Period, and Colon ........................................................................ 18

3.15

Ordinary Text Inside Math Zones ............................................................. 19

3.16

Space Characters ....................................................................................... 19

3.17

Phantoms and Smashes ............................................................................ 21

3.18

Arbitrary Groupings .................................................................................. 22

3.19

Equation Arrays ......................................................................................... 22

3.20

Math Zones ................................................................................................. 22

3.21

Equation Numbers .................................................................................... 23

3.22

Linear Format Characters and Operands ................................................ 23

3.23

Equation Breaking and Alignment ........................................................... 26

3.24

Size Overrides ............................................................................................ 26

Input Methods ...................................................................................................... 27

4.1

Character Translations ................................................................................. 27

4.2

Math Keyboards ............................................................................................ 29

4.3

Hexadecimal Input ........................................................................................ 29

4.4

Pull-Down Menus, Toolbars, Context Menus .............................................. 29

4.5

Macros ............................................................................................................ 30

4.6

Linear Format Math Autocorrect List .......................................................... 30

4.7

Handwritten Input ........................................................................................ 30

Recognizing Mathematical Expressions ............................................................. 31

Unicode Technical Note 28

Unicode Nearly Plain Text Encoding of Mathematics

Using the Linear Format in Programming Languages ....................................... 32

6.1

Advantages of Linear Format in Programs ................................................. 33

6.2

Comparison of Programming Notations ..................................................... 34

6.3

Export to TeX ................................................................................................. 36

Conclusions ........................................................................................................... 37

Acknowledgements ..................................................................................................... 37

Appendix A. Linear Format Grammar ....................................................................... 38

Appendix B. Character Keywords and Properties .................................................... 39

Version Differences ..................................................................................................... 48

References .................................................................................................................... 48

1. Introduction

Getting computers to understand human languages is important in increasing

the utility of computers. Natural-language translation, speech recognition and gen-eration, and programming are typical ways in which such machine comprehension

plays a role. The better this comprehension, the more useful the computer, and

hence there has been considerable current effort devoted to these areas since the

early 1960s. Ironically one truly international human language that tends to be ne-glected in this connection is mathematics itself.

With a few conventions, Unicode1 can encode many mathematical expressions

in readable nearly plain text. Technically this format is a “lightly marked up format”;

hence the use of “nearly”. The format is linear, but it can be displayed in built-up

presentation form. To distinguish the two kinds of formats in this paper, we refer to

the nearly plain-text format as the linear format and to the built-up presentation

format as the built-up format. This linear format can be used with heuristics based

on the Unicode math properties to recognize mathematical expressions without the

aid of explicit math-on/off commands. The recognition is facilitated by Unicode’s

strong support for mathematical symbols.2 Alternatively, the linear format can be

used in “math zones” explicitly controlled by the user either with on-off characters

as used in TeX or with a character format attribute in a rich-text environment. Use of

math zones is desirable, since the recognition heuristics are not infallible.

The linear format is more compact and easy to read than [La]TeX,3,4 or

MathML.5 However unlike those formats, it doesn’t attempt to include all typograph-ical embellishments. Instead we feel it’s useful to handle some embellishments in

the higher-level layer that handles rich text properties like text and background col-ors, font size, footnotes, comments, hyperlinks, etc. In principle one can extend the

notation to include the properties of the higher-level layer, but at the cost of re-duced readability. Hence embedded in a rich-text environment, the linear format

can faithfully represent rich mathematical text, whereas embedded in a plain-text

environment it lacks most rich-text properties and some mathematical typograph-ical properties. The linear format is primarily concerned with presentation, but it

has some semantic features that might seem to be only content oriented, e.g., n-2

Unicode Technical Note 28

Unicode Nearly Plain Text Encoding of Mathematics

aryands and function-apply arguments (see Secs. 3.4 and 3.5). These have been in-cluded to aid in displaying built-up functions with proper typography, but they also

help to interoperate with math-oriented programs.

Most mathematical expressions can be represented unambiguously in the line-ar format, from which they can be exported to [La]TeX, MathML, C++, and symbolic

manipulation programs. The linear format borrows notation from TeX for mathe-matical objects that don’t lend themselves well to a mathematical linear notation,

e.g., for matrices.

A variety of syntax choices can be used for a linear format. The choices made in

this paper favor a number of criteria: efficient input of mathematical formulae, suffi-cient generality to support high-quality mathematical typography, the ability to

round trip elegant mathematical text at least in a rich-text environment, and a for-mat that resembles a real mathematical notation. Obviously compromises between

these goals had to be made.

The linear format is useful for 1) inputting mathematical expressions,6 2) dis-playing mathematics by text engines that cannot display a built-up format, and 3)

computer programs. For more general storage and interchange of math expressions

between math-aware programs, MathML and other higher-level languages are pre-ferred.

Section 2 motivates and illustrates the linear format for math using the fraction,

subscripts, and superscripts along with a discussion of how the ASCII space U+0020

is used to build up one construct at a time. Section 3 summarizes the usage of the

other constructs along with their relative precedences, which are used to simplify

the notation. Section 4 discusses input methods. Section 5 gives ways to recognize

mathematical expressions embedded in ordinary text. Section 6 explains how

Unicode plain text can be helpful in programming languages. Section 7 gives conclu-sions. The appendices present a simplified linear-format grammar and a partial list

of operators.

2. Encoding Simple Math Expressions

Given Unicode’s strong support for mathematics2 relative to ASCII, how much

better can a plain-text encoding of mathematical expressions look using Unicode?

The most well-known ASCII encoding of such expressions is that of TeX, so we use it

for comparison. MathML is more verbose than TeX and some of the comparisons ap-ply to it as well. Notwithstanding TeX’s phenomenal success in the science and engi-neering communities, a casual glance at its representations of mathematical expres-sions reveals that they do not look very much like the expressions they represent.

It’s not easy to make algebraic calculations by hand directly using TeX’s notation.

With Unicode, one can represent mathematical expressions more readably, and the

resulting nearly plain text can often be used with few or no modifications for such

calculations. This capability is considerably enhanced by using the linear format in a

system that can also display and edit the mathematics in built-up form.

Unicode Technical Note 28

Unicode Nearly Plain Text Encoding of Mathematics

The present section introduces the linear format with fractions, subscripts, and

superscripts. It concludes with a subsection on how the ASCII space character

U+0020 is used to build up one construct at a time. This is a key idea that makes the

linear format ideal for inputting mathematical formulae. In general where syntax

and semantic choices were made, input convenience was given high priority.

2.1 Fractions

One way to specify a fraction linearly is LaTeX’s frac{numerator}{denominator}.

The

{ } are not printed when the fraction is built up. These simple rules immediately

give a “plain text” that is unambiguous, but looks quite different from the corre-sponding mathematical notation, thereby making it harder to read.

Instead we define a simple operand to consist of all consecutive letters and

decimal digits, i.e., a span of alphanumeric characters, those belonging to the Lx and

Nd General Categories (see The Unicode Standard 5.0,1 Table 4-2. General Category).

As such, a simple numerator or denominator is terminated by most nonalphanumer-ic characters, including, for example, arithmetic operators, the blank (U+0020), and

Unicode characters in the ranges U+2200..U+23FF, U+2500..U+27FF, and U+2900 ..

U+2AFF. The fraction operator is given by the usual solidus / (U+002F). So the sim-ple built-up fraction

本文标签：开启大疆教程作者

版权声明：本文标题：Unicode Nearly Plain Text Encoding of Mathematics 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.roclinux.cn/p/1705975422a496357.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

发表评论

全部评论 0

暂无评论

Linux大棚 – 不忘初心的技术博客，浮躁时代的安静角落

Unicode Nearly Plain Text Encoding of Mathematics

更多相关文章

IE 11安装（WIN 7）教程

Mac连接路由器后没有反应_轻松解决设置了路由器还是没法上网的问题（新手教程）...

Windows系统中文版切换英文版教程

服务器2016安装系统教材,Windows Server 2016 正式版安装教程

windows 仍在设置此设备的类配置。 (代码 56)_Win10系统局域网设置的终极教程

【ShuQiHere】Windows远程桌面配置教程：远程桌面协议（Remote Desktop Protocol, RDP）及其使用方法 ️✨

Ubuntu Server 20.04 U盘启动-详细安装教程

Windows 10系统安装全攻略：U盘启动盘制作与安装教程

Win10关闭体验共享功能详细教程

VMware虚拟机安装Win7专业版详细教程(很详细)

Idea如何彻底卸载干净＋最新版本下载教程（小白版本）

win10如何共享计算机网络打印机,win10系统如何安装网络打印机？windows10安装网络打印机图文教程...

Ubuntu系统 nano 编辑器使用教程

Windows 10搭建Intel SGX环境教程

windows安装Tomcat及配置教程

LRTimelapse Pro v7.0.0 激活版下载安装教程 (延迟摄影后期渲染)

新手教程：如何安装Windows 11 安卓子系统

2024Nessus超详细下载安装教程（windows）

Qt for Android（b站讯为QT教程）

斐讯K2路由器华硕固件刷机教程

发表评论

推荐文章

javascript - How to make AJAX work on local server using XAMPP or node.js - Stack Overflow

Correctly export numeric lists as JSON in BigQuery - Stack Overflow

android - Azure Devops Pipeline GooglePlayRelease@4 Failed with message: GaxiosError: Version code 1 has already been used - Sta

javascript - How can I disable vue.js watcher while createdmounted methods are yet to be finished? - Stack Overflow

手机能安装windows系统吗

热门文章

javascript - Saving Div Content As Image On Server - Stack Overflow

vba - Excel UDF: use Characters object without ScreenUpdating - Stack Overflow

javascript - Warning: Expected server HTML to contain a matching &lt;body&gt; in &lt;div&gt; - Stack Overflow

javascript - How to change the text of the label box dynamically in jsp page? - Stack Overflow

java - Jasper reports table overflowing issue - Stack Overflow

想知道xp怎么升级到win7 xp怎么升级到win7系统

FinalShell 远程桌面连接虚拟机 Win 7

领夹麦克风十大品牌，无线麦克风品牌排行榜，十大音质好的麦克风

FormatTool-U盘格式化工具：快速转换格式，解决U盘问题

守护你的数字家园：家庭网络安全实用指南

最新文章

javascript - How do I toggle the readonly attribute of all child element with jquery - Stack Overflow

javascript - Might it be possible to block an entire US state from accessing my site, using PHP? - Stack Overflow

c++ - Is dereferencing std::span::end always undefined? - Stack Overflow

javascript - Delay function execution if it has been called recently - Stack Overflow

javascript - Google Maps Autocomplete List - Stack Overflow

【免费下载】 重温经典：MSDN原版Windows 7 with SP1各版本下载推荐

【免费下载】 大神U盘工具（Win10PE）UEFI纯净版启动盘制作工具

【免费下载】 重温经典：Windows 98原版系统镜像下载资源推荐

Windows系统更新，显示Windows启动管理器，进去后为重装系统界面的解决方法。

win11登录密码忘记了？别慌！无需重装系统，一个U盘轻松移除！

Exploring the Finest Accommodations: A Comprehensive Guide to Ruston LA Hotels

The Enchanting Experience of ScaliniTella NYC: A Culinary Gem in the Heart of Manhattan

Exploring the Exquisite Aloft Chicago O'Hare: A Blend of Modern Luxury and Convenience

A Culinary Journey: Discovering the Finest Dining Experiences in Waco, TX

A Culinary Journey: Discovering the Finest Dining Experiences in Athens, GA

javascript - Warning: Expected server HTML to contain a matching <body> in <div> - Stack Overflow

【免费下载】重温经典：MSDN原版Windows 7 with SP1各版本下载推荐

【免费下载】大神U盘工具（Win10PE）UEFI纯净版启动盘制作工具

【免费下载】重温经典：Windows 98原版系统镜像下载资源推荐