admin 管理员组

文章数量: 1184232

注:本文为 “Beautiful maths simplification | avoiding trigonometry” 相关译文合辑。
图片清晰度受引文原图所限。
机翻未校,略作重排,如有内容异常,请看原文。


Beautiful maths simplification: quaternion from two vectors

优美的数学简化:由两个向量推导四元数

In this article I would like to explain the thought process behind the derivation of a widely used formula: finding a quaternion representing the rotation between two 3D vectors. Nothing really new, but hopefully a few ideas could be reused at other times.
在本文中,我想阐释一个常用公式推导背后的思路:求解表示两个 3D 向量间旋转的四元数。其中并无太多全新内容,但希望部分思路能在其他场景下复用。

Naive method

朴素方法

A rotation is best visualised using a rotation axis and an angle. Except in degenerate cases, the rotation axis can be obtained by computing the cross product of the two original vectors:
旋转最直观的表示方式是借助旋转轴和旋转角。除退化情况外,旋转轴可通过计算两个原始向量的叉积得到:

Then the angle can be obtained using the properties of the cross product and/or the dot product:
随后,可利用叉积和/或点积的性质求得旋转角:

u . v = ∣ u ∣ . ∣ v ∣ . cos ⁡ θ ∣ ∣ u × v ∣ ∣ = ∣ u ∣ . ∣ v ∣ . ∣ sin ⁡ θ ∣ \begin{align} \boldsymbol{u} . \boldsymbol{v} &= |\boldsymbol{u}| . |\boldsymbol{v}| . \cos\theta \\ ||\boldsymbol{u} \times \boldsymbol{v}|| &= |\boldsymbol{u}| . |\boldsymbol{v}| . |\sin\theta| \end{align} u.v∣∣u×v∣∣=u∣.∣v∣.cosθ=u∣.∣v∣.∣sinθ

Since θ is always between 0 and π, we only care about the dot product. This gives us some obvious code to create the quaternion (omitting corner cases such as θ = 0 for clarity):
由于 θ 始终介于 0 和 π 之间,我们只需关注点积即可。据此可写出创建四元数的直观代码(为清晰起见,省略 θ = 0 等边界情况):

quat quat::fromtwovectors(vec3 u, vec3 v)
{
    float cos_theta = dot(normalize(u), normalize(v));
    float angle = acos(cos_theta);
    vec3 w = normalize(cross(u, v));
    return quat::fromaxisangle(angle, w);
}

This is naive but it works. Googling for “quaternion from two vectors” shows this forum post and this SO question where this method or a variation thereof is suggested.
这种方法虽朴素但切实有效。在谷歌中搜索“由两个向量推导四元数”,会找到这篇论坛帖子和这个 Stack Overflow 问题,其中都提到了该方法或其变体。

Looking under the hood

深入底层原理

Let’s have a look at what happens when building the quaternion from an axis and an angle:
我们来探究一下通过旋转轴和旋转角构建四元数的底层逻辑:


This means the code for quat::fromaxisangle would look somewhat like this:
这意味着 quat::fromaxisangle 函数的代码大致如下:

quat quat::fromaxisangle(float angle, vec3 axis)
{
    float half_sin = sin(0.5f * angle);
    float half_cos = cos(0.5f * angle);
    return quat(half_cos,
                half_sin * axis.x,
                half_sin * axis.y,
                half_sin * axis.z);
}

Avoiding trigonometry

规避三角函数

If you read Iñigo Quilez’s recent article about avoiding trigonometry you’ll have probably frowned at the fact that we computed θ from cos(θ), then computed sin(θ/2) and cos(θ/2).
如果你读过伊尼戈·奎莱斯(Iñigo Quilez)近期关于规避三角函数的文章,想必会对我们先从 cos(θ) 求出 θ,再计算 sin(θ/2) 和 cos(θ/2) 这种做法感到不妥。

Indeed, it happens that there is a much simpler way to do it; the half-angle formulas from precalculus tell us the following:
事实上,存在一种简便得多的方法;预微积分中的半角公式给出了如下结论:
[ sin ⁡ θ 2 = 1 − cos ⁡ θ 2 cos ⁡ θ 2 = 1 + cos ⁡ θ 2 [\begin{align} \sin\frac{\theta}{2} &= \sqrt{\frac{1 - \cos\theta}{2}} \\ \cos\frac{\theta}{2} &= \sqrt{\frac{1 + \cos\theta}{2}} \end{align} [sin2θcos2θ=21cosθ =21+cosθ

This allows us to simplify our quaternion creation code:
据此,我们可以简化四元数的创建代码:

quat quat::fromtwovectors(vec3 u, vec3 v)
{
    float cos_theta = dot(normalize(u), normalize(v));
    float half_cos = sqrt(0.5f * (1.f + cos_theta));
    float half_sin = sqrt(0.5f * (1.f - cos_theta));
    vec3 w = normalize(cross(u, v));
    return quat(half_cos,
                half_sin * w.x,
                half_sin * w.y,
                half_sin * w.z);
}

This is pretty nice. By using well known trigonometry formulas, we got rid of all trigonometry function calls!
这样的优化效果显著。借助经典的三角公式,我们彻底摒弃了所有三角函数调用!

Avoiding square roots

规避平方根运算

It happens that we can do slightly better. Note that we normalize three vectors: u, v and cross(u, v). That’s three square roots. The thing is, we already know the norm of w through this formula:
我们还能进一步优化。注意到我们对三个向量做了归一化:uvcross(u, v),这涉及三次平方根运算。但实际上,我们可通过如下公式直接得到 w 的模长:

∣ ∣ u × v ∣ ∣ = ∣ u ∣ . ∣ v ∣ . ∣ sin ⁡ θ ∣ ||\boldsymbol{u} \times \boldsymbol{v}|| = |\boldsymbol{u}| . |\boldsymbol{v}| . |\sin\theta| ∣∣u×v∣∣=u∣.∣v∣.∣sinθ

And we know sin(θ) from precalculus again:
同时,预微积分知识还告诉我们:

sin ⁡ θ = 2 sin ⁡ θ 2 cos ⁡ θ 2 \sin\theta = 2 \sin\frac{\theta}{2} \cos\frac{\theta}{2} sinθ=2sin2θcos2θ

Also, using the fact that sqrt(a)sqrt(b) = sqrt(ab) lets us perform one less square root.
此外,利用 sqrt(a) * sqrt(b) = sqrt(ab) 这一性质,我们可减少一次平方根运算。

We can therefore come up with the following performance improvement:
由此,我们可得到如下性能优化后的代码:

quat quat::fromtwovectors(vec3 u, vec3 v)
{
    float norm_u_norm_v = sqrt(sqlength(u) * sqlength(v));
    float cos_theta = dot(u, v) / norm_u_norm_v;
    float half_cos = sqrt(0.5f * (1.f + cos_theta));
    float half_sin = sqrt(0.5f * (1.f - cos_theta));
    vec3 w = cross(u, v) / (norm_u_norm_v * 2.f * half_sin * half_cos);
    return quat(half_cos,
                half_sin * w.x,
                half_sin * w.y,
                half_sin * w.z);
}

Oh wait! We divide by sin(θ/2) to compute w, then we multiply by sin(θ/2) again. This means we don’t even need that variable, and we can simplify even further:
哦等等!我们在计算 w 时除以了 sin(θ/2),随后又乘以 sin(θ/2)。这意味着该变量完全可以省略,代码可进一步简化:

quat quat::fromtwovectors(vec3 u, vec3 v)
{
    float norm_u_norm_v = sqrt(sqlength(u) * sqlength(v));
    float cos_theta = dot(u, v) / norm_u_norm_v;
    float half_cos = sqrt(0.5f * (1.f + cos_theta));
    vec3 w = cross(u, v) / (norm_u_norm_v * 2.f * half_cos);
    return quat(half_cos, w.x, w.y, w.z);
}

This is more or less the code used by the Ogre3D engine in OgreVector3.h, except they perform an additional normalisation step on the final result. This is mathematically useless, but due to numerical stability issues, it is probably safe to do so nonetheless.
这与 Ogre3D 引擎在 OgreVector3.h 中使用的代码大致相同,区别仅在于后者对最终结果额外执行了一次归一化步骤。从数学角度看这毫无必要,但考虑到数值稳定性问题,这么做或许更为稳妥。

This final normalisation step is actually an opportunity to simplify the code even further.
而这最后的归一化步骤,恰恰为代码的进一步简化提供了契机。

Improving on Ogre3D

优化 Ogre3D 实现方案

We are down to two square roots and four divisions, plus quite a few mul/adds. Depending on the platform that we are running on, it is possible to simplify even further and improve performance. For instance, on many SIMD architectures, normalising a quaternion can be very fast.
此时代码仅包含两次平方根运算、四次除法运算,以及若干次乘加运算。根据运行平台的不同,我们还能进一步简化代码并提升性能。例如,在许多 SIMD 架构下,四元数的归一化操作速度极快。

This is the code we get if we multiply every component of the quaternion by 2.f * half_cos and let normalize() do the rest of the job:
若将四元数的每个分量都乘以 2.f * half_cos,并交由 normalize() 完成后续处理,可得到如下代码:

quat quat::fromtwovectors(vec3 u, vec3 v)
{
    float norm_u_norm_v = sqrt(sqlength(u) * sqlength(v));
    float cos_theta = dot(u, v) / norm_u_norm_v;
    float half_cos = sqrt(0.5f * (1.f + cos_theta));
    vec3 w = cross(u, v) / norm_u_norm_v;
    return normalize(quat(2.f * half_cos * half_cos, w.x, w.y, w.z));
}

Now half_cos only appears in its squared form, and since it comes from a square root, we can simply omit that square root:
此时 half_cos 仅以平方形式出现,而它本身源于平方根运算,因此我们可直接省去该平方根运算:

quat quat::fromtwovectors(vec3 u, vec3 v)
{
    float norm_u_norm_v = sqrt(sqlength(u) * sqlength(v));
    float cos_theta = dot(u, v) / norm_u_norm_v;
    vec3 w = cross(u, v) / norm_u_norm_v;
    return normalize(quat(1.f + cos_theta, w.x, w.y, w.z));
}

And using the same reasoning we can multiply every quaternion component by norm_u_norm_v:
基于相同的思路,我们还可将四元数的所有分量乘以 norm_u_norm_v

quat quat::fromtwovectors(vec3 u, vec3 v)
{
    float norm_u_norm_v = sqrt(sqlength(u) * sqlength(v));
    vec3 w = cross(u, v);
    quat q = quat(norm_u_norm_v + dot(u, v), w.x, w.y, w.z);
    return normalize(q);
}

We are still doing two square roots and four divisions, some of which are hidden in normalize(), but the code is considerably shorter now.
此时代码仍包含两次平方根运算和四次除法运算(部分运算隐藏在 normalize() 中),但代码长度已大幅缩短。

Final form

最终形式

If u and v can be enforced to be unit vectors, norm_u_norm_v can be omitted and simply replaced with 1.0f:
若能确保 uv 为单位向量,则可省略 norm_u_norm_v,直接替换为 1.0f

quat quat::fromtwovectors(vec3 u, vec3 v)
{
    vec3 w = cross(u, v);
    quat q = quat(1.f + dot(u, v), w.x, w.y, w.z);
    return normalize(q);
}

Isn’t it beautiful, considering the sin(), cos() and acos() ridden mess we started with?
对比最初充斥着 sin()cos()acos() 的繁琐代码,这样的实现难道不优雅吗?

This algorithm can be found all over the Internet, but I do not know who first came up with it. Also, a lot of 3D engines (both publicly available and slightly more private) could benefit from it.
该算法在互联网上随处可见,但我并不清楚其首创者。此外,许多 3D 引擎(无论是开源的还是半闭源的)都能从中获益。

  • Posted: 2013-09-18 18:13 (Updated: 2013-09-18 19:10)
    Author: sam
    Categories: maths optim

avoiding trigonometry - 2013

避免三角函数运算 - 2013

Intro

引言

I think we should use less trigonometry in computer graphics. As my understanding of projections, reflections, and vector operations improved over time, I experienced a growing unease every time I saw trigonometry at the core of 3D algorithms inside rendering engines. These days I’m at a point where if I see some asin, acos, atan, sin, cos or tan in the middle of a 3D algorithm, I’ll assume the code was written by an inexperienced programmer and that it needs review.
我认为在计算机图形学领域应减少三角函数的使用。随着我对投影、反射和向量运算的理解逐步加深,每当看到渲染引擎中 3D 算法的核心部分包含三角函数时,内心的不安感便与日俱增。如今,只要在 3D 算法中看到 asin、acos、atan、sin、cos 或 tan 这类函数,我就会认定这段代码出自经验欠缺的程序员之手,且亟需审查优化。

Now, don’t get me wrong. Trigonometry is convenient and necessary for data input and for feeding the larger algorithm. What’s wrong is when angles and trigonometry suddenly emerge deep in the internals of a 3D engine or algorithm out of nowhere. In other words, where the inputs and the outputs to an algorithm of function are vectors and geometry, and suddenly in the middle of the implementation trigonometry appears. Because, most of the time (if not all of the time) such a thing is unnecessarily complicated, error prone and overall unnecessary. It also gets in the way of elegance and truth, if that matters to you. Let me explain.
但请不要误解我的意思。三角函数在数据输入以及为复杂算法提供输入时,确实既便捷又不可或缺。问题的症结在于,当角度和三角函数毫无缘由地出现在 3D 引擎或算法的深层内部逻辑中时——也就是说,当一个函数/算法的输入和输出均为向量与几何图形,但其实现过程中却突兀地出现三角函数时,就会产生问题。因为绝大多数情况下(甚至可以说所有情况下),这种做法会让代码变得无端复杂、易出错,且完全没有必要。如果你在意代码的简洁性与本质性,这种做法还会破坏代码的优雅性,掩盖问题的核心逻辑。我来具体说明一下。

I’ve already discussed in other places before how the dot and cross products encode all the information you need to deal with orientation related operations. These two products respectively capture all that you’ve learnt about cosine and sine of angles. Actually, cosine and sine are really dot and cross products in disguise, which has been taken out of a geometric context and ripped of physicality. For example, they operate on “angles” which are a rather abstract thing compared to the actual “vectors” or “lines” that the two products operate on. Anyways, my claim for now is that when working with objects and vectors, going to angular and trigonometry land is an error. But let me make this claim more concrete with one typical example:
我此前在其他场合已经阐述过,点积和叉积如何编码了处理方向相关操作所需的全部信息。这两种乘积分别对应了我们所学的角度余弦(cosine)和正弦(sine)的所有特性。事实上,余弦和正弦本质上就是经过“伪装”的点积和叉积——它们被剥离了几何语境,失去了物理意义。比如,三角函数作用于“角度”这种相对抽象的概念,而点积和叉积则作用于“向量”或“直线”这些具体的几何对象。总之,我的核心观点是:在处理物体和向量相关问题时,转而使用角度和三角函数是一种错误的做法。接下来,我将通过一个典型示例来具体佐证这一观点:

The suboptimal (wrong?) way to orientate an object

物体定向的非最优(甚至错误?)方式

Say you have this function, which computes a matrix that rotates vectors around a given axis v by an amount of a radians. Many 3D engines, renderers or math libraries will have one such routine. The function will look something like this:
假设你有这样一个函数:它计算一个旋转矩阵,该矩阵可将向量绕给定轴 v 旋转 a 弧度。许多 3D 引擎、渲染器或数学库中都会包含这类函数,其代码大致如下:

mat3x3 rotationAxisAngle( const vec3 & v, float a ) {    
    const float si = sinf( a );    
    const float co = cosf( a );    
    const float ic = 1.0f - co;     
    return mat3x3( v.x*v.x*ic + co,       v.y*v.x*ic - si*v.z,    v.z*v.x*ic + si*v.y,                  
                   v.x*v.y*ic + si*v.z,   v.y*v.y*ic + co,        v.z*v.y*ic - si*v.x,                  
                   v.x*v.z*ic - si*v.y,   v.y*v.z*ic + si*v.x,    v.z*v.z*ic + co ); 
}

So now imagine you are somewhere in the internals of your project, writing some code that needs to set the orientation of an object in a given direction. For example, you are aligning a spaceship to an animation path, by making sure the spaceship’s z axis aligns with the path’s tangent or direction vector d. So, you decide you want to use the rotationAxisAngle() function for that. Cool.
现在假设你正在项目的底层逻辑中编写代码,需要将某个物体定向至指定方向。比如,你要让一艘宇宙飞船与动画路径对齐,确保飞船的 z 轴与路径的切向量(或方向向量)d 重合。于是,你决定使用 rotationAxisAngle() 函数来实现这一功能,看似没问题。

Then, you measure the angle between the spaceship’s z axis and the desired orientation vector d so you know by how much you’ll need to rotate it. Since you are a graphics programmer, you know you can do this by taking the dot product of the two vectors and then extracting the angle from it with an acos() call. Great. Also, because you know that acos() can return nonsense if its argument lays outside of the -1…1 range, you decide to clamp the dot product to -1…1 before calling acos(), just in case. “Because floating point”. Fine. Although at this stage one kitten has already been murdered. But since you don’t know that yet, you proceed to writing the rest of the code.
接下来,你需要计算飞船 z 轴与目标方向向量 d 之间的夹角,以确定旋转角度。作为图形程序员,你知道可以先计算两个向量的点积,再通过 acos() 函数从点积反推出角度,这一步看似合理。此外,你还清楚若 acos() 的参数超出 -1 到 1 的范围,函数会返回无意义的结果,因此为了保险起见,你决定在调用 acos() 前将点积值限制在 -1.0f 到 1.0f 之间——毕竟“浮点数运算总会有误差”。这一步看似也没问题,但其实此时代码已埋下隐患(注:原文“one kitten has already been murdered”为比喻,指代码出现隐性问题)。只是你尚未察觉,仍继续编写后续代码。

So, next you compute the rotation axis, which you know is the cross product of your z and d vectors, for all points in the spaceship need to rotate in planes parallel to that spanned by these two vectors. Then you decide to normalize the rotation axis vector, just because you often do so just to make sure things will work, regardless of whether you actually needed to normalize the vector or not. And so, in the end, your code looks like this:
随后,你计算旋转轴——你知道旋转轴是 zd 两个向量的叉积,因为飞船上所有点都需要在与这两个向量张成的平面平行的平面内旋转。接着,你决定对旋转轴向量做归一化处理,仅仅是因为你习惯性地这么做以确保代码能正常运行,而不管该向量是否真的需要归一化。最终,你的代码写成了这样:

const vec3   axi = normalize( cross( z, d ) );    
const float  ang = acosf( clamp( dot( z, d ), -1.0f, 1.0f ) );    
const mat3x3 rot = rotationAxisAngle( axi, ang );

This does work. But I claim it’s wrong. To see why, let’s expand the code of rotationAxisAngle() in place and see the whole picture of what’s mathematically going on:
这段代码确实能运行,但我认为它是错误的。为说明原因,我们将 rotationAxisAngle() 函数的代码展开,看看背后的数学逻辑全貌:

const vec3   axi = normalize( cross( z, d ) );    
const float  ang = acosf( clamp( dot( z, d ), -1.0f, 1.0f ) );    
const float  co = cosf( ang );    
const float  si = sinf( ang );    
const float  ic = 1.0f - co;    
const mat3x3 rot = mat3x3( axi.x*axi.x*ic + co,         axi.y*axi.x*ic - si*axi.z,    axi.z*axi.x*ic + si*axi.y,                               
                           axi.x*axi.y*ic + si*axi.z,   axi.y*axi.y*ic + co,         axi.z*axi.y*ic - si*axi.x,                               
                           axi.x*axi.z*ic - si*axi.y,   axi.y*axi.z*ic + si*axi.x,   axi.z*axi.z*ic + co );

As you have already probably noticed, we are performing an rather imprecise and expensive acos() call just to undo it immediately after by computing a cos() on its return value. cos(acos(x)) is just x after all. This begs the question: why not skip the acos()/cos() chain altogether?
你或许已经注意到:我们先调用了精度低且计算成本高的 acos() 函数,紧接着又对其返回值调用 cos() 函数——而 cos(acos(x)) 的结果本质上就是 x。这就引出一个问题:为何不直接跳过 acos()/cos() 这一串无意义的操作?

The key insight here is that this is not just a mere optimization (skipping cos, acos and clamp calls), but that maybe what we are revealing here is a deeper mathematical structure or relationship. Indeed, while sometimes simplifications and symmetries in mathematical expressions are certainly just accidental and meaningless, more often than not they actually are the consequence of deep truths. So let’s keep exploring what’s going on in our code, see if we can learn anything here.
核心要点在于:这不仅仅是简单的优化(跳过 cos、acos 和 clamp 调用),更重要的是,这一过程揭示了更深层次的数学结构或关系。诚然,数学表达式中的某些简化和对称性可能只是偶然且无意义的,但更多时候,它们恰恰是底层数学规律的体现。接下来,我们继续分析代码中的逻辑,看看能从中发现什么。

So, a first roadblock seems to be that despite we could in principle simplify the cos(acos(x)) chain, it seems we need to compute the angle or rotation anyways for our sin() function that comes right after the cosine. So we might not be able to simplify things much after all. However this is where we start using the dot and cross products.
表面上看,第一个障碍是:尽管我们理论上可简化 cos(acos(x)) 这一操作,但后续的 sin() 函数似乎仍需用到旋转角度,因此我们似乎无法大幅简化代码。不过,这正是点积和叉积能发挥作用的地方。

If you are familiar with the cross product, you might know that it somehow encodes sines, just like dot products encode cosines. If you never realized this, and that’s fine, that’s why we are here today. A hint is that the length of a cross product is the length of the two vectors involved times the sine of the angle they form (while the dot product is the lengths of the vectors times the cosine of the angle they form).
如果你熟悉叉积,可能知道它在某种程度上编码了正弦值,就像点积编码了余弦值一样。若你此前从未意识到这一点也无妨——这正是我们今天要讲解的核心内容。关键提示:叉积的模长等于两个向量的模长乘以它们夹角的 正弦值(而点积等于两个向量的模长乘以它们夹角的 余弦值)。

So this is the point - wherever there’s a dot product, chances are there’s a cross product lying around, sometimes a bit hidden perhaps, that completes the picture of what’s geometrically happening (an alignment and rotation in our case). And so, generally, for any parallel thing you can measure (dot products, cosines) there’s a perpendicular component that you can measure too (cross product, sines). So, this is all a bit abstract perhaps, so let’s go ahead and check this out:
这就是核心逻辑——只要有点积出现的地方,大概率能找到叉积(有时可能隐藏较深),二者共同完整描述了几何层面的变化(本文示例中为对齐与旋转)。通常而言,对于任何可度量的平行分量(点积、余弦值),都存在对应的垂直分量可被度量(叉积、正弦值)。这些概念或许稍显抽象,接下来我们通过实际代码验证:

The proper way to do it

正确的实现方式

We can extract the sine of the angle between z and d by just looking at the length of their cross product. Indeed, z and d are normalized so the length of their cross product is the sine of the angle they form times one and times one again! Which means we can and should rewrite our operation this way:
我们可直接通过 zd 叉积的模长得到它们夹角的正弦值。因为 zd 均为归一化向量,所以其叉积的模长等于夹角的正弦值乘以 1 再乘以 1(即正弦值本身)!这意味着我们可以(也应该)这样重写操作:

mat3x3 rotationAlign( const vec3 & d, const vec3 & z ) {    
    const vec3  ax = cross( z, d );    
    const float co = dot( z, d );    
    const float si = length( ax );    
    const vec3   v = ax/si;    
    const float ic = 1.0f - co;     
    
    return mat3x3( v.x*v.x*ic + co,       v.y*v.x*ic - si*v.z,    v.z*v.x*ic + si*v.y,                  
                   v.x*v.y*ic + si*v.z,   v.y*v.y*ic + co,        v.z*v.y*ic - si*v.x,                  
                   v.x*v.z*ic - si*v.y,   v.y*v.z*ic + si*v.x,    v.z*v.z*ic + co ); 
}

Good, no normalize(), no acos(), no clamp() anymore. We are definitely getting somewhere now. Let’s keep going and also get rid of the length/square root operation by noticing that si²=1-co² (since sine and cosine lay on the unit circle) and by propagating the 1/si term into the matrix and getting some cancellations. What we get is:
很好,代码中已不再有 normalize()acos()clamp() 函数。我们的优化显然取得了进展。接下来,我们继续优化:利用 si² = 1 - co²(因正弦和余弦满足单位圆公式),并将 1/si 项代入矩阵中约分,从而去掉长度/平方根运算。最终得到:

mat3x3 rotationAlign( const vec3 & d, const vec3 & z ) {    
    const vec3  v = cross( z, d );    
    const float c = dot( z, d );    
    const float k = (1.0f-c)/(1.0f-c*c);     
    
    return mat3x3( v.x*v.x*k + c,     v.y*v.x*k - v.z,    v.z*v.x*k + v.y,                  
                   v.x*v.y*k + v.z,   v.y*v.y*k + c,      v.z*v.y*k - v.x,                  
                   v.x*v.z*k - v.y,   v.y*v.z*k + v.x,    v.z*v.z*k + c    ); 
}

Lastly, we can simplify k to k = 1/(1+c) which is more elegant and also moves the two singularities in k and whole function (d and z are parallel) into a single singularity (when d and z are the same vector, in which case there’s no clear rotation that is best anyways (thanks Zoltan Vrana for this tip). So, the final function looks like this:
最后,我们可将 k 简化为 k = 1/(1 + c),此举既更简洁优雅,又能将 k 和整个函数中的两个奇点(即 dz 平行的情况)合并为一个奇点(仅当 dz 为同一向量时——此时本就不存在“最优”旋转方式,感谢 Zoltan Vrana 提供该优化思路)。最终函数如下:

mat3x3 rotationAlign( const vec3 & d, const vec3 & z ) {    
    const vec3  v = cross( z, d );    
    const float c = dot( z, d );    
    const float k = 1.0f/(1.0f+c);     
    
    return mat3x3( v.x*v.x*k + c,     v.y*v.x*k - v.z,    v.z*v.x*k + v.y,                  
                   v.x*v.y*k + v.z,   v.y*v.y*k + c,      v.z*v.y*k - v.x,                  
                   v.x*v.z*k - v.y,   v.y*v.z*k + v.x,    v.z*v.z*k + c    ); 
}

Conclusion

结论

So look at that beauty - there aren’t any trigonometric or inverse trigonometric functions, nor clamp or normalizations or even square roots. This is great from a stability and performance point of view. And also, we have solved a problem relating vectors using only vectors, which just feels right.
看看这段优雅的代码——无任何三角函数或反三角函数,也无 clamp、归一化甚至平方根运算。从稳定性和性能角度而言,这无疑是极佳的。此外,我们完全通过向量运算解决了向量相关问题,这在逻辑上也更为自洽。

Now, I’m not going to claim that in this particular example of today we got an intuitive interpretation of the final matrix. Neither we had it in the beginning, so nothing was lost there. Indeed, all I can think of doing to the matrix above is decompose it as
我并非声称在本文这个具体示例中,我们能直观解释最终的矩阵形式——毕竟最初的矩阵也无直观解释,因此我们并未损失任何信息。事实上,我能对上述矩阵做的唯一拆解是:

M = v ⋅ v T 1 + c + ( c − v . z v . y v . z c − v . x − v . y v . x c ) M = \frac{v \cdot v^T}{1+c} + \begin{pmatrix} c & -v.z & v.y \\ v.z & c & -v.x \\ -v.y & v.x & c \end{pmatrix} M=1+cvvT+ cv.zv.yv.zcv.xv.yv.xc

The point is, in almost all situations you can perform similar trigonometric simplifications and slowly untangle and uncover a simpler vector expression that describes the same problem. Also, oftentimes you can actually rethink the whole problem in terms of vectors from the beginning. For that you’ll need to develop some intuition about dot products, cross products and projections. But once you have it you’ll find that many of the trigonometric identities you find in literature are just a statement about some particular vector configurations, so you’ll never have to memorize or look up the identities again. But I digress.
核心要点是:在几乎所有场景下,你都能进行类似的三角函数简化,逐步梳理并找到能描述同一问题的更简洁的向量表达式。此外,很多时候你甚至能从一开始就用向量视角重新思考整个问题。要做到这一点,你需培养对点积、叉积和投影的直觉。一旦建立这种直觉,你会发现文献中的诸多三角恒等式,本质上只是特定向量构型的表述——从此无需死记硬背或查阅这些恒等式。不过这已偏离主题。

I’ll just conclude saying that part of the problem with overusing trigonometry in 3D rendering codebases comes from poorly designed third party APIs, like that of rotationAxisAngle(v,a) or WebXR stereo projection information, which work on angles and force their users to use angles too, infecting our codebases. Sometimes the cost is a performance or stability hit; some other times there’s no performance cost but either way I think we should strive to doing more vectors operations and less trigonometry!
最后我想说:3D 渲染代码中过度使用三角函数的部分原因,源于设计不佳的第三方 API(如 rotationAxisAngle(v,a) 或 WebXR 立体投影相关接口)——这些 API 基于角度设计,迫使开发者也使用角度,进而“污染”整个代码库。有时这会导致性能下降或稳定性问题;有时虽无性能损耗,但无论如何,我认为我们都应尽量多使用向量运算,少用三角函数!


a sin/cos trick - 2010

一个正弦/余弦的实用技巧 - 2010

You probably remember this identity from school:
你或许还记得上学时学过的这个恒等式:

sin ⁡ ( α + β ) = sin ⁡ α ⋅ cos ⁡ β + cos ⁡ α ⋅ sin ⁡ β cos ⁡ ( α + β ) = cos ⁡ α ⋅ cos ⁡ β − sin ⁡ α ⋅ sin ⁡ β \sin(\alpha+\beta)=\sin\alpha \cdot \cos\beta + \cos\alpha \cdot \sin\beta\\\cos(\alpha+\beta)=\cos\alpha \cdot \cos\beta - \sin\alpha \cdot \sin\beta sin(α+β)=sinαcosβ+cosαsinβcos(α+β)=cosαcosβsinαsinβ

When you are a kid and are first introduced to it your first feeling is the pain of having to memorize it. That’s too bad, because in fact you don’t need to memorize anything as this formula arises naturally when you draw the rotation of a triangle, on a piece of paper. That’s actually what I do when I need this formula down but can’t remember it. I draw it. You probably will be able to do so too by the time we are half way down this tutorial. But for now, and to keep the things fun and delay the eureka moment for a bit, let’s first think when would we need this formula in the first place.
小时候初次接触这个公式时,你的第一反应大概率是头疼——因为要背下来。但其实没必要,因为只要在纸上画出三角形的旋转过程,这个公式就会自然推导出来。我自己忘记这个公式时,就是这么做的。读到本教程一半时,你应该也能做到。不过现在,为了保持趣味性,先暂缓揭晓这个“顿悟时刻”,我们先思考一个问题:我们究竟在什么场景下会用到这个公式?

Intro

引言

The sin and cos trigonometric functions are probably the functions that most frequently appear in computer graphics, for they are the simplest (but not only) way to describe any circular motion and shape in a parametric way. Among their uses, we have the generation of circles or 3D revolution objects, the computation of Fourier Transforms, the synthesis of procedural waves, the creation of noise, the production of software sound synthesizer, etc etc. In all these cases, sin, cos or both are repeatedly called inside a loop, similar to this:
sincos 三角函数或许是计算机图形学中最常出现的函数,因为它们是用参数化方式描述圆周运动和圆形轮廓最简单(但非唯一)的方法。其应用场景包括:生成圆形或 3D 旋转体、计算傅里叶变换、合成程序化波形、生成噪声、制作软件声音合成器等等。在这些场景中,sincos(或两者)会在循环中被反复调用,示例如下:

for( int n=0; n<num; n++ ) {    
    const float t = 2.0f*PI*(float)n/(float)num;    
    const float s = sinf(t);    
    const float c = cosf(t);     
    
    // do something with s and c    
    ... 
}

So let me rewrite this loop in an incremental fashion now, so that it’s easier to see that given an iteration n of the loop which corresponds to a phase t, the next iteration n+1 is going to compute the sin and cos of t+f. In other words, after computing sin(t) and cos(t), the next iteration needs to compute sin(t+f) and cos(t+f):
现在我将这个循环改写成增量形式,便于理解:循环第 n 次迭代对应相位 t,第 n+1 次迭代则需计算 t+f 的正弦和余弦值。也就是说,计算完 sin(t)cos(t) 后,下一次迭代要计算 sin(t+f)cos(t+f)

const float f = 2.0f*PI/(float)num; 
const float t = 0.0f; 
for( int n=0; n < num; n++ ) {    
    const float s = sinf(t);    
    const float c = cosf(t);     
    
    // do something with s and c    
    ...    
    t += f; 
}

So we can try to think if there’s a way to compute sin(t+f) and cos(t+f) from sin(t) and cos(t), because if there is one and it doesn’t involve expensive functions like square roots or trigonometrics, then we’d be able to take the expensive evaluations of sin(t) and cos(t) out of the loop!
由此我们可以思考:是否能通过 sin(t)cos(t) 计算出 sin(t+f)cos(t+f)?如果可行,且该方法不涉及平方根、三角函数等高开销函数,那么我们就能将耗时的 sin(t)cos(t) 计算移出循环!

Well, let’s look no further than the formula at the top of the article, the one we memorized as kids. According to it:
答案就在本文开头那个小时候背过的公式里:

sin ⁡ ( t + f ) = sin ⁡ ( f ) ⋅ cos ⁡ ( t ) + cos ⁡ ( f ) ⋅ sin ⁡ ( t ) cos ⁡ ( t + f ) = cos ⁡ ( f ) ⋅ cos ⁡ ( t ) − sin ⁡ ( f ) ⋅ sin ⁡ ( t ) \sin(t+f) = \sin(f) \cdot \cos(t) + \cos(f) \cdot \sin(t)\\\cos(t+f) = \cos(f) \cdot \cos(t) - \sin(f) \cdot \sin(t) sin(t+f)=sin(f)cos(t)+cos(f)sin(t)cos(t+f)=cos(f)cos(t)sin(f)sin(t)

or in more coder-friendly manner:
或者用更贴近编程的写法:

n e w _ s i n = sin ⁡ ( f ) ⋅ o l d _ c o s + cos ⁡ ( f ) ⋅ o l d _ s i n n e w _ c o s = cos ⁡ ( f ) ⋅ o l d _ c o s − sin ⁡ ( f ) ⋅ o l d _ s i n new\_sin = \sin(f) \cdot old\_cos + \cos(f) \cdot old\_sin\\new\_cos = \cos(f) \cdot old\_cos - \sin(f) \cdot old\_sin new_sin=sin(f)old_cos+cos(f)old_sinnew_cos=cos(f)old_cossin(f)old_sin

Now, because f is a constant, so are cos(f) and sin(f). Let’s call them a and b so that:
由于 f 是常量,因此 cos(f)sin(f) 也是常量。我们将其分别记为 ab,则:

n e w _ s i n = b ⋅ o l d _ c o s + a ⋅ o l d _ s i n n e w _ c o s = a ⋅ o l d _ c o s − b ⋅ o l d _ s i n new\_sin = b \cdot old\_cos + a \cdot old\_sin\\new\_cos = a \cdot old\_cos - b \cdot old\_sin new_sin=bold_cos+aold_sinnew_cos=aold_cosbold_sin

This expression can be directly used in our code, resulting in a loop that hasn’t any expensive trigonometric function at all, just like we hoped for. See:
该表达式可直接应用到代码中,最终得到的循环完全不含高开销的三角函数,正如我们期望的那样:

const float f = 2.0f*PI/(float)num; 
const float a = cosf(f); 
const float b = sinf(f); 
float s = 0.0f; 
float c = 1.0f; 
for( int n=0; n<num; n++ ) {    
    // do something with s and c    
    ...     
    
    const float ns = b*c + a*s;    
    const float nc = a*c - b*s;    
    c = nc;    
    s = ns; 
}

The interpretation

原理阐释

So far we have blindly used the math identity we learnt at school without really talking about what we’ve been actually doing. So let’s first rewrite the inner loop this way:
到目前为止,我们只是机械套用了上学时学的数学恒等式,却未解释其背后的原理。首先,我们将循环内的核心逻辑改写为:

s n + 1 = s n ⋅ a + c n ⋅ b c n + 1 = c n ⋅ a − s n ⋅ b s_{n+1} = s_n \cdot a + c_n \cdot b\\c_{n+1} = c_n \cdot a - s_n \cdot b sn+1=sna+cnbcn+1=cnasnb

Some of you might have already noticed that this is nothing but the traditional way to compute 2D rotations. If you didn’t recognize it, perhaps rewriting it in matrix form helps:
部分读者或许已发现,这正是计算 2D 旋转的经典方式。若你未识别出来,写成矩阵形式会更清晰:

( s n + 1 c n + 1 ) = ( a b − b a ) ⋅ ( s n c n ) \begin{pmatrix} s_{n+1} \\ c_{n+1} \end{pmatrix} = \begin{pmatrix} a & b \\ -b & a \end{pmatrix} \cdot \begin{pmatrix} s_n \\ c_n \end{pmatrix} (sn+1cn+1)=(abba)(sncn)

Indeed, sin(t) and cos(t) can be grouped as a vector of length one; let’s call it x:
事实上,sin(t)cos(t) 可组合成一个单位长度的向量,我们记为 x

x = ( sin ⁡ t , cos ⁡ t ) T \boldsymbol{x} = (\sin t, \cos t)^T x=(sint,cost)T

Then, the vectorial form of the expression above looks like
那么,上述表达式的向量形式为:

x n + 1 = R ⋅ x n \boldsymbol{x}_{n+1} = \boldsymbol{R} \cdot \boldsymbol{x}_n xn+1=Rxn

with R being the {a, b, -b, a} 2x2 matrix. So, we see that our loop is performing a little 2D rotation in each iteration, such that x rotates in a circle during the execution of the loop. Every iteration x rotates by an angle of 2π/num radians.
其中 R 是由 {a, b, -b, a} 构成的 2x2 矩阵。由此可见,循环的每一次迭代都在执行一次微小的 2D 旋转,使得向量 x 在循环执行过程中沿圆形轨迹旋转,每次迭代旋转的角度为 2π/num 弧度。

So, basically
因此,本质上

sin ⁡ ( α + β ) = sin ⁡ α ⋅ cos ⁡ β + cos ⁡ α ⋅ sin ⁡ β cos ⁡ ( α + β ) = cos ⁡ α ⋅ cos ⁡ β − sin ⁡ α ⋅ sin ⁡ β \sin(\alpha+\beta)=\sin\alpha \cdot \cos\beta + \cos\alpha \cdot \sin\beta\\\cos(\alpha+\beta)=\cos\alpha \cdot \cos\beta - \sin\alpha \cdot \sin\beta sin(α+β)=sinαcosβ+cosαsinβcos(α+β)=cosαcosβsinαsinβ

as we learnt at school, must be a 2D rotation formula of a point x = (cos(α), sin(α))T in the circle by an angle of β radians. To check that, we first construct one of the two axes of the rotation, ie, u = ( cos(β), sin(β) )T. The first component of the rotation is going to be the projection of x into u. Because both x and u have a length of one, so the projection is just their dot product. Therefore:
我们上学时学的这组公式,其实是单位圆上的点 x = (cosα, sinα)ᵀ 绕原点旋转 β 弧度的 2D 旋转公式。为验证这一点,我们先构造旋转的其中一个轴,即 u = (cosβ, sinβ)ᵀ。旋转后的第一个分量是 xu 上的投影——由于 xu 均为单位向量,投影结果即为二者的点积:

s = ⟨ x , u ⟩ = sin ⁡ α ⋅ cos ⁡ β + cos ⁡ α ⋅ sin ⁡ β s = \langle \boldsymbol{x}, \boldsymbol{u} \rangle = \sin\alpha \cdot \cos\beta + \cos\alpha \cdot \sin\beta s=x,u=sinαcosβ+cosαsinβ

And of course, the second component is its antiprojection, which we can do by projecting in a perpendicular axis, v. We can construct this vector by reversing the coordinates of u and negating one of them:
而第二个分量则是其“反投影”,可通过将 x 投影到与 u 垂直的轴 v 上得到。v 可通过反转 u 的坐标并取反其中一个分量得到:

c = ⟨ x , v ⟩ = cos ⁡ α ⋅ cos ⁡ β − sin ⁡ α ⋅ sin ⁡ β c = \langle \boldsymbol{x}, \boldsymbol{v} \rangle = \cos\alpha \cdot \cos\beta - \sin\alpha \cdot \sin\beta c=x,v=cosαcosβsinαsinβ

Some notes

补充说明

So, now we have a way to generate circles, waves and many other undulating shapes and functions without using sin and cosine function, just by doing a couple of multiplications and additions instead per loop iteration. In theory we should be able to perform this tiny rotations per iteration for as many iterations as we wised, since
至此,我们得到了一种无需调用 sin 和 cos 函数即可生成圆形、波形及其他周期形状/函数的方法——仅需在每次循环迭代中执行几次乘法和加法运算。理论上,我们可按需求执行任意次数的这种微小旋转,因为

R T ⋅ R = I \boldsymbol{R}^T \cdot \boldsymbol{R} = \boldsymbol{I} RTR=I

what means that R is neither a contracting nor expanding transformation indeed (it’s a rotation after all!). Or in other words, we expect x to move in a perfect circle without every spiraling in or out. However due to numerical precision issues, x will eventually move in a spiral and fall into the origin. I never had this issue in my applications, but I’m guessing that for very big values of num, ie, little rotation angles, one can get some problems.
这意味着 R 既非收缩变换也非扩张变换(毕竟它是旋转矩阵!)。换句话说,我们期望向量 x 沿完美的圆形轨迹运动,不会向内或向外螺旋偏移。但受数值精度限制,x 最终仍会逐渐螺旋式趋向原点。我在自己的项目中从未遇到过这个问题,但推测当 num 取值极大(即每次旋转角度极小时),可能会出现相关问题。

An example

示例

In Kindercrasher, a realtime 4 kilobytes demo from 2008, a group of spheres do pulsate to the music. For that I computed the Fourier Transform of the sound in windows of 4096 samples (every frame though, so windows overlapped). Fast algorithms exist to do this in realtime, like the FFT. However, given that my code had to fit in no more than a few hundred bytes, I decided to do it in a different way. Instead of implementing the FFT I implemented the Discrete Fourier Transform directly from its very basic definition. You can check it on the Wikipedia.
在 2008 年发布的 4KB 实时演示程序 Kindercrasher 中,一组球体随音乐节奏脉动。为实现该效果,我对每帧的 4096 个采样点的音频窗口进行傅里叶变换(窗口存在重叠)。虽然有快速傅里叶变换(FFT)这类实时高效的算法,但由于代码量需控制在几百字节以内,我选择了另一种方式:不实现 FFT,而是直接根据离散傅里叶变换(DFT)的基础定义来实现。你可在维基百科上查阅该定义。

My function takes a stereo 16 bit sound buffer, x, and returns the first 128 frequencies in the sound spectrum of the sound in y. Note how the inner loop, that one iterating 4096 times, has no *sin* or *cos* calls, as the naive implementation would.
我的函数接收一个立体声 16 位音频缓冲区 x,返回音频频谱中前 128 个频率分量并存储在 y 中。请注意,内层循环(迭代 4096 次)完全没有像朴素实现那样调用 sincos 函数。

#include <math.h> 
void iqDFT12( float *y, const short *x ) {    
    for( int i=0; i<128; i++ )    {        const float wi = (float)i*(2.0f*3.1415927f/4096.0f);        const float sii = sinf( wi );        const float coi = cosf( wi );          float co = 1.0f;        float si = 0.0f;        float acco = 0.0f;        float acsi = 0.0f;        for( int j=0; j<4096; j++ )        {            const float f = (float)(x[2*j+0]+x[2*j+1]);            const float oco = co;            acco += co*f; co = co*coi -  si*sii;            acsi += si*f; si = si*coi + oco*sii;        }        y[i] = sqrtf(acco*acco+acsi*acsi)*(1.0f/32767.0f);    } 
}

snap45 - 2025

45 度向量对齐(snap45) - 2025

Intro

引言

This is another example of avoiding trigonometry that came up the other day. I recommend reading Pat I and Part II of this miniseries where I propose that using trigonometric functions (sin, cos, tan) or their inverses (asin, acos, atan) in any geometry code or rendering system is always a code smell.
这是近期遇到的又一个摒弃三角函数的案例。我建议先阅读本系列的 第一部分 和 第二部分,其中我提出:在任何几何代码或渲染系统中使用三角函数(sin、cos、tan)或其反函数(asin、acos、atan),都应视为代码的“坏味道”。

This time, we’ll look at a challenge that got sent my way when I claimed one can always avoid the rather expensive atan()/atan2() function. The problem is that of snapping a 2D unit vector to the closest 45 degree angle (N, NE, E, SE, S, SW, W, NW). The claim was that you do need to commute the angle of such a vector, round it to the closest 45 degree angle, then reconstruct the vector. Something like this perhaps:
此次的场景是:我声称总能避开开销较高的 atan()/atan2() 函数,随后有人提出了一个挑战问题——将一个 2D 单位向量对齐到最近的 45 度方向(北、东北、东、东南、南、西南、西、西北)。对方认为,必须先计算向量的角度,将其四舍五入到最近的 45 度,再重构向量。代码大概如下:

vec2 snap45( in vec2 v ) {    
    const float r = 6.283185/8.0; // 45 degrees    
    float a = atan(v.y,v.x);      // vector to angle    
    float s = r*round(a/r);       // snap angle    
    return vec2(cos(s),sin(s));   // angle to vector 
}

This seems reasonable, and it does work - on the right side you can see how as the vector v in white moves, the yellow vector snaps to the closest 45 degrees direction. The triangular sectors are the basins of attraction of each direction. All good. But my claim was that surely you wouldn’t need trigonometry to achieve the same behavior, and that if this code happened to be in a perf critical path, you’d want to do better.
这段代码看似合理且能正常运行——右侧可看到,当白色向量 v 移动时,黄色向量会对齐到最近的 45 度方向,三角形区域是每个方向的“吸引域”。一切看似没问题,但我认为,实现相同效果完全无需三角函数;且若这段代码位于性能关键路径,还能进一步优化。

Basic proposal

基础优化方案

My immediate proposal was the most obvious one: all quadrants of the plane require the same treatment, so first use symmetry by taking the absolute value of the vector. Then realize that within the basic quadrant, there are only three possible outputs: horizontal (0 degree), diagonal (45 degree) and vertical (90 degree). The three cases can be easily determined by whether the x component of the vector is below 45-45/2 degrees or above 45+45/2 degrees. So the code I answered with was:
我的初步方案十分直观:平面的所有象限处理逻辑一致,因此先利用对称性取向量的绝对值;接着注意到,在基础象限内,仅存在三种输出可能:水平(0 度)、对角线(45 度)、垂直(90 度)。可通过向量的 x 分量判断属于哪种情况——若 x 分量小于 45-45/2 度对应的取值,或大于 45+45/2 度对应的取值,即可区分。我给出的代码如下:

vec2 snap45( in vec2 v )  // 1.8x faster {    
    vec2 s = sign(v);    
    float x = abs(v.x);    
    return x>0.923880?vec2(s.x,0.0):           
           x>0.382683?s*sqrt(0.5):                     
                      vec2(0.0,s.y); 
}

Here 0.923880 is cos(π/8), that is, the x component of a vector at 45-45/2 degrees, and 0.382683 is cos(3PI/8), that is, the x component of a vector at 45+45/2 degrees.
其中 0.923880 是 cos(π/8),即 45-45/2 度方向向量的 x 分量;0.382683 是 cos(3π/8),即 45+45/2 度方向向量的 x 分量。

I wouldn’t consider this logic and corresponding code an “optimized” one, just a common sense approach to not doing the stupid thing to begin with. Yet, it already performs 1.8 times faster than the naive atan() based approach.
我并不认为这个逻辑和代码是“极致优化”的结果,只是一种“避免做蠢事”的常规思路。即便如此,其性能仍比基于 atan() 的朴素实现快 1.8 倍。

A better approach

更优方案

Once I responded with this code I considered the challenge over, and so I archived the little shader code in Shadertoy for others to reuse. But as it often happens with the Shadertoy community, it didn’t take long for somebody to take my common sense approach and make it look bad. The user Spalmer, helped by a correction provided by user FordPerfect, came up with this equivalent code:
我给出上述代码后,本以为这个挑战已解决,便将这段着色器代码存档到 Shadertoy 供他人复用。但正如 Shadertoy 社区常出现的情况,很快有人优化了我的方案,让我的代码相形见绌。用户 Spalmer 在 FordPerfect 的修正建议下,写出了等效代码:

vec2 snap45( in vec2 v )  // 2.1x faster {    
    v = round(v*1.30656296); // sqrt(1.0+sqrt(0.5)) = sin(π/8)/2    
    return normalize(v);  
}

This code uses a different approach, but before we explored, this code runs 2.1 times faster than the original, ie, slightly faster than mine. The way it works is that instead of thinking of snapping in angular space, you can think of a 2D grid covering the plane, such that we’ll snap the vector to this grid. The value 1.30656296 is chosen such that the grid intersects exactly the 45-45/2 and 45+45/2 points; that means it is ½sin(π/8) = √(1+√½). If this seems a bit abstract at first, a drawing will convince you this is what you need.
这段代码采用了不同思路,性能比原始的 atan() 实现快 2.1 倍(略快于我的方案)。其核心逻辑是:不再在角度空间中做对齐,而是将平面覆盖一个 2D 网格,把向量对齐到该网格上。取值 1.30656296 是为了让网格恰好与 45-45/2 和 45+45/2 度的方向点相交——该值等于 ½sin(π/8) = √(1+√0.5)。若初看觉得抽象,画个图就能理解其合理性。

While this code was more than twice faster than the original atan() based one, I wasn’t satisfied because it needs a dot product and a inverse square root for the normalize() call. So I made my own contribution to it and further optimized it to this:
尽管这段代码的性能已是原始 atan() 实现的两倍多,但我仍不满意——因为 normalize() 调用涉及点积和平方根倒数运算。于是我在此基础上进一步优化,得到如下代码:

vec2 snap45( in vec2 v )  // 2.8x faster {    
    v = round(v*1.30656296); // sqrt(1.0+sqrt(0.5)) = sin(π/8)/2    
    return v*(abs(v.x)+abs(v.y)>1.5?sqrt(0.5):1.0); 
}

This now runs 2.8 times faster than the original atan() based implementation, not bad! You can see the source code for all these variants of the code here: https://www.shadertoy/view/M3ycWd.
这段代码的性能比原始的 atan() 实现快 2.8 倍,效果不错!你可在此 shadertoy 查看所有版本的代码。

Conclusion

结论

So, as I claimed more than a decade ago, you can solve almost all geometry or rendering problems without trigonometry, in a more intuitive and efficient way (and generalizable, if you care about that)
正如我十多年前所主张的,几乎所有几何或渲染问题都能在不使用三角函数的情况下解决,且解法更直观、高效(若你在意通用性,这类解法也具备可推广性)。

But a more important takeaway is perhaps that if you move past the “general solution” and decide to specialize, you always discover lots of structure in the problem that can be exploited to simplify code (more readable) and optimize. In this example, the uncovered structure is the mirroring symmetry.
但更重要的启示或许是:若你跳出“通用解法”的思维定式,针对具体场景做定制化优化,总能发现问题中可利用的内在结构,从而简化代码(提升可读性)并优化性能。本示例中,我们挖掘出的核心结构就是镜像对称性。

Programmers often forget this, or avoid it on purpose sometimes because of laziness, sometimes because they love to generalize and abstract (they are trained to always do so after all). But this is in many cases a mistake - specializing makes code simpler (and often more readable) and faster, and should be practiced in conjunction. Knowing when to abstract and when not to is an art. But try to always thing when it makes sense to do so. This is by the way true in almost all contexts I can think of - solving Newtonian F=ma is more efficient than solving relativistic motion, and easier to grasp. The Unreal Engine 5 can’t beat an custom engine designed for one particular game, which will also have simpelr codebase to navigate. And "lea eax, [eax4+eax]" outperforms “imul eax,5”, although in that case the optimization has ofbuscaed the intent, which can be solved with a comment or a properly named function around it.
程序员往往忽略这一点,或刻意回避——有时是因为懒惰,有时是因为偏爱泛化和抽象(毕竟程序员被训练成这样)。但在很多情况下,这是错误的——定制化优化能让代码更简洁(通常也更易读)、运行更快,且应与抽象思维结合使用。懂得何时该抽象、何时该定制化,是一门艺术。但请务必思考这种权衡的合理性——几乎在所有场景下都是如此:求解牛顿力学的 F=ma 方程比求解相对论运动方程更高效、更易理解;虚幻引擎 5 无法击败为某款特定游戏定制的引擎(后者的代码库也更易维护);指令“lea eax, [eax
4+eax]”的性能优于“imul eax,5”(尽管这种优化会掩盖代码意图,但可通过注释或命名规范的函数来解决)。


avoiding trigonometry

规避三角函数

Intro

引言

I still think we should use less trigonometry in computer graphics. A good understanding of projections, reflections, and vector operations (as in the true meaning of the dot and the cross/external product) usually comes with a growing feeling of uneasy with the use of trigonometry. More preciselly, while I think that trigonometry is good for inputing data to your algorithm (for the notion of angles is an untuitive to measure of orientation), I feel there’s something wrong when I see trigonometry involved in the depths of some 3D rendering algorithm. In fact, I think a kitten is sacrificed somewhere every time there’s trigonometry invoved down there. And I’m not that concern with speed or precission really, but with conceptual elegance I guess… Well, let me explain.
我始终认为,计算机图形学领域应当减少三角函数的使用。当你深入理解了投影、反射以及向量运算(即点积与叉积/外积的本质内涵)后,往往会对三角函数的使用愈发感到不适。更确切地说,我认为三角函数适用于向算法输入数据(因为角度是描述方向的直观度量方式),但当看到某些 3D 渲染算法的核心逻辑中充斥着三角函数时,便会觉得其中存在问题。甚至我会打趣地认为:每当底层代码里出现三角函数时,就有一只小猫“惨遭牺牲”。我真正在意的并非运算速度或精度,而是概念层面的简洁优雅……接下来,我来具体阐释这一点。

I have already discussed in other places before how the dot and cross products encode all the information you need to deal with orientation operations, for these two “orthogonal” operations measure cosines and sines of angles respectivelly. They are equivalent to those in so many ways that it feels as if one could simply stick to the products and get rid of angles and trigonometry altogether. In practice, you can indeed do so much by staying in the simple vector-land of Euclides, without trigonometry, that it makes you wander - are we doing somwthing wrong? Probably yes. However, unfortuntally, even experienced professionals tend to abuse of trigonometrics, and so do things way too complicated, cumbersome, and far from elegant. And perhaps indeed, “wrong”.
我此前在其他场合也曾探讨过:点积和叉积这两种“正交”运算分别对应着角度的余弦值和正弦值,二者已涵盖处理方向相关操作所需的全部信息。从诸多维度来看,这两种运算与三角函数等效,仿佛只需依靠它们就能彻底摒弃角度和三角函数。实际应用中,仅依托欧几里得体系下简单的向量运算(无需三角函数),就能完成大量工作,这不禁让人反思——我们是不是在做无用功?答案很可能是肯定的。然而遗憾的是,即便是经验丰富的专业人士,也常常滥用三角函数,导致实现方式变得极度复杂、繁琐且毫无优雅可言,甚至可以说确实是“错误”的。

So, without making the article any more abstract, lets consider one such use case of dot and cross products as a replacement for trigonometry, and see what I’m taking about.
因此,为避免文章过于抽象,我们不妨以点积和叉积替代三角函数的一个实际场景为例,来具体说明我的观点。

The wrong way to do orientate a space/object

空间/物体定向的错误实现方式

Say you have this function, which computes a matrix that rotates vectors around a normalized axis vector v, by an angle a. Any 3D engine or realtime rendering math library will have one such rutine, which has been most probably blindly copied from another engine, wikipedia or an opengl tutorial… (yes, at this point you should confess it, and depending of your mood, perhaps feel bad about it). The function will look something like this:
假设有这样一个函数:它用于计算一个旋转矩阵,可将向量绕归一化的轴向量 v 旋转角度 a。所有 3D 引擎或实时渲染数学库中都会包含这类函数,而其实现几乎都是盲目照搬自其他引擎、维基百科或 OpenGL 教程……(没错,此刻你应当承认这一点,或许还会因此感到些许惭愧)。该函数的代码大致如下:

mat3x3 rotationAxisAngle( const vec3 & v, float a )
{
    const float si = sinf( a );
    const float co = cosf( a );
    const float ic = 1.0f - co;

    return mat3x3( v.x*v.x*ic + co,       v.y*v.x*ic - si*v.z,    v.z*v.x*ic + si*v.y,
                   v.x*v.y*ic + si*v.z,   v.y*v.y*ic + co,        v.z*v.y*ic - si*v.x,
                   v.x*v.z*ic - si*v.y,   v.y*v.z*ic + si*v.x,    v.z*v.z*ic + co );
}

Imagine you are now somewhere in the internals of your demo or game, perhaps writing some animation module, when you find yourself in need to orientate/rotate an object in a give direction. You want to rotate it such that one of its axis, say the z axis, aligns to a given direction vector d (say, the tangent of an animation path). You decide of course to build the matrix that will encode the transformation by using rotationAxisAngle(). So, you first measure the angle between your object’s z axis a?nd the desired orientation vector. Since you are a graphics programmer, you know you can do this by taking the dot product and then extracting the angle with an acos() call… Also, because you know that sometimes acosf() can return weird values if your dot products returns anything outside of the -1…1 range, you decide to clamp its argument accordingly (at this point you might even dare blame the machine’s precission for not making the length of your normalized vectors exactly 1). At this stage one kitten has already been murdered. But since you don’t know about it, you proceed writing your code. So you next compute the rotation axis, which again you know it’s the cross product of your z vector with the desired direction d, for all points in your object will rotate in planes parallel to the one defined by those two vectors. Then you decide to normalize the vector, just in case… (kitten has been resurected, and murdered again). In the end, your code looks like this:
试想你正在编写演示程序或游戏的底层代码(比如动画模块),需要将某个物体朝指定方向定向/旋转。你希望旋转该物体,使其某一坐标轴(例如 z 轴)与给定的方向向量 d(例如动画路径的切向量)对齐。你自然会决定调用 rotationAxisAngle() 函数来构建表示该变换的矩阵。于是,你首先计算物体 z 轴与目标方向向量之间的夹角。作为图形程序员,你知道可通过点积结合 acos() 函数来求解这个角度……同时,你也清楚若点积结果超出 -1 到 1 的范围,acosf() 可能会返回异常值,因此你会对输入参数做钳位处理(此时你甚至可能抱怨机器精度问题,导致归一化后的向量长度并非严格等于 1)。至此,已有一只小猫“惨遭牺牲”。但你对此毫无察觉,继续编写代码:接下来计算旋转轴——你知道这是 z 向量与目标方向 d 的叉积,因为物体上所有点都将在与这两个向量所确定平面平行的平面内旋转。随后,你为了“保险起见”,又对该向量做了归一化……(那只小猫刚“复活”,又再次“遭殃”)。最终,你的代码如下:

    const vec3   axi = normalize( cross( z, d ) );
    const float  ang = acosf( clamp( dot( z, d ), -1.0f, 1.0f ) );
    const mat3x3 rot = rotationAxisAngle( axi, ang );

To see why this works but it’s still wrong in many ways, lets expand the code of rotationAxisAngle() in place and see what’s really going on:
为说明该代码为何能运行但在诸多方面仍存在问题,我们将 rotationAxisAngle() 的代码内联展开,看看实际执行的逻辑:

    const vec3   axi = normalize( cross( z, d ) );
    const float  ang = acosf( clamp( dot( z, d ), -1.0f, 1.0f ) );
    const float  co = cosf( ang );
    const float  si = sinf( ang );
    const float  ic = 1.0f - co;
    const mat3x3 rot = mat3x3( axi.x*axi.x*ic + co,         axi.y*axi.x*ic - si*axiz,    axi.z*axi.x*ic + si*axi.y,
                               axi.x*axi.y*ic + si*axi.z,   axi.y*axi.y*ic + co,         axi.z*axi.y*ic - si*axi.x,
                               axi.x*axi.z*ic - si*axi.y,   axi.y*axi.z*ic + si*axi.x,   axi.z*axi.z*ic + co );

As you have already probably noticed, we are performing an rather imprecise and expensive acos() call, just to undoit immediatelly after by computing a cos() on its return value. So first quesion comes: why not skip the acos/cos chain altogether and save some CPU sycles. Secondly, and more importantly, isn’t this observation telling us that we are doing something conceptually wrong and far too complicated, and that there is some sort of simple mathematical principle coming to us, manifesting itself through this expresion simplification?
你或许已经注意到:我们先调用精度欠佳且计算开销大的 acos() 函数,紧接着又对其返回值调用 cos() 函数,相当于“做了无用功”。那么第一个问题来了:为何不直接省去 acos/cos 这一操作链,节省 CPU 运算周期呢?其次,更重要的是,这一现象是否说明我们的思路在概念层面出现了偏差,把问题搞得过于复杂?而这种表达式可简化的特性,是否恰恰暗示着存在某种简洁的数学原理等待我们发掘?

You might argue that the simplification cannot be done really for one needs the angle anyway in order to feed the sin() function coming right after the cosine. However, this is not true. If you are familiar with the cross product, you might now that the same way dot products encode cosines, cross products encode sines… Most graphics programmers seem to have an intuition for what the dot product does, but a few haven’t developed one for the cross product (and only use it to compute normals and rotation axes). Basically, the mathematical principle that is helping us getting rid of the acos/cos pair also tells us that wherever there’s a dot product there’s probably a cross product around that completes the missing piece of information (the orthogonal piece, the sine).
你可能会反驳:这种简化不可行,因为紧随 cos() 之后的 sin() 函数仍需要用到角度值。但事实并非如此。如果你熟悉叉积的性质,就会知道:正如点积对应余弦值,叉积也对应着正弦值……大多数图形程序员对点积的作用有直观认知,但对叉积的理解却十分有限(仅将其用于计算法向量和旋转轴)。本质上,能让我们摒弃 acos/cos 操作对的数学原理也揭示了一个规律:只要有点积出现的地方,大概率能找到对应的叉积,补全缺失的正交信息(即正弦值)。

The proper way to do it

正确的实现方式

So, yeah, we can extract the sine of the angle between z and d by just looking at the length of their cross product… - remember that z and d are normalized! Which means we can (should!!) rewrite the whole operation this way:
没错,我们只需计算 z 和 d 叉积的长度,就能得到二者夹角的正弦值……别忘了 z 和 d 都是归一化向量!这意味着我们(也理应!)可以将整个操作重写为:

    const vec3   axi = cross( z, d );
    const float  si  = length( axi );
    const float  co  = dot( z, d );
    const mat3x3 rot = rotationAxisCosSin( axi/si, co, si );

and make sure our new rotation matrix building function rotationAxisCosSin() doesn’t compute sines and cosines, but takes them as arguments:
同时确保我们新的旋转矩阵构建函数 rotationAxisCosSin() 不再计算正弦和余弦值,而是直接接收这两个值作为参数:

mat3x3 rotationAxisCosSin( const vec3 & v, const float co, const float si )
{
    const float ic = 1.0f - co;

    return mat3x3( v.x*v.x*ic + co,       v.y*v.x*ic - si*v.z,    v.z*v.x*ic + si*v.y,
                   v.x*v.y*ic + si*v.z,   v.y*v.y*ic + co,        v.z*v.y*ic - si*v.x,
                   v.x*v.z*ic - si*v.y,   v.y*v.z*ic + si*v.x,    v.z*v.z*ic + co );
}

There’s one more thing we can do, and that’s to get rid of the normalization/square root, by encapsulating the whole alignment logic in a new function and propagating the 1/si to the matrix:
我们还能再进一步优化:将整个定向逻辑封装到一个新函数中,并把 1/si 的运算融入矩阵计算过程,从而彻底摒弃归一化/平方根运算:

mat3x3 rotationAlign( const vec3 & d, const vec3 & z )
{
    const vec3  v = cross( z, d );
    const float c = dot( z, d );
    const float k = (1.0f - c)/(1.0-c*c);

    return mat3x3( v.x*v.x*k + c,     v.y*v.x*k - v.z,    v.z*v.x*k + v.y,
                   v.x*v.y*k + v.z,   v.y*v.y*k + c,      v.z*v.y*k - v.x,
                   v.x*v.z*K - v.y,   v.y*v.z*k + v.x,    v.z*v.z*k + c    );
}

Lastly Zoltan Vrana noted that k can be simplified to k = 1/(1+c), which not only is mathematically more elegant, but also moves the singularity in k (and so the whole function) to the case the vectors d and z are parallel (rather than perpendicular), in which case there’s no clear rotation that is best anyways. So, the final function looks like this:
最后,佐尔坦·弗拉纳(Zoltan Vrana)指出,k 可简化为 k = 1/(1+c)。这一简化不仅在数学形式上更优雅,还将 k(乃至整个函数)的奇点转移到向量 d 和 z 平行的场景(而非垂直场景)——而在平行场景下,本就不存在唯一的“最优”旋转方式。因此,最终的函数形式如下:

mat3x3 rotationAlign( const vec3 & d, const vec3 & z )
{
    const vec3  v = cross( z, d );
    const float c = dot( z, d );
    const float k = 1.0f/(1.0f+c);

    return mat3x3( v.x*v.x*k + c,     v.y*v.x*k - v.z,    v.z*v.x*k + v.y,
                   v.x*v.y*k + v.z,   v.y*v.y*k + c,      v.z*v.y*k - v.x,
                   v.x*v.z*K - v.y,   v.y*v.z*k + v.x,    v.z*v.z*k + c    );
}

Not only we have saved 3 trigonometric functions and avoided an ugly clamp (and a normalization!), but in fact we have conceptually simplified our 3D maths. No trascendental functions, only vectors invoved here. Vectors construct matrices that transforms vectors. And ths is important because, remember, the less trigonometry in your 3D engine, not only the faster and more robust it is, but above all, the more mathematicall elegant (rcoorect!) it is.
我们不仅省去了 3 个三角函数调用,还避免了不美观的钳位操作(以及一次归一化!),更重要的是,从概念层面简化了 3D 数学运算逻辑。代码中不再出现超越函数,仅涉及向量运算——由向量构建出用于变换向量的矩阵。这一点至关重要,因为请记住:3D 引擎中的三角函数使用得越少,不仅运算速度更快、鲁棒性更强,更重要的是,其数学实现也更优雅(也更“正确”!)。


via :

  • Blog: Beautiful maths simplification: quaternion from two vectors – Lol Engine
    https://lolengine/blog/2013/09/18/beautiful-maths-quaternion-from-vectors

  • Quilez - fractals, computer graphics, mathematics, demoscene and more
    https://www.iquilezles/www/articles/noacos/noacos.htm

本文标签: Optimization Mathematical trigonometry avoiding