Rotations and their representations

# Julia in Practice: Building Scattering.jl from Scratch (4)

In this post we implement submodules `rotation.jl`

of `Scattering.jl`

to rotate a vector in the reference coordinate system to the internal coordinate system of the scatterer. Three representations of a rotaion operation are discussed and implemented. The conversion among and math operations on these representations are also implemented.

## Table of Contents

## Rotations

Rotations are another type of isometric transformation which preserve distances and angles. As stated in the previous post, we will adopt the convention of *alias* transformation. Therefore, a rotation operation is applied to the coordinates. It will rotate the reference coordinates to the internal coordinates of a scatterer. It thus natural to choose a rotation to represent the orientation of a scatterer, as shown in Figure 1. In the following, we will briefly review three popular represenations of a rotation. For a more comprehensive introduction, especially in the context of crystallography, please consult these references^{1}^{2}. A series of Wiki pages about rotations in a general sense are also good sources to learn:

- Rotation formalisms in three dimensions
- Rotation matrix
- Axis–angle representation
- Euler’s rotation theorem
- Rodrigues’ rotation formula
- Euler angles
- Gimbal lock

### Rotation matrices

The transformation of a position vector $\vr = [x\;y\;z]^T = [x_1\;x_2\;x_3]^T$ in the reference coordinate system $xyz$ to the internal coordinate system of a scatterer $x’y’z’$ can be written as a multiplication of the vector by a rotation matrix $\mR$. The resulted position vector $\vr’=[x’\;y’\;z’]^T$ with its coordinates expressed in the $x’y’z’$ system is then given by

\[\vr' = \mR\vr\]To determine the rotation matrix which has 9 components in 3D space, we have to identify at least 9 equations. It is most convenient to examine the transformation of three basis vectors of the $x’y’z’$ system, $\vu=[u_x\;u_y\;u_z]^T, \vv=[v_x\;v_y\;v_z]^T, \vw=[w_x\;w_y\;w_z]^T$ with their coordinates expressed in the $xyz$ system. When the coordinates of these basis vectors are expressed in the $x’y’z’$ system, they are just three unit vectors $[1\;0\;0]^T, [0\;1\;0]^T, [0\;0\;1]^T$. Thus we have

\[\begin{equation} \begin{split} \mR\vu &= [1\;0\;0]^T \\ \mR\vv &= [0\;1\;0]^T \\ \mR\vw &= [0\;0\;1]^T \end{split} \end{equation}\]It is possible to rewrite above equation in a more compact form as

\[\begin{equation} \begin{split} \mR[\vu\;\vv\;\vw] &= \begin{bmatrix} 1& 0& 0 \\ 0& 1& 0 \\ 0& 0& 1 \end{bmatrix} \\ &= \mI \end{split} \end{equation}\]The term in the right hand side of the above equation is just an identity matrix $\mI$. In convention, we will write the matrix $[\vu\;\vv\;\vw]$ as $\mP$. Obviously, $\mP$ is an orthonormal matrix because its three column vectors are just three basis vectors of a coordinate system which are unit vectors and orthogonal to each other. As long as we know all the components of the three basis vectors of the internal coordinate system in the reference coordinate system, we can easily construct $mP$ and the rotation matrix can be obtained by inverse the above equation:

\[\mR = \mP^{-1}\]This equation can be futher simplified using a property of an orthornormal matrix:

\[\begin{equation} \begin{split} \mP^T\mP &= \begin{bmatrix} \vu^T \\ \vv^T \\ \vw^T \end{bmatrix} [\vu\;\vv\;\vw] \\ &= \mI \end{split} \end{equation}\]which means

\[\mP^{-1} = \mP^T\]Therefore, the rotation matrix is just the transposition of $\mP$:

\[\begin{equation} \mR = \mP^T = \begin{bmatrix} u_x& u_y& u_z \\ v_x& v_y& v_z \\ w_x& w_y& w_z \end{bmatrix} \end{equation}\]It is now clear that $\mP$ is also a rotation matrix which transforms a vector in the internal coordinate system to the reference coordinate system. All rotation matrices should have following properties:

- $\mR^{-1} = \mR^T$
- $\det(\mR) = \pm 1$ since $1=\det(I)=\det(\mR^T\mR)=\det(\mR^T)\det(\mR)=\det(\mR)^2$

For the coordinate system transformation described above, the determinat should be the volume of the parallelopiped which is $+1$. such $\mR$ is called a **proper rotation**. Given a matrix, to determine whether it is a proper rotation matrix, we should first compute its determinant to see if it is $+1$. If so, we then compute its inverse, and check if it is identical to its transposition. If both crierions are met, then the matrix represents a proper rotation in the Cartesian coordinate system. However, *it is not necessary true for other non-Cartesian coordinate systems*.

### Euler axis angle

The rotation matrix is convenient for numerical computations, however it is not an intuitive way to represent a rotation. The most popular way to describe a rotation is by Euler axis angle or polar angles: there exists a rotation axis given by a unit vector $\vomega$, about which the object rotates by angle $\theta$ according to the *Euler’s rotation theorem*.

#### Rotation axis

By definition, a vector $\vr$ parallel to the rotation axis is invariant under rotation:

\[\mR\vr = \vr\]or

\[\begin{equation} (\mR - \mI)\vr = 0 \end{equation}\label{eq:rotaxis0}\]Obviously, the vector $\vr$ is an eigenvector of $\mR$ corresponding to the eigenvalue $\lambda=+1$. To obtain $\vr$, we can simply diagonalize $\mR$ and find the eigenvector corresponding to the eigenvalue of $+1$. For non-symmetric $\mR$, however, there is a simpler way to determine $\vr$. We can show that

\[\begin{equation} \begin{split} (\mR - \mR^T)\vr &= [(\mI - \mR^T) + (\mR - \mI)]\vr \\ &= (\mI - \mR^T)\vr + (\mR - \mI)\vr \\ &= (\mR^T\mR - \mR^T)\vr + (\mR - \mI)\vr \\ &= \mR^T(\mR - \mI)\vr + (\mR - \mI)\vr \\ &= \mR^T 0 + 0 \\ &= 0 \end{split} \end{equation}\label{eq:skewsym}\]From the second to the third line, we have invoked the property of a rotation matrix $\mR^T\mR = \mI$, while from the fourth to the fifth line eq.\eqref{eq:rotaxis0} is applied. It is also easy to show that $\mR - \mR^T$ is a skew-symmetric matrix because

\[(\mR - \mR^T)^T = \mR^T - \mR = -(\mR - \mR^T)\]A skew-symmetric matrix can be used to represent cross products as matrix multiplications. Consider vectors $\va = [a_1\;a_2\;a_3]^T$ and $\vb = [b_1\;b_2\;b_3]^T$. Then, defining a skew-symmetric matrix with its components from $\va$

\[\begin{equation} [\va]_{\times} = \begin{bmatrix} 0& -a_3& a_2 \\ a_3& 0& -a_1 \\ -a_2& a_1& 0 \end{bmatrix} \end{equation}\label{eq:crossproduct}\]the cross product can be written as

\[\va \times \vb = [\va]_{\times}\vb.\]Since the cross product of any vector with itself is equal to 0, eq.\eqref{eq:skewsym} implies that $\mR - \mR^T$ is actually the cross product matrix of vector $\vr$,

\[\mR - \mR^T = [\vr]_{\times}.\]By writing out $\mR - \mR^T$ explicitly as

\[\begin{equation} \mR - \mR^T = \begin{bmatrix} 0& u_y-v_x& u_z-w_x \\ v_x-u_y& 0& v_z-w_y \\ w_x-u_z& w_y-v_z& 0 \end{bmatrix} \end{equation}\label{eq:RRT}\]Comparing eq.\eqref{eq:RRT} with eq.\eqref{eq:crossproduct}, we immediately arrives at

\[\begin{equation} \vr = \begin{bmatrix} r_x \\ r_y \\ r_z \end{bmatrix} = \begin{bmatrix} w_y-v_z \\ u_z-w_x \\ v_x-u_y \end{bmatrix} = \begin{bmatrix} R_{32}-R_{23} \\ R_{13}-R_{31} \\ R_{21}-R_{12} \end{bmatrix} \end{equation}\label{eq:vomega}\]The last term is useful when we denote the rotation matrix as $\mR = [R_{ij}]$ where $i={1,2,3},j={1,2,3}$ are row and column indices, respectively. Finally, the unit vector parallel to the rotation axis is

\[\begin{equation} \vomega = \frac{\vr}{\norm{\vr}} \end{equation}\label{eq:vomega-norm}\]Note that $\vomega$ computed by eq.\eqref{eq:vomega} becomes a zero vector when $\mR$ is symmetric. Therefore, this approach is *inapplicable* to symmetric matrices.

#### Rotation angle

Once the rotation axis is known, the angle of rotation $\theta$ is the angle between two vectors $\va$ and $\mR\va$, where $\va$ can be any vector that is *perpendicular to* $\vomega$.

However, a more straightforward way to find $\theta$ is to compute the trace of the rotation matrix and invoke the relation^{2}

It follows that the angle of rotation can be computed via

\[\begin{equation} \theta = \arccos\left[ \frac{\Tr(\mR)-1}{2} \right] \end{equation}\label{eq:theta}\]where we should restrict the range of $\theta$ to $[0, \pi]$.

#### Conversion to the rotation matrix

Given by an Euler axis angle representation with known $\vomega=[\omega_x\;\omega_y\;\omega_z]^T$ and $\theta$, we can construct the corresponding rotation matrix directly:

\[\begin{equation} \mR = \begin{bmatrix} \cos\theta + \omega_x^2(1-\cos\theta)& \omega_x\omega_y(1-\cos\theta)-\omega_z\sin\theta& \omega_x\omega_z(1-\cos\theta)+\omega_y\sin\theta \\ \omega_y\omega_x(1-\cos\theta)+\omega_z\sin\theta& \cos\theta + \omega_y^2(1-\cos\theta)& \omega_y\omega_z(1-\cos\theta)-\omega_x\sin\theta \\ \omega_z\omega_x(1-\cos\theta)-\omega_y\sin\theta& \omega_z\omega_y(1-\cos\theta)+\omega_x\sin\theta& \cos\theta + \omega_z^2(1-\cos\theta) \end{bmatrix} \end{equation}\label{eq:aa2rm}\]See ref^{2} for how to derive above equation. Alternatively, we can utilize the cross product matrix of $\vomega$ by comparing to eq.\eqref{eq:crossproduct}:

The rotation matrix is then obtained as

\[\begin{equation} \mR = \mI + \mK\sin\theta + (1-\cos\theta)\mK^2 \end{equation}\label{eq:K2rm}\]The above equation can be derived in a Lie-algebraic way or a geometric way using the Rodrigues’ rotation formula.

### Euler angles

Another popular description of a rotation is using Euler angles. It is demonstrated by Leonhard Euler that it is sufficient to rotate a reference coordinate system by **three angles** around three axes of a coordinate system to reach any target frame. The rotation around an axis of a coordinate system is called an elemental/basic rotation. Albeit its popularity, the Euler angles representation of a rotation causes many confusions and it has a serious artifact known as the Gimbal lock. There exist twelve possible sequences of rotation axes, leading to twelve conventions. Different fields usually choose a particular convention. In this project, we will choose the Z1Y2Z3 convention according to the Wiki page Euler angles.

#### Convert Euler angles to the rotation matrix

Euler angles are denoted as $\alpha, \beta, \gamma$. The rotation matrix conforming to the Z1Y2Z3 convention has the form

\[\begin{equation} \mR = \begin{bmatrix} \cos\alpha\cos\beta\cos\gamma-\sin\alpha\sin\gamma& -\cos\gamma\sin\alpha-\cos\alpha\cos\beta\sin\gamma& \cos\alpha\sin\beta \\ \cos\alpha\sin\gamma+\cos\beta\cos\gamma\sin\alpha& \cos\alpha\cos\gamma-\cos\beta\sin\alpha\sin\gamma& \sin\alpha\sin\beta \\ -\cos\gamma\sin\beta& \sin\beta\sin\gamma& \cos\beta \end{bmatrix} \end{equation}\label{eq:euler2mat}\]#### Convert the rotation matrix to Euler angles

From eq.\eqref{eq:euler2mat}, we have

\[\begin{split} \frac{R_{23}}{R_{13}} &= \tan\alpha \\ R_{33} &= \cos\beta \\ \frac{R_{32}}{-R_{31}} &= \tan\gamma \end{split}\]Then Euler angles can be obtained acoordingly

\[\begin{equation} \begin{split} \alpha &= \arctan(R_{23}, R_{13}) \\ \beta &= \arccos(R_{33}) \\ \gamma &= \arctan(R_{32}, -R_{31}) \end{split} \end{equation}\label{eq:mat2euler}\]where $\arctan(a, b)$ is a specialized version of the `arctan`

function which takes into account the quadrant that the point $(b, a)$ is in. Note that eq.\eqref{eq:mat2euler} is valid for $\theta$ in the range $(0, \pi]$. When $\beta=0$ or $R_{33}=1$, $\alpha$ and $\gamma$ has infinite many solutions which satisfy

or

\[\alpha + \gamma = \arctan(R_{21}, R_{11})\]In this project, **we fix $\gamma=0$ when $\beta=0$.**

## Implementation

It is now straightforward to implement a submodule `rotation.jl`

of `Scattering.jl`

which realize rotations and their associated operations.

### Rotation types

First, we define an abstract type to describe any rotation operators:

1

abstract type AbstractRotation end

Inheriting from it, we define a concrete type for rotation matrices:

1
2
3

struct RotationMatrix{T<:Real} <: AbstractRotation
R::RotMatrix{T} # 3x3 matrix
end

To ensure the user supplied matrix is actually a rotation matrix, we add an internal constructor to it

1
2
3
4
5
6
7

function RotationMatrix(R::RotMatrix{T}) where {T<:Real}
# Verify that P is a valid rotation matrix
detR = det(R)
@assert(detR ≈ one(detR), "Not a valid rotation matrix: det=$detR.")
@assert(transpose(R) ≈ inv(R), "Not a valid rotation matrix: RT != R^-1.")
new{T}(R)
end

It is convenient to define a outer constructer which takes three arguments. They are unit basis vectors of the internal coordinate system of a scatterer $\vu, \vv, \vw$ whose components are expressed in the reference coordinate system.

1

RotationMatrix(u::Vector3D{T}, v::Vector3D{T}, w::Vector3D{T}) where {T <: Real} = RotationMatrix(transpose(RotMatrix(hcat(u,v,w))))

Note that we first construct $\mP$ using `hcat`

function. Then we transpose it to obtain $\mR$ using `transpose`

function.

Similarly, we define a concrete type for Euler axis angle:

1
2
3
4
5

struct EulerAxisAngle{T<:Real} <: AbstractRotation
ω::RVector{T} # rotation axis
θ::T # in radian unit
K::SMatrix{3,3,T} # the cross product matrix such that Kv = ω×v for any vector v.
end

Note that the cross product matrix $\mK$ is also stored. We also need an internal constructor which ensures the rotation axis vector is a unit vector

1
2
3
4
5

function EulerAxisAngle(ω::RVector{T}, θ::T) where {T<:Real}
ω /= norm(ω) # make sure ω is normalized
K = hcat([0, ω[3], -ω[2]], [-ω[3], 0, ω[1]], [ω[2], -ω[1], 0])
new{T}(ω, θ, K)
end

Finally, we define a concrete type for Euler angles:

1
2
3
4
5

struct EulerAngle{T<:Real} <: AbstractRotation
α::T # in radian unit
β::T # in radian unit
γ::T # in radian unit
end

And an internal constructor is also necessary to ensure all three angles are in the same type:

1
2
3

function EulerAngle(α::T=0, β::T=0, γ::T=0) where {T<:Real}
new{T}(α, β, γ)
end

### Conversions

Conversions are implemented as outer constructors for each type. Converting Euler axis angle to a rotation matrix:

1
2
3
4
5
6

function RotationMatrix(axisangle::EulerAxisAngle)
θ = axisangle.θ
K = axisangle.K
R = one(K) + sin(θ)*K + (1-cos(θ))*K*K
RotationMatrix(R)
end

where eq.\eqref{eq:K2rm} is used. Converting Euler angles to a rotation matrix:

1
2
3
4
5
6
7
8
9
10
11
12
13

function RotationMatrix(euler::EulerAngle)
c1 = cos(euler.α)
c2 = cos(euler.β)
c3 = cos(euler.γ)
s1 = sin(euler.α)
s2 = sin(euler.β)
s3 = sin(euler.γ)
v1 = [c1*c2*c3 - s1*s3 -c3*s1 - c1*c2*s3 c1*s2]
v2 = [c1*s3 + c2*c3*s1 c1*c3 - c2*s1*s3 s1*s2]
v3 = [-c3*s2 s2*s3 c2]
R = RotMatrix(vcat(v1, v2, v3))
RotationMatrix(R)
end

where eq.\eqref{eq:euler2mat} is used. Converting a rotation matrix to Euler axis angle:

1
2
3
4
5
6
7
8
9
10
11
12
13
14

function EulerAxisAngle(rotmatrix::RotationMatrix)
R = rotmatrix.R
θ = acos((tr(R)-1)/2)
v = [R[3,2]-R[2,3], R[1,3]-R[3,1], R[2,1]-R[1,2]]
if θ ≈ zero(θ)
return EulerAxisAngle([1.0, 0.0, 0.0], 0.0)
end
if v ≈ zero(v)
vals, vecs = eigen(R)
idx = findfirst(x -> x ≈ one(x), vals)
ω = real(vecs[:, idx])
end
EulerAxisAngle(ω, θ)
end

where eq.\eqref{eq:vomega} and eq.\eqref{eq:theta} are used. Converting a rotation matrix to Euler angles:

1
2
3
4
5
6
7
8
9
10
11
12
13

function EulerAngle(rotmatrix::RotationMatrix)
R = rotmatrix.R
if R[3,3] ≈ one(R[3,3])
α = atan(R[2,1],R[1,1])
β = zero(α)
γ = zero(α)
else
α = atan(R[2,3], R[1,3])
β = acos(R[3,3])
γ = atan(R[3,2], -R[3,1])
end
EulerAngle(α, β, γ)
end

where eq.\eqref{eq:mat2euler} is used.

The conversions between Euler angles and Euler axis angle are implemented by converting them to rotation matrices first:

1
2

EulerAxisAngle(euler::EulerAngle) = RotationMatrix(euler) |> EulerAxisAngle
EulerAngle(axisangle::EulerAxisAngle) = RotationMatrix(axisangle) |> EulerAngle

Strictly follow our convention developed in this post, the conversions among these three representations are lostless. Thus it is reasonable to define a set of conversion rules following the Julia guidelines:

1
2

convert(::Type{T}, x::AbstractRotation) where {T<:AbstractRotation} = T(x)
convert(::Type{T}, x::T) where {T<:AbstractRotation} = x

The second line handles the situation when the object `x`

is already of the target type.

### Promotion

The rotation matrix is most convenient for numerical compuations. Therefore, its rank is considered higher than the other two representations. It means that we will convert any type of a rotation to the rotation matrix during computation. In Julia, it is realized by promotion rules. A promotion rule promote two or more types to a single common type. It will be used implicitly by Julia for methods `promote_type`

and `promote`

.

We define the following promotion rule:

1

promote_rule(::Type{T}, ::Type{S}) where {T<:AbstractRotation, S<:AbstractRotation} = RotationMatrix

which promotes two rotations of **different** types to the `RotationMatrix`

type. However, when two rotations are of same type, Julia implicitly does nothing and this behavior can not be overridden at present. For example, if two rotations are of type `EulerAngle`

, the return type will be `EulerAngle`

instead of `RotationMatrix`

. It is important to bare this in mind when implement `==`

and `*`

later.

It is also convenient to implement a `promote`

method which takes a single argument and promotes all rotations to the `RotationMatrix`

type. Note that one-argument `promote`

is not supported by Julia `Base`

.

1
2

promote(rot::RotationMatrix) = rot
promote(rot::AbstractRotation) = RotationMatrix(rot)

### Inversion

The inversion of a rotation is computed via inverting the rotation matrix, then converting it back to the original type:

1
2

inv(rot::RotationMatrix) = collect(transpose(rot.R)) |> RotationMatrix
inv(rot::T) where {T<:AbstractRotation} = inv(promote(rot)) |> T.name.wrapper

In the above implementation, we utilized the relation $\mR^{-1} = \mR^T$.

### Comparison

It is meaningful to check whether two rotations are identical. So we have extended the `==`

operator defined in Julia `Base`

:

1
2
3
4

==(rot1::RotationMatrix, rot2::RotationMatrix) = rot1.R ≈ rot2.R
==(rot1::EulerAngle, rot2::EulerAngle) = RotationMatrix(rot1) == RotationMatrix(rot2)
==(rot1::EulerAxisAngle, rot2::EulerAxisAngle) = RotationMatrix(rot1) == RotationMatrix(rot2)
==(rot1::AbstractRotation, rot2::AbstractRotation) = ==(promote(rot1,rot2)...)

Note how poromtion rules defined previously has been used.

### Multiplication of two rotation operations

Since any rotation operation can be expressed as a matrix, it means that we can multiply them together to obtain another rotation. Mulitiplication of two rotations of any type is implemented by first promoting them to the `RotationMatrix`

type:

1
2
3
4

*(rot1::RotationMatrix, rot2::RotationMatrix) = RotationMatrix(rot1.R * rot2.R)
*(rot1::EulerAngle, rot2::EulerAngle) = RotationMatrix(rot1) * RotationMatrix(rot2)
*(rot1::EulerAxisAngle, rot2::EulerAxisAngle) = RotationMatrix(rot1) * RotationMatrix(rot2)
*(rot1::AbstractRotation, rot2::AbstractRotation) = *(promote(rot1,rot2)...)

Note that the result of a multiplication is always of type `RotationMatrix`

. As mentioned earlier, promotion of two rotations of an identical type returns this type other than `RotationMatrix`

. Therefore, it is required to explicitly add the first three lines.

### Power of a rotation operation

Raising a rotation to an integral power $n$ means multiplying a rotation for $n$ times. A naive implementation is

1
2
3
4
5
6
7
8

function pow1(rot::AbstractRotation, n::Integer)
rot = promote(rot)
result = rot
for i in 1:(n-1)
result *= rot
end
result
end

However, the complexity is only $O(N)$. The complexity can be optimized to $O(\log(N))$:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

function ^(rot::AbstractRotation, n::Integer)
rot = promote(rot)
if n == zero(n)
return one(rot)
elseif n == one(n)
return rot
else
if isodd(n)
return rot * rot^(n-1)
else
rothalf = rot^(n÷2)
return rothalf * rothalf
end
end
end

where we have utilized a *unit rotation* which is simply a do-nothing rotation. It is implemented as follows:

1
2

one(rot::RotationMatrix{T}) where {T<:Real} = one(rot.R) |> RotationMatrix
one(rot::T) where {T<:AbstractRotation} = one(promote(rot)) |> T.name.wrapper

Using `BechmarkTools`

, we can compare these two methods as

1
2
3
4
5
6
7

julia> using BenchmarkTools
julia> using Scattering
julia> # construct euler ...
julia> @btime pow1($euler, 1000)
142.561 μs (3004 allocations: 172.31 KiB)
julia> @btime $euler^1000
2.327 μs (49 allocations: 3.02 KiB)

We can see that the optimized version is nearly $70\times$ faster than the naive version when $n=1000$.

### Transformation of a position vector

Transformation of a position vector with components expressed in the reference coordinate system to the internal coordinate system of a scatterer is implemented by applying a rotation operation on the vector:

1

*(rot::AbstractRotation, v::RVector{T}) where {T <: Real} = promote(rot).R * v

As can be seen, the actual computation is carried out by a matrix-vector product.

## Usage

Due to the length of this blog post, the usage as well as the test of `rotation.jl`

is presented in a separate blog post.

## Acknowledgements

This work is partially supported by the General Program of the National Natural Science Foundation of China (No. 21873021).