Real linear maps
A map $f: V \to V$ from some vector space $V$ to itself is said to be a real linear map if it satisfies $f(\alpha x + \beta y) = \alpha f(x) + \beta f(y)$ for all $x, y \in V$ and all $\alpha, \beta \in \mathbb{R}$. When $V$ is itself a real vector space, this is just the natural concept of a linear map. However, this definition can be used even if $x$ and $y$ are naturally represented using complex numbers and arithmetic and also admit complex linear combinations, i.e. if $V$ is a complex vector space.
Such real linear maps arise whenever f(x)
involves calling conj(x)
, and are for example obtained in the context of Jacobians (pullbacks) of complex valued functions that are not holomorphic.
To deal with real linear maps, one should reinterpret $V$ as a real vector space, by restricting the possible linear combinations to those with real scalar coefficients, and by using the real part of the inner product. When the vectors are explictly represented as some AbstractVector{Complex{T}}
, this could be obtained by explicitly splitting them in their real and imaginary parts and stacking those into AbstractVector{T}
objects with twice the original length.
However, KrylovKit.jl admits a different approach, where the original representation of vectors is kept, and the inner product is simply replaced by its real part. KrylovKit.jl offers specific methods for solving linear systems and eigenvalue systems in this way. For linear problems, this is implemented using reallinsolve
:
KrylovKit.reallinsolve
— Functionreallinsolve(f, b, x₀, algorithm, [a₀::Real = 0, a₁::Real = 1]; alg_rrule=algorithm)
Compute a solution x
to the linear system a₀ * x + a₁ * f(x) = b
, using a starting guess x₀
, where f
represents a real linear map. Return the approximate solution x
and a ConvergenceInfo
structure.
A function f
is said to implement a real linear map if it satisfies f(add(x,y)) = add(f(x), f(y)
and f(scale(x, α)) = scale(f(x), α)
for vectors x
and y
and scalars α::Real
. Note that this is possible even when the vectors are represented using complex arithmetic. For example, the map f=x-> x + conj(x)
represents a real linear map that is not (complex) linear, as it does not satisfy f(scale(x, α)) = scale(f(x), α)
for complex scalars α
. Note that complex linear maps are always real linear maps and thus can be used in this context, though in that case linsolve
and reallinsolve
target the same solution. However, they still compute that solution using different arithmetic, and in that case linsolve
might be more efficient.
To interpret the vectors x
and y
as elements from a real vector space, the standard inner product defined on them will be replaced with real(inner(x,y))
. This has no effect if the vectors x
and y
were represented using real arithmetic to begin with, and allows to seemlessly use complex vectors as well.
Arguments:
The linear map can be an AbstractMatrix
(dense or sparse) or a general function or callable object. The real numbers a₀
and a₁
are optional arguments; they are applied implicitly, i.e. they do not contribute the computation time of applying the linear map or to the number of operations on vectors of type x
and b
.
Return values:
The return value is always of the form x, info = linsolve(...)
with
x
: the approximate solution to the problem, similar type as the right hand sideb
but possibly with a differentscalartype
info
: an object of type [ConvergenceInfo
], which has the following fieldsinfo.converged::Int
: takes value 0 or 1 depending on whether the solution was converged up to the requested toleranceinfo.residual
: residualb - f(x)
of the approximate solutionx
info.normres::Real
: norm of the residual, i.e.norm(info.residual)
info.numops::Int
: total number of times that the linear map was applied, i.e. the number of times thatf
was called, or a vector was multiplied withA
info.numiter::Int
: number of times the Krylov subspace was restarted (see below)
No warning is printed if no converged solution was found, so always check if info.converged == 1
.
Algorithms
The final (expert) method, without default values and keyword arguments, is the one that is finally called, and can also be used directly. Here, one specifies the algorithm explicitly. Currently, only CG
, GMRES
and BiCGStab
are implemented, where CG
is chosen if isposdef == true
and GMRES
is chosen otherwise. Note that in standard GMRES
terminology, our parameter krylovdim
is referred to as the restart parameter, and our maxiter
parameter counts the number of outer iterations, i.e. restart cycles. In CG
, the Krylov subspace is only implicit because short recurrence relations are being used, and therefore no restarts are required. Therefore, we pass krylovdim*maxiter
as the maximal number of CG iterations that can be used by the CG
algorithm.
In the case of eigenvalue systems, a similar method realeigsolve
is available. In this context, only real eigenvalues are meaningful, as the corresponding eigenvectors should be built from real linear combinations of the vectors that span the (real) Krylov subspace. This approach can also be applied to linear maps on vectors that were naturally real to begin with, if it is guaranteed that the targetted eigenvalues are real. In that case, also the associated eigenvectors will be returned using only real arithmic. This is contrast with eigsolve
, which will always turn to complex arithmetic if the linear map is real but not symmetric. An error will be thrown if complex eigenvalues are encountered within the targetted set.
KrylovKit.realeigsolve
— Function# expert version:
realeigsolve(f, x₀, howmany, which, algorithm; alg_rrule=algorithm)
Compute the first howmany
eigenvalues (according to the order specified by which
) from the real linear map encoded in the matrix A
or by the function f
, if it can be guaranteed that these eigenvalues (and thus their associated eigenvectors) are real. An error will be thrown if there are complex eigenvalues within the first howmany
eigenvalues.
Return eigenvalues, eigenvectors and a ConvergenceInfo
structure.
A function f
is said to implement a real linear map if it satisfies f(add(x,y)) = add(f(x), f(y)
and f(scale(x, α)) = scale(f(x), α)
for vectors x
and y
and scalars α::Real
. Note that this is possible even when the vectors are represented using complex arithmetic. For example, the map f=x-> x + conj(x)
represents a real linear map that is not (complex) linear, as it does not satisfy f(scale(x, α)) = scale(f(x), α)
for complex scalars α
. Note that complex linear maps are always real linear maps and thus can be used in this context, if looking specifically for real eigenvalues that they may have.
To interpret the vectors x
and y
as elements from a real vector space, the standard inner product defined on them will be replaced with real(inner(x,y))
. This has no effect if the vectors x
and y
were represented using real arithmetic to begin with, and allows to seemlessly use complex vectors as well.
Arguments:
The linear map can be an AbstractMatrix
(dense or sparse) or a general function or callable object. A starting vector x₀
needs to be provided. Note that x₀
does not need to be of type AbstractVector
; any type that behaves as a vector and supports the required interface (see KrylovKit docs) is accepted.
The argument howmany
specifies how many eigenvalues should be computed; which
specifies which eigenvalues should be targeted. Valid specifications of which
for real problems are given by
:LM
: eigenvalues of largest magnitude:LR
: eigenvalues with largest (most positive) real part:SR
: eigenvalues with smallest (most negative) real partEigSorter(f; rev = false)
: eigenvaluesλ
that appear first (or last ifrev == true
) when sorted byf(λ)
Krylov methods work well for extremal eigenvalues, i.e. eigenvalues on the periphery of the spectrum of the linear map. All of the valid Symbol
s for which
have this property, but could also be specified using EigSorter
, e.g. :LM
is equivalent to Eigsorter(abs; rev = true)
. Note that smallest magnitude sorting is obtained using e.g. EigSorter(abs; rev = false)
, but since no (shift-and)-invert is used, this will only be successful if you somehow know that eigenvalues close to zero are also close to the periphery of the spectrum.
From a theoretical point of view, Krylov methods can at most find a single eigenvector associated with a targetted eigenvalue, even if the latter is degenerate. In the case of a degenerate eigenvalue, the specific eigenvector that is returned is determined by the starting vector x₀
. For large problems, this turns out to be less of an issue in practice, as often a second linearly independent eigenvector is generated out of the numerical noise resulting from the orthogonalisation steps in the Lanczos or Arnoldi iteration. Nonetheless, it is important to take this into account and to try not to depend on this potentially fragile behaviour, especially for smaller problems.
The algorithm
argument currently only supports an instance of Arnoldi
, which is where the parameters of the Krylov method (such as Krylov dimension and maximum number of iterations) can be specified. Since realeigsolve
is less commonly used as eigsolve
, it only supports this expert mode call syntax and no convenient keyword interface is currently available.
The keyword argument alg_rrule
can be used to specify an algorithm to be used for computing the pullback
of realeigsolve
in the context of reverse-mode automatic differentation.
Return values:
The return value is always of the form vals, vecs, info = eigsolve(...)
with
vals
: aVector
containing the eigenvalues, of length at leasthowmany
, but could be longer if more eigenvalues were converged at the same cost. Eigenvalues will be real, anArgumentError
will be thrown if the firsthowmany
eigenvalues ordered according towhich
of the linear map are not all real.vecs
: aVector
of corresponding eigenvectors, of the same length asvals
. Note that eigenvectors are not returned as a matrix, as the linear map could act on any custom Julia type with vector like behavior, i.e. the elements of the listvecs
are objects that are typically similar to the starting guessx₀
. For a real problem with real eigenvalues, also the eigenvectors will be real and no complex arithmetic is used anywhere.info
: an object of type [ConvergenceInfo
], which has the following fieldsinfo.converged::Int
: indicates how many eigenvalues and eigenvectors were actually converged to the specified tolerancetol
(see below under keyword arguments)info.residual::Vector
: a list of the same length asvals
containing the residualsinfo.residual[i] = f(vecs[i]) - vals[i] * vecs[i]
info.normres::Vector{<:Real}
: list of the same length asvals
containing the norm of the residualinfo.normres[i] = norm(info.residual[i])
info.numops::Int
: number of times the linear map was applied, i.e. number of timesf
was called, or a vector was multiplied withA
info.numiter::Int
: number of times the Krylov subspace was restarted (see below)
No warning is printed if not all requested eigenvalues were converged, so always check if info.converged >= howmany
.
Note that both reallinsolve
and realeigsolve
currently only exist with the "expert" mode interface, where the user has to manually specify the underlying Krylov algorithm and its parameters, i.e. GMRES
or BiCGStab
for reallinsolve
and Arnoldi
for realeigsolve
.