Real linear maps

A map $f: V \to V$ from some vector space $V$ to itself is said to be a real linear map if it satisfies $f(\alpha x + \beta y) = \alpha f(x) + \beta f(y)$ for all $x, y \in V$ and all $\alpha, \beta \in \mathbb{R}$. When $V$ is itself a real vector space, this is just the natural concept of a linear map. However, this definition can be used even if $x$ and $y$ are naturally represented using complex numbers and arithmetic and also admit complex linear combinations, i.e. if $V$ is a complex vector space.

Such real linear maps arise whenever f(x) involves calling conj(x), and are for example obtained in the context of Jacobians (pullbacks) of complex valued functions that are not holomorphic.

To deal with real linear maps, one should reinterpret $V$ as a real vector space, by restricting the possible linear combinations to those with real scalar coefficients, and by using the real part of the inner product. When the vectors are explictly represented as some AbstractVector{Complex{T}}, this could be obtained by explicitly splitting them in their real and imaginary parts and stacking those into AbstractVector{T} objects with twice the original length.

However, KrylovKit.jl admits a different approach, where the original representation of vectors is kept, and the inner product is simply replaced by its real part. KrylovKit.jl offers specific methods for solving linear systems and eigenvalue systems in this way. For linear problems and least square problems, this is implemented using reallinsolve and reallssolve:

KrylovKit.reallinsolve — Function

reallinsolve(f, b, x₀, algorithm, [a₀::Real = 0, a₁::Real = 1]; alg_rrule=algorithm)

Compute a solution x to the linear system a₀ * x + a₁ * f(x) = b, using a starting guess x₀, where f represents a real linear map. Return the approximate solution x and a ConvergenceInfo structure.

Note about real linear maps

A function f is said to implement a real linear map if it satisfies f(add(x,y)) = add(f(x), f(y) and f(scale(x, α)) = scale(f(x), α) for vectors x and y and scalars α::Real. Note that this is possible even when the vectors are represented using complex arithmetic. For example, the map f=x-> x + conj(x) represents a real linear map that is not (complex) linear, as it does not satisfy f(scale(x, α)) = scale(f(x), α) for complex scalars α. Note that complex linear maps are always real linear maps and thus can be used in this context, though in that case linsolve and reallinsolve target the same solution. However, they still compute that solution using different arithmetic, and in that case linsolve might be more efficient.

To interpret the vectors x and y as elements from a real vector space, the standard inner product defined on them will be replaced with real(inner(x,y)). This has no effect if the vectors x and y were represented using real arithmetic to begin with, and allows to seemlessly use complex vectors as well.

Arguments:

The linear map can be an AbstractMatrix (dense or sparse) or a general function or callable object. The real numbers a₀ and a₁ are optional arguments; they are applied implicitly, i.e. they do not contribute the computation time of applying the linear map or to the number of operations on vectors of type x and b.

Return values:

The return value is always of the form x, info = linsolve(...) with

x: the approximate solution to the problem, similar type as the right hand side b but possibly with a different scalartype
info: an object of type [ConvergenceInfo], which has the following fields
- info.converged::Int: takes value 0 or 1 depending on whether the solution was converged up to the requested tolerance
- info.residual: residual b - f(x) of the approximate solution x
- info.normres::Real: norm of the residual, i.e. norm(info.residual)
- info.numops::Int: total number of times that the linear map was applied, i.e. the number of times that f was called, or a vector was multiplied with A
- info.numiter::Int: number of times the Krylov subspace was restarted (see below)

Check for convergence

No warning is printed if no converged solution was found, so always check if info.converged == 1.

Algorithms

The final (expert) method, without default values and keyword arguments, is the one that is finally called, and can also be used directly. Here, one specifies the algorithm explicitly. Currently, only CG, GMRES and BiCGStab are implemented, where CG is chosen if isposdef == true and GMRES is chosen otherwise. Note that in standard GMRES terminology, our parameter krylovdim is referred to as the restart parameter, and our maxiter parameter counts the number of outer iterations, i.e. restart cycles. In CG, the Krylov subspace is only implicit because short recurrence relations are being used, and therefore no restarts are required. Therefore, we pass krylovdim*maxiter as the maximal number of CG iterations that can be used by the CG algorithm.

source

KrylovKit.reallssolve — Function

reallssolve(f, b, algorithm, [λ::Real = 0])

Compute a least squares solution x to the problem f(x) ≈ b where f encodes a real linear map, i.e. a solution x that minimizes norm(b - f(x)). Return the approximate solution x and a ConvergenceInfo structure.

Note about real linear maps

A function f is said to implement a real linear map if it satisfies f(add(x,y)) = add(f(x), f(y) and f(scale(x, α)) = scale(f(x), α) for vectors x and y and scalars α::Real. Note that this is possible even when the vectors are represented using complex arithmetic. For example, the map f=x-> x + conj(x) represents a real linear map that is not (complex) linear, as it does not satisfy f(scale(x, α)) = scale(f(x), α) for complex scalars α. Note that complex linear maps are always real linear maps and thus can be used in this context, though in that case lssolve and reallssolve target the same solution. However, they still compute that solution using different arithmetic, and in that case lssolve might be more efficient.

Arguments:

The real linear map will typically be a function or callable object, as a matrix can only represent a complex linear map and can thus simply be used in combination with lssolve. Since both the action of the map and its adjoint are required in order to solve the least squares problem, f can either be a tuple of two callable objects (each accepting a single argument), representing the linear map and its adjoint respectively, or, f can be a single callable object that accepts two input arguments, where the second argument is a flag of type Val{true} or Val{false} that indicates whether the adjoint or the normal action of the real linear map needs to be computed. The latter form still combines well with the do block syntax of Julia, as in

x, info = reallssolve(b; kwargs...) do x, flag
    if flag === Val(true)
        # y = compute action of adjoint map on x
    else
        # y = compute action of linear map on x
    end
    return y
end

If the real linear map A or f has a nontrivial nullspace, so different minimisers exist, the solution being returned is such that norm(x) is minimal. Alternatively, the problem can be providing a nonzero value for the optional argument λ, representing a scalar so that the minimisation problem norm(b - A * x)^2 + λ * norm(x)^2 is solved instead.

Return values:

The return value is always of the form x, info = reallssolve(...) with

x: the least squares solution to the problem, as defined above
info: an object of type [ConvergenceInfo], which has the following fields
- info.converged::Int: takes value 0 or 1 depending on whether the solution was converged up to the requested tolerance
- info.residual: residual b - A*x of the approximate solution x
- info.normres::Real: norm of the residual of the normal equations, i.e. the quantity norm(A'*(b - A*x) - λ^2 * x) that needs to be smaller than the requested tolerance tol in order to have a converged solution
- info.numops::Int: total number of times that the linear map was applied, i.e. the number of times that f was called, or a vector was multiplied with A or A'
- info.numiter::Int: total number of iterations of the algorithm

Algorithms

source

In the case of eigenvalue systems, a similar method realeigsolve is available. In this context, only real eigenvalues are meaningful, as the corresponding eigenvectors should be built from real linear combinations of the vectors that span the (real) Krylov subspace. This approach can also be applied to linear maps on vectors that were naturally real to begin with, if it is guaranteed that the targetted eigenvalues are real. In that case, also the associated eigenvectors will be returned using only real arithmic. This is contrast with eigsolve, which will always turn to complex arithmetic if the linear map is real but not symmetric. An error will be thrown if complex eigenvalues are encountered within the targetted set.

KrylovKit.realeigsolve — Function

# expert version:
realeigsolve(f, x₀, howmany, which, algorithm; alg_rrule=algorithm)

Compute the first howmany eigenvalues (according to the order specified by which) from the real linear map encoded in the matrix A or by the function f, if it can be guaranteed that these eigenvalues (and thus their associated eigenvectors) are real. An error will be thrown if there are complex eigenvalues within the first howmany eigenvalues.

Return eigenvalues, eigenvectors and a ConvergenceInfo structure.

Note about real linear maps

A function f is said to implement a real linear map if it satisfies f(add(x,y)) = add(f(x), f(y) and f(scale(x, α)) = scale(f(x), α) for vectors x and y and scalars α::Real. Note that this is possible even when the vectors are represented using complex arithmetic. For example, the map f=x-> x + conj(x) represents a real linear map that is not (complex) linear, as it does not satisfy f(scale(x, α)) = scale(f(x), α) for complex scalars α. Note that complex linear maps are always real linear maps and thus can be used in this context, if looking specifically for real eigenvalues that they may have.

Arguments:

The linear map can be an AbstractMatrix (dense or sparse) or a general function or callable object. A starting vector x₀ needs to be provided. Note that x₀ does not need to be of type AbstractVector; any type that behaves as a vector and supports the required interface (see KrylovKit docs) is accepted.

The argument howmany specifies how many eigenvalues should be computed; which specifies which eigenvalues should be targeted. Valid specifications of which for real problems are given by

:LM: eigenvalues of largest magnitude
:LR: eigenvalues with largest (most positive) real part
:SR: eigenvalues with smallest (most negative) real part
EigSorter(f; rev = false): eigenvalues λ that appear first (or last if rev == true) when sorted by f(λ)

Note about selecting `which` eigenvalues

Krylov methods work well for extremal eigenvalues, i.e. eigenvalues on the periphery of the spectrum of the linear map. All of the valid Symbols for which have this property, but could also be specified using EigSorter, e.g. :LM is equivalent to Eigsorter(abs; rev = true). Note that smallest magnitude sorting is obtained using e.g. EigSorter(abs; rev = false), but since no (shift-and)-invert is used, this will only be successful if you somehow know that eigenvalues close to zero are also close to the periphery of the spectrum.

Repeated eigenvalues

From a theoretical point of view, Krylov methods can at most find a single eigenvector associated with a targetted eigenvalue, even if the latter is repeated, i.e. the eigenvalue has multiple linearly independent eigenvectors or thus an eigenspace with dimension (geometric multiplicity) larger than one. In the case of such a repeated eigenvalue, the specific eigenvector that is returned is determined by the starting vector x₀. For large problems, this turns out to be less of an issue in practice, as often a second linearly independent eigenvector is generated out of the numerical noise resulting from the orthogonalisation steps in the Lanczos or Arnoldi iteration. Nonetheless, it is important to take this into account and to try not to depend on this potentially fragile behaviour, especially for smaller problems.

The algorithm argument currently only supports an instance of Arnoldi, which is where the parameters of the Krylov method (such as Krylov dimension and maximum number of iterations) can be specified. Since realeigsolve is less commonly used as eigsolve, it only supports this expert mode call syntax and no convenient keyword interface is currently available.

The keyword argument alg_rrule can be used to specify an algorithm to be used for computing the pullback of realeigsolve in the context of reverse-mode automatic differentation.

Return values:

The return value is always of the form vals, vecs, info = eigsolve(...) with

vals: a Vector containing the eigenvalues, of length at least howmany, but could be longer if more eigenvalues were converged at the same cost. Eigenvalues will be real, an ArgumentError will be thrown if the first howmany eigenvalues ordered according to which of the linear map are not all real.
vecs: a Vector of corresponding eigenvectors, of the same length as vals. Note that eigenvectors are not returned as a matrix, as the linear map could act on any custom Julia type with vector like behavior, i.e. the elements of the list vecs are objects that are typically similar to the starting guess x₀. For a real problem with real eigenvalues, also the eigenvectors will be real and no complex arithmetic is used anywhere.
info: an object of type [ConvergenceInfo], which has the following fields
- info.converged::Int: indicates how many eigenvalues and eigenvectors were actually converged to the specified tolerance tol (see below under keyword arguments)
- info.residual::Vector: a list of the same length as vals containing the residuals info.residual[i] = f(vecs[i]) - vals[i] * vecs[i]
- info.normres::Vector{<:Real}: list of the same length as vals containing the norm of the residual info.normres[i] = norm(info.residual[i])
- info.numops::Int: number of times the linear map was applied, i.e. number of times f was called, or a vector was multiplied with A
- info.numiter::Int: number of times the Krylov subspace was restarted (see below)

Check for convergence

No warning is printed if not all requested eigenvalues were converged, so always check if info.converged >= howmany.

source

Note that reallinsolve, reallssolve and realeigsolve currently only exist with the "expert" mode interface, where the user has to manually specify the underlying Krylov algorithm and its parameters, i.e. GMRES or BiCGStab for reallinsolve, LSMR for reallssolve and Arnoldi for realeigsolve.