Linear problems
Linear systems
Linear systems are of the form A*x=b
where A
should be a linear map that has the same type of output as input, i.e. the solution x
should be of the same type as the right hand side b
. They can be solved using the function linsolve
:
KrylovKit.linsolve
— Functionlinsolve(A::AbstractMatrix, b::AbstractVector, [x₀, a₀::Number = 0, a₁::Number = 1]; kwargs...)
linsolve(f, b, [x₀, a₀::Number = 0, a₁::Number = 1]; kwargs...)
# expert version:
linsolve(f, b, x₀, algorithm, [a₀::Number = 0, a₁::Number = 1]; alg_rrule=algorithm)
Compute a solution x
to the linear system (a₀ + a₁ * A)*x = b
or a₀ * x + a₁ * f(x) = b
, possibly using a starting guess x₀
. Return the approximate solution x
and a ConvergenceInfo
structure.
Arguments:
The linear map can be an AbstractMatrix
(dense or sparse) or a general function or callable object. If no initial guess is specified, it is chosen as (zero(a₀)*zero(a₁))*b
which should generate an object similar to b
but initialized with zeros. The numbers a₀
and a₁
are optional arguments; they are applied implicitly, i.e. they do not contribute the computation time of applying the linear map or to the number of operations on vectors of type x
and b
.
Return values:
The return value is always of the form x, info = linsolve(...)
with
x
: the approximate solution to the problem, similar type as the right hand sideb
but possibly with a differentscalartype
info
: an object of type [ConvergenceInfo
], which has the following fieldsinfo.converged::Int
: takes value 0 or 1 depending on whether the solution was converged up to the requested toleranceinfo.residual
: residualb - f(x)
of the approximate solutionx
info.normres::Real
: norm of the residual, i.e.norm(info.residual)
info.numops::Int
: total number of times that the linear map was applied, i.e. the number of times thatf
was called, or a vector was multiplied withA
info.numiter::Int
: number of times the Krylov subspace was restarted (see below)
No warning is printed if no converged solution was found, so always check if info.converged == 1
.
Keyword arguments:
Keyword arguments are given by:
verbosity::Int = 0
: verbosity level, i.e.- 0 (suppress all messages)
- 1 (only warnings)
- 2 (information at the beginning and end)
- 3 (progress info after every iteration)
atol::Real
: the requested accuracy, i.e. absolute tolerance, on the norm of the residual.rtol::Real
: the requested accuracy on the norm of the residual, relative to the norm of the right hand sideb
.tol::Real
: the requested accuracy on the norm of the residual that is actually used by the algorithm; it defaults tomax(atol, rtol*norm(b))
. So either useatol
andrtol
or directly usetol
(in which case the value ofatol
andrtol
will be ignored).krylovdim::Integer
: the maximum dimension of the Krylov subspace that will be constructed.maxiter::Integer
: the number of times the Krylov subspace can be rebuilt; see below for further details on the algorithms.orth::Orthogonalizer
: the orthogonalization method to be used, seeOrthogonalizer
issymmetric::Bool
: if the linear map is symmetric, only meaningful ifT<:Real
ishermitian::Bool
: if the linear map is hermitianisposdef::Bool
: if the linear map is positive definite
The default values are given by atol = KrylovDefaults.tol[]
, rtol = KrylovDefaults.tol[]
, tol = max(atol, rtol*norm(b))
, krylovdim = KrylovDefaults.krylovdim[]
, maxiter = KrylovDefaults.maxiter[]
, orth = KrylovDefaults.orth
; see KrylovDefaults
for details.
The default value for the last three parameters depends on the method. If an AbstractMatrix
is used, issymmetric
, ishermitian
and isposdef
are checked for that matrix, ortherwise the default values are issymmetric = false
, ishermitian = T <: Real && issymmetric
and isposdef = false
.
The final keyword argument alg_rrule
is relevant only when linsolve
is used in a setting where reverse-mode automatic differentation will be used. A custom ChainRulesCore.rrule
is defined for linsolve
, which can be evaluated using different algorithms that can be specified via alg_rrule
. As the pullback of linsolve
involves solving a linear system with the (Hermitian) adjoint of the linear map, the default value is to use the same algorithm. This keyword argument should only be used when this default choice is failing or not performing efficiently. Check the documentation for more information on the possible values for alg_rrule
and their implications on the algorithm being used.
Algorithms
The final (expert) method, without default values and keyword arguments, is the one that is finally called, and can also be used directly. Here, one specifies the algorithm explicitly. Currently, only CG
, GMRES
, BiCGStab
and LSMR
are implemented, where CG
is chosen if isposdef == true
and GMRES
is chosen otherwise. Note that in standard GMRES
terminology, our parameter krylovdim
is referred to as the restart parameter, and our maxiter
parameter counts the number of outer iterations, i.e. restart cycles. In CG
, the Krylov subspace is only implicit because short recurrence relations are being used, and therefore no restarts are required. Therefore, we pass krylovdim*maxiter
as the maximal number of CG iterations that can be used by the CG
algorithm.
Automatic differentation
The linsolve
routine can be used in conjunction with reverse-mode automatic differentiation, using AD engines that are compatible with the ChainRules ecosystem. The adjoint problem of a linear problem is again a linear problem, that requires the adjoint[1] of the linear map. If the linear map is an AbstractMatrix
instance, its adjoint
will be used in the rrule
. If the linear map is implemented as a function f
, then the AD engine itself is used to compute the corresponding adjoint via ChainRulesCore.rrule_via_ad(config, f, x)
. The specific base point x
at which this adjoint is computed should not affect the result if f
properly represents a linear map. Furthermore, the linsolve
output is only affected by the linear map argument and the right hand side argument b
(from a theoretical perspective, the starting vector and algorithm parameters should have no effect), so that these two arguments are where the adjoint variables need to be propagated to and have a nonzero effect.
The adjoint linear problem (also referred to as cotangent problem) is by default solved using the same algorithms as the primal problem. However, the rrule
can be customized to use a different Krylov algorithm, by specifying the alg_rrule
keyword argument. Its value can take any of the values as the algorithm
argument in linsolve
.
with its (conjugate) transpose, at least with respect to the standard Euclidean inner product.
- 1For a linear map, the adjoint or pullback required in the reverse-order chain rule coincides