Eigenvalue problems
Eigenvalues and eigenvectors
Finding a selection of eigenvalues and corresponding (right) eigenvectors of a linear map can be accomplished with the eigsolve
routine:
KrylovKit.eigsolve
— Functioneigsolve(A::AbstractMatrix, [x₀, howmany = 1, which = :LM, T = eltype(A)]; kwargs...)
eigsolve(f, n::Int, [howmany = 1, which = :LM, T = Float64]; kwargs...)
eigsolve(f, x₀, [howmany = 1, which = :LM]; kwargs...)
# expert version:
eigsolve(f, x₀, howmany, which, algorithm; alg_rrule=...)
Compute at least howmany
eigenvalues from the linear map encoded in the matrix A
or by the function f
. Return eigenvalues, eigenvectors and a ConvergenceInfo
structure.
Arguments:
The linear map can be an AbstractMatrix
(dense or sparse) or a general function or callable object. If an AbstractMatrix
is used, a starting vector x₀
does not need to be provided, it is then chosen as rand(T, size(A, 1))
. If the linear map is encoded more generally as a a callable function or method, the best approach is to provide an explicit starting guess x₀
. Note that x₀
does not need to be of type AbstractVector
; any type that behaves as a vector and supports the required methods (see KrylovKit docs) is accepted. If instead of x₀
an integer n
is specified, it is assumed that x₀
is a regular vector and it is initialized to rand(T, n)
, where the default value of T
is Float64
, unless specified differently.
The next arguments are optional, but should typically be specified. howmany
specifies how many eigenvalues should be computed; which
specifies which eigenvalues should be targeted. Valid specifications of which
are given by
:LM
: eigenvalues of largest magnitude:LR
: eigenvalues with largest (most positive) real part:SR
: eigenvalues with smallest (most negative) real part:LI
: eigenvalues with largest (most positive) imaginary part, only ifT <: Complex
:SI
: eigenvalues with smallest (most negative) imaginary part, only ifT <: Complex
EigSorter(f; rev = false)
: eigenvaluesλ
that appear first (or last ifrev == true
) when sorted byf(λ)
Krylov methods work well for extremal eigenvalues, i.e. eigenvalues on the periphery of the spectrum of the linear map. All of the valid Symbol
s for which
have this property, but could also be specified using EigSorter
, e.g. :LM
is equivalent to Eigsorter(abs; rev = true)
. Note that smallest magnitude sorting is obtained using e.g. EigSorter(abs; rev = false)
, but since no (shift-and)-invert is used, this will only be successful if you somehow know that eigenvalues close to zero are also close to the periphery of the spectrum.
From a theoretical point of view, Krylov methods can at most find a single eigenvector associated with a targetted eigenvalue, even if the latter is degenerate. In the case of a degenerate eigenvalue, the specific eigenvector that is returned is determined by the starting vector x₀
. For large problems, this turns out to be less of an issue in practice, as often a second linearly independent eigenvector is generated out of the numerical noise resulting from the orthogonalisation steps in the Lanczos or Arnoldi iteration. Nonetheless, it is important to take this into account and to try not to depend on this potentially fragile behaviour, especially for smaller problems.
The argument T
acts as a hint in which Number
type the computation should be performed, but is not restrictive. If the linear map automatically produces complex values, complex arithmetic will be used even though T<:Real
was specified. However, if the linear map and initial guess are real, approximate eigenvalues will be searched for using a partial Schur factorization, which implies that complex conjugate eigenvalues come in pairs and cannot be split. It is then illegal to choose which
in a way that would treat λ
and conj(λ)
differently, i.e. :LI
and :SI
are invalid, as well as any EigSorter
that would lead to by(λ) != by(conj(λ))
.
Return values:
The return value is always of the form vals, vecs, info = eigsolve(...)
with
vals
: aVector
containing the eigenvalues, of length at leasthowmany
, but could be longer if more eigenvalues were converged at the same cost. Eigenvalues will be real ifLanczos
was used and complex ifArnoldi
was used (see below).vecs
: aVector
of corresponding eigenvectors, of the same length asvals
. Note that eigenvectors are not returned as a matrix, as the linear map could act on any custom Julia type with vector like behavior, i.e. the elements of the listvecs
are objects that are typically similar to the starting guessx₀
, up to a possibly differenteltype
. In particular for a general matrix (i.e. withArnoldi
) the eigenvectors are generally complex and are therefore always returned in a complex number format. When the linear map is a simpleAbstractMatrix
,vecs
will beVector{Vector{<:Number}}
.info
: an object of type [ConvergenceInfo
], which has the following fieldsinfo.converged::Int
: indicates how many eigenvalues and eigenvectors were actually converged to the specified tolerancetol
(see below under keyword arguments)info.residual::Vector
: a list of the same length asvals
containing the residualsinfo.residual[i] = f(vecs[i]) - vals[i] * vecs[i]
info.normres::Vector{<:Real}
: list of the same length asvals
containing the norm of the residualinfo.normres[i] = norm(info.residual[i])
info.numops::Int
: number of times the linear map was applied, i.e. number of timesf
was called, or a vector was multiplied withA
info.numiter::Int
: number of times the Krylov subspace was restarted (see below)
No warning is printed if not all requested eigenvalues were converged, so always check if info.converged >= howmany
.
Keyword arguments:
Keyword arguments and their default values are given by:
verbosity::Int = 0
: verbosity level, i.e. 0 (no messages), 1 (single message at the end), 2 (information after every iteration), 3 (information per Krylov step)tol::Real
: the requested accuracy (corresponding to the 2-norm of the residual for Schur vectors, not the eigenvectors). If you work in e.g. single precision (Float32
), you should definitely change the default value.krylovdim::Integer
: the maximum dimension of the Krylov subspace that will be constructed. Note that the dimension of the vector space is not known or checked, e.g.x₀
should not necessarily support theBase.length
function. If you know the actual problem dimension is smaller than the default value, it is useful to reduce the value ofkrylovdim
, though in principle this should be detected.maxiter::Integer
: the number of times the Krylov subspace can be rebuilt; see below for further details on the algorithms.orth::Orthogonalizer
: the orthogonalization method to be used, seeOrthogonalizer
issymmetric::Bool
: if the linear map is symmetric, only meaningful ifT<:Real
ishermitian::Bool
: if the linear map is hermitianeager::Bool = false
: if true, eagerly compute the eigenvalue or Schur decomposition after every expansion of the Krylov subspace to test for convergence, otherwise wait until the Krylov subspace has dimensionkrylovdim
. This can result in a faster return, for example if the initial guess is very good, but also has some overhead, as many more dense Schur factorizations need to be computed.
The default values are given by tol = KrylovDefaults.tol
, krylovdim = KrylovDefaults.krylovdim
, maxiter = KrylovDefaults.maxiter
, orth = KrylovDefaults.orth
; see KrylovDefaults
for details.
The default value for the last two parameters depends on the method. If an AbstractMatrix
is used, issymmetric
and ishermitian
are checked for that matrix, otherwise the default values are issymmetric = false
and ishermitian = T <: Real && issymmetric
. When values for the keyword arguments are provided, no checks will be performed even in the matrix case.
The final keyword argument alg_rrule
is relevant only when eigsolve
is used in a setting where reverse-mode automatic differentation will be used. A custom ChainRulesCore.rrule
is defined for eigsolve
, which can be evaluated using different algorithms that can be specified via alg_rrule
. A suitable default is chosen, so this keyword argument should only be used when this default choice is failing or not performing efficiently. Check the documentation for more information on the possible values for alg_rrule
and their implications on the algorithm being used.
Algorithm
The final (expert) method, without default values and keyword arguments, is the one that is finally called, and can also be used directly. Here, one specifies the algorithm explicitly as either Lanczos
, for real symmetric or complex hermitian problems, or Arnoldi
, for general problems. Note that these names refer to the process for building the Krylov subspace, but the actual algorithm is an implementation of the Krylov-Schur algorithm, which can dynamically shrink and grow the Krylov subspace, i.e. the restarts are so-called thick restarts where a part of the current Krylov subspace is kept.
In case of a general problem, where the Arnoldi
method is used, convergence of an eigenvalue is not based on the norm of the residual norm(f(vecs[i]) - vals[i]*vecs[i])
for the eigenvector but rather on the norm of the residual for the corresponding Schur vectors.
See also schursolve
if you want to use the partial Schur decomposition directly, or if you are not interested in computing the eigenvectors, and want to work in real arithmetic all the way true (if the linear map and starting guess are real). If you have knowledge that all requested eigenvalues of a real problem will be real, and thus also their associated eigenvectors, you can also use realeigsolve
.
Which eigenvalues are targeted can be specified using one of the symbols :LM
, :LR
, :SR
, :LI
and :SI
for largest magnitude, largest and smallest real part, and largest and smallest imaginary part respectively. Alternatively, one can just specify a general sorting operation using EigSorter
KrylovKit.EigSorter
— TypeEigSorter(by; rev = false)
A simple struct
to be used in combination with eigsolve
or schursolve
to indicate which eigenvalues need to be targeted, namely those that appear first when sorted by by
and possibly in reverse order if rev == true
.
For a general matrix, eigenvalues and eigenvectors will always be returned with complex values for reasons of type stability. However, if the linear map and initial guess are real, most of the computation is actually performed using real arithmetic, as in fact the first step is to compute an approximate partial Schur factorization. If one is not interested in the eigenvectors, one can also just compute this partial Schur factorization using schursolve
, for which only an 'expert' method call is available
KrylovKit.schursolve
— Function# expert version:
schursolve(f, x₀, howmany, which, algorithm)
Compute a partial Schur decomposition containing howmany
eigenvalues from the linear map encoded in the matrix or function A
. Return the reduced Schur matrix, the basis of Schur vectors, the extracted eigenvalues and a ConvergenceInfo
structure.
See also eigsolve
to obtain the eigenvectors instead. For real symmetric or complex hermitian problems, the (partial) Schur decomposition is identical to the (partial) eigenvalue decomposition, and eigsolve
should always be used.
Arguments:
The linear map can be an AbstractMatrix
(dense or sparse) or a general function or callable object, that acts on vector like objects similar to x₀
, which is the starting guess from which a Krylov subspace will be built. howmany
specifies how many Schur vectors should be converged before the algorithm terminates; which
specifies which eigenvalues should be targeted. Valid specifications of which
are
LM
: eigenvalues of largest magnitudeLR
: eigenvalues with largest (most positive) real partSR
: eigenvalues with smallest (most negative) real partLI
: eigenvalues with largest (most positive) imaginary part, only ifT <: Complex
SI
: eigenvalues with smallest (most negative) imaginary part, only ifT <: Complex
EigSorter(f; rev = false)
: eigenvaluesλ
that appear first (or last ifrev == true
) when sorted byf(λ)
Krylov methods work well for extremal eigenvalues, i.e. eigenvalues on the periphery of the spectrum of the linear map. All of they valid Symbol
s for which
have this property, but could also be specified using EigSorter
, e.g. :LM
is equivalent to Eigsorter(abs; rev = true)
. Note that smallest magnitude sorting is obtained using e.g. EigSorter(abs; rev = false)
, but since no (shift-and)-invert is used, this will only be successful if you somehow know that eigenvalues close to zero are also close to the periphery of the spectrum.
From a theoretical point of view, Krylov methods can at most find a single eigenvector associated with a targetted eigenvalue, even if the latter is degenerate. In the case of a degenerate eigenvalue, the specific eigenvector that is returned is determined by the starting vector x₀
. For large problems, this turns out to be less of an issue in practice, as often a second linearly independent eigenvector is generated out of the numerical noise resulting from the orthogonalisation steps in the Lanczos or Arnoldi iteration. Nonetheless, it is important to take this into account and to try not to depend on this potentially fragile behaviour, especially for smaller problems.
The algorithm
argument currently only supports an instance of Arnoldi
, which is where the parameters of the Krylov method (such as Krylov dimension and maximum number of iterations) can be specified. Since schursolve
is less commonly used as eigsolve
, it only supports this expert mode call syntax and no convenient keyword interface is currently available.
Return values:
The return value is always of the form T, vecs, vals, info = schursolve(...)
with
T
: aMatrix
containing the partial Schur decomposition of the linear map, i.e. it's elements are given byT[i,j] = dot(vecs[i], f(vecs[j]))
. It is of Schur form, i.e. upper triangular in case of complex arithmetic, and block upper triangular (with at most 2x2 blocks) in case of real arithmetic.vecs
: aVector
of corresponding Schur vectors, of the same length asvals
. Note that Schur vectors are not returned as a matrix, as the linear map could act on any custom Julia type with vector like behavior, i.e. the elements of the listvecs
are objects that are typically similar to the starting guessx₀
, up to a possibly differenteltype
. When the linear map is a simpleAbstractMatrix
,vecs
will beVector{Vector{<:Number}}
. Schur vectors are by definition orthogonal, i.e.dot(vecs[i],vecs[j]) = I[i,j]
. Note that Schur vectors are real if the problem (i.e. the linear map and the initial guess) are real.vals
: aVector
of eigenvalues, i.e. the diagonal elements ofT
in case of complex arithmetic, or extracted from the diagonal blocks in case of real arithmetic. Note thatvals
will always be complex, independent of the underlying arithmetic.info
: an object of type [ConvergenceInfo
], which has the following fieldsinfo.converged::Int
: indicates how many eigenvalues and Schur vectors were actually converged to the specified tolerance (see below under keyword arguments)info.residuals::Vector
: a list of the same length asvals
containing the actual residualsinfo.residuals[i] = f(vecs[i]) - sum(vecs[j] * T[j, i] for j in 1:i+1)
where
T[i+1,i]
is definitely zero in case of complex arithmetic and possibly zero in case of real arithmeticinfo.normres::Vector{<:Real}
: list of the same length asvals
containing the norm of the residual for every Schur vector, i.e.info.normes[i] = norm(info.residual[i])
info.numops::Int
: number of times the linear map was applied, i.e. number of timesf
was called, or a vector was multiplied withA
info.numiter::Int
: number of times the Krylov subspace was restarted (see below)
No warning is printed if not all requested eigenvalues were converged, so always check if info.converged >= howmany
.
Algorithm
The actual algorithm is an implementation of the Krylov-Schur algorithm, where the Arnoldi
algorithm is used to generate the Krylov subspace. During the algorithm, the Krylov subspace is dynamically grown and shrunk, i.e. the restarts are so-called thick restarts where a part of the current Krylov subspace is kept.
Note that, for symmetric or hermitian linear maps, the eigenvalue and Schur factorization are equivalent, and one should only use eigsolve
. There is no schursolve
using the Lanczos
algorithm.
Another example of a possible use case of schursolve
is if the linear map is known to have a unique eigenvalue of, e.g. largest magnitude. Then, if the linear map is real valued, that largest magnitude eigenvalue and its corresponding eigenvector are also real valued. eigsolve
will automatically return complex valued eigenvectors for reasons of type stability. However, as the first Schur vector will coincide with the first eigenvector, one can instead use
T, vecs, vals, info = schursolve(A, x₀, 1, :LM, Arnoldi(...))
and use vecs[1]
as the real valued eigenvector (after checking info.converged
) corresponding to the largest magnitude eigenvalue of A
.
More generally, if you want to compute several eigenvalues of a real linear map, and you know that all of them are real, so that also the associated eigenvectors will be real, then you can use the realeigsolve
method, which is also restricted to the 'expert' method call and which will error if any of the requested eigenvalues turn out to be complex
KrylovKit.realeigsolve
— Function# expert version:
realeigsolve(f, x₀, howmany, which, algorithm; alg_rrule=algorithm)
Compute the first howmany
eigenvalues (according to the order specified by which
) from the real linear map encoded in the matrix A
or by the function f
, with the guarantee that these eigenvalues (and thus their associated eigenvectors) are real. Return eigenvalues, eigenvectors and a ConvergenceInfo
structure.
Arguments:
The linear map can be an AbstractMatrix
(dense or sparse) or a general function or callable object. A starting vector x₀
needs to be provided. Note that x₀
does not need to be of type AbstractVector
; any type that behaves as a vector and supports the required interface (see KrylovKit docs) is accepted.
The argument howmany
specifies how many eigenvalues should be computed; which
specifies which eigenvalues should be targeted. Valid specifications of which
for real problems are given by
:LM
: eigenvalues of largest magnitude:LR
: eigenvalues with largest (most positive) real part:SR
: eigenvalues with smallest (most negative) real partEigSorter(f; rev = false)
: eigenvaluesλ
that appear first (or last ifrev == true
) when sorted byf(λ)
Krylov methods work well for extremal eigenvalues, i.e. eigenvalues on the periphery of the spectrum of the linear map. All of the valid Symbol
s for which
have this property, but could also be specified using EigSorter
, e.g. :LM
is equivalent to Eigsorter(abs; rev = true)
. Note that smallest magnitude sorting is obtained using e.g. EigSorter(abs; rev = false)
, but since no (shift-and)-invert is used, this will only be successful if you somehow know that eigenvalues close to zero are also close to the periphery of the spectrum.
From a theoretical point of view, Krylov methods can at most find a single eigenvector associated with a targetted eigenvalue, even if the latter is degenerate. In the case of a degenerate eigenvalue, the specific eigenvector that is returned is determined by the starting vector x₀
. For large problems, this turns out to be less of an issue in practice, as often a second linearly independent eigenvector is generated out of the numerical noise resulting from the orthogonalisation steps in the Lanczos or Arnoldi iteration. Nonetheless, it is important to take this into account and to try not to depend on this potentially fragile behaviour, especially for smaller problems.
The algorithm
argument currently only supports an instance of Arnoldi
, which is where the parameters of the Krylov method (such as Krylov dimension and maximum number of iterations) can be specified. Since realeigsolve
is less commonly used as eigsolve
, it only supports this expert mode call syntax and no convenient keyword interface is currently available.
The keyword argument alg_rrule
can be used to specify an algorithm to be used for computing the pullback
of realeigsolve
in the context of reverse-mode automatic differentation.
Return values:
The return value is always of the form vals, vecs, info = eigsolve(...)
with
vals
: aVector
containing the eigenvalues, of length at leasthowmany
, but could be longer if more eigenvalues were converged at the same cost. Eigenvalues will be real, anArgumentError
will be thrown if the firsthowmany
eigenvalues ordered according towhich
of the linear map are not all real.vecs
: aVector
of corresponding eigenvectors, of the same length asvals
. Note that eigenvectors are not returned as a matrix, as the linear map could act on any custom Julia type with vector like behavior, i.e. the elements of the listvecs
are objects that are typically similar to the starting guessx₀
. For a real problem with real eigenvalues, also the eigenvectors will be real and no complex arithmetic is used anywhere.info
: an object of type [ConvergenceInfo
], which has the following fieldsinfo.converged::Int
: indicates how many eigenvalues and eigenvectors were actually converged to the specified tolerancetol
(see below under keyword arguments)info.residual::Vector
: a list of the same length asvals
containing the residualsinfo.residual[i] = f(vecs[i]) - vals[i] * vecs[i]
info.normres::Vector{<:Real}
: list of the same length asvals
containing the norm of the residualinfo.normres[i] = norm(info.residual[i])
info.numops::Int
: number of times the linear map was applied, i.e. number of timesf
was called, or a vector was multiplied withA
info.numiter::Int
: number of times the Krylov subspace was restarted (see below)
No warning is printed if not all requested eigenvalues were converged, so always check if info.converged >= howmany
.
Automatic differentation
The eigsolve
(and realeigsolve
) routine can be used in conjunction with reverse-mode automatic differentiation, using AD engines that are compatible with the ChainRules ecosystem. The adjoint problem of an eigenvalue problem is a linear problem, although it can also be formulated as an eigenvalue problem. Details about this approach will be published in a forthcoming manuscript.
In either case, the adjoint problem requires the adjoint[1] of the linear map. If the linear map is an AbstractMatrix
instance, its adjoint
will be used in the rrule
. If the linear map is implemented as a function f
, then the AD engine itself is used to compute the corresponding adjoint via ChainRulesCore.rrule_via_ad(config, f, x)
. The specific base point x
at which this adjoint is computed should not affect the result if f
properly represents a linear map. Furthermore, the linear map is the only argument that affects the eigsolve
output (from a theoretical perspective, the starting vector and algorithm parameters should have no effect), so that this is where the adjoint variables need to be propagated to and have a nonzero effect.
The adjoint problem (also referred to as cotangent problem) can thus be solved as a linear problem or as an eigenvalue problem. Note that this eigenvalue problem is never symmetric or Hermitian, even if the primal problem is. The different implementations of the rrule
can be selected using the alg_rrule
keyword argument. If a linear solver such as GMRES
or BiCGStab
is specified, the adjoint problem requires solving a number of linear problems equal to the number of requested eigenvalues and eigenvectors. If an eigenvalue solver is specified, for which Arnoldi
is essentially the only option, then the adjoint problem is solved as a single (but larger) eigenvalue problem.
Note that the phase of an eigenvector is not uniquely determined. Hence, a well-defined cost function constructed from eigenvectors should depend on these in such a way that its value is not affected by changing the phase of those eigenvectors, i.e. the cost function should be 'gauge invariant'. If this is not the case, the cost function is said to be 'gauge dependent', and this can be detected in the resulting adjoint variables for those eigenvectors. The KrylovKit rrule
for eigsolve
will print a warning if it detects from the incoming adjoint variables that the cost function is gauge dependent. This warning can be suppressed by passing alg_rrule
an algorithm with verbosity=-1
.
Generalized eigenvalue problems
Generalized eigenvalues λ
and corresponding vectors x
of the generalized eigenvalue problem $A x = λ B x$ can be obtained using the method geneigsolve
. Currently, there is only one algorithm, which does not require inverses of A
or B
, but is restricted to symmetric or hermitian generalized eigenvalue problems where the matrix or linear map B
is positive definite. Note that this is not reflected in the default values for the keyword arguments issymmetric
, ishermitian
and isposdef
, so that these should be set explicitly in order to comply with this restriction. If A
and B
are actual instances of AbstractMatrix
, the default value for the keyword arguments will try to check these properties explicitly.
KrylovKit.geneigsolve
— Functiongeneigsolve((A::AbstractMatrix, B::AbstractMatrix), [howmany = 1, which = :LM,
T = promote_type(eltype(A), eltype(B))]; kwargs...)
geneigsolve(f, n::Int, [howmany = 1, which = :LM, T = Float64]; kwargs...)
geneigsolve(f, x₀, [howmany = 1, which = :LM]; kwargs...)
# expert version:
geneigsolve(f, x₀, howmany, which, algorithm)
Compute at least howmany
generalized eigenvalues $λ$ and generalized eigenvectors $x$ of the form $(A - λB)x = 0$, where A
and B
are either instances of AbstractMatrix
, or some function that implements the matrix vector product. In case functions are used, one could either specify the action of A
and B
via a tuple of two functions (or a function and an AbstractMatrix
), or one could use a single function that takes a single argument x
and returns two results, corresponding to A*x
and B*x
. Return the computed eigenvalues, eigenvectors and a ConvergenceInfo
structure.
Arguments:
The first argument is either a tuple of two linear maps, so a function or an AbstractMatrix
for either of them, representing the action of A
and B
. Alternatively, a single function can be used that takes a single argument x
and returns the equivalent of (A*x, B*x)
as result. This latter form is compatible with the do
block syntax of Julia. If an AbstractMatrix
is used for either A
or B
, a starting vector x₀
does not need to be provided, it is then chosen as rand(T, size(A,1))
if A
is an AbstractMatrix
(or similarly if only B
is an AbstractMatrix
). Here T = promote_type(eltype(A), eltype(B))
if both A
and B
are instances of AbstractMatrix
, or just the eltype
of whichever is an AbstractMatrix
. If both A
and B
are encoded more generally as a callable function or method, the best approach is to provide an explicit starting guess x₀
. Note that x₀
does not need to be of type AbstractVector
, any type that behaves as a vector and supports the required methods (see KrylovKit docs) is accepted. If instead of x₀
an integer n
is specified, it is assumed that x₀
is a regular vector and it is initialized to rand(T,n)
, where the default value of T
is Float64
, unless specified differently.
The next arguments are optional, but should typically be specified. howmany
specifies how many eigenvalues should be computed; which
specifies which eigenvalues should be targeted. Valid specifications of which
are given by
:LM
: eigenvalues of largest magnitude:LR
: eigenvalues with largest (most positive) real part:SR
: eigenvalues with smallest (most negative) real part:LI
: eigenvalues with largest (most positive) imaginary part, only ifT <: Complex
:SI
: eigenvalues with smallest (most negative) imaginary part, only ifT <: Complex
EigSorter(f; rev = false)
: eigenvaluesλ
that appear first (or last ifrev == true
) when sorted byf(λ)
Krylov methods work well for extremal eigenvalues, i.e. eigenvalues on the periphery of the spectrum of the linear map. Even with ClosestTo
, no shift and invert is performed. This is useful if, e.g., you know the spectrum to be within the unit circle in the complex plane, and want to target the eigenvalues closest to the value λ = 1
.
The argument T
acts as a hint in which Number
type the computation should be performed, but is not restrictive. If the linear map automatically produces complex values, complex arithmetic will be used even though T<:Real
was specified.
Return values:
The return value is always of the form vals, vecs, info = geneigsolve(...)
with
vals
: aVector
containing the eigenvalues, of length at leasthowmany
, but could be longer if more eigenvalues were converged at the same cost.vecs
: aVector
of corresponding eigenvectors, of the same length asvals
. Note that eigenvectors are not returned as a matrix, as the linear map could act on any custom Julia type with vector like behavior, i.e. the elements of the listvecs
are objects that are typically similar to the starting guessx₀
, up to a possibly differenteltype
. When the linear map is a simpleAbstractMatrix
,vecs
will beVector{Vector{<:Number}}
.info
: an object of type [ConvergenceInfo
], which has the following fieldsinfo.converged::Int
: indicates how many eigenvalues and eigenvectors were actually converged to the specified tolerancetol
(see below under keyword arguments)info.residual::Vector
: a list of the same length asvals
containing the residualsinfo.residual[i] = f(vecs[i]) - vals[i] * vecs[i]
info.normres::Vector{<:Real}
: list of the same length asvals
containing the norm of the residualinfo.normres[i] = norm(info.residual[i])
info.numops::Int
: number of times the linear map was applied, i.e. number of timesf
was called, or a vector was multiplied withA
info.numiter::Int
: number of times the Krylov subspace was restarted (see below)
No warning is printed if not all requested eigenvalues were converged, so always check if info.converged >= howmany
.
Keyword arguments:
Keyword arguments and their default values are given by:
verbosity::Int = 0
: verbosity level, i.e. 0 (no messages), 1 (single message at the end), 2 (information after every iteration), 3 (information per Krylov step)tol::Real
: the requested accuracy, relative to the 2-norm of the corresponding eigenvectors, i.e. convergence is achieved ifnorm((A - λB)x) < tol * norm(x)
. Because eigenvectors are now normalised such thatdot(x, B*x) = 1
,norm(x)
is not automatically one. If you work in e.g. single precision (Float32
), you should definitely change the default value.krylovdim::Integer
: the maximum dimension of the Krylov subspace that will be constructed. Note that the dimension of the vector space is not known or checked, e.g.x₀
should not necessarily support theBase.length
function. If you know the actual problem dimension is smaller than the default value, it is useful to reduce the value ofkrylovdim
, though in principle this should be detected.maxiter::Integer
: the number of times the Krylov subspace can be rebuilt; see below for further details on the algorithms.orth::Orthogonalizer
: the orthogonalization method to be used, seeOrthogonalizer
issymmetric::Bool
: if both linear mapsA
andB
are symmetric, only meaningful ifT<:Real
ishermitian::Bool
: if both linear mapsA
andB
are hermitianisposdef::Bool
: if the linear mapB
is positive definite
The default values are given by tol = KrylovDefaults.tol
, krylovdim = KrylovDefaults.krylovdim
, maxiter = KrylovDefaults.maxiter
, orth = KrylovDefaults.orth
; see KrylovDefaults
for details.
The default value for the last three parameters depends on the method. If an AbstractMatrix
is used, issymmetric
, ishermitian
and isposdef
are checked for that matrix, otherwise the default values are issymmetric = false
and ishermitian = T <: Real && issymmetric
. When values are provided, no checks will be performed even in the matrix case.
Algorithm
The last method, without default values and keyword arguments, is the one that is finally called, and can also be used directly. Here the algorithm is specified, though currently only GolubYe
is available. The Golub-Ye algorithm is an algorithm for solving hermitian (symmetric) generalized eigenvalue problems A x = λ B x
with positive definite B
, without the need for inverting B
. It builds a Krylov subspace of size krylovdim
starting from an estimate x
by acting with (A - ρ(x) B)
, where ρ(x) = dot(x, A*x)/ dot(x, B*x)
, and employing the Lanczos algorithm. This process is repeated at most maxiter
times. In every iteration k>1
, the subspace will also be expanded to size krylovdim+1
by adding $x_k - x_{k-1}$, which is known as the LOPCG correction and was suggested by Money and Ye. With krylovdim = 2
, this algorithm becomes equivalent to LOPCG
.
While the only algorithm so far is restricted to symmetric/hermitian generalized eigenvalue problems with positive definite B
, this is not reflected in the default values for the keyword arguments issymmetric
or ishermitian
and isposdef
. Make sure to set these to true to understand the implications of using this algorithm.
Currently, there is rrule
and thus no automatic differentiation support for geneigsolve
.
with its (conjugate) transpose, at least with respect to the standard Euclidean inner product.
- 1For a linear map, the adjoint or pullback required in the reverse-order chain rule coincides