Index notation with macros
The main export and main functionality of TensorOperations.jl is the @tensor
macro
TensorOperations.@tensor
— Macro@tensor(tensor_expr; kwargs...)
@tensor [kw_expr...] tensor_expr
Specify one or more tensor operations using Einstein's index notation. Indices can be chosen to be arbitrary Julia variable names, or integers. When contracting several tensors together, this will be evaluated as pairwise contractions in left to right order, unless the so-called NCON style is used (positive integers for contracted indices and negative indices for open indices).
Additional keyword arguments may be passed to control the behavior of the parser:
order
: A list of contraction indices of the formorder=(...,)
which specify the order in which they will be contracted.opt
: Contraction order optimization, similar to@tensoropt
. Can be either a boolean or anOptExpr
.contractcheck
: Boolean flag to enable runtime check for contractibility of indices with clearer error messages.costcheck
: Can be either:warn
or:cache
and adds runtime checks to compare the compile-time contraction order to the optimal order computed for the actual run time tensor costs. Ifcostcheck == :warn
, warnings are printed for every sub-optimal contraction that is encountered. Ifcostcheck == :cache
, only the most costly run of a particular sub-optimal contraction will be cached inTensorOperations.costcache
. In both cases, a suggestion for theorder
keyword argument is computed to switch to the optimal contraction order.backend
: Inserts an implementation backend as a final argument in the different tensor operation calls in the generated code.allocator
: Inserts an allocation strategy as a final argument in the tensor allocation calls in the generated code.
The functionality and configurability of @tensor
and some of its relatives is explained in detail on this page.
The @tensor
macro
The prefered way to specify (a sequence of) tensor operations is by using the @tensor
macro, which accepts an index notation format, a.k.a. Einstein notation (and in particular, Einstein's summation convention).
This can most easily be explained using a simple example:
using TensorOperations
α = randn()
A = randn(5, 5, 5, 5, 5, 5)
B = randn(5, 5, 5)
C = randn(5, 5, 5)
D = zeros(5, 5, 5)
@tensor begin
D[a, b, c] = A[a, e, f, c, f, g] * B[g, b, e] + α * C[c, a, b]
E[a, b, c] := A[a, e, f, c, f, g] * B[g, b, e] + α * C[c, a, b]
end
The first important observation is the use of two different assignment operators within the body of the @tensor
call. The regular assignment operator =
stores the result of the tensor expression in the right hand side in an existing tensor D
, whereas the 'definition' operator :=
results in a new tensor E
with the correct properties to be created. Nonetheless, the contents of D
and E
will be equal.
Following Einstein's summation convention, that contents is computed in a number of steps involving the three primitive tensor operators. In this particular example, the first step involves tracing/contracting the 3rd and 5th index of array A
, the result of which is stored in a temporary array which thus needs to be created. This resulting array will then be contracted with array B
by contracting its 2nd index with the last index of B
and its last index with the first index of B
. The result is stored in D
in the first line, or in a newly allocated array which will end up being E
in the second line. Note that the index order of D
and E
is such that its first index corresponds to the first index of A
, the second index corresponds to the second index of B
, whereas the third index corresponds to the fourth index of A
. Finally, the array C
(scaled with α
) is added to this result (in place), which requires a further index permutation.
The index pattern is analyzed at compile time and expanded to a set of calls to the basic tensor operations, i.e. tensoradd!
, tensortrace!
and tensorcontract!
. Temporaries are created where necessary, as these building blocks operate pairwise on the input tensors. The generated code can easily be inspected
using TensorOperations
@macroexpand @tensor E[a, b, c] := A[a, e, f, c, f, g] * B[g, b, e] + α * C[c, a, b]
quote
var"##T_##E_A#231#232" = TensorOperations.scalartype(A)
var"##E_A#231" = TensorOperations.tensoralloc_add(var"##T_##E_A#231#232", A, ((1, 4), (2, 6)), false, Val{true}())
var"##E_A#231" = TensorOperations.tensortrace!(var"##E_A#231", A, ((1, 4), (2, 6)), ((3,), (5,)), false, VectorInterface.One(), VectorInterface.Zero())
var"##T_E#233" = TensorOperations.promote_add(TensorOperations.promote_contract(TensorOperations.scalartype(A), TensorOperations.scalartype(B)), Base.promote_op(*, TensorOperations.scalartype(C), TensorOperations.scalartype(α)))
E = TensorOperations.tensoralloc_contract(var"##T_E#233", var"##E_A#231", ((1, 2), (3, 4)), false, B, ((3, 1), (2,)), false, ((1, 3, 2), ()), Val{false}())
E = TensorOperations.tensorcontract!(E, var"##E_A#231", ((1, 2), (3, 4)), false, B, ((3, 1), (2,)), false, ((1, 3, 2), ()), VectorInterface.One(), VectorInterface.Zero())
TensorOperations.tensorfree!(var"##E_A#231")
E
E = TensorOperations.tensoradd!(E, C, ((2, 3, 1), ()), false, α, VectorInterface.One())
end
The different functions in which this tensor expression is decomposed are discussed in more detail in the Implementation section of this manual.
In this example, the tensor indices were labeled with arbitrary letters; also longer names could have been used. In fact, any proper Julia variable name constitutes a valid label. Note though that these labels are never interpreted as existing Julia variables. Within the @tensor
macro they are converted into symbols and then used as dummy names, whose only role is to distinguish the different indices. Their specific value bears no meaning. They also do not appear in the generated code as illustrated above. This implies, in particular, that the specific tensor operations defined by the code inside the @tensor
environment are completely specified at compile time. Various remarks regarding the index notation are in order.
TensorOperations.jl only supports strict Einstein summation convention. This implies that there are two types of indices. Either an index label appears once in every term of the right hand side, and it also appears on the left hand side. We refer to the corresponding indices as open or free. Alternatively, an index label appears exactly twice within a given term on the right hand side. The corresponding indices are referred to as closed or contracted, i.e. the pair of indices takes equal values and are summed over their (equal) range. This is known as a contraction, either an outer contraction (between two indices of two different tensors) or an inner contraction (a.k.a. trace, between two indices of a single tensor). More liberal use of the index notation, such as simultaneous summutation over three or more indices, or a open index appearing simultaneously in different tensor factors, are not supported by TensorOperations.jl.
Aside from valid Julia identifiers, index labels can also be specified using literal integer constants or using a combination of integers and symbols. Furthermore, it is also allowed to use primes (i.e. Julia's
adjoint
operator) to denote different indices, including using multiple subsequent primes. The following expression thus computes the same result as the example above:@tensor D[å'', ß, clap'] = A[å'', 1, -3, clap', -3, 2] * B[2, ß, 1] + α * C[clap', å'', ß]
If only integers are used for specifying index labels, this can be used to control the pairwise contraction order, by using the well-known NCON convention, where open indices in the left hand side are labelled by negative integers
-1
,-2
,-3
, whereas contraction indices are labelled with positive integers1
,2
, … Since the index order of the left hand side is in that case clear from the right hand side expression, the left hand side can be indexed with[:]
, which is automatically replaced with all negative integers appearing in the right hand side, in decreasing order. The value of the labels for the contraction indices determines the pairwise contraction order. If multiple tensors need to be contracted, a first temporary will be created consisting of the contraction of the pair of tensors that share contraction index1
, then the pair of tensors that share contraction index2
(if not contracted away in the first pair) will be contracted, and so forth. The next subsection explains contraction order in more detail and gives some useful examples, as the example above only includes a single pair of tensors to be contracted.Index labels always appear in square brackets
[ ... ]
but can be separated by either commas, as inD[a, b, c]
, (yielding a:ref
expression) or by spaces, as inD[a b c]
, (yielding a:typed_hcat
expression).There is also the option to separate the indices into two groups using a semicolon. This can be useful for tensor types which have two distinct set of indices, but has no effect when using Julia
AbstractArray
objects. While in principle both spaces and commas can be used within the two groups, e.g. as inD[a, b; c]
orD[a b; c]
, there are some restrictions because of accepted Julia syntax. Both groups of indices should use the same convention. If there is only a single index in the first group, the second group should use spaces to constitute a valid expression. Finally, having no indices in the first group is only possible by writing an empty tuple. The second group can then use spaces, or also contain the indices as a tuple, i.e. bothD[(); a b c]
orD[(); (a, b, c)]
. Writing the two groups of indices within a tuple (which uses a comma as natural separator), with both tuples seperated by a semicolon is always valid syntax, irrespective of the number of indices in that group.Index expressions
[...]
are only interpreted as index notation on the highest level. For example, if you want to mulitply two matrices which are stored in a list, you can write@tensor result[i,j] := list[1][i,k] * list[2][k,j]
However, if both are stored as a the slices of a 3-way array, you cannot write
@tensor result[i,j] := list[i,k,1] * list[k,j,2]
Rather, you should use
@tensor result[i,j] := list[:,:,1][i,k] * list[:,:,2][k,j]
or, if you want to avoid additional allocations
@tensor result[i,j] := view(list,:,:,1)[i,k] * view(list,:,:,2)[k,j]
Note, finally, that the @tensor
specifier can be put in front of a single tensor expression, or in front of a begin ... end
block to group and evaluate different expressions at once. Within an @tensor begin ... end
block, the @notensor
macro can be used to annotate indexing expressions that need to be interpreted literally.
TensorOperations.@notensor
— Macro@notensor(block)
Marks a block which should be ignored within an @tensor
environment. Has no effect outside of @tensor
.
As in illustration, note that the previous code examples about the matrix multiplication with matrices stored in a 3-way array can now also be written as
@tensor begin
@notensor A = list[:,:,1]
@notensor B = list[:,:,2]
result[i,j] = A[i,k] * B[k,j]
end
Contraction order specification and optimisation
A contraction of several (more than two) tensors, as in
@tensor D[a, d, j, i, g] := A[a, b, c, d, e] * B[b, e, f, g] * C[c, f, i, j]
is known as a tensor network and is generically evaluated as a sequence of pairwise contractions. In the example above, this contraction is evaluated using Julia's default left to right order, i.e. as (A[a, b, c, d, e] * B[b, e, f, g]) * C[c, f, i, j]
. There are however different strategies to modify this order.
Explicit parenthesis can be used to group subnetworks within a tensor network that will be evaluated first. Parentheses around subexpressions are always respected by the
@tensor
macro.As explained in the previous subsection, if one respects the NCON convention of specifying indices, i.e. positive integers for the contracted indices and negative indices for the open indices, the different factors will be reordered and so that the pairwise tensor contractions contract over indices with smaller integer label first. For example,
@tensor D[:] := A[-1, 3, 1, -2, 2] * B[3, 2, 4, -5] * C[1, 4, -4, -3]
will be evaluated as
(A[-1, 3, 1, -2, 2] * C[1, 4, -4, -3]) * B[3, 2, 4, -5]
. Furthermore, in that case the indices of the output tensor (D
in this case) do not need to be specified (using[:]
instead), and will be chosen as(-1, -2, -3, -4, -5)
. Note that if two tensors are contracted, all contraction indices among them will be contracted, even if there are additional contraction indices whose label is a higher positive number. For example,@tensor D[:] := A[-1, 3, 2, -2, 1] * B[3, 1, 4, -5] * C[2, 4, -4, -3]
amounts to the original left to right order, because
A
andB
share the first contraction index1
. WhenA
andB
are contracted, also the contraction with label3
will be performed, even though contraction index with label2
is not yet 'processed'.A specific contraction order can be manually specified by supplying an
order
keyword argument to the@tensor
macro. The value is a tuple of the contraction indices in the order that they should be dealt with, e.g. the default order could be changed to first contractA
withC
using@tensor order=(c, b, e, f) begin D[a, d, j, i, g] := A[a, b, c, d, e] * B[b, e, f, g] * C[c, f, i, j] end
Here, the same comment as in the NCON style applies; once two tensors are contracted because they share an index label which is next in the
order
list, all other indices with shared label among them will be contracted, irrespective of their order.
In the case of more complex tensor networks, the optimal contraction order cannot always easily be guessed or determined on plain sight. It is then useful to be able to optimize the contraction order automatically, given a model for the complexity of contracting the different tensors in a particular order. This functionality is provided where the cost function being minimized models the computational complexity by counting the number of scalar multiplications. This minimisation problem is solved using the algorithm that was described in Physical Review E 90, 033315 (2014). For a given tensor networ contraction, this algorithm is ran once at compile time, while lowering the tensor espression, and the outcome will be hard coded in the expression resulting from the macro expansion. While the computational complexity of this optimisation algorithm scales itself exponentially in the number of tensors involved in the network, it should still be acceptibly fast (milliseconds up to a few seconds at most) for tensor network contractions with up to around 30 tensors. Information of the optimization process can be obtained during compilation by using the alternative macro @tensoropt_verbose
.
As the cost is determined at compile time, it is not using actual tensor properties (e.g. size(A, i)
in the case of arrays) in the cost model, and the cost or extent associated with every index can be specified in various ways, either using integers or floating point numbers or some arbitrary univariate polynomial of an abstract variable, e.g. χ
. In the latter case, the optimization assumes the asymptotic limit of large χ
.
TensorOperations.@tensoropt
— Macro@tensoropt(optex, block)
@tensoropt(block)
Specify one or more tensor operations using Einstein's index notation. Indices can be chosen to be arbitrary Julia variable names, or integers. When contracting several tensors together, the macro will determine (at compile time) the optimal contraction order depending on the cost associated to the individual indices. If no optex
is provided, all indices are assumed to have an abstract scaling χ
which is optimized in the asympotic limit of large χ
.
The cost can be specified in the following ways:
# cost χ for all indices (a, b, c, d, e, f)
@tensoropt D[a, b, c, d] := A[a, e, c, f] * B[g, d, e] * C[g, f, b]
# asymptotic cost χ for indices (a, b, c, e), other indices (d, f) have cost 1
@tensoropt (a, b, c, e) C[a, b, c, d] := A[a, e, c, f, h] * B[f, g, e, b] * C[g, d, h]
# cost 1 for indices (a, b, c, e); other indices (d, f) have asymptotic cost χ
@tensoropt !(a, b, c, e) C[a, b, c, d] := A[a, e, c, f, h] * B[f, g, e, b] * C[g, d, h]
# cost as specified for listed indices, unlisted indices have cost 1 (any symbol for χ can be used)
@tensoropt (a => χ, b => χ^2, c => 2 * χ, e => 5) begin
C[a, b, c, d] := A[a, e, c, f, h] * B[f, g, e, b] * C[g, d, h]
end
Note that @tensoropt
will optimize any tensor contraction sequence it encounters in the (block of) expressions. It will however not break apart expressions that have been explicitly grouped with parenthesis, i.e. in
@tensoropt C[a, b, c, d] := A[a, e, c, f, h] * (B[f, g, e, b] * C[g, d, h])
it will always contract B
and C
first. For a single tensor contraction sequence, the optimal contraction order and associated (asymptotic) cost can be obtained using @optimalcontractiontree
.
TensorOperations.@tensoropt_verbose
— Macro@tensoropt_verbose(optex, block)
@tensoropt_verbose(block)
Similar to @tensoropt
, but prints information details regarding the optimization process to the standard output.
As an final remark, the optimization can also be accessed directly from @tensor
by specifying the additional keyword argument opt=true
, which will then use the default cost model, or opt=optex
to further specify the costs.
# cost χ for all indices (a, b, c, d, e, f)
@tensor opt=true D[a, b, c, d] := A[a, e, c, f] * B[g, d, e] * C[g, f, b]
# cost χ for indices (a, b, c, e), other indices (d, f) have cost 1
@tensor opt=(a, b, c, e) D[a, b, c, d] := A[a, e, c, f] * B[g, d, e] * C[g, f, b]
# cost 1 for indices (a, b, c, e), other indices (d, f) have cost χ
@tensor opt=!(a, b, c, e) D[a, b, c, d] := A[a, e, c, f] * B[g, d, e] * C[g, f, b]
# cost as specified for listed indices, unlisted indices have cost 1 (any symbol for χ can be used)
@tensor opt=(a => χ, b => χ^2, c => 2 * χ, e => 5) begin
D[a, b, c, d] := A[a, e, c, f] * B[g, d, e] * C[g, f, b]
end
Dynamical tensor network contractions with ncon
and @ncon
Tensor network practicioners are probably more familiar with the network contractor function ncon
to perform a tensor network contraction, as e.g. described in NCON. In particular, a graphical application TensorTrace was recently introduced to facilitate the generation of such ncon
calls. TensorOperations.jl provides compatibility with this interface by also exposing an ncon
function with the same basic syntax
ncon(list_of_tensor_objects, list_of_index_lists)
e.g. the example of above is equivalent to
@tensor D[:] := A[-1, 3, 1, -2, 2] * B[3, 2, 4, -5] * C[1, 4, -4, -3]
D ≈ ncon((A, B, C), ([-1, 3, 1, -2, 2], [3, 2, 4, -5], [1, 4, -4, -3]))
where the lists of tensor objects and of index lists can be given as a vector or a tuple. The ncon
function necessarily needs to analyze the contraction pattern at runtime, but this can be an advantage, in cases where the contraction is determined by runtime information and thus not known at compile time. A downside from this, besides the fact that this can result in some overhead (though this is typically negligable for anything but very small tensor contractions), is that ncon
is type-unstable, i.e. its return type cannot be inferred by the Julia compiler.
The full call syntax of the ncon
method exposed by TensorOperations.jl is
ncon(tensorlist, indexlist, [conjlist]; order=..., output=...)
where the first two arguments are those of above. Let us first discuss the keyword arguments. The keyword argument order
can be used to change the contraction order, i.e. by specifying which contraction indices need to be processed first, rather than the strictly increasing order [1, 2, ...]
, as discussed in the previous subsection. The keyword argument output
can be used to specify the order of the output indices, when it is different from the default [-1, -2, ...]
.
The optional positional argument conjlist
is a list of Bool
variables that indicate whether the corresponding tensor needs to be conjugated in the contraction. So while
ncon([A, conj(B), C], [[-1, 3, 1, -2, 2], [3, 2, 4, -5], [1, 4, -4, -3]]) ≈
ncon([A, B, C], [[-1, 3, 1, -2, 2], [3, 2, 4, -5], [1, 4, -4, -3]], [false, true, false])
the latter has the advantage that conjugating B
is not an extra step (which creates an additional temporary requiring allocations), but is performed at the same time when it is contracted.
As an alternative solution to the optional positional arguments, there is also an @ncon
macro. It is just a simple wrapper over an ncon
call and thus does not analyze the indices at compile time, so that they can be fully dynamical. However, it will transform
@ncon([A, conj(B), C], indexlist; order=..., output=...)
into
ncon(Any[A, B, C], indexlist, [false, true, false]; order=..., output=...)
so as to get the advantages of just-in-time conjugation (pun intended) using the familiar looking ncon
syntax.
TensorOperations.ncon
— Functionncon(tensorlist, indexlist, [conjlist, sym]; order = ..., output = ..., backend = ..., allocator = ...)
Contract the tensors in tensorlist
(of type Vector
or Tuple
) according to the network as specified by indexlist
. Here, indexlist
is a list (i.e. a Vector
or Tuple
) with the same length as tensorlist
whose entries are themselves lists (preferably Vector{Int}
) where every integer entry provides a label for corresponding index/dimension of the corresponding tensor in tensorlist
. Positive integers are used to label indices that need to be contracted, and such thus appear in two different entries within indexlist
, whereas negative integers are used to label indices of the output tensor, and should appear only once.
Optional arguments in another list with the same length, conjlist
, whose entries are of type Bool
and indicate whether the corresponding tensor object should be conjugated (true
) or not (false
). The default is false
for all entries.
By default, contractions are performed in the order such that the indices being contracted over are labelled by increasing integers, i.e. first the contraction corresponding to label 1
is performed. The output tensor had an index order corresponding to decreasing (negative, so increasing in absolute value) index labels. The keyword arguments order
and output
allow to change these defaults.
See also the macro version @ncon
.
TensorOperations.@ncon
— Macro@ncon(tensorlist, indexlist; order = ..., output = ...)
Contract the tensors in tensorlist
(of type Vector
or Tuple
) according to the network as specified by indexlist
. Here, indexlist
is a list (i.e. a Vector
or Tuple
) with the same length as tensorlist
whose entries are themselves lists (preferably Vector{Int}
) where every integer entry provides a label for corresponding index/dimension of the corresponding tensor in tensorlist
. Positive integers are used to label indices that need to be contracted, and such thus appear in two different entries within indexlist
, whereas negative integers are used to label indices of the output tensor, and should appear only once.
By default, contractions are performed in the order such that the indices being contracted over are labelled by increasing integers, i.e. first the contraction corresponding to label 1
is performed. The output tensor had an index order corresponding to decreasing (negative, so increasing in absolute value) index labels. The keyword arguments order
and output
allow to change these defaults.
The advantage of the macro @ncon
over the function call ncon
is that, if tensorlist
is not just some variable but an actual list (as a tuple with parentheses or a vector with square brackets) at the call site, the @ncon
macro will scan for conjugation calls, e.g. conj(A)
, and replace this with just A
but build a matching list of conjugation flags to be specified to ncon
. This makes it more convenient to specify tensor conjugation, without paying the cost of actively performing the conjugation beforehand.
See also the function ncon
.
Index compatibility and checks
Errors in the index pattern are caught at compile time, and will spawn an error of type
TensorOperations.IndexError
— Typestruct IndexError{<:AbstractString} <: Exception
Exception type for reporting errors in the index specification.
Once the expression is compiled, there can still be errors that will lead to runtime errors, the type of which depends on the specific type of tensor (e.g. DimensionMismatch
for AbstractArray
objects). Indeed, indices with the same label, either open indices on the two sides of the equation, or contracted indices, need to be compatible. For AbstractArray
objects, this means they must have the same size. Other tensor types might have more complicated structure associated with their indices, and requires matching between those. The function checkcontractible
is part of the interface that can be used to control when tensors can be contracted with each other along specific indices.
For example, for AbstractArray
objects, the following helper methods are defined that capture the recurring checks across different implementations of the tensor operations:
TensorOperations.argcheck_index2tuple
— Functionargcheck_index2tuple(C::AbstractArray, pC::Index2Tuple)
Check that C
has numind(pC)
indices and that pC
constitutes a valid permutation.
TensorOperations.argcheck_tensoradd
— Functionargcheck_tensoradd(C::AbstractArray, A::AbstractArray, pA::Index2Tuple)
Check that C
and A
have numind(pA)
indices and that pA
constitutes a valid permutation.
TensorOperations.argcheck_tensortrace
— Functionargcheck_tensortrace(C::AbstractArray, A::AbstractArray, p::Index2Tuple, q::Index2Tuple)
Check that the partial trace of A
over indices q
and with permutation of the remaining indices p
is compatible with output C
.
TensorOperations.argcheck_tensorcontract
— Functionargcheck_tensorcontract(C::AbstractArray, A::AbstractArray, pA::Index2Tuple, B::AbstractArray, pB::Index2Tuple, pAB::Index2Tuple)
Check that C
, A
and pA
, and B
and pB
and pAB
have compatible indices and number of dimensions.
TensorOperations.dimcheck_tensoradd
— Functiondimcheck_tensoradd(C::AbstractArray, A::AbstractArray, pA::Index2Tuple)
Check that C
and A
have compatible sizes for the addition specified by pA
.
TensorOperations.dimcheck_tensortrace
— Functiondimcheck_tensorcontract(C::AbstractArray, A::AbstractArray,
p::Index2Tuple, q::Index2Tuple)
Check that C
and A
have compatible sizes for the trace and addition specified by p
and q
.
TensorOperations.dimcheck_tensorcontract
— Functiondimcheck_tensorcontract(C::AbstractArray,
A::AbstractArray, pA::Index2Tuple,
B::AbstractArray, pB::Index2Tuple,
pAB::Index2Tuple)
Check that C
, A
and B
have compatible sizes for the contraction specified by pA
, pB
and pAB
.
Whenever an incompatible index structures is detected at runtime, the resulting error will originate deep within the implementation, at which point the error message will provide little information as to which specific tensors and which indices are producing the mismatch. When debugging, it might be useful to add the keyword argument contractcheck = true
to the @tensor
macro. Explicit checks using checkcontractible
are then enabled that are run before any tensor operation is performed. When a mismatch is detected, these checks still have access to the label information and spawn a more informative error message.
A different type of check is the costcheck
keyword argument, which can be given the values :warn
or :cache
. With either of both values for this keyword argument, additional checks are inserted that compare the contraction order of any tensor contraction of three or more factors against the optimal order based on the current tensor size. More generally, the function tensorcost
is part of the interface and associated a cost value with every index of a tensor, which is then used in the cost model. With costcheck=:warn
, a warning will be spawned for every tensor network where the actual contraction order (even when optimized using abstract costs) does not match with the ideal contraction order given the current tensorcost
values. With costcheck = :cache
, the tensor networks with non-optimal contraction order are stored in a global package variable TensorOperations.costcache
. However, when a tensor network is evaluated several times with different tensor sizes or tensor costs, only the evaluation giving rise to the largest total contraction cost for that network will appear in the cache (provided the actual contraction order deviates from the optimal order in that largest case).
Backends, multithreading and GPUs
Every index expression will be evaluated as a sequence of elementary tensor operations, i.e. permuted additions, partial traces and contractions, which are implemented for strided arrays as discussed in Package features. In particular, optimised implementations rely on Strided.jl, and we refer to this package for a full specification of which arrays are supported. As a rule of thumb, this primarily includes Array
s from Julia base, as well as view
s thereof if sliced with a combination of Integer
s and Range
s. Special types such as Adjoint
and Transpose
from Base are also supported. For permuted addition and partial traces, native Julia implementations are used which could benefit from multithreading if JULIA_NUM_THREADS>1
.
The binary contraction is performed by first permuting the two input tensors into a form such that the contraction becomes equivalent to one matrix multiplication on the whole data, followed by a final permutation to bring the indices of the output tensor into the desired order. This approach allows to use the highly efficient matrix multiplication kernel (gemm
) from BLAS, which is multithreaded by default. There is also a native contraction implementation that is used for e.g. arrays with an eltype
that is not <:LinearAlgebra.BlasFloat
. It performs the contraction directly without the additional permutations, but still in a cache-friendly and multithreaded way (again relying on JULIA_NUM_THREADS > 1
). This implementation can also be used for BlasFloat
types (but will typically be slower), and the use of BLAS can be controlled by explicitly switching the backend between StridedBLAS
and StridedNative
using the backend
keyword to @tensor
. Similarly, different allocation strategies, when available, can be selected using the allocator
keyword of @tensor
.
Since TensorOperations v5, there are also fallback implementations for non-strided arrays that only use functionality from Base
, as well as LinearAlgebra.mul!
. The manual section Backends and Allocators provides a more elaborate discussion.
The primitive tensor operations are also implemented for CuArray
objects of the CUDA.jl library. This implementation is essentially a simple wrapper over the cuTENSOR library of NVidia, and will only be loaded when the cuTENSOR.jl
package is loaded. The @tensor
macro will then automatically work for operations between GPU arrays.
Mixed operations between host arrays (e.g. Array
) and device arrays (e.g. CuArray
) will fail. However, if one wants to harness the computing power of the GPU to perform all tensor operations, there is a dedicated macro @cutensor
. This will transfer all host arrays to the GPU before performing the requested operations. If the output is an existing host array, the result will be copied back. If a new result array is created (i.e. using :=
), it will remain on the GPU device and it is up to the user to transfer it back. Arrays are transfered to the GPU just before they are first used, and in a complicated tensor expression, this might have the benefit that transer of the later arrays overlaps with computation of earlier operations.
TensorOperations.@cutensor
— Macro@cutensor tensor_expr
Use the GPU to perform all tensor operations, through the use of the cuTENSOR library. This will transfer all arrays to the GPU before performing the requested operations. If the output is an existing host array, the result will be transferred back. If a new array is created (i.e. using :=
), it will remain on the GPU device and it is up to the user to transfer it back. This macro is equivalent to @tensor backend=cuTENSORBackend() allocator=CUDAAllocator() tensor_expr
.
This macro requires the cuTENSOR library to be installed and loaded. This can be achieved by running using cuTENSOR
or import cuTENSOR
before using the macro.