+1 vote
edited

Hi,

I am doing DMRG using julia ITensor on a cluster with multicores of cpu. I notice the official explanation of how to start julia with a -t N, which is clear to me (https://itensor.github.io/ITensors.jl/stable/Multithreading.html). But there is also some discussion about Strided.jl. Any advice on how to use multi-thread Julia ITensor exactly?

Naively I expect the following way (which I don't know whether they are the best)
(a) Without any physical symmetry (U(1), Zn..), start julia with Julia -t N and do not change the serial version codes at all.

(b) With Abelian symmetry, start julia with Julia -t N and add in the serial version codes:

+1 vote
selected

Yes, as Miles said, please refer to those docs and let us know if it is still unclear.

For the sake of completeness, here is what we recommend:

(a) For non-QN conserving code, just start Julia serially (julia or equivalently julia -t 1), and you don't have to set any more commands. By default, Julia will use multi-threaded BLAS/LAPACK, which is not controlled by setting the number of Julia threads. Instead, the number of BLAS/LAPACK threads can be controlled at runtime with: https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/#LinearAlgebra.BLAS.set_num_threads and will be set to the number of threads your system has available (https://docs.julialang.org/en/v1/base/constants/#Base.Sys.CPU_THREADS ) by default.

(b) For QN conserving code, start Julia with julia -t N for N > 1 threads and use:

BLAS.set_num_threads(1)


The caveat is that in certain cases like Z2 symmetry where there are relatively few blocks, if the blocks are large enough it may be better to use BLAS/LAPACK multithreading instead of block sparse multithreading, so you should benchmark the two approaches (a) and (b) for your own problem and see which one is best.

commented by (550 points)
Thanks for the detailed explanation. Maybe I didn't express my questions correctly, but your answer makes it very clear for me. Now I totally understand how to use Julia Itensor on a cluster.
+1 vote

Thanks for your questions – does this section of the docs answer them?