+1 vote
asked by (170 points)

Hi,

I tried to recompile my code (Trotter time-evolution with swap gates for bosonic sites) with version 3.1.7 (previously it was working with v3.1.6) and I get an error in some runs at certain points during the SVD.

The error message is "what(): Error condition in LAPACK SVD". I would assume this is due to the change in the implicit method for SVD in v3.1.7 to gesdd, which, as said in the documentation, "has been observed to crash depending on the LAPACK implementation used."

So my question would be if you have any intuition regarding the LAPACK implementations that I could use to not get this error? Or if do you know any workarounds for this kind of crash? Otherwise, I guess I should change my code to still use the "ITensor" method for the SVD before updating to v3.1.7.

Thanks!

commented by (480 points)
I'm not very knowledgeable about this stuff, so this is just a stab in the dark based on some gfortran - Lapack compatibility issues that I had in the past. If not using gfortran on Linux, you may wish to specify details of the installation like Windows/Linux/etc., the compiler and platform parts of the options.mk file for your build.

Assuming you are using gfortran and some default LAPACK on Linux:

Can't say about the specific issue you're facing, but if you're using a gfortran compiler, that may be a source of incompatibility between LAPACK and modern versions of the gfortran compiler. As I understand it, these issues stem come from the C-Fortran interfacing that wasn't standardized when LAPACK was written  but was introduced later. The problem I faced was for some eigenvalue solvers, I'm not sure if that extends into issues with SVD. I *think* the Reference LAPACK people are working on this, but it hadn't been fully fixed the last I checked.

Using OpenBLAS and the other options supplied for the PLATFORM=openblas in the comments of the options.mk file for my LAPACK solver has worked well for my Linux machine (running Fedora which comes with a modern gfortran compiler and LAPACK & OpenBLAS installed by default). I did have to adjust the path to the OpenBLAS, but that's reasonably quick to fix.

Other workarounds are to use a different gfortran compiler, but when I had done that in the past, it was a headache to get everything to build and run without error. Switching over to openblas and the -DHAVE_LAPACK_CONFIG_H etc in the make file was a much faster fix for the problem I had before.

Not sure if it'll circumvent your issue, but if you can copy or ITensor into your favorite compressed file and already have those options installed, it was relatively fast to remake and test for me.

 - Jared

P.S. Even for OpenBLAS is appeared to have a problem with gesdd at some point, but was fixed at least back by the start of 2021: https://github.com/xianyi/OpenBLAS/issues/3044
So, Openblas should be good to go for this particular routine.
commented by (170 points)
Thanks for your answer!

I'm using ITensor on a cluster, where it is compiled in a Debian 10 container using the GCC compiler.

The following flags related to LAPACK are used in options.mk:
BLAS_LAPACK_LIBFLAGS=-lmkl_intel_lp64 -lmkl_intel_thread -lmkl_rt -lmkl_core -liomp5 -lpthread
BLAS_LAPACK_INCLUDEFLAGS=-I/usr/include/mkl

So the underlying implementation used is Intel MKL. This uses the version shipped with Debian 10, i.e. this one: (https://packages.debian.org/buster/libmkl-dev), which corresponds to Intel MKL version 2019.2.187.

As the SVD (with gesdd in v3.1.7) crash seems rather "random" to me, with the same code it can occur or not at some point during the time-evolution depending on the initial parameters, it is not easy to produce a minimal code that can exhibit the bug.

In this sense I was asking if anyone has an intuition regarding the compiler and LAPACK implementation that one can use for gesdd to work properly. We also have Intel OneAPI and OpenBLAS installed on the cluster I'm using, so I should be able to try them.
commented by (70k points)
Hi Catalin,
Speaking just for myself, I'm not aware of which LAPACK implementations do or don't have this bug. I've seen it in multiple LAPACK's so it might be an issue that affects most of them.

Note that:
- there is the "ITensor" "SVDMethod" option which does not suffer from this kind of crash but is slower (however this could be fine if SVD is not your main bottleneck anyway)
- there is also the "gesvd" option for "SVDMethod" which might not crash if you are finding "gesdd" does

Info about these options here:
http://itensor.org/docs.cgi?vers=cppv3&page=classes/decomp

Miles

1 Answer

0 votes
answered by (70k points)

(See discussion above.)

Welcome to ITensor Support Q&A, where you can ask questions and receive answers from other members of the community.

Formatting Tips:
  • To format code, indent by four spaces
  • To format inline LaTeX, surround it by @@ on both sides
  • To format LaTeX on its own line, surround it by $$ above and below
  • For LaTeX, it may be necessary to backslash-escape underscore characters to obtain proper formatting. So for example writing \sum\_i to represent a sum over i.
If you cannot register due to firewall issues (e.g. you cannot see the capcha box) please email Miles Stoudenmire to ask for an account.

To report ITensor bugs, please use the issue tracker.

Categories

...