Issue with size of the vector in large bond dimension

Question

Issue with size of the vector in large bond dimension

asked Oct 11, 2020 by sharmaprakash (160 points)

Hi,
I am doing DMRG calculation on nearest neighbor Fermi Hubbard model on triangular lattice with periodic boundary condition along y-direction. For smaller lattice sizes I have no issues. I tried doing same calculation for 6x6 size cluster on two different computers with RAM 128GB and 256 GB, but I got different error on each computer when the calculation reaches bond dimension of exactly 12000 on both (particularly at half filling or closer to it). At lower filling level (less electrons in the system) again I have no issues.
Error on 128G computer:
terminate called after throwing an instance of 'std::badalloc'
what(): std::badalloc
I think this is because of insufficient memory space in the ram.

Error on 256G computer:
terminate called after throwing an instance of 'std::lengtherror'
what(): cannot create std::vector larger than maxsize()
Here I think it is not a memory issue.

Surprisingly, both the error occur at same bond dim (i.e.12000). I found a paper https://journals.aps.org/prb/abstract/10.1103/PhysRevB.96.205130 in which people have done ground state dmrg calculation on 36 sites or larger size clusters and went upto bond dimension of 20000. So, I hope there should be a solution to my problem. Any help or suggestion please??

Best,
Prakash

commented Oct 12, 2020 by MattFishman (14.1k points)
edited Oct 12, 2020 by MattFishman

I think those errors are both related to memory limits, though the second one appears to be some limitation on the maximum size of a `std::vector` (http://www.cplusplus.com/reference/vector/vector/max_size/ ). You should check what the `max_size()` is on your system to see how large of a tensor can be allocated, since ITensor's use an `std::vector` as a storage.

Glancing through that paper, it is possible they are using @@SU(2)@@ symmetry, though they don't seem to give many details about their calculation. ITensor only makes use of abelian symmetries, so for the Hubbard model it can conserve the @@U(1)@@ spin projection symmetry and the @@U(1)@@ particle number. Conserving @@SU(2)@@ would lead to more sparse tensors, and therefore could allow one to reach much higher bond dimensions without reaching memory and time limits.

Alternatively, they may just be using a computer with a huge amount of RAM, or some other tricks like distributed memory. If you are interested in reproducing their results, could you reach out to the authors for more details on how they performed their calculation? I would be curious how they reached such large bond dimensions (those are larger bond dimensions than I am used to seeing for DMRG calculations, without extra "tricks" being used like the ones I mentioned above).

commented Oct 14, 2020 by sharmaprakash (160 points)

This problem is resolved by one of my friend Kyungmin Lee. Its a minor bug in the itensor library. He has sent you pull request.

commented Oct 14, 2020 by SergiJulia (140 points)

Hi Sharma, Matt,

I am very curious about this, since I encountered similar problems even at lower bond dimensions ~3000, but with only 64 GB of RAM. Are you using the "WriteDim" functionality for the DMRG? I thought that by using this I would not run into memory issues, as I have 1TB of disk space, but still I got the error ['std::badalloc' what(): std::badalloc] at a given bond dimension depending on the system size, I don't know if you also encountered this problem. Alternatively, if you don't use the "WriteDim" function, are you saying that after solving this bug one can push the bond dimension significantly further?

Thanks a lot!

commented Oct 15, 2020 by sharmaprakash (160 points)

Yes, in my case this issue happened in both cases either using "WriteDim" or without using it, but only the difference is error occurs slightly before if I do not use this "WriteDim" option. After correcting this bug in Itensor library I am being able to go to larger bond dimension without any trouble. In your case, if it is not quite memory issue (remember you can always track memory occupied while running job), this correction should solve your issue.
To the best of my knowledge large disc space alone doesn't help to reach large bond dim if you have small RAM space. "WriteDim" option allows you to go to little far (in my case ~2 more thousands in bond dim). Thank you.

commented Oct 15, 2020 by SergiJulia (140 points)

I believe I still have some RAM available when the error pops up, as it is quite a discontinuous thing: I see ~70% memory usage at a given bond dimension, and when I increase it a bit it crashes. So I am looking forward for your friend's pull request to be accepted, thank you very much!

commented Oct 17, 2020 by miles (70.1k points)

Hi Prakash,
Which issue or PR is the one filed by Kyungmin Lee that fixes this? Is it this one?
https://github.com/ITensor/ITensor/pull/369

If so, I'm surprised that this PR would fix an error related to memory consumption or to the exception which you got. But perhaps it does and I'm just not understanding why. Or perhaps you were referring to a different issue or PR by Kyungmin?

Thank you,
Miles

commented Oct 17, 2020 by miles (70.1k points)

Ah ok I didn't look closely enough at Kyungmin's changes. Looking at them now, I see that it does change how the resizing is done for the std::vector's which are used as temporary workspaces for the low-level LAPACK calls in ITensor. So it could possibly relate to the issue you had, and good to know that it does.

Miles

1 Answer

miles · Answer 1 · 2020-10-17T16:32:15+0000

Please see discussion above. This issue might be now fixed in the latest v3 branch of ITensor, following a recent PR #369 (https://github.com/ITensor/ITensor/pull/369).

However, also note that ITensor DMRG is not guaranteed to work at a given, predetermined large bond dimension, as the reachable bond dimension depends on many details of the code, the system being studied, and the user's software and hardware configuration (size of Hamiltonian MPO, degree of block-sparsity, behavior of memory allocator on user's individual platform and operating system, version of BLAS used, etc.). But we are of course happy to accept fixes and PRs which allow ITensor DMRG to reach larger bond dimensions.

Finally, one trick you might want to try to reach larger bond dimensions is to make separate MPOs for different kinds of terms in your Hamiltonian. For example, you might make one MPO which just has the terms that are local to individual 2d columns, and another MPO that is just the intra-column terms. This approach is not guaranteed to help with memory usage, and often has a bigger impact on time rather than memory, but could still help with things like allowing the code to allocate smaller temporary memory vectors one at a time rather than simultaneously and thereby delay running out of memory.

Best,
Miles

Issue with size of the vector in large bond dimension

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Categories