+1 vote
asked by (180 points)
edited by

This is a question about the Julia ITensors package (https://github.com/ITensor/ITensors.jl), which I am using to run DMRG on various Hamiltonians to find the ground state and low-lying excited states.

When you create a Hamiltonian on which to use DMRG, you need to create an AutoMPO object, add terms to it, and then run the command "H = MPO(ampo,sites)." However, I have observed confusing discrepancies in runtime and memory usage across different Hamiltonians.

For example, when I run the following code to create the Hamiltonian for the J1-J2 model (https://en.wikipedia.org/wiki/J1_J2_model) on N = 10^4 (resulting in 60,000 MPO terms), the code takes only 1 minute to run on my desktop.

J1 = 1
J2 = 0.5
N = 10^4
sites = siteinds("S=1/2",N)
ampo = AutoMPO()
for j=1:N-1
  add!(ampo,J1,"Sx",j,"Sx",j+1)
  add!(ampo,J1,"Sy",j,"Sy",j+1)
  add!(ampo,J1,"Sz",j,"Sz",j+1)
end
for j=1:N-2
  add!(ampo,J2,"Sx",j,"Sx",j+2)
  add!(ampo,J2,"Sy",j,"Sy",j+2)
  add!(ampo,J2,"Sz",j,"Sz",j+2)
end
add!(ampo,J1,"Sx",N,"Sx",1)
add!(ampo,J1,"Sy",N,"Sy",1)
add!(ampo,J1,"Sz",N,"Sz",1)
add!(ampo,J2,"Sx",N-1,"Sx",1)
add!(ampo,J2,"Sz",N-1,"Sz",1)
add!(ampo,J2,"Sx",N,"Sx",2)
add!(ampo,J2,"Sy",N,"Sy",2)
add!(ampo,J2,"Sz",N,"Sz",2)
H = MPO(ampo,sites)

However, when I run the following code to create the Hamiltonian for the lattice formulation of the Schwinger model (equation 2.6 on page 4 of https://arxiv.org/abs/1305.3765v2) on N = 60 (resulting in1948 MPO terms), the code takes 2 hours and 20 minutes to run on my desktop. Again, this is just to create the Hamiltonian, not even to run DMRG.

m_over_g = 0 # m, g are Schwinger model parameters
x = 25 # x = 1/(g^2*a^2), a = lattice spacing
μ = 2 * m_over_g * sqrt(x) # μ = 2m/(g^2*a) = 2*(m/g)*sqrt(x)
N = 60
ampo_H = AutoMPO()
trace = N*μ/2 + N^2/8
add!(ampo_H,2*2 * trace,"Sz*Sz",1) # cheat designed to add trace term to Hamiltonian
for j = 1:N
  if j != N
    add!(ampo_H,x,"S+",j,"S-",j+1) # S+ and S- are the same as σ+ and σ-
    add!(ampo_H,x,"S-",j,"S+",j+1)
    # S operators are 1/2 of Pauli matrices, so I have put factors of 2 to compensate
    add!(ampo_H,2 * trunc((N-j+1)/2)/2,"Sz",j)
    for i=1:j-1
      add!(ampo_H,2*2 * (N-j)/2,"Sz",i,"Sz",j)
    end
  end
  add!(ampo_H,2 * μ*(-1)^(j-1)/2,"Sz",j)
end
H = MPO(ampo_H, sites)

It is worth mentioning that this used to take a lot less time. Up until a week ago, this would take probably 5-10 minutes, but somehow, ever since downloading one of the later versions of the code last Thursday, it has been running way slower.

In addition, if I change N = 60 to N = 100 in the above code (resulting in 5248 terms), the code eventually throws an OutOfMemoryError. This also seems weird to me, since there are still far fewer MPO terms, and each individual term is not any more complicated than the terms in the J1-J2 model Hamiltonian.

The obvious difference between these two is that the Schwinger model Hamiltonian is intensely non-local, having an interaction between almost every pair of spins. However, I thought this only significantly affected the speed of DMRG, not the memory usage or speed of the creation of the Hamiltonian itself. Any insights into what factors govern the speed and memory usage of the creation of the Hamiltonian would be greatly appreciated.

1 Answer

0 votes
answered by (41.2k points)

Hi, thanks for this question. Also congrats on asking what I think is the first Julia-related question on this forum!

Your question is a good one, about what sort of inputs work well with AutoMPO or not. Without going far into the details of the algorithm, the main steps of it are:
(1) collect and sort terms (meaning sort each term in site order)
(2a) create an uncompressed, sparse version of the MPO
(2b) create a matrix at each bond which contains the coefficients weighting each operator string to the left and right of that bond (note that these matrices are highly redundant, and are not MPO matrices
(3) SVD all of the matrices in (2b), and truncate singular values which are zero. Use the truncated U and V^\dagger matrices to compress the "big", uncompressed sparse MPO from (2a)

So I believe the bottleneck you are hitting comes from either (2a) or (3). I'd have to do profiling to see. There are two hypothetical improvements we could do:
(2a+) don't make the full uncompressed MPO all at once, even in a sparse form
(3) don't make uncompressed matrices with all of the coefficients
What I mean here loosely is that there's a different way of coding the AutoMPO algorithm which builds the (2b) matrices on some bond j already in the compressed basis of bond j-1.

Anyway, you can see that it's technical, and right now I can't guarantee that I know how to do the above improvements in a general way and that they will work.

Unfortunately, this means your input is currently outside of the design scope of AutoMPO. But I appreciate you filing an issue, because if there was some change in recent versions which we can revert which at least alleviates your issue we'll see if we can find it.

But I think your best bet is actually this: for highly unusual and non-local Hamiltonians, it's really best to invest in hand-crafting an MPO. A few years ago I was working on quantum chemistry DMRG calculations, which also involve highly non-local terms, and had to make custom MPOs quite often. So it's necessary for such challenging cases until we can make AutoMPO an even more general solution. (Though I'm sure people will still continue to find cases it can't quite handle!)

It's great that you are using ITensor to study interesting systems outside of the usual condensed matter setting. Again, we'll keep your issue open and try to at least alleviate the memory usage and ultimately later on devise a better-scaling algorithm, though it's sort of an open area of algorithm research for us still.

Best,
Miles

commented by (180 points)
Thank you so much! Sorry for the late reply--I thought I had notifications on for this question, so I expected to get an email if anyone replied.

I'm going to continue to look into the earlier versions of the Julia code to see whether one of the older versions indeed worked better for this Hamiltonian. I had never thought about the idea of hand-crafting an MPO, so thank you for that suggestion! I guess I have a lot to think about.
commented by (41.2k points)
Hi Sujay,
 I took a bit more time to think about your specific Hamiltonian and I’m fairly convinced about two things:

this is a tough case for AutoMPO, the way it currently works, so I’m not surprised it eats up a lot of time and memory
but the good news is: I’m fairly sure your Hamiltonian has a very compact MPO representation
As far as I can see from a brief reading of the code, the only term which is troublesome and non-local is the one where the site values are named “i” and “j” and where i < j but can otherwise take any value between 1 and j-1.

For a term like that, there ought to be a simple way to construct an MPO, by using the “finite state automaton” picture of how MPOs work. (If you don’t know this picture, I could send you some references.) In your case, the automaton would have the following internal states (this is just for making that one type of term, not the others):

we haven’t placed any operators yet
we have placed the “i” operator now, but not the “j” one yet
we have now placed “j”, and are done
For the off-diagonal elements in the MPO that do the transitions, the one to take the most care with would be the elements that do the transitions from state 2 to state 3, as the coefficients go like (N-j). But since the MPO has “knowledge” of what site its on (in the sense that each MPO tensor can be different for a finite system), you can just set the coefficient to be proportional to (N-j) by hand.

So I’m pretty convinced you could make a nice, custom MPO for your case if you research it a bit. You could also still use AutoMPO to make an MPO for the rest of your local terms, and use ITensor’s feature which lets you pass multiple MPOs to the dmrg function and which “lazily” treats them as summed together in an efficient way (without actually ever summing the MPOs together).

Let me know if you have any questions -

Miles
commented by (41.2k points)
For others reading this discussion, Sujay and I had some follow-up discussions about constructing MPO's manually, and I wanted to re-post some of the resources I recommended here too:


Regarding resources for designing MPOs manually, the most detailed discussion of the "automaton" picture of MPS and MPO's is, I believe, in this article: https://arxiv.org/abs/0708.1221

Another nice article that briefly discusses some examples of MPOs and how they work is this one:
https://arxiv.org/abs/0804.2509

Finally, I have some talk slides that go over such constructions very briefly, but the visual aids may help you:
http://tensornetwork.org/reviews_resources.html
(See the "Tensor Networks and Applications" talk slides by me which are linked there.)
Specific slide deck discussing MPOs: https://itensor.org/miles/BrazilLectures/TNAndApplications03.pdf
Welcome to ITensor Support Q&A, where you can ask questions and receive answers from other members of the community.

Formatting Tips:
  • To format code, indent by four spaces
  • To format inline LaTeX, surround it by @@ on both sides
  • To format LaTeX on its own line, surround it by $$ above and below
  • For LaTeX, it may be necessary to backslash-escape underscore characters to obtain proper formatting. So for example writing \sum\_i to represent a sum over i.
If you cannot register due to firewall issues (e.g. you cannot see the capcha box) please email Miles Stoudenmire to ask for an account.

To report ITensor bugs, please use the issue tracker.

Categories

...