# Sweeping algorithm utilizing ITensor capabilities as best as possible.

+1 vote

Hi, sorry if this question is a bit naive. I want to implement a sweeping type algorithm, and I want to utilize as much of ITensor's capabilities to enhance efficiency. What I need to do is calculate the matrix elements

(psidag| S_i^z |in)

for all i in the lattice. The calculation is at finite T, so I have an ancilla that I don't need matrix elements on. What I currently do is create all sets of tensors to the left and right by

L[1] = ( psidag(P[1])*in(P[1]) ) * ( psidag(A[1]) * in(A[1]) );
R[N] = psidag(A[N])*in(A[N]);
for(auto i : range1(2,N))
{
auto I = N-i+1;
L[i] = L[i-1] * ( ( psidag(P[i])*in(P[i]) ) * ( psidag(A[i]) *in(A[i]) ) );
R[I] = R[I+1] * ( ( psidag(P[I+1])*in(P[I+1]) ) * ( psidag(A[I]) *in(A[I]) ) );
}


Where P[i] is the ith physical site, and A[i] is the ith ancilla site, they are consecutive (p[1] = 1, A[1]=2, P[2] = 3, A[2] =4, etc.). Then, the ith matrix element is

auto res = L[i-1] * psidag(P[i]) * noPrime(in(P[i]) * op(sites,"Sz",P[i])) * R[i];


Now, my question is if there is a better way to do this with ITensor? I have noticed it's quite inefficient with increased bond dimensions. My initial thought was that this is better than just using 'inner', as I'm saving the majority of the tensor contraction from one value of i to the next. Space complexity is not a major factor for me, but time is.

+1 vote

Hi Nick,
I think this is a good question - no problem. First of all, I wouldn't say there's much about ITensor specifically you can use to make this algorithm faster. By this I mean that the main things you can improve are mostly to do with the algorithm itself, and you would get the same characteristics (scaling, speed) with any reasonably efficient tensor library. So I'm just answering that part of your question first. Of course we do hope and believe that ITensor's interface makes such algorithms nice to write :^)

Now in terms of what you could do to make this faster, let me list a few ideas. Please note that I may have missed something in your code here or there, and I didn't run it or do a detailed sketch of the steps, so please discard or do not take too seriously any comments which you think don't apply or make sense:

1. instead of building the L[i] and R[i] arrays across the whole system, do you not just need R[2] for the first step, then L[1] and R[3] for the second step, and so on? (Maybe I'm off-by-one on those numberings there but hopefully you see what I'm suggesting.) So potentially you could save a big factor here by only building the L's and R's needed at each step, and updating them as you do one sweep across rather than building them all up front.

2. the other thing I'd note is more vague, but I'm concerned by your statement that the code gets a lot slower as you increase the bond dimension. This kind of task should be rather fast for MPS of dimensions approaching 1000 at least, where I'd still expect it to take only 10's of minutes at most. So if it's much slower than that, then probably it's not scaling correctly. So then I'd recommend you print out each intermediate tensor and inspect its indices. Work out the scaling of each step and make sure it scales as chi^3 where chi is the typical bond dimension of the two MPS involved (just here assuming they have the same bond dimension for simplicity).

Best regards,
Miles

commented by (620 points)
Hi Miles,

As for point 1, I don't quite see how I could do this, except to save space, but not time. Since, say, I am sweeping to the left, then R[i] would include the contraction of psidag and in over all sites to the right. But then once I increase i, I'd need to somehow undo the contraction from the previous step. However, I could calculate L[i] as I need it, and then just get L[i+1] from L[i] as I sweep.

But it turns out the issue was what you suggested in part 2. I thought I got it correctly with the parentheses, but when I wrote the tensor contractions on separate lines, it sped up dramatically, so I'm guessing I was constructing large tensors in the procedure somewhere, and not contracting efficiently.

Best,
Nick
commented by (44k points)
Hi Nick,
Glad you found the efficiency issue, that's great.

As for my first point above, it might have just not been so clear the way I wrote it. What I meant is exactly what you said about " However, I could calculate L[i] as I need it, and then just get L[i+1] from L[i] as I sweep." That's all I meant. So I agree you do have to build most or all of the R[i]'s ahead of time, but it could be more efficient to build the L[i]'s one at a time as you go. As a memory bonus, you can even use the same array for the L[i]'s and R[i]'s, just overwriting the old R[i]'s after you're done using them. But that's a more advanced move and only affects memory usage, not speed.

Best,
Miles
commented by (620 points)
Oh, nice, I like it. thank you for your help!

Nick