# Improving applyExp()

0 votes
asked

Hi All,
could someone please tell me what kind of algorithm is used for the actual computation of the exponential? If I want to implement a new function for the actual calculation of the exponential where should I start from? I am asking this because I am facing some convergence issues for which I have to increase the number of iterations in the Krylov procedure to a very large number. Also what does the function expMatrix() returns just a vector or the full matrix exponential?
Maybe I could start from the expokit.f fortran code which have used in the past.

Any suggestion is very appreciated.

Thanks,
Raffaele

## 1 Answer

0 votes
answered by (14.1k points)

Hi Raffaele,

Thanks for reporting an issue. Could you elaborate more on your application? What matrix are you trying to exponentiate and what are you ultimately trying to calculate?

Right now, applyExp uses a pretty standard Krylov-based method for calculating the exponential of an operator applied to a vector (i.e. it builds the Krylov basis and then the Krylov matrix is exponentiated, you can see the implementation here: https://github.com/ITensor/ITensor/blob/7abb7b2760d0608713f846f6bc85f6b6b8c7e509/itensor/iterativesolvers.h#L970).

The current implementation does not have restarts, so it may be limited in the timestep you can input (i.e. if you are calculating exp(-t*H)*v, for large t it may have convergence issues). We would be happy to add in restarts if there is a compelling use case, but we were waiting for a concrete example where that is necessary. For example, in the main use case for us (TDVP), the time step can be taken smaller by doing more sweeps of the MPS, and we have found that the current implementation is sufficient.

Also right now, consider that you can split up your timestep and do multiple calls of applyExp, for example you can try:

auto expHv = applyExp(H,v,-t/2);
expHv = applyExp(H,expHv,-t/2);

instead of:

auto expHv = applyExp(H,v,-t);

if your timestep t is too large for convergence. For a fixed ErrGoal you of course accumulate the error of both calls of applyExp.

Cheers,
Matt

commented by (310 points)
Hi Matt,

thanks for the reply, I am actually using TDVP both 1 site and 2 site implementations and my Hamiltonian is a chain of coupled bosons without any conserved quantum number.
At the moment I increased the number of iterations to 200 and I can reach and ErrorGoal of 1e-10 at each sweep. Of course I could decrease the time step but the point was to have an algorithm capable to use longer time steps and fewer iterations.
On GitHub there is a C++ implementation of the subroutine zgexpmv() which is a very good routine from EXPOKIT that I have used in the past (in its Fortran77 form). It might be worth trying if it can improve the situations. Actually this function rescales the time step in case of poor convergence but it is an automatic feature.
I was trying to modify applyExp() to use this function but I am quite slow in coding.

As far as I can see the main function responsible for the exponentiation at the moment is expPade(), is that so?

Raffaele
commented by (310 points)
I Just wanted to add that in my boson chain the convergence issue becomes more severe when the boson occupation number become large. I am using an occupation number of 35 for each site and in this case the procedure takes more 200 iterations to converge.
commented by (70.1k points)
Hi Raffaele, Have you tried it for the case of using a much smaller max occupation number for your boson sites? Does it work when that number is smaller? Even if you ultimately plan to take the occupation number larger, or need to to obtain accurate results, it would be good to know what happens for max occupation of say 3-6 range to find out what is causing the issue you are seeing.
commented by (310 points)
Keeping the boson occupation number to a much smaller value does indeed decrease the number of iterations required for convergence, but I have not yet done some serious test.
I have a log file (about 5 MB) for a system with 16 bosons sites with occupation number 24 that clearly shows the convergence issues. It takes 100 iterations to lower the error from from 1e-5 to 1-6. Below a few lines from the log file.

---
Sweep=1, HS=2, Bond=15/16
Iteration: 1, Error: 0.638119
Iteration: 2, Error: 0.287186
Iteration: 3, Error: 0.130251
Iteration: 4, Error: 0.071088
Iteration: 5, Error: 0.049083
Iteration: 6, Error: 0.0423913
Iteration: 7, Error: 0.0452068
Iteration: 8, Error: 0.0538683
Iteration: 9, Error: 0.0589756
Iteration: 10, Error: 0.0676309
Iteration: 11, Error: 0.0621057
Iteration: 12, Error: 0.0512839
Iteration: 13, Error: 0.0422955
Iteration: 14, Error: 0.0355001
Iteration: 15, Error: 0.0316838
Iteration: 16, Error: 0.0273959
Iteration: 17, Error: 0.0237547
Iteration: 18, Error: 0.022184
Iteration: 19, Error: 0.0202687
Iteration: 20, Error: 0.0187334
Iteration: 21, Error: 0.0185156
Iteration: 22, Error: 0.0171161
Iteration: 23, Error: 0.015366
Iteration: 24, Error: 0.0151294
Iteration: 25, Error: 0.0138789
Iteration: 26, Error: 0.0128832
Iteration: 27, Error: 0.0115974
Iteration: 28, Error: 0.00991369
Iteration: 29, Error: 0.00899013
Iteration: 30, Error: 0.00786299
Iteration: 31, Error: 0.00725331
Iteration: 32, Error: 0.00701716
Iteration: 33, Error: 0.00628013
Iteration: 34, Error: 0.00590094
Iteration: 35, Error: 0.0052219
Iteration: 36, Error: 0.00482777
Iteration: 37, Error: 0.00446068
Iteration: 38, Error: 0.0039519
Iteration: 39, Error: 0.00379703
Iteration: 40, Error: 0.00347708
Iteration: 41, Error: 0.00329166
Iteration: 42, Error: 0.00301498
Iteration: 43, Error: 0.00277684
Iteration: 44, Error: 0.00255457
Iteration: 45, Error: 0.00233283
Iteration: 46, Error: 0.00207878
Iteration: 47, Error: 0.00192199
Iteration: 48, Error: 0.00169962
Iteration: 49, Error: 0.00161412
Iteration: 50, Error: 0.00152527
Iteration: 51, Error: 0.00139379
Iteration: 52, Error: 0.00130557
Iteration: 53, Error: 0.00124559
Iteration: 54, Error: 0.00119817
Iteration: 55, Error: 0.00104333
Iteration: 56, Error: 0.000955515
Iteration: 57, Error: 0.000859409
Iteration: 58, Error: 0.000774391
Iteration: 59, Error: 0.000753224
Iteration: 60, Error: 0.000715607
Iteration: 61, Error: 0.000663465
Iteration: 62, Error: 0.000581495
Iteration: 63, Error: 0.000548726
Iteration: 64, Error: 0.000509549
Iteration: 65, Error: 0.000480556
Iteration: 66, Error: 0.000476432
Iteration: 67, Error: 0.000426619
Iteration: 68, Error: 0.000412285
Iteration: 69, Error: 0.000372208
Iteration: 70, Error: 0.000334442
Iteration: 71, Error: 0.000312243
Iteration: 72, Error: 0.000284354
Iteration: 73, Error: 0.000273403
Iteration: 74, Error: 0.000240112
Iteration: 75, Error: 0.00023832
Iteration: 76, Error: 0.00023475
Iteration: 77, Error: 0.000235279
Iteration: 78, Error: 0.000210067
Iteration: 79, Error: 0.000207372
Iteration: 80, Error: 0.000181347
Iteration: 81, Error: 0.000170289
Iteration: 82, Error: 0.000148436
Iteration: 83, Error: 0.000145014
Iteration: 84, Error: 0.000150516
Iteration: 85, Error: 0.000138971
Iteration: 86, Error: 0.00013238
Iteration: 87, Error: 0.000124886
Iteration: 88, Error: 0.000114711
Iteration: 89, Error: 0.000113345
Iteration: 90, Error: 0.00011269
Iteration: 91, Error: 0.000109958
Iteration: 92, Error: 9.76829e-05
Iteration: 93, Error: 8.65131e-05
Iteration: 94, Error: 8.43918e-05
Iteration: 95, Error: 8.40385e-05
Iteration: 96, Error: 8.23291e-05
Iteration: 97, Error: 7.99522e-05
Iteration: 98, Error: 7.42379e-05
Iteration: 99, Error: 6.70692e-05
Iteration: 100, Error: 6.20688e-05
Iteration: 101, Error: 6.16791e-05
Iteration: 102, Error: 6.0274e-05
Iteration: 103, Error: 5.99569e-05
Iteration: 104, Error: 5.8609e-05
Iteration: 105, Error: 5.7213e-05
Iteration: 106, Error: 4.73714e-05
Iteration: 107, Error: 4.73477e-05
Iteration: 108, Error: 4.57683e-05
Iteration: 109, Error: 4.27954e-05
Iteration: 110, Error: 4.33137e-05
Iteration: 111, Error: 4.52173e-05
Iteration: 112, Error: 4.24098e-05
Iteration: 113, Error: 3.57563e-05
Iteration: 114, Error: 3.95354e-05
Iteration: 115, Error: 3.7504e-05
Iteration: 116, Error: 3.39426e-05
Iteration: 117, Error: 3.32528e-05
Iteration: 118, Error: 3.24908e-05
Iteration: 119, Error: 3.40345e-05
Iteration: 120, Error: 3.50016e-05
Iteration: 121, Error: 3.08532e-05
Iteration: 122, Error: 2.88498e-05
Iteration: 123, Error: 3.12128e-05
Iteration: 124, Error: 3.10717e-05
Iteration: 125, Error: 3.06152e-05
Iteration: 126, Error: 2.72405e-05
Iteration: 127, Error: 2.66073e-05
Iteration: 128, Error: 2.61065e-05
Iteration: 129, Error: 2.9085e-05
Iteration: 130, Error: 2.59396e-05
Iteration: 131, Error: 2.48724e-05
Iteration: 132, Error: 2.35326e-05
Iteration: 133, Error: 2.42675e-05
Iteration: 134, Error: 2.40615e-05
Iteration: 135, Error: 2.31259e-05
Iteration: 136, Error: 2.14988e-05
Iteration: 137, Error: 2.08507e-05
Iteration: 138, Error: 2.17281e-05
Iteration: 139, Error: 2.03236e-05
Iteration: 140, Error: 1.86188e-05
Iteration: 141, Error: 2.00066e-05
Iteration: 142, Error: 1.96718e-05
Iteration: 143, Error: 1.74585e-05
Iteration: 144, Error: 1.90512e-05
Iteration: 145, Error: 2.0851e-05
Iteration: 146, Error: 2.0051e-05
Iteration: 147, Error: 1.76715e-05
Iteration: 148, Error: 1.87925e-05
Iteration: 149, Error: 1.87155e-05
Iteration: 150, Error: 1.69574e-05
Iteration: 151, Error: 1.66414e-05
Iteration: 152, Error: 1.79171e-05
Iteration: 153, Error: 1.89355e-05
Iteration: 154, Error: 1.7597e-05
Iteration: 155, Error: 1.65225e-05
Iteration: 156, Error: 1.63052e-05
Iteration: 157, Error: 1.73545e-05
Iteration: 158, Error: 1.60184e-05
Iteration: 159, Error: 1.51521e-05
Iteration: 160, Error: 1.73576e-05
Iteration: 161, Error: 1.6333e-05
Iteration: 162, Error: 1.38076e-05
Iteration: 163, Error: 1.47765e-05
Iteration: 164, Error: 1.33878e-05
Iteration: 165, Error: 1.42886e-05
Iteration: 166, Error: 1.62694e-05
Iteration: 167, Error: 1.41469e-05
Iteration: 168, Error: 1.42415e-05
Iteration: 169, Error: 1.52288e-05
Iteration: 170, Error: 1.21468e-05
Iteration: 171, Error: 1.13911e-05
Iteration: 172, Error: 1.32988e-05
Iteration: 173, Error: 1.35119e-05
Iteration: 174, Error: 1.2796e-05
Iteration: 175, Error: 1.29884e-05
Iteration: 176, Error: 1.30855e-05
Iteration: 177, Error: 1.31864e-05
Iteration: 178, Error: 1.14494e-05
Iteration: 179, Error: 1.19393e-05
Iteration: 180, Error: 1.33965e-05
Iteration: 181, Error: 1.23386e-05
Iteration: 182, Error: 1.11296e-05
Iteration: 183, Error: 1.11898e-05
Iteration: 184, Error: 1.05088e-05
Iteration: 185, Error: 1.12205e-05
Iteration: 186, Error: 1.0402e-05
Iteration: 187, Error: 1.03346e-05
Iteration: 188, Error: 8.68706e-06
Iteration: 189, Error: 1.02391e-05
Iteration: 190, Error: 1.10672e-05
Iteration: 191, Error: 1.04691e-05
Iteration: 192, Error: 8.89482e-06
Iteration: 193, Error: 9.40873e-06
Iteration: 194, Error: 8.57953e-06
Iteration: 195, Error: 9.7164e-06
Iteration: 196, Error: 1.00126e-05
Iteration: 197, Error: 9.38924e-06
Iteration: 198, Error: 1.0108e-05
Iteration: 199, Error: 8.33536e-06
Iteration: 200, Error: 8.88882e-06
Iteration: 201, Error: 8.21131e-06
Iteration: 202, Error: 8.15833e-06
Iteration: 203, Error: 9.35119e-06
Iteration: 204, Error: 9.5519e-06
Iteration: 205, Error: 7.9187e-06
Iteration: 206, Error: 8.08857e-06
Iteration: 207, Error: 7.64561e-06
Iteration: 208, Error: 6.85986e-06
Iteration: 209, Error: 8.53052e-06
Iteration: 210, Error: 7.86043e-06
Iteration: 211, Error: 7.33537e-06
Iteration: 212, Error: 6.39248e-06
Iteration: 213, Error: 6.38474e-06
Iteration: 214, Error: 7.22072e-06
Iteration: 215, Error: 7.07952e-06
Iteration: 216, Error: 6.80428e-06
Iteration: 217, Error: 7.034e-06
Iteration: 218, Error: 5.75385e-06
Iteration: 219, Error: 6.39325e-06
Iteration: 220, Error: 6.33908e-06
Iteration: 221, Error: 7.01694e-06
Iteration: 222, Error: 6.37219e-06
Iteration: 223, Error: 6.16033e-06
Iteration: 224, Error: 6.26016e-06
Iteration: 225, Error: 5.63829e-06
Iteration: 226, Error: 6.48061e-06
Iteration: 227, Error: 6.11072e-06
Iteration: 228, Error: 5.99218e-06
Iteration: 229, Error: 5.69966e-06
Iteration: 230, Error: 5.3489e-06
Iteration: 231, Error: 5.31105e-06
Iteration: 232, Error: 5.24717e-06
Iteration: 233, Error: 5.53838e-06
Iteration: 234, Error: 6.0289e-06
Iteration: 235, Error: 5.83203e-06
Iteration: 236, Error: 5.86855e-06
Iteration: 237, Error: 4.92979e-06
Iteration: 238, Error: 4.71234e-06
Iteration: 239, Error: 3.87016e-06
Iteration: 240, Error: 3.89903e-06
Iteration: 241, Error: 4.06568e-06
Iteration: 242, Error: 4.6644e-06
Iteration: 243, Error: 5.3939e-06
Iteration: 244, Error: 5.91335e-06
Iteration: 245, Error: 6.57035e-06
Iteration: 246, Error: 6.82027e-06
Iteration: 247, Error: 6.39711e-06
Iteration: 248, Error: 5.6575e-06
Iteration: 249, Error: 5.34638e-06
warning: applyExp not converged in 250 steps
In applyExp, number of iterations: 249
In applyExp, number of matrix-vector multiplies: 250
---
commented by (70.1k points)
Thanks. While we look into this, would it be possible for you to use a lower maxmimum boson number? What was the reason for choosing 35?
commented by (310 points)
Hi, just for testing I lowered the occupation number to 6 and up to now I don't see this convergence issue. The worst case takes 12 iterations to converge. The results below are obtained with bond dimension 70 and the 1TDVP integrator.

--
Sweep=84, HS=2, Bond=15/16
Iteration: 1, Error: 0.00907976
Iteration: 2, Error: 0.0239859
Iteration: 3, Error: 0.00451044
Iteration: 4, Error: 0.0018541
Iteration: 5, Error: 0.00163281
Iteration: 6, Error: 7.48603e-05
Iteration: 7, Error: 6.14993e-05
Iteration: 8, Error: 1.41896e-05
Iteration: 9, Error: 7.75392e-07
Iteration: 10, Error: 1.77492e-07
Iteration: 11, Error: 6.10702e-08
Iteration: 12, Error: 2.87171e-09
In applyExp, number of iterations: 12
In applyExp, number of matrix-vector multiplies: 13
---
The model is comprised of a chain of low frequency bosons at finite temperature and the population dynamics converges only using quite large occupation numbers. I know the implementation is correct since I tested the ITensor code with a short chain of six coupled bosons against an exact evolution (using a Chebyshev propagator) and the MPS and exact results are in very good agreement.

I have implemented other models in ITensor and used the TDVP integrator and I have not seen this problem, so it certainly is very system dependent.

Thanks,
Raffaele
commented by (14.1k points)
Hi Raffaele,

Thanks for providing more details.

What time step are you using in the above case that is slow to converge? Does the convergence improve if you lower the time step per sweep?

-Matt
commented by (310 points)
At the moment I am using a timestep of 1 fs (which is much shorter than any characteristic timescale of the system.) Of course reducing the time step improves the convergence.
In the example below I have reduced the timestep to 0.2 fs, I am comparing the same Hamiltonian with the same initial state at the same time step as in my first example (which doesn't converge in 250 iterations).
I think this is a quite common Lanczos issue. I mean, without restart and/or reorthogonalization all Arnoldi type procedures have convergence issues.

--
Sweep=1, HS=2, Bond=15/16
Iteration: 1, Error: 6.25882e-05
Iteration: 2, Error: 0.00242279
Iteration: 3, Error: 0.000913222
Iteration: 4, Error: 0.000451137
Iteration: 5, Error: 0.000258266
Iteration: 6, Error: 0.000236436
Iteration: 7, Error: 0.000185916
Iteration: 8, Error: 0.000103495
Iteration: 9, Error: 8.44437e-05
Iteration: 10, Error: 7.25048e-05
Iteration: 11, Error: 4.61025e-05
Iteration: 12, Error: 2.79697e-05
Iteration: 13, Error: 2.32301e-05
Iteration: 14, Error: 1.63954e-05
Iteration: 15, Error: 1.14627e-05
Iteration: 16, Error: 9.18254e-06
Iteration: 17, Error: 7.2487e-06
Iteration: 18, Error: 5.29218e-06
Iteration: 19, Error: 3.56943e-06
Iteration: 20, Error: 2.92893e-06
Iteration: 21, Error: 2.2794e-06
Iteration: 22, Error: 1.57727e-06
Iteration: 23, Error: 1.18774e-06
Iteration: 24, Error: 9.90965e-07
Iteration: 25, Error: 6.09846e-07
Iteration: 26, Error: 4.2143e-07
Iteration: 27, Error: 2.46482e-07
Iteration: 28, Error: 1.26405e-07
Iteration: 29, Error: 7.77123e-08
Iteration: 30, Error: 3.76132e-08
Iteration: 31, Error: 1.99405e-08
Iteration: 32, Error: 8.8653e-09
--

Raffaele
commented by (14.1k points)
Ok that is useful. If you don't mind, could you see how large of a timestep you are able to do with that example before it has trouble converging?

I guess the case you saw above that didn't converge involved too large of a Krylov space, so there were issues with orthogonalization of the Krylov vectors. Currently we don't have reorthogonalization or restarts in the implementation (just to keep the initial implementation simpler, and see how well it could do for reasonable problems. Also, we had a more sophisticated implementation with restarts but it had a performance issue so we rewrote it to ensure that the simplest version was working properly).

It seems like the case you saw above is a reasonable use case, so we will look into adding back restarts. I can think of two things you could do in the meantime (if the case you saw above is blocking work you need to do):

1. Decrease the timestep per sweep of TDVP to the point where you don't seem to have convergence problems in applyExp anymore.
2. You can make a crude version of restarts by running applyExp multiple times within a step of TDVP. For example if you want to go to a timestep t in one step, you could do:

auto expHv = applyExp(H,v,-t/2);
expHv = applyExp(H,expHv,-t/2);

assuming t/2 is a timestep the current implementation is capable of reaching.

I realize those options may not be ideal, we will try to add in restarts as soon as possible (I am not sure if I can make promises about when that will happen, since the holidays are coming up and things will be quite busy around here early next year).
commented by (310 points)
Hi, thanks for your reply.
It would be great to have restart capabilities in the Lanczos procedure, meanwhile I will run the calculations with a shorter timestep.
commented by (260 points)
Hi Lello,

Have you solved the problem? or modified the applyExp() function with your zgexpv() function? Because I have met a similar question recently that my MPO is non-Hermite, So I need a generalized function to do this.

Best,
Meng
commented by (310 points)
Hi,
I solved the problem by reducing the time step but I didn't modify the applyExp() function.
It shouldn't be too difficult though.
(BTW, I also work on the use of TT to solve HEOM.)
commented by (260 points)
Hi Lello,

That's cool to hear that you apply TT solve HEOM. I have some experience in developing HEOM approach that may provide some help if you need it. Actually, I am currently trying to apply ITensor to rewrite HEOM.

Best,
Meng
commented by (260 points)
I just realize that you are  Raffaele Borrelli! I always focus on your works related to HEOM/tensor network since 2019 and very enjoy your recently tAMEn applications!