Practical routes to convergence

+1 vote

I've long struggled with the question, "What is the best possible sequence of input parameters -- energy cutoff, maximum number of singular values, noise, number of sweeps per set of parameters to get the most rapid and precise wavefunction?"

I opted for a compromise, for practical purposes of getting my calculations running, that incorporates a combination of noise at the start of a set of sweeps and doubling in the number of singular values at each set of sweeps.

In other words, I provide up to 7 input files in succession, and each input file has a different maximum number of singular values (M).

So in my most recent set of calculations, I used roughly the following:
Inputfile 1:
M = 50 states kept
Noise = 1E-4 and decreasing to 0 over about 7 sweeps
Energy cutoff = 1E-2

Inputfile 2:
M = 100 states kept
Noise is the same as inputfile 1
Energy cutoff is 1E-4

3:
M = 200
E cutoff is 1E-7

4:
M = 400
Ecutoff is 1E-8

5:
M = 800
Ecutoff is 1E-9

and so on. The higher cutoff files are usually too time-consuming to run so I don't use them often.

I have since realized that this "practical attempt" to get good convergence might be inefficient for the following reason: The decrease in energy from one inputfile to the next is greater than the supposed energy cutoff of the previous inputfile. So I am decreasing the energy cutoffs to better match the true precision afforded by a given M.

Jin recommended letting M follow a simpler sequence such as 20, 20, 40, 40, ...Mprevious + 20 with two sweeps at each M. For me, this seems impractical since I want to reach high precisions (M up to and beyond 800 to get energy precisions above 1E-6 if possible).

Is there any realistic way of optimizing this energy cutoff/states kept/number of sweeps/noise sequence besides an automated (or manual) trial-and-error approach? I am studying a difficult system and need the most efficient and precise wavefunctions I can afford. I have limited computer time that is running out rapidly.

Thanks!

Jon

commented Aug 17 by (16,920 points)
Hi Jon, one comment is that you don't have to use separate input files. There is an option to provide a "sweeping table" to a DMRG calculation. Check out the dmrg_table.cc sample code in the sample/ folder and the accompanying file inputfile_dmrg_table
commented Aug 18 by (330 points)
Hi Miles, thanks for the response, and that's exactly what I have been using. I just have a series of different input tables that I can select from, with increasing precision for each file.

+1 vote
answered Sep 9 by (16,920 points)

Hi Jon,
So there's no perfect or universal answer to this question and it depends on your goals and constraints (e.g. short calculation time; ability to extrapolate in truncation error; high precision; limited resources; etc.) and it very much depends on the specific system you are studying.

However, for the easiest and most natural case for DMRG - that of a gapped, finite-size system with open boundaries - the most usual and recommended approach is to use a small to very small cutoff (so ranging from 1E-5 down to 1E-12) throughout and raise M from a rather tiny value like 10 exponentially quickly up to the final large value you need to get high accuracy. So a common thing I do is to just set the cutoff to 1E-10, say, and then raise M according to a sequence like 10,20,40,80,160,320,640.
The main point of using this sequence is that DMRG typically converges exponentially quickly in the number of sweeps for gapped systems with local Hamiltonians.

Another case is when you want to extrapolate in truncation or more generally use information from intermediate sweeps. Then you want to make sure the MPS is fairly well converge for each M value you visit. So a common approach here is to do the same M twice:
10,10,20,20,40,40,80,80,160,160,...etc and use the value of the energy, say, coming from the second sweep at each M value.

For very tough to converge systems there are additional things you can do. For the quasi-continuum systems we've studied a lot over the last few years, we often do many sweeps (like 100 sweeps, say) at rather small M values < 100 and then only raise M to large values after the main properties of the wavefunction like the density of electrons look reasonable. The last handful of sweeps at larger M build in the proper electron correlations needed to get very accurate properties.

Of course turning on the noise term is very important for tough to converge systems, especially when you are conserving quantum numbers.

So those are some partial answers but really the choices depend a lot on the specifics of your system.

My main piece of advice would be to do measurements of not just the energy but many other local properties of your system (magnetization at each site; density at each site; etc.) and plot these quantities after each sweep. This way you can start to build up an intuition for your system and get some idea about if your sweeping parameters make sense (like if your wavefunction is very far away from the expected physics and changing very slowly sweep-to-sweep & if there are many small energy scales in the Hamiltonian then try doing a huge number of low accuracy DMRG sweeps).

Plotting and experimenting is very important. I don't think there is a one-size-fits-all or black-box solution when it comes to best practices for DMRG.

Hope that helps -

Miles

commented Sep 11 by (330 points)
Thanks Miles, and I've unfortunately been learning these lessons the hard way, through trial and error, which has been taking a lot of time and resources.

A few questions I have:

1. What do you mean by "ability to extrapolate in truncation error"? Shouldn't this always be possible -- simply plot an observable as a function of truncation error, then extrapolate to 0?

2. When turning on the noise term, does this impact the number of sweeps needed for convergence at each M? Or should it improve convergence for even a single sweep? Is it possible for noise to worsen convergence?

Thanks

Jon
commented 3 days ago by (16,920 points)
Hi Jon,
1. by ability to extrapolate, the idea here is that after maxm changes, it often takes two sweeps for the truncation error reported by DMRG to be accurate. At first the truncation error reported is usually wronge because the algorithm hasn't figured out the best representation of the state in the newly enlarged basis. So a recommended practice is to do two sweeps at each "m" to converge the state at that m and get an accurate trunc err to use for extrapolating.

2. Yes the noise term will hurt everything about the convergence, such as number of sweeps and accuracy of the state etc. But a little bit of noise can help a ton with convergence while not hurting accuracy much. So for tough calculations it's best to use large noise at the beginning then quickly drop the noise down to a rather small value (you have to experiment) and maybe even zero in the last sweep or two, as long as you observe the energy not starting to go up or something.

As with everything, you can and should try out various hypotheses about different convergence strategies and learn what works best for a particular system. Each system is different so it takes experience and a willingness to experiment, which I think you are already doing.