Is there some built-in generalization of KrylovKit.eigsolve that optimizes over multiple input vectors?

Question

Is there some built-in generalization of KrylovKit.eigsolve that optimizes over multiple input vectors?

asked Oct 9, 2020 by sujaykazi (1.1k points)
edited Oct 9, 2020 by sujaykazi

I know that the KrylovKit.eigsolve Julia function can find a vector @@x@@ to maximize or minimize @@x^\dagger Ax@@ for a fixed matrix @@M@@. We can think of this in index notation as @@M_{\alpha\beta}x^\alpha\overline{x}^\beta@@, where the bar denotes complex conjugation.

What I am wondering is: does some common package in Julia have a built-in generalization of this method that optimizes over multiple input vectors? For instance, is there a built-in function that finds vectors @@x@@ and @@y@@ to maximize or minimize @@M_{\alpha\beta\gamma\delta}x^\alpha\overline{x}^\beta y^\gamma\overline{y}^\delta@@?

In case you're wondering why I'm asking, basically I am wondering whether it is possible to do a single optimization step over two non-adjacent sites in a way that keeps them unliked, because if you just multiply the two tensors together and then optimize that, then it will introduce a link between those sites. Basically I want to just do this step once in a while to see whether it is possible to fix some problems with my trial Schwinger model (as I have already discussed in some other questions on this website).

commented Oct 9, 2020 by MattFishman (14.1k points)
edited Oct 12, 2020 by MattFishman

Someone can correct me if I am wrong, but I think that is outside the scope of standard Krylov method packages. I think the "standard" practice for this in tensor network algorithms would be to fix @@y@@ and solve for @@x@@, then fix @@x@@ and solve for @@y@@, and repeat until convergence. Does that not work?

commented Oct 12, 2020 by MattFishman (14.1k points)
edited Oct 12, 2020 by MattFishman

Also, as a slightly more sophisticated approach, you could do the following:
1. Solve the full fixed point equation @@M_{\alpha\beta\alpha'\beta'} z_{\alpha\beta} \propto z_{\alpha'\beta'}@@ (i.e. using @@KrylovKit.eigsolve@@).
2. SVD the results @@z_{\alpha\beta}@@, and only keep the dominant singular value/singular vectors. This would result in the factorization @@z_{\alpha\beta} \approx x_\alpha y_\beta@@.
3. Use @@x_\alpha@@ and @@y_\beta@@ from step 2. as starting points in solving for @@M_{\alpha\beta\alpha'\beta'} x_\alpha y_\beta \propto x_{\alpha'} y_{\beta'}@@ by first fixing @@y_\beta@@ and solving for @@x_\alpha@@ (i.e. using @@KrylovKit.eigsolve@@), then fixing @@x_\alpha@@ and solving for @@y_\beta@@, and repeating until convergence.

commented Oct 12, 2020 by sujaykazi (1.1k points)

Hi Matt, sorry for the late reply. I realize now that I should have mentioned in my first question that I would like the optimization over the two vectors to be "truly simultaneous" because I want it to be able to optimize assuming a quantum number constraint too, which would prevent it from optimizing each vector in isolation. But yes, in a situation without QN constraints or similar things, your first suggestion is probably the best one.

I actually have tried using your second suggestion (or something very similar to it), and I think it works for certain applications, but I remember that it didn't really work for mine, mainly because, for the same reason as above (QN constraints), you can only really perform steps 1 and 2, and step 3 will fail to do anything.

But in any case, thank you so much! I figured that, if there were an easier way to do what I am trying to do, someone on this forum would likely know. I think I will have to keep looking.

commented Oct 12, 2020 by MattFishman (14.1k points)

What kind of QN constraint is there in your system? If you know what the QN flux should be beforehand, you can initialize the vectors @@x@@ and @@y@@ to have the correct flux (for example, if you have 10 particles in your system, you could initalize @@x@@ to have 10 particles and @@y@@ to have zero particles).

Also, I don't think the suggestion in my comment would have any issues when QNs are involved. For example, you could initialize @@z@@ to have a flux of 10 particles in step 1., then when you SVD, depending on where you put the orthogonality center you can specify that either @@x@@ or @@y@@ gets 10 particles in step 2. Then, the QN constraint that you decide on in step 3. should continue to be satisfied based on the one you choose in step 2. Possibly I am misunderstanding something, but what do you mean when you say "step 3 will fail to do anything"? It should still do some sort of extra optimization of @@x@@ and @@y@@ (unless they are already exactly the fixed points/minima of the optimization).

commented Oct 12, 2020 by MattFishman (14.1k points)

I realized a subtlety of this entire procedure is that you need to keep track of the orthogonality of your MPS correctly, and ensure it is being accounted for properly. Perhaps you are already aware of this, but it may be a bit nontrivial.

For example in step 1. above, you would need to have set up the orthogonality center of your MPS to be either the site of tensor @@x@@ or the site of tensor @@y@@ (I believe you have been working on optimizing the first and last MPS tensors simultaneously, correct?). Say that you fix your orthogonality center of your MPS to be the location of tensor @@x@@ (perhaps the first site). For noncontiguous MPS sites, you would have to pick one or the other. That would mean in general the eigenvector equation for @@z@@ will be a generalized eigenvector equation, since you will have to account for a nontrivial environment of tensor @@y@@ (the last site of the MPS). The orthogonality of the MPS will also have to be accounted for in the rank-1 factorization in step 2., so that the center ends up back on tensor @@x@@ and tensor @@y@@ is properly orthogonalized.

Then, in step 3. above, there would be a normal eigenvector equation for tensor @@x@@, and a generalized eigenvector equation for tensor @@y@@. Alternatively, there are ways to optimize tensor @@y@@ that constrain it to be an orthogonal matrix (these techniques are used in other tensor network algorithms, like MERA optimization). I could explain these techniques in more detail.

Sorry if this is confusing or vague, but I think the problem you are trying to solve is fairly subtle. It would help to have more details about where the eigenvector equation is coming from in your particular problem, and what constraints there are on the tensors @@x@@ and @@y@@.

commented Oct 18, 2020 by miles (70.2k points)

Yes, to echo Matt I'm not sure this problem as posed is a kind that can be solved by iterative, linear-algebra methods in the way you are envisioning. I think it requires an approach called "alternating least squares" which is like what Matt was saying about holding some of the vectors fixed temporarily while optimizing some of the other ones. (Also please correct me if I'm wrong about that but that's what I think at a glance. ) Alternating least squares can work well in practice but does require a good initial guess and can get stuck in local optima. DMRG is actually a kind of alternating least squares algorithm for MPS and its actually called that in the math literature ("ALS for tensor trains").

There might be packages in Julia for solving problems like this but I'm not sure. Here is one that I found:
https://github.com/kul-forbes/StructuredOptimization.jl

Is there some built-in generalization of KrylovKit.eigsolve that optimizes over multiple input vectors?

Please log in or register to add a comment.

Please log in or register to answer this question.

Categories