Good question. I think the std::bad_alloc may be because your code is running out of memory. This can happen when the contractions are not doing what you think, and the number of indices grows to be very large. I have been meaning to put in a line that checks if the number of indices grows beyond, say, 10 and then throws an error unless disabled by the user.
So I think what's happening is that the MPS tensors are being swapped ok, but then the actual site indices are not being switched. This could cause mismatches to, say, a Hamiltonian made from the original site set which would have the site indices in the original order. So I'd suggest you add the following step: use "delta" tensors (http://itensor.org/docs.cgi?page=classes/diag_itensor) to change all of the site/physical indices after you swap the order of the tensors. For example, on the first site do:
phi.Aref(1) *= delta(dag(sites(N)),sites(1);
which (efficiently) replaces the Nth site index with the first site index. Here I'm assuming "sites" is the variable holding your site set (SpinHalf, Hubbard, SpinOne, etc. are examples of site sets).
Please let me know if that fixes your issue -