(Abhiroop and I discussed via email, but posting part of my answer here as well.)
The inner workings of AutoMPO are quite complex and would take many days for me to describe fully in all detail. At some point I may create a long article about how it works, but also its implementation sometimes changes. The most important thing is that it obeys a "contract" about its output given certain input, otherwise it should be viewed as a black box for most purposes. If you end up reading through and studying the code itself, I could answer specific questions you have about that.
That being said, here is a link to an article that discusses some of the ideas that are in the current version of AutoMPO (by Anna Keselman) which can now handle multi-site and long-range interactions: https://arxiv.org/abs/1605.02611
The basic idea of how AutoMPO works is that there is an analysis done of the pairs of operators making up the Hamiltonian (say for the case where all terms are pairwise only). Then each MPO tensor is constructed such that an operator can be put on site "i" and a particular setting of the MPO bond "remembers" this. Then later when one encounters the other operator of an operator pair at site j, it is put into the position of the matrix at site j whose row corresponds to the other operator being on site i. This is how the "Exact=",true backend of AutoMPO works. The newer backend, which can handle more than two-site terms, uses the ideas of the article by Chan, Keselman, et al. linked above.
Best regards,
Miles