It is absolutely true that an MPS does not efficiently represent the entanglement structure of a 2D system. To represent a 2D state with an area law using an MPS, the bond dimension of the MPS will scale exponentially with the width of the system (if we are thinking about the 2D system as a finite strip or cylinder). However, MPS are often used instead of PEPS because MPS methods are very reliable and well developed, and can get good results for small 2D systems (cylinders with widths of 10-20). So to answer your first question, every 2D system of large enough size must be represented in some other way besides an MPS. Here is a paper discussing a comparison of the scaling of MPS vs PEPS in more detail:

https://arxiv.org/abs/1705.03222
For your use case, where you are performing time evolution, time evolution with PEPS is not a very developed area (the only work I know of is this very recent one:

https://arxiv.org/abs/1811.05497 ), so the most straightforward way is probably to try out MPS methods first.

You say that you would like to time evolve the string operator. May I ask why you are trying to do this? Will you eventually apply the (time evolved) string operator to the state? If so, it sounds like maybe you could time evolve the state, apply the original string operator, and then time evolve the resulting state after the string operator is applied (i.e. use the Schrodinger picture of evolution instead of the Heisenberg picture). This should be much easier to accomplish in terms of MPS.

For your last question, I have to admit that I don't quite understand your notation. What do you mean by "break it apart by 4 to 2+2 to 1+1+1+1"? Anyway, to figure out the details of how to do time evolution with four-site gates of the toric code, it would depend on what evolution method you were interested in using (swap gates, long range MPOs, TDVP, etc.). I believe that in ITensor, using long range MPOs would be the easiest thing to try out first, so if you have any questions about that please let us know.

-Matt