You definitely make an important point that "inefficient" code is only really so if it dominates a significant portion of the running time. It's always good to keep an eye on sections of code for possible inefficiencies though. So to answer your question: yes delta tensor contraction is handled in a special way. In fact it launches one of 3 different algorithms depending on the case. For the case of a 2-index delta tensor, the code just replaces one index with another without touching any tensor data. The other two cases deal with whether all of the indices of the delta tensor are contracted or just some. Then it calls special sparse-tensor routines for these cases.