Hi Huike,
These are good questions but it will be hard for me to completely answer them here. I am working on the next section of the ITensor Book (http://itensor.org/docs.cgi?page=book) which is going to have a section which goes into detail about IQTensors and their structure. There is already some material on IQTensors there which might help you a lot.
About iqdmrg, it is not a separate algorithm from DMRG. If you look at the sample code iqdmrg.cc (in the sample/ folder under the ITensor software) then you'll see at the end it actually calls the same dmrg(...) function that the dmrg.cc code does. It's literally the same algorithm code, just with template parameters that switch the tensor type from ITensor to IQTensor.
So all of the steps are the same as in regular DMRG with dense tensors, except the tensors are block-sparse due to the (Abelian) quantum number conservation. (Recall that the main steps of DMRG are applying a few steps of an iterative solver, such as Davidson or Lanczos, then performing an SVD of the two-site wavefunction tensor to restore the MPS form.)
One important thing about how IQTensor DMRG works is that whatever the initial total quantum number of the initial MPS (actually IQMPS) is, the DMRG algorithm is guaranteed not to change this total quantum number. So it is important to make sure the total quantum numbers of the initial state are well defined. You'll see that in iqdmrg.cc the code uses an InitState helper object to initialize the state to be a product state with a well defined total Sz.
Finally, if you are asking about the details of how IQTensors actually work internally to conserve quantum numbers, it would be too long to explain in full detail here (but the ITensor Book will eventually explain most of this). Basically the indices of an IQTensor, of type IQIndex, are segmented into "sectors" which are labeled by Index and QN object pairs. Once the total "flux" or quantum number of an IQTensor is set (e.g. by setting of the the tensor components non-zero) then only blocks with that total flux are allowed to be non-zero. All other blocks are assumed to be zero and are not even allocated in memory. Operations on IQTensors such as contracting them or computing SVD's of them preserve this block structure and exploit the block structure for extra efficiency.
Best regards,
Miles