I actually found that very repelling, too. However, I thought it was just a hacky video.
It's really not that hard altogether to pull off. Basically follow Gemini's advice:
I mean, all you've got to do is an extra tile shifting+DMA transfer to VRAM each time an overflow happens.
I actually have always done it this exact way. The only problem for now is that sometimes text gets rendered to a background that is not the same pattern as the text background, so you see additional "blank" space where the routine will print the next char.
why should an "i" take the same amount of time to appear as an "M"?
Because they're both equally letters. It's the way you read and are least distracted. By getting not-fully-rendered parts every rendered letter, you automatically feel that something is somehow wrong.
cYa,
Tauwasser