SNES has 2 VRAM chips, each are 32K in size. It interleaves VRAM between these two chips, so that each 2-byte word is split with the low byte being on one chip and the high byte being on another chip. Therefore, everything is accessed by WORD addresses rather than BYTE addresses. And it might be easier to visualize it that way.
So in this response, I'm going to use WORD addresses for all VRAM accesses unless I specifically say I'm doing otherwise.
Storing 04 to $2107 means that the tile map starts at $400 x 2 = $800 in Vram. So which register I should write to in order to make the tiles appear from $800? For example, I want the tile #02 to appear at (0,0) and tile #03 to appear at (0,1) at the upper left corner.
VRAM is a blank slate. Some of it will contain patterns and some of it will contain tilemaps. The SNES doesn't care what data you put where as long as you tell it where to find everything.
04->2107 is telling the SNES you want it to look at address $0400 for the tilemap.
This means, to draw to the tilemap, you would write #$0400 to register $2116/7 to set the address, followed by the words you want to draw to $2118/9.
Storing 01 to $210B means that the tile set (graphic data) starts at $1000 x 2 = $2000 in Vram. I'm clear at this point, but not with $2116 as described below.
Correct: 01->210B means BG1's patterns will be read from word address
According to some documents, $2116 is described as "This register is used to set the initial address for a VRAM upload or download". So what exactly it does?
Main system memory can be accessed directly via banks $7E/7F. However, VRAM is tied to the PPU and not the CPU, and therefore cannot be accessed directly. Therefore, you have to set the 'address' with 2116, and then write the data with 2118.
sta $2116 ; set PPU Addr to word address 0400
sta $2118 ; write the word 'FFEE' to VRAM
Althought the tile data appears from $2000 Vram, but actually $2118 will write the data from $800.
So my understanding is, whatever the value of $210B is, the actual graphic data will be at whatever address that $2116 tells. Is my understanding right?
Maybe? You're phrasing this kind of weird.
2116 tells 2118 where to write to.
210B tells the PPU where to look for patterns.
This write tile (graphic data) to Vram, since I use 32x32 screen, so there're 2048 bytes to write. But I checked the Vram and there's only #$400. I wonder why. My graphic data is also 2048 bytes.
That should be writing $800 bytes ($400 words). Though maybe you aren't force-blanking so maybe you are running out of VBlank time? For big chunks of drawing, you really should force blank by writing: