Well first there's the VRAM (video memory).
Each graphic is composed from smaller square-shaped puzzle pieces called tiles.
There's the bigger graphics, the backgrounds, and the smaller ones moving around the screen, the sprites.
Link / HUD tile data are loaded at all times.
Environment tiles, and NPC tiles in the case of towns, are loaded once* when entering a new area. (*: stuff like animated water may differ)
Enemy tiles and weapon tiles aren't all loaded at once, just the necessary ones at that instant.
Fonts, menu tiles, and cutscene graphics are not loaded at all unless absolutely necessary.
The whole font is loaded in the case of LA/FTFTBT, but in the case of OOA/OOT only the characters needed for the currently displayed sentence are.
If they didn't do this, the graphics simply wouldn't fit in the limited VRAM they had (which has doubled from the GB to the GBC).
So they only load what's necessary.
Even better, sometimes games have various lists of enemies and depending on these lists they determine what enemies to load. Each zone is tied to a specific "enemy set".