It's a long post, but these are some complicated questions.
For basic operation video out is defined at a software level. The VI (video interrupt) registers control the output mode, timing, and video buffer locations. Very few companies implemented anything outside the library-defined output modes (there's ~64 total, and you can find that in the online documentation nowadays so it isn't worth mentioning) but some were larger; ~1200 was around the widest I've seen. The hardware is very generic and can do some surprising output types like greyscale. Since this is software-defined they can be freely changed during gameplay, even switching NTSC/PAL/MPAL/FPAL.
The greatest limitation on video modes would be the size and number of video buffers. These are dma'd from rdram, so the larger and numerous the buffers the less memory you have for other purposes. Depending on the output type you may require a minimum number; 16bit color requires at least two, but 32bit color can use as few as one. Effectively these buffers are nothing but an array of color values and can also be used as an image for, say, blur effects or instant replays.
Output resolution does have some hardware-implied cap due to burst sizes + display time but you're going to have to ask somebody like Marshallh for the specifics. He's currently building an add-on HDMI adapter
and would be your best bet for a question like that.
Cart Domain Sizes:
Maximum cart sizes before you have to remap hardware addresses would be double what was used commercially--128MB. Addressable media, such as HD/SD used in backup devices don't have such a limit. Their "cart size" limitations are defined by the on-board memory dedicated to holding an image. That, obviously, would be the normal cart limit, but it isn't like you can't pull more data off the thing at a software level and stick it someplace. Basically, bankswapping on the N64.
Other cart domains exist as well. The 64DD IPL falls in a given range, as does SRAM/FLASH. FLASH is an interesting case, since there is a blob memory controller on-board that handles access. Technically, it could be similiarly-large as cartridges supposing you set up access protocol to address within this.
This is the most complicated and game-specific subject there is. Things that influence fps:
- Actual game code.
- RSP microcode in use, if any.
- Audio library in use, if any.
- Amount of images and geometry being processed.
- Degree of processing required (features being used)
The most important factor is the game itself. Less code running less of the time means fewer cycles spent on code so you can update more often. Quality of code also matter heavily. You don't want to spend all your time in error handling or long, stupid loops. By nature then certain genera games have advantages over others.
There are two ways graphics are generated. Directly writing to the screen buffer is one and yes, this is used outside of demos. Typical N64 games were 3D though, and even ones that aren't found it simpler to draw textured triangles. With maybe 1-2 exceptions geometry wasn't written low-level for the sake of storage size and ease of manipulation. Instead, they used some ultra-fast processing via the RSP.
The RSP is a programmable signal processor that processes data before it is handled by the RCP. The RCP would be the audio/video out. It could queue two tasks at once, and besides AV could also be used as a general computing tool. It has its own memory as well, and a large part of the texture limit was this little guy here. (I'll mention the texture limit thing later though, since that doesn't work the way most think.) being a signal processor it has no error checking and, as a result, is lightning fast.
Since the RSP is programmable, setting a task for it starts by sending it code to use for processing data. We use the term "microcode" for this, sometimes shortened "ucode". Commercial titles usually use several microcodes each and switch them. The boot sequence technically uses microcode as well.
From a practical perspective all audio and video tasks would be routed through the RSP. Your typical bottleneck will be here, and it's directly relative to the complexity of the tasks sent, complexity of the code being used, and time required for completion. As time went on the official libraries were altered to break tasks up into smaller parts, especially within audio libraries. Some of the earliest titles would "block" until all sound in a soundbank was processed before allowing another task and sending its data to the AI. This wasn't a problem later on.
Video tasks are the killer. There are more video microcodes than you'd think, and many specialize for certain tasks by simplifying horrifying operations. The earliest versions had the best visual output--compare the metal surfaces in Mario or GoldenEye to later titles--but the cost was a large amount of processing. It's like the math used for firing a projectile. At it's simplest it's 4th grade algebra; at its worst you're computing directional correction for the unevenness of the surface you're on, wind speed, tumble of the projectile, etc. Later titles tended to sacrifice quality for speed generally, switching in HQ ucode when it mattered more. Likewise, if they were never going to use certain features they wouldn't bother to include them.
The two worst bits of video tasks are depth comparison and texture processing. This isn't an N64-exclusive problem either. First would be depth comparison though. The Z-buffer is a nightmare and tied to current combiner settings for all objects that would be above, below, or intersecting with the current piece of geometry being rendered. Besides just simply rendering something it also has to compute how visible it might be and how to treat that overlap. That takes a lot of calculations, all costly. The more objects you need to do this with in any given scene the more time this takes. As a result, the first step in many cases is omitting anything that can't be completely rendered, or omitting things over a certain distance away, or omitting things too close, etc. It's slow and painful.
Next worst would be texture processing. The "standard texture types" people go on about are not how textures are used in-game. They are processed by the RSP into specific formats, then interlaced them. Ever see indexed intensity-alpha as a type? It's really only post-RSP. Some companies (Rare, Factor 5, and a few smaller names) would pre-interlace images that would be read as-is (c16, for instance) to avoid that overhead. they would also provide mipmap images for specific distances. Images that are being mirrored in one or more directions are also copied in-memory.
Besides formatting, later microcodes would not load entire images as they were. Instead, they would cut strips or blocks out of a larger image and use only that chunk at a time. That doesn't usually add much overhead though since it's basically DMA.
From there, as each piece of geometry is drawn textures are applied. The combiner mode tells how this application occurs. Textures are applied in selected method and can be modified by the triangle's vertex color (precalculated lighting in this era) or vertex normal. Typically this isn't too painful except when transparency, lighting, and depth gets involved. At further distances mipmaps will be used instead of the base texture to ease this a little. I think there can be something like 7 or 9 levels of mipmaps; I don't recall from memory.
The bottleneck occurs here though, when rendering textured geometry factoring in lighting, transparency, and depth. There is a good way and a fast way, but no good fast way. This is handled differently by each ucode and focusing on a balance of ideal features while sacrificing others. Even overclocking isn't a cure-all. Simply put, you can not get any game to exceed the rate it can process this portion of the ucode without replacing the ucode and sacrificing something along the line.
Hi-Res Texture Mods on Console:
I mentioned this in the bit above, but the texture limitation has to do with the amount of a given texture loaded at a time, not the physical size of a texture. You can (and commercial titles did) programmicly subdivide images into parts which would be drawn seperately. So, the real limitation has to do with the way you generate your model and texture it--or more how the generated display list does this for you.
So, does that mean you can use a typical hi-res texture on console via hackery? It's more complicated than that ;*)Should
you use the typical hi-res texture on console?
The best any given image can be rendered is pixel-perfect. 1 pixel to 1 pixel displayed. For these purposes we'll assume a texture applied to a surface at a 100% scale--no perspective shifts or super/sub sizing. The typical mapping values allow a certain degree of fudge room, so you can have a depth of roughly 16 or so relative to the mapped coordinates under most rules. Beyond that further dpi will be excluded or merged. If either case occurs or you map a larger image to the same space, the hardware will either drop the additional depth or refer to a mipmap. So, you lose any gain.
This works with HLE emulators though since they are applying a texture at your computer's
dpi level, not that of the system video output. Maximum displayable depth will be higher. However, from within the system there is a display depth bottleneck--your video output.
That doesn't mean you can't make improvements. Clarity
is the most important, and you can increase this by using multiple textures over the same surface. You're still working within the same constraints, but you're working closer to that displayable system dpi limit. Functionally a x2 or x4 size jump is the most that is practical.
Also, algorithms for scaling images have advanced some in this time. We have the capacity to create better small-scale images from larger source than was available at the time.
As a real-world example, the font used in AKI fighting titles is a 1bit font at a large size; 32x32 or 48x48 (I don't remember off-hand). They do rely on hardware post-processing, but this is mitigated by certain aspects of the font writer that are somewhat complicated. As a result, the displayed font is much clearer than other games which rely on a smaller size ia4 or ia8 source and it scales much better. Ironically, it is rendered worse than others by HLE graphics plugins since they read the source image and skip most post-processing.