The game is farther ahead in the steps than what MMX1-3 were. The font IS loaded directly from rom staring at C00000 (Literally right at the start of the game). It's a 16x16 font that gets pulled from ROM then stored directly into VRAM. I've found the base area for where it stores into VRAM, how it loads the graphics I believe and all that fun stuff. There's even a fun thing that after a certain byte is hit, it jumps 180 for a new line and such.
But now that the basis of this is already there (It's not fixed width per se), where would be the next step to start to make this into a VWF?
When you say the font is loaded directly from the ROM, you mean that when each character is printed it is loaded from ROM and copied directly to the proper spot in VRAM tile? (Instead of loading the font to VRAM and printing chars by altering the tilemap.) I'll assume this is the case. And btw, if every character is the same width, it is fixed-width. What I was talking about before
was a fixed-width font, but it was about changing the routine so that it'd be easier to modify into a VWF, not changing from __ to a fixed-width. If you're starting with something similar to what I described, good, that should be easier to work with. Although, it being a 16x16 font will add slightly more complexity.
Here are the next few steps I would do:
1a) Relocate the section of code that copies data to VRAM. Modify the original code so that it calls this routine to copy the data. (Note: for a 16x16 font, this routine should draw a half the character - an 8 px wide strip. So you will need to call the routine once for the first half and once for the second half. You'll see why this is in a bit.)
1b) Instead of copying directly from the ROM, copy the data from the ROM to some unused RAM area. Then use the a call to the routine from 1a to copy from RAM to VRAM. After this step, you'll have code that copies the left half of the character from ROM->RAM-> VRAM, then does the same with the right half.
2) Modify the code so that it draws each half twice. You'll need to modify the part of the code that increments the tile number that is printed to for this. This is why we relocated the code in 1a - it's easy to add another call. I made a diagram to see why you do this:

3) The first time the character (or half of character) is drawn, shift each row of pixels right by X. (On this step I usually leave X as a hardcoded value, like 3 or 4, so it's obvious if it works.) The second time each half is drawn, shift left by 8-X. Shifting should be done on the data in ROM, so it can be done outside of vblank. Copying must be done in vblank.
(Note: I'm not sure exactly how bitshifting works on SNES, but you'll need to pay attention to the bit depth of the graphics. If it were 1bpp, an 8 pixel row would be one byte, and you could just shift right once per pixel you wanted to shift. But if it were 2bpp, you'd have two bytes, and you'd need to shift twice, making sure to shift the leftover from the first byte into the second byte. I can post more details on the shifting if you have trouble.)
At this point you should have something similar to the diagram, but you'll only see the second half of the character, because we still have it overwriting the previously drawn data.
4) Now, we'll make it combine the two halves. The first time the character is drawn, notice how it is already shifted over. In reality, there would be another half of the previous character there which we don't want to overwrite. So instead of just copying the character, you should load the data that is already written to the tile (or just keep this in RAM) and OR in the new, shifted-over data. A single OR is all you need if the BG color is represented by zero bits only. If it's something else, you'll need to do some masking, which I can post about more if needed.
The second time each half is drawn, you'll want to increment the tile number, since theses halves represent the "overflow" - the pixels that have spilled over into the next tile. Again, OR in the data.
At this point you should have the font printing normally, except everything is shifted over 4 pixels.
5) Now, replace the hardcoded X value from before with a calculated value (I would just devote some RAM byte for this use). You'll want to initialize the value to 00 every time a new string is going to be printed, or when there is a line break. Then after each character is printed, you'll want to calculate the amount of pixels left over, and store them here. For example, if the first character was 11 wide, you'd have 5 left over. Then if the next char was also 11 wide, the first 5 px of it would be drawn in those 5, the second 6 pixels in the next tile, and you'd have 2 left over. [So I think the formula for this is: 8 - ( ((previous overflow) + (char width)) mod 8 )]
To get the width of the character, make a "width table" somewhere in the ROM, and just use the value of the character being drawn to look up the width.
6) Now you should have a legitimate VWF, but there may be some bugs you need to handle. For example, if a character doesn't need to be shifted at all (ie. X is zero) you can just draw it normally. There might be some other edge cases you forgot to handle - for example, line breaks, or starting a new page of dialog.
---
Hopefully that helped, I can be more detailed if any point didn't make sense.