Just thought that I would post an update, kind of a necropost though.
I've been too busy with work to spend a lot of time on this (basically haven't touched it in a few months), but I have made progress since my previous post. I've wrote a simple program that will scan the files in the disc image which contains the text, find the pointer tables, calculate all the text positions and other data, and extract text (and its related memory locations/addresses/pointers). It has saved me perhaps hundreds of hours of copying, pasting, and typing. And, luckily, the text wasn't compressed or anything tricky. Some of the pointers are two bytes, and others are three bytes, so that threw me for a loop at first. But it's all working well enough now.
It processes it all and stuffs it into a normalized relational database. Once I get the collaborative translation system online, I'll probably post more details about it - and I'll no doubt be looking for people to help out with the translation.
Basically my goal is to create a collaborative translation system, like I said before. It will allow team members to tag text snippets, propose & vote on translations, upload screenshots, etc. The idea being to create something that's structured enough that I can have many people contributing, and keep it all maintainable and easy for me to ultimately get the translations put into the disc image in some automated way.