So I started up a translation project for the original Medarot GB game (because it's a prequel that we never got here in the States), and after asking Danke for some help I was led here!
By searching around and finding random tutorials, I managed to replace all the tiles with English ones and make an English table. Then my problem was that I didn't know the story or the dialogue anyway, so having it in gibberish was even worse than having it in Japanese (which I also don't speak).
So my new plan was to:
1. Create a Japanese character table
2. Use said table + a Python script I wrote to extract any text in the ROM (lots of gibberish, but I got all the dialogue text too, from at least looking at the intro and playing picture matching) along with the addresses and size of the text
3. Get it translated (somehow, one of my friends said he'd help)
4. Use the English table I made to shove all the new text back in the game according to the addresses and sizes
So I haven't gotten it translated completely or anything, but I definitely noticed huge issues with the amount of space I have for text. My understanding is that all dialogue is called from somewhere through a pointer, so I found the start of the intro text and looked for references to it (doing the whole address/0x4000 -> bank number, then searched for <Bank Number><Last 2 bytes in little-endian>, making sure to add 80 or subtract 40 for the last byte), but I've had no luck so far with that.
Then I decided that maybe I'd be able to use BGB to setup access breakpoints to the location I wanted, but no luck (does BGB's debugger just show me dynamic memory? I'm assuming so).
For reference, the start of the introductory text I searched for was 0x5A849 (dec 370761), and the first text is:
76 4F 00 C0 E2 CC C7 D0 00 4B 23 C9 4C
こんにちは (NAME)
4B23C9 is (NAME) and 00 is space. 76 4F seems to indicate the start of dialogue, and 4C seems to indicate the end of the current chatbox (as in, hit A to continue).
The address I searched for was: 164968
And 2 bytes before and after it (164B68, 164868, etc...).
So, I was hoping maybe someone could give me some pointers (heh) so I can possibly figure out where to look, maybe even the structure. If there's a clear structure to dialogue text, then it might be possible for me to write a translator-helper program with a pretty GUI.
Edit: the 76 is actually probably just coincidence, I decided to do some other comparisons and I found a couple that didn't have the 7F 4B (just 7F), so ignore that! Also, 4F is probably actually representing the end of dialogue. That'd make more sense, given what I've found.
Edit2:
Huh, so I guess posting here magically solved my issue. I did a little more searching and realized that at least the intro text pointer was a 2 byte pointer (Searched starting from the C0). So now I guess my issue is... is there any way to somehow get this to be a 3 byte pointer so I can point it to the huge amount of spare data at the end address of the ROM?