News:

11 March 2016 - Forum Rules

Main Menu

Gimmick Land (GBC) English Translation

Started by TomatoAdventure_Fan, May 15, 2022, 07:47:04 PM

Previous topic - Next topic

TomatoAdventure_Fan

As you know Tomato Adventure has gotten an unofficial English translation. I was interested in getting Gimmick Land, the GBC prototype for this game translated to English as well.

The game is practically the same as Tomato Adventure, and we already have an English translation for the final game we can borrow from. So far, I've been doing some work altering menu graphics, and I've located the graphics for the Japanese text. However, this is where I've hit a roadblock. I'm fairly new to romhacking, and was stuck trying to alter the hexadecimal code. Does anyone know how to locate the script within the hex editor? All I see is a bunch of nonsense symbols. I know you have to make to a table, but I'm unsure of how I would go about doing that.

Jorpho

Quote from: TomatoAdventure_Fan on May 15, 2022, 07:47:04 PMI know you have to make to a table, but I'm unsure of how I would go about doing that.

I suggest you start with:
The Newbie Package of REQUIRED Material

ROMHacking.net FAQ: You ask, we answer!
ROMHacking.net Getting Started Section: Newbies Go HERE!
ROMHacking.net Documents Section!
How to ask questions the smart way.
On the Essence of ROM Hacking
Talk with experienced people in our IRC chat and ask specific questions there.

Next, I suggest you Google around to see if anyone else has already gotten started on this, as I can't imagine you're the first person to have such an interest.
This signature is an illusion and is a trap devisut by Satan. Go ahead dauntlessly! Make rapid progres!

Bunkai

Curiosity leads to knowledge,
be curious.

TomatoAdventure_Fan

Quote from: Bunkai on May 16, 2022, 04:55:37 AM
Maybe this can be useful :)

https://www.romhacking.net/forum/index.php?topic=34726

Best of luck with your project

Thank you for your well wishes! I read up on this and the other post and I think that I've made a bit of progress. However, I'm still confused about some things.

First of all, I am using Tile Layer Pro, Windhex32, and BGB for reference. I looked around the provided materials for the English translation for Tomato Adventure, and found a table that seems to match up with the characters I found by using Tile Layer Pro. I converted this into a table using Windhex32, and then applied it. At first it looked like gibberish, but then I selected "view text data as Japanese" and I was able to see the characters.

Here's the confusing part. I tried looking up some things in the Windhex32 from the Japanese script that was ripped from Tomato Adventure, and I didn't always find a match. When I was able to find a match, there were some Japanese symbols in the line, but it never matched up, and there was random English letters and other symbols dispersed throughout it.

Did I do something wrong when I made the table? Is the script not exactly the same for Gimmick Land and Tomato Adventure? Am I simply not searching right? Any help is appreciated. I think that I'll be able to make some good progress once I finally figure out how to start translating. Thank you for the help.

One final thing, here's the Github link that I'm using as a reference for tables and the script.
https://github.com/unknownbrackets/tomatotrans

Jorpho

Quote from: TomatoAdventure_Fan on May 16, 2022, 02:53:18 PMI tried looking up some things in the Windhex32 from the Japanese script that was ripped from Tomato Adventure, and I didn't always find a match.
To be clear, were you able to see any recognizable Japanese words at all? Because any random table file is inevitably going to show some random Japanese characters in random data.

QuoteWhen I was able to find a match, there were some Japanese symbols in the line, but it never matched up, and there was random English letters and other symbols dispersed throughout it.
Would you not expect this?  Any game is inevitably going to have "control codes" to do things like pause the text or apply special effects or show a character portrait and so on. You will probably see much the same if you try to view the unmodified Tomato Adventure ROM with the appropriate table file. There is no easy way to figure out what these do except through careful analysis and trial-and-error.

Or the text might have some kind of compression applied, as compressed text will sometimes contain parts that are still semi-readable.

QuoteIs the script not exactly the same for Gimmick Land and Tomato Adventure?
This is a baffling question. Are you expecting that someone has already dumped the script for Gimmick Land and compared it line-by-line to the Tomato Adventure script?
This signature is an illusion and is a trap devisut by Satan. Go ahead dauntlessly! Make rapid progres!

[Unknown]

The script probably has at least minor differences, since it presumably went through another round of QA.  A comparison of the text could be interesting to people as it might point to decisions made during development.

Anyway, assuming the text is the same as Tomato Adventure, there will indeed be control codes.  These are documented here:
https://github.com/unknownbrackets/tomatotrans/blob/master/notes/SCRIPT.md

For example, the sequence 0xFF 0x05 is quite common, and indicates a hard line break.  If you're using a table without the control codes, you'd see this as " E" or something.

Many of these tools (I haven't used one in possibly 20 years... when did Thingy32 come out?) support multibyte control codes in the table.  You just need to expand the table to include them.  The extra control codes you need, if they match Tomato Adventure, would start with either 0xFE or 0xFF.

FE codes are kanji and symbols.  Table here:
https://github.com/unknownbrackets/tomatotrans/blob/master/ta_kanji.txt

FF codes are logical control codes, which change text formatting or behavior.  They're documented as I linked above:
https://github.com/unknownbrackets/tomatotrans/blob/master/notes/SCRIPT.md

Some of these take "arguments."  For example: [PAUSE XX] / [FF 11 XX] is a pattern that starts with a byte 0xFF, then a byte 0x11, and then any other byte.  The third byte indicates how long the pause should be for.

You could search this file for all the possible values if you're using a simple table based extraction:
https://github.com/unknownbrackets/tomatotrans/blob/master/ta_script_jpn.txt

And then see if that helps things line up better.

It is possible that some of the control codes, text, or etc. differ.  That said, I anticipate that the dialog text will be the easier part of this - so it's good you're starting with that.

-[Unknown]

TomatoAdventure_Fan

Quote from: [Unknown] on May 16, 2022, 11:20:31 PM
It is possible that some of the control codes, text, or etc. differ.  That said, I anticipate that the dialog text will be the easier part of this - so it's good you're starting with that.

I've made some good progress so far with learning how to do all of this stuff. From what I've seen so far everything (commands, script placement, etc.) seems to match up with the Tomato Adventure Github stuff. I've been messing around with WindHex, and I was able to translate several parts of the opening cutscene.

I have a few questions though. Whenever I try and use the Kana Search, it almost always fails to bring up a string (like 80% of the time), even though I know that exact line of dialogue is within the script somewhere. I would like to start by translating things area by area and follow how the game progresses. However, the script seems to be all over the place and doesn't seem to follow a specific order. I can't use the addresses from the Tomato Adventure script since they do not match up in terms of location. Any suggestions here?

Also, what would you recommend for a strategy here? Should I try and translate and play-test area by area, or should I beat the game first and make save states along the way at every location, and use those to jump around?

Thanks to everyone for their help so far.

[Unknown]

I'm not sure about the search.  I mainly used MadEdit for testing direct hex edits, and never had issues with its search (but I was always searching it directly for hex, had a quick script I could run to output the hex to search for.)  I'm not sure why your search wouldn't work, unless it's down to punctuation, codes, spaces, etc.

There's some method to the madness as far as order, at least in Tomato Adventure.  I think they're somewhat grouped by map.  I think it really depends on the order in which their internal script files were built/compiled.

There's a scripting engine and the scripting engine has pointers to the actual text.  The script is referenced by map data, and there's a table (at least in the GBA image) of argument byte lengths per bytecode in the script.  Some of these script commands control animation (i.e. changing which direction a character faces), jump to another bytecode location, etc. - but others display text from a specific pointer.

The order of the text is pretty aligned to the order of the scripts and the references within them.  There's clearly some removed text, that never displays in the game, and isn't even referenced (I assume it was commented out, but the compiler wasn't quite smart enough to skip emitting text for it.)

The way I tested this was:
1. Export all the text.
2. The translator translated all the text (while playing through the game in Japanese.)
3. Insert all the text (including adjusting pointers.)
4. Export a spreadsheet of all the Japanese and English text.
5. Play the game, reviewing the text and creating about 36 save states along the way.
6. Mark off strings as I validate their text wrapping and meaning is correct and follows the flow of conversation.
7. Dump a rough disassembly of the scripting language for strings I never encountered, and review the scripting logic to figure out how to encounter the text (this is how I learned of different dialog when you have 99 binkies.)
8. Mark off unreferenced text only after I didn't see it and can't find it referenced anywhere in the game code.
9. Repeat until exactly zero strings are marked off, using save states if necessary to hunt for them.

I accounted for every single string this way.

-[Unknown]

TomatoAdventure_Fan

Quote from: [Unknown] on May 18, 2022, 09:56:04 PM
The way I tested this was:
1. Export all the text.
2. The translator translated all the text (while playing through the game in Japanese.)
3. Insert all the text (including adjusting pointers.)
4. Export a spreadsheet of all the Japanese and English text.
5. Play the game, reviewing the text and creating about 36 save states along the way.
6. Mark off strings as I validate their text wrapping and meaning is correct and follows the flow of conversation.
7. Dump a rough disassembly of the scripting language for strings I never encountered, and review the scripting logic to figure out how to encounter the text (this is how I learned of different dialog when you have 99 binkies.)
8. Mark off unreferenced text only after I didn't see it and can't find it referenced anywhere in the game code.
9. Repeat until exactly zero strings are marked off, using save states if necessary to hunt for them.

I think this is roughly how I am going to try and go about doing things. I managed to figure out where the script started, so I am planning to work down from there. Through my playtesting so far, everything seems to work as long as you don't mess with the [END] command for each dialogue box. I am also quickly learning about the length limitations and how that affects the translation process.

For anyone that's interested in this project and stumbles across this thread, I just want to state that I don't expect this to be the absolute best romhack in the world, especially since it's my first project.

Will there be occasional weird spacing issues and line breaks within the dialogue? Yes. Will this keep every word from the original translation? No. Will I be able to effectively convey every message in the limited space that I have to work with? No.

However, I do intend to work towards a "finished" project in the sense that the script will be fully translated to the best of my ability, the dialogue will be comprehensible (even if there are abbreviations or conventional grammar rules broken), and of course the game will be playable from start to finish.

[Unknown]

Yes, if you've got a fixed width font and you're not updating pointers there are several tough things.

Even with Tomato Adventure, I had to get creative with a few strings to make it fit in the dialog box well, and that was after adding a variable-width font and solving character limits.

If you find yourself very immersed and want to try something harder, it should be possible to change the pointers.  This would allow you to make the strings as many characters long as you want.  The address of the start of each string is probably stored somewhere and changeable.

I'm not as familiar with the GBC, but I'm sure it uses banks.  Where ever you found the string, take the offset into the file and modulo by 16384.  For example, let's say the offset was 38378.  The remainder after dividing by 16384 is 5610.  The divisor would be 2, which is the "bank number".  A simpler way to think of this is as a big pile of thousands of papers, each numbered.  The way the game accesses them starts with the "thousands place" of the document page number (the bank) to find which stack of papers they're in, and then the number within that thousands place.  Except with the GB, it's 16384 not 1000.

Assuming it has the almost same text as Tomato Adventure, it would require at least 8 banks (it must be missing some of the text, because I heard it lacks Gimica and the way the Gimica code is written makes me think it was written separately.)

When a GBC game accesses data, it encodes the offset into the bank as a number.  Sometimes, it will also have a bank number.  Tomato Adventure didn't have this problem, so this is where guessing really begins.  Most probably, the offset for a string would be stored as three bytes, using this formula:

(offset inside bank) modulo 256
(offset inside bank) divided by 256
(bank number)

Earlier I gave the example of offset 38378, which is 5610 into bank 2.  That would have the values:

234
21
2

Based on my above.  That would mean the byte sequence 0xEA 0x15 0x02.  It might not have the bank number like that though, which would make things tougher.  Another thing to beware is that you may find this sequence in multiple places, but that doesn't mean it's for sure the part that says where that string starts.  It could just be a coincidence and part of an image or actual game logic.

If you do find it, though, you could replace this value with another address and put the string there.  You'd have to shuffle things around, but it might allow you to make some important strings longer.

I'd guess that some banks will have blank space at the end.  So if you look at the file in chunks of 16384 bytes, you may find some extra space you can stuff longer text in.

Anyway, that's just if you feel adventurous.  It may seem daunting, and don't feel bad only getting abbreviated text in - that's still a big achievement, especially if you've never done any of this before.

-[Unknown]

TomatoAdventure_Fan

I have a few more questions to share in this post.

Quote from: [Unknown] on May 21, 2022, 01:27:35 AM
Anyway, that's just if you feel adventurous.  It may seem daunting, and don't feel bad only getting abbreviated text in - that's still a big achievement, especially if you've never done any of this before.

I decided to just work with the space that I already have. I've found that there's still ways to make it work.

However, I have a few more questions, please forgive me if they are beginner questions. Did you ever have to use more than one table file when working on Tomato Adventure? Right now, I am trying to focus on things outside the main script, and I have had to do some "digging" to find where the code is. I want to make sure that I am not passing by dialogue that looks "gibberish" because I don't have the right table loaded.

Also, when you worked on Tomato Adventure, did you find that all the dialogue including the main script and menu text was all sort of grouped together? For example, after a big chunk of the main script finished was there a chunk of menu text before returning back to the script?

I'm really looking forward to this year's Tomatoversary ;D

[Unknown]

Tomato Adventure did have two character sets (tables), but they weren't very different.  Just the last few entries were different in terms of symbols, and then the supported control codes were different for each one.  The dialog used one, and then the menus used the other.  Actually, the final credits used a third character set though it was messily mapped to one of the main ones.  Then Gimica text had its own control codes.

I'll note that in the translation, I invented some control codes and made more places support common ones.  For example, enemy names support the control code for the main character's name - which is used in one particular enemy's name.  I'd argue it was a bug that the Japanese release failed to use your entered name.

It would surprise me if any of the dialog (outside menus, tutorials, battle, etc.) uses a different character set.

Menu text is not all grouped together.  Types of things are - i.e. item names, enemy names, etc.  And much of the menu text is concentrated in certain regions.  But there are a few things that are off on their own, and a few embedded in the binary itself (like the poisoned text, iirc.)  This could be pretty different in Gimmick Land, though.

Something like a linker (a tool used in programming to tie different code together into a single program) was likely used to compile the data in the final image, and it might've been a GBA-specific tool (because the GBA doesn't have "banks" and data organization is simpler.)  Whatever was used to organize the GBC data might've followed different ordering rules.

The technique I used was to identify all the text I could find, and then I wrote a script that could insert dummy English text for that.  I replaced all the non-spaces / control codes in this text with AAAAA.  That way, I could easily see what text I had not identified yet.  After that, I made it so the dummy replacement added an identifier in so instead of just "AAA A AAAAA AAA" it would be "X12 3 AAAAA AAA" and I could tell that string was X123.  I didn't end up needing to use that very much, but it helped me tell between different copies of identical text.

-[Unknown]

TomatoAdventure_Fan

Quote from: [Unknown] on June 16, 2022, 06:06:51 PM
Tomato Adventure did have two character sets (tables), but they weren't very different. Just the last few entries were different in terms of symbols, and then the supported control codes were different for each one.

I'm not really seeing the difference between the two tables on the Github page. One just looks like it is an expanded version of the other one.

When I try to do a search for menu lines there are some components that I just can't find. I've also found that the menus are a lot harder to work with given the lack of control codes, and even more restrictive length limits.

I am getting to a point where things are getting way beyond my current abilities and understandings. I will try my best to search around for more things to translate, but I can't guarantee I will get everything. In this case, would I still be able to submit my translation and make a news article about it? For example, say the main script was fully completed but certain menu lines were still in Japanese. I want people to be able to enjoy this game, and hopefully even continue the work that I have done. It may not be a perfect or 100% finished translation, but I feel that it is certainly going to be a good start for this game.

[Unknown]

Quote from: TomatoAdventure_Fan on June 17, 2022, 11:13:35 PM
I'm not really seeing the difference between the two tables on the Github page. One just looks like it is an expanded version of the other one.

Right, that doesn't really show the control codes and kanji handling.  That's where they differ, really.

Quote from: TomatoAdventure_Fan on June 17, 2022, 11:13:35 PM
When I try to do a search for menu lines there are some components that I just can't find. I've also found that the menus are a lot harder to work with given the lack of control codes, and even more restrictive length limits.

Menu text was where I did the most work.  Dialog required figuring out the whole scripting engine, sure, but menus were where I really made a lot of code changes to redirect things and give myself more space.  I also had to make sure I'd updated EVERYWHERE that looked at a specific piece of menu text when doing this, so it meant figuring out a lot of the game's code.  Doing menus with even tighter space and pixel limitations indeed sounds rough.

It could be that the menus are done in a different way in Gimmick Land, since they clearly had more space.  But generally, there were tables of things (like monster data) which included the monster name in a certain pattern.  In programming, this is called an "array" (list) of "structs" (sets of data in a pattern.)  For example, maybe 8 bytes are dedicated to the name, then 2 bytes to its HP, and another 1 byte for it's ATK, and last 1 more byte for a reference to its sprite.  In that theoretical example, the names of each monster would be 12 bytes apart.  The actual monster structure (struct) had more fields in Tomato Adventure, that's just illustrative.

Quote from: TomatoAdventure_Fan on June 17, 2022, 11:13:35 PM
I am getting to a point where things are getting way beyond my current abilities and understandings. I will try my best to search around for more things to translate, but I can't guarantee I will get everything. In this case, would I still be able to submit my translation and make a news article about it? For example, say the main script was fully completed but certain menu lines were still in Japanese. I want people to be able to enjoy this game, and hopefully even continue the work that I have done. It may not be a perfect or 100% finished translation, but I feel that it is certainly going to be a good start for this game.

You can still create an "Unfinished" entry here, although I'm not sure what rules there are for that specifically.  I don't know if news articles are allowed for unfinished entries or not.  That said, if you've put work into it and believe others would enjoy it, I don't see why an unfinished entry wouldn't be okay.

-[Unknown]

TomatoAdventure_Fan

Lately I have been exploring more of this site's submission information, and I believe that I still should be good to submit my work once it is finally ready. It will probably fall under the "fully playable" or "improvement" category. A significant amount of work has been done on this project, and it will certainly be the best (and currently the only) option for a English Gimmick Land experience.

I've been doing some more work on the project lately and I've been doing things to the best of my ability. Even things like graphics alternation have proven to be more complicated than I once thought. For example, the main title screen graphic has a lot of repeating tiles that are flipped. I'm not sure I'll be able to do that, but I can certainly add an English indication on the main menu.



Here's an image I used to help me figure out the locations of tiles on the game over screen.



Here's the end result. As you can see, a tile repeated itself on the top row, not sure how I could fix it, but I am still pretty happy with it.

Bonesy

the scratchpad is not for uploading random images to, use imgur for that

KingMike

Nobody but Staff members can see Scratchpad files.
"My watch says 30 chickens" Google, 2018

hobblinharry

This project sounds really cool and I am looking forward to seeing more information about it  :)
I was wondering if someone was ever going to work on a translation for this

TomatoAdventure_Fan

Quote from: hobblinharry on July 26, 2022, 07:39:37 PMThis project sounds really cool and I am looking forward to seeing more information about it  :)
I was wondering if someone was ever going to work on a translation for this

Well then, you're in luck! It's currently waiting in the Submission Queue right now. I hinted that I would try and have it out by the Tomatoversary (today), but it looks like there's about 10 or 11 projects in front of it right now. If it doesn't come out today, it should be out relatively soon as people continue to work on approving the submissions. It's currently out of my control, but hopefully it shouldn't be too much longer considering batches of hacks are approved every few days.

Please keep in mind that I didn't really know anything about romhacking before I made this project, so don't expect it to be the most "polished" experience in the world (weird line breaks and abbreviations to save space for an example). There were a few things that I couldn't figure out how to get, such as things on the START menu, but if you have a basic understanding of how RPGs work, or have played Tomato Adventure before, you should have no problem beating the game. That being said, a large majority is translated (such as 99% of main script), and all the parts that are absolutely necessary to understand how to beat the main game are translated.

Like you, I was also wondering if anyone was ever going to work on this. I had heard people wish for it, but I had never seen any work materialize from it. While I am happy with my project, I hope this "gets the ball rolling" for more work and improvements to be made to this translation. I hope you'll try it out once it officially releases, I'll be making a news post once the project is uploaded.

Since it's the Tomatoversary I want to have something to show, so I've included a few screenshots below.


hobblinharry

Bravo! I am looking forward to the release! Thanks for putting in the effort for this  :thumbsup: