Romhacking.net

Romhacking => Newcomer's Board => Topic started by: SquiddyGoat on August 09, 2020, 04:48:27 am

Title: Need help with getting POINTER_RELATIVE script dump.
Post by: SquiddyGoat on August 09, 2020, 04:48:27 am
Background: I'm working with a Japanese-literate friend to translate Legendz: Yomigaeru Shiren no Shima into English using the English letters that are already present in the ROM.

In Cartographer, I'm only getting 0kb POINTER_RELATIVE dumps that are completely empty, only a ($00)-filled RAW dump appears. Below is a copy of my command folder:

#GAME NAME:      Legendz: Island of Ordeal (GBA)

#BLOCK NAME:      Dialogue Block (RAW)
#TYPE:         NORMAL
#METHOD:      RAW
#SCRIPT START:      $160BF0
#SCRIPT STOP:      $189700
#TABLE:         legendztable.tbl
#COMMENTS:      Yes      
#END BLOCK            

#BLOCK NAME:      Dialogue Block (POINTER_RELATIVE)
#TYPE:         NORMAL
#METHOD:      POINTER_RELATIVE
#POINTER ENDIAN:   LITTLE
#POINTER TABLE START:   $08B0927F
#POINTER TABLE STOP:   $08F0FF7F
#POINTER SIZE:      $04
#POINTER SPACE:      $00
#ATLAS PTRS:      Yes
#BASE POINTER:      $080C8669
#TABLE:         legendztable.tbl
#COMMENTS:      Yes
#END BLOCK

I based the SCRIPT START/STOP areas on where the legible Japanese words seem to start and end. I have no clue if I need to be more exact.

There is a large section of 00's and FF's at the bottom of the ROM, which I currently believe is the pointer section (Correct me if I'm wrong.). This is what the POINTER TABLE START/STOP areas are based on.

One thing to note is that the legible game script is far above the huge pointer table, which lies at the bottom of the ROM. I don't understand anything about the base pointer, either, so that could be acting up too. Any idea what could be going wrong?
Title: Re: Need help with getting POINTER_RELATIVE script dump.
Post by: FAST6191 on August 09, 2020, 03:13:36 pm
Might want to note it is for the GBA.

Also have you got a means to bypass hardware addons this franchise uses? Others playing along these games are basically spyro skylanders or those disney statue things but circa 2004.

"There is a large section of 00's and FF's at the bottom of the ROM, which I currently believe is the pointer section (Correct me if I'm wrong.)."
You are probably wrong. Most ROM dumpers for the GBA just dump up to the nearest power of 2 mark (4 megabytes, 8 megabytes, 16 or 32) and any difference between where the program stops is padded with 00 or FF just to make up the space (this is what ROM trimming sorts).

Start and stop of legible text is not necessarily what is being done -- many games will have formatting and other things before the text or placeholders (it costs ? a night to stay here sort of thing where ? is one thing, or maybe a character name if you can customise it). Not to mention the pointer section in those tools are asking for where the game keeps the pointers, not where the pointers will be pointing to.

Anyway this is the GBA so you probably want to play to that.

In memory the whole GBA ROM for 99.99% of games* is visible all at once in memory.
http://problemkaputt.de/gbatek.htm#gbamemorymap
In the vast majority of cases this means the game starts at 08000000 in memory and finishes at 09FFFFFF, however as most games are 16 megabytes or less that will be between 08000000 and 08FFFFFF. There are areas later in the memory that the console will wait a while to fetch data from (presumably done so as not to interrupt other goings on in the game for data it does not care about right away, though there are other reasons) but most things won't use it. Likewise you are free and clear as a hacker to use the 09?????? region (a full 16 megabytes) without restriction -- only some older flash carts will be troubled by this and most of those won't mind too much either.
This is so much so that I will usually encourage people to search for 08 and see the results. If you end up with a bunch of them mostly equally spaced in a row that is likely some pointers. It could be anything in the ROM (graphics, music, levels, the code itself... also use pointers) but usually something worth looking at (even if only to eliminate or add to your notes on a game -- you might only be interested in text but others might be interested in other things). If the rest of the pointer decodes as an area in or around the text (those remaining bytes in the 6 values after 08 are the location in the ROM if you opened it in a hex editor and pressed goto, albeit probably flipped a bit because endianness) which you apparently already know the location of then probably worth having a deeper look.
Find the pointers and then you can dump them, though I don't normally use atlas and cartographer for this sort of thing (they tend to not play as nicely with the GBA as I might like). Personally I translate the thing and then between values there is usually some indicator (or you might add one and do some maths after the fact). When it is all translated you then dump it again with your new values, look at the indicator stuff and generate accordingly.

*for those following along that want the exception then flash carts, technically pogoshell and in commercial games https://mgba.io/2015/10/20/dumping-the-undumped/

Anyway time to find at least a few pointers manually and then you can look at persuading those tools to do something, making your own (a spreadsheet might well do) or maybe going to play with kruptar and co.
Title: Re: Need help with getting POINTER_RELATIVE script dump.
Post by: SquiddyGoat on August 09, 2020, 04:38:22 pm
Following your advice on finding the real pointers, I came across a block of hex values with "08" going down in a straight line. Here's a cheaply-made example:

(hex)08(hex)F1(hex)08(hex)
(hex)08(hex)F1(hex)08(hex)
(hex)08(hex)F2(hex)08(hex)
(hex)08(hex)F2(hex)08(hex)
(hex)08(hex)F3(hex)08(hex)
(hex)08(hex)F3(hex)08(hex)
(hex)08(hex)F4(hex)08(hex)
(hex)08(hex)F4(hex)08(hex)

Thing is, "08" pointer tables of all sizes can be found all over the place, especially in the beginning of the ROM. How will I know which one deals with the game script?

One thing I'm confused about is how I'm supposed to use these pointers once dumped. If using Atlas and Cartographer isn't a good idea, could they still help me extract and re-insert text easily (My current goal.) through other means? You suggested Kruptar, which I had installed a while ago, but I'm pretty new so I couldn't figure out how to load the dumped script and my table file.

Also, in the ROM the Japanese text is spaced out with periods between letters. (Using the few English words as an example, ...I.N.F.O.は... shows up at one point along with .R.P.G...) with the periods corresponding to values like 00, 000000, 00FDFF and 00FEFF. I'm not entirely sure why this happens, but it makes the raw dumps have a million ($00)'s and ($FF)'s in between each letter.

One more thing, I do own the necessary adapter and other accessories needed to test it on real hardware.
Title: Re: Need help with getting POINTER_RELATIVE script dump.
Post by: FAST6191 on August 11, 2020, 08:32:44 am
The adapter bit was more for those without such toys -- flash carts and emulators sometimes include a real time clock and occasionally a rumble pack, fewer have simulation/replication/emulation of the Legendz hardware addons.

Spaces between characters as it were. Fairly standard -- Japanese has thousands of characters, even in the more condensed versions. Most European languages do not and can fit it in with the 256 possibilities that 8 bit gives you. You can make things parse things with combined 8 and 16 bit characters but it is computationally costly so most game devs sacrificed the space instead rather than have to do things.

How do you know what one is the script? If they are of the plain 08 and say 6 characters format (which may include another 08 as it is valid to have them in the rest) then the remaining hex will likely be the location of the script within the ROM or in and around it after you handle any flipping issues (see big endian vs little endian -- the GBA, and most things that are not a PC, don't have trivial to read numbers but for complicated historical and legacy reasons flip things such that it does not read in order).
Some later stuff (like the DS) and the GBA is not immune to such quirks might have calculated, offset, sector based or relative pointers (relative in this case meaning if you are at location 60 and the value is 30 then you add the two together to get 90 in the final ROM, or possibly memory layout which is another reason that failed above), to say nothing of the more complicated memory maps on older systems (the GBA having the whole game visible at once is something of a rarity, stuff like the NES and SNES seeing you cycle things).

You could probably coax atlas and cartographer into doing something, however the workarounds required for it are almost enough that you might as well make your own script or abuse a spreadsheet (dump the pointers into a spreadsheet when you find them, find some indicator of value if there is an end of line/start of line value in the text itself, do you changes, search for the new location of the end of line/start of line stuff and change pointers accordingly. This is doable in a hex editor with a decent search function). You don't necessarily need pointers for a dump either.


In this case it is probably redundant or only likely to confuse things further but I will note there is a Korean version of this game ( Legendz - Buhwarhaneun Siryeonyi Seom ) and sometimes you can use such things as a kind of Rosetta stone to compare and contrast things -- most game translations only change the bare minimum so anything that changed is likely to do with that.
Title: Re: Need help with getting POINTER_RELATIVE script dump.
Post by: SquiddyGoat on August 11, 2020, 01:23:17 pm
By the way, thanks for taking the time to offer all this great advice. I'll continue on a bit more before coming up with any more questions.

About what you said first, though, flashcarts won't recognize the dedicated adapter at all? I was planning to have a flashcart running the game plugged into a GBA with the adapter plugged into the back. Not being able to do that would... complicate things a lot, such as leaving me with no way to test the game at all.

In that case, would this be of any use for getting it onto a flashcart?
https://shonumi.github.io/articles/art15.html
https://github.com/shonumi/gbe-plus/tree/master/src/data/bin/soul_dollz
Title: Re: Need help with getting POINTER_RELATIVE script dump.
Post by: FAST6191 on August 11, 2020, 03:12:25 pm
I was thinking more for people playing it after the fact without any access to it (not like it litters second hand shops like so many skylanders readers). I don't know the specifics of this one, though generally if it goes to the cartridge itself then probably not but if it goes to the link port then that is a different matter and most flash carts will probably work -- any issues coming from timings changes that trouble some link games. Mostly only looked into what went for a possible fix (it is one of the very few things that cause troubles for flash carts and emulator authors on the GBA, indeed this, its sequel and plaston gate tending to be the big three that are not basic sensors, with the latter having a form of bypass) but as it seems to be active electronics doing active things that means emulating/creating a whole lot of extra stuff which is seldom fun (even doing it for a basic anti piracy fix is one of those things that might take years -- see the time between release for banjo kazooie and its patch
As far as those links. Looks like I have some reading to do, thanks for sharing those. Missed them entirely in my rounds of the internet.
Title: Re: Need help with getting POINTER_RELATIVE script dump.
Post by: SquiddyGoat on August 13, 2020, 03:50:26 pm
Ever since my last message, I've been working on a basic menu translation as practice (Not something that's fit for release.). So far, I've changed the title screen options, some text on the naming screens, the date/time setting screen and the "command" window in the pause menu. I have also changed the text of four basic moves, the menu description of said moves and a little bit of text in the "TALISPD" menu. There's a lot more that I could be changing, but doing this all by manually replacing the hex value for "あ" with the hex value for "A" is pretty tiring. Also, I really wish I understood more about lengthening pointers since I had to replace "Reborn" with "CALL", "Save" with "SAV", "Items" with "INV", etc.

https://i.imgur.com/qnEJIS7.png
https://i.imgur.com/TNpoOhM.png
https://i.imgur.com/UItAktp.png
https://i.imgur.com/5UOTNQB.png


I'm currently pursuing more efficient ways to edit the text faster, many of which I am just beginning to understand. Cartographer is apparently bad with GBA games (I keep getting /pause and 0KB dumped after a thousand edits of my legendz_commands.txt file.) I'm still experimenting with Kruptar, but I don't know how to add pointers. I'll probably try the spreadsheet thing later, but I have no idea where to start with that. One of the simplest tools appears to be this: http://www.romhacking.net/utilities/880/. I followed all the instructions in the readme, but this one part really confuses me:

"table_write.ini” on the other hands is a binary write so you can include the Hex Byte “00” in your table file.  Now I don’t mean just going and writing “00=00” in your “table_write.ini”.  Open a hex editor and write “00=Hex00.”". There isn't enough space to write "00=Hex00" in a hex editor at "00=", so outside of typing it directly into the .ini I have no idea how to do this.

I tried just leaving the file alone and I got this after an hour of waiting:

ERROR in
action number 1
of Draw Event
for object obj_control:

Error opening file for writing.

Well, there's that. Now for something else:

You could probably coax atlas and cartographer into doing something, however the workarounds required for it are almost enough that you might as well make your own script or abuse a spreadsheet (dump the pointers into a spreadsheet when you find them, find some indicator of value if there is an end of line/start of line value in the text itself, do you changes, search for the new location of the end of line/start of line stuff and change pointers accordingly.

I've located and marked where several menu options and the entire movelist are in the ROM. Before each line of menu/move text is a "00FFFF". At one point, I had accidentally changed an "00FFFF" to a "0000FF" while editing text for the 1st move "CLW" manually and text for the 2nd move "2XCLAW" overflowed into the info box (Which I later fixed.).

Now that I've thought of it, I wonder if this is the "end of line/start of line" value that you described? There's also a "000000", "00FDFF" or "00FEFF" in-between each line of the script dialogue.

Also, I'm pretty sure that the pointers stored from 15CF50 to 15FB50 are the script pointers, purely because they seem to be the longest list of pointers and they sit very close to the game script (Where the first line spoken by a character in the game starts, 180F1E.). While I was using Cartographer, I had 38C005 as my "base pointer" because 50CF15(15CF50 in little endian.) - 38C005 = 180F1E.

A lot of things seem to ask me for this "base pointer" or something that's to be "added/subtracted to the offset to get to the script", so I wonder if 38C005 is what I'm looking for.
Title: Re: Need help with getting POINTER_RELATIVE script dump.
Post by: abw on August 27, 2020, 09:23:50 pm
#POINTER TABLE START:   $08B0927F
#POINTER TABLE STOP:   $08F0FF7F
This is definitely a problem - the entire ROM is only $800000 bytes long, so telling Cartographer that the pointer table starts at $08B0927F means that it would have to read beyond the end of the ROM file, which isn't going to work. I don't have your table file to test it out myself, but what happens if you try something like this instead?
Code: [Select]
#BLOCK NAME: Dialogue Block (POINTER_RELATIVE)
#TYPE: NORMAL
#METHOD: POINTER_RELATIVE
#POINTER ENDIAN: LITTLE
#POINTER TABLE START: $15CF5C
#POINTER TABLE STOP: $15FB60
#POINTER SIZE: $03
#POINTER SPACE: $01
#ATLAS PTRS: Yes
#BASE POINTER: $0
#TABLE: legendztable.tbl
#COMMENTS: Yes
#END BLOCK
You are going to need to figure out what determines the end of each string; probably that'll be some byte sequence like $00FFFF that you can include in your table file (e.g. "/00FFFF=[end]"), but there's a chance it could be something more complicated instead.

"table_write.ini” on the other hands is a binary write so you can include the Hex Byte “00” in your table file.  Now I don’t mean just going and writing “00=00” in your “table_write.ini”.  Open a hex editor and write “00=Hex00.”". There isn't enough space to write "00=Hex00" in a hex editor at "00=", so outside of typing it directly into the .ini I have no idea how to do this.
I'not familiar with that program, but as a guess, probably those instructions mean "00=" in ASCII and "00" in hexadecimal, so in hex your line would be 30303D00 ($30 = "0", $3D = "=", $00 = a zero byte).
Title: Re: Need help with getting POINTER_RELATIVE script dump.
Post by: SquiddyGoat on August 29, 2020, 01:14:20 am
Ah, so that's what was crashing Cartographer. After messing around with some of those VBA tools, I've only recently begun to understand the different sections that make up a GBA ROM, so I get what you're saying.

As much as I'd like to try out your solution, I recently found out that the pointer table for the game script does not start at 0015CF5C (It is a table of pointers, but for something completely different.). In fact, there doesn't seem to be a "table" at all, just a huge list of scattered pointers that begins at 000251CC and goes down for miles. They're still in order (I think?), but with uneven strings of data in-between each one. In this case, would Cartographer still be of any use? Does it scan for the "08" pointers or am I out of luck?

As for the "end of each string", It's probably a good thing that I know a ton of them. 00FFFF teminates the line in places like menus, 00FDFF brings the text to the second line, 00FEFF ends the line until the next line is brought up by the player, etc. I have them all marked down in my table file, which I recently updated with all 66 of the Kanji symbols present in the game in order.

Also, thanks for the advice for using Rewriter as well, I'll have to try that out and see how it goes. I'd really like to find a faster way than what I'm doing, even though it gets the job done:

https://i.imgur.com/i9bsjW6.png
https://i.imgur.com/zQlSZ4e.png
Title: Re: Need help with getting POINTER_RELATIVE script dump.
Post by: abw on August 29, 2020, 09:40:14 pm
Hmm, if the pointers really are scattered around with no useful pattern, then dumping their strings will be annoying. You could still do it in Cartographer by setting up a block for each pointer; probably the only thing that would differ between blocks would be the #POINTER TABLE START and #POINTER TABLE STOP lines, so if you have a list of pointers you could generate all the blocks programmatically pretty easily (search-and-replace in any decent text editor would be enough for that).

That said, it would probably be worth spending a little bit of time to verify that a) the things you're looking at really are pointers and b) they're pointers to the strings you want to dump instead of pointers to something else. One way of doing that would be to find a string in-game that you can get to display easily, find its pointer, change its pointer to have the same value as the pointer for some other string, and then display the string in-game again; if the game now shows the other string, then you know you've found one of the pointers you want.

If you're comfortable working with a debugger, a more thorough approach would be to trace the code for displaying the string back to its pointer, which would probably take more work but would probably give you a great deal of information about the locations of all the script pointers, so it might end up being faster in the long run.
Title: Re: Need help with getting POINTER_RELATIVE script dump.
Post by: SquiddyGoat on August 30, 2020, 03:33:55 am
I'm 100% positive that they're the script pointers, as I was able to replace the first line in the game with another character's line by changing what came before the "08". In fact, I was able to change multiple pointers to point to a huge blank area where I could write anything of any length and have it show up in-game. For such an advanced GBA game, changing text by hand is time-consuming but really easy and risk-free.

Right now, I'm working on creating a list of every single script pointer in the game using one of my own methods. Every pointer is pointing to the first letter that comes after FEFF, so I am simply using Notepad++ to search and replace FEFFXX (FEFF00, FEFF01, etc.) with FEFFD7, with D7 being a letter that would never show up in a conversation. Then, I will insert the binary data into the game and search for D7 within the parameters of the script and just copy down all the addresses. Then, I have a simple macro set up in Notepad++ that will convert every address into a little-endian pointer, giving me a set of script pointers that I can copy and paste into anything.

This is all I need for the multiple-block method you described, right? Do I input the pointers themselves (XXXXXX08), or do I need to input the addresses at which the pointer is located (000251CC) into Cartographer? The second one sounds painful, so I hope It's the first.

Also, how would I generate the blocks using the pointers I'm working on? What would the blocks roughly look like and how would I easily use a search and replace function to create them?
Title: Re: Need help with getting POINTER_RELATIVE script dump.
Post by: abw on August 30, 2020, 03:06:35 pm
Do I input the pointers themselves (XXXXXX08), or do I need to input the addresses at which the pointer is located (000251CC) into Cartographer? The second one sounds painful, so I hope It's the first.

Also, how would I generate the blocks using the pointers I'm working on? What would the blocks roughly look like and how would I easily use a search and replace function to create them?
You're going to need the addresses of the pointers themselves, and not just for extracting the original strings, but also for inserting your new strings. Cartographer's output should include the appropriate pointer write commands for Atlas, so the good news is that once you've got the Cartographer command file set up, the resulting Atlas file should be very close to what you'll need for inserting.

Assuming you have access to some Unix-like command line, you can get a list of all ROM addresses following a 0xFEFF byte sequence with e.g.
Code: [Select]
xxd -c1 "rom.gba" | grep -Pzo "(?<=.{8}: fe  .\n.{8}: ff  .\n).{8}" > addrs
Those are the values you'll want to use for #POINTER TABLE START, and those + 4 are the values you'll want to use for #POINTER TABLE STOP. You can take those addresses and generate the Cartographer blocks in many ways; the following works for me:
Code: [Select]
perl -ne 'print sprintf("#BLOCK NAME:\t\tDialogue Block (POINTER_RELATIVE)\n#TYPE:\t\t\tNORMAL\n#METHOD:\t\tPOINTER_RELATIVE\n#POINTER ENDIAN:\tLITTLE\n#POINTER TABLE START:\t\$%s#POINTER TABLE STOP:\t\$%08x\n#POINTER SIZE:\t\t\$04\n#POINTER SPACE:\t\t\$00\n#ATLAS PTRS:\t\tYes\n#BASE POINTER:\t\t-\$8000000\n#TABLE:\t\t\tlegendztable.tbl\n#COMMENTS:\t\tYes\n#END BLOCK\n\n", $_, oct("0x".$_)+4);' < addrs
Or, putting it all together:
Code: [Select]
xxd -c1 "rom.gba" | grep -Pzo "(?<=.{8}: fe  .\n.{8}: ff  .\n).{8}" | perl -ne 'print sprintf("#BLOCK NAME:\t\tDialogue Block (POINTER_RELATIVE)\n#TYPE:\t\t\tNORMAL\n#METHOD:\t\tPOINTER_RELATIVE\n#POINTER ENDIAN:\tLITTLE\n#POINTER TABLE START:\t\$%s#POINTER TABLE STOP:\t\$%08x\n#POINTER SIZE:\t\t\$04\n#POINTER SPACE:\t\t\$00\n#ATLAS PTRS:\t\tYes\n#BASE POINTER:\t\t-\$8000000\n#TABLE:\t\t\tlegendztable.tbl\n#COMMENTS:\t\tYes\n#END BLOCK\n\n", $_, oct("0x".$_)+4);'
gets me output like:
Code: [Select]
#BLOCK NAME: Dialogue Block (POINTER_RELATIVE)
#TYPE: NORMAL
#METHOD: POINTER_RELATIVE
#POINTER ENDIAN: LITTLE
#POINTER TABLE START: $00000126
#POINTER TABLE STOP: $0000012a
#POINTER SIZE: $04
#POINTER SPACE: $00
#ATLAS PTRS: Yes
#BASE POINTER: -$8000000
#TABLE: legendztable.tbl
#COMMENTS: Yes
#END BLOCK

#BLOCK NAME: Dialogue Block (POINTER_RELATIVE)
#TYPE: NORMAL
#METHOD: POINTER_RELATIVE
#POINTER ENDIAN: LITTLE
#POINTER TABLE START: $00000e8c
#POINTER TABLE STOP: $00000e90
#POINTER SIZE: $04
#POINTER SPACE: $00
#ATLAS PTRS: Yes
#BASE POINTER: -$8000000
#TABLE: legendztable.tbl
#COMMENTS: Yes
#END BLOCK

#BLOCK NAME: Dialogue Block (POINTER_RELATIVE)
#TYPE: NORMAL
#METHOD: POINTER_RELATIVE
#POINTER ENDIAN: LITTLE
#POINTER TABLE START: $0000163c
#POINTER TABLE STOP: $00001640
#POINTER SIZE: $04
#POINTER SPACE: $00
#ATLAS PTRS: Yes
#BASE POINTER: -$8000000
#TABLE: legendztable.tbl
#COMMENTS: Yes
#END BLOCK

etc., etc.
Yes, it's huge and clunky, but it ought to work (I haven't tested it, so possibly it might need some tweaking) and hopefully you'll only have to do it one time.
Title: Re: Need help with getting POINTER_RELATIVE script dump.
Post by: SquiddyGoat on August 30, 2020, 08:17:09 pm
This is some really great help! If I can get this to work, then I'm sure my current problems will be either solved or close-to-solved.

One thing, though. I'm not exactly sure what a "Unix-like command line" is, so I googled it and found Windows Powershell. Is this the correct thing I'm supposed to be using? I've read that "xxd" doesn't exist in Powershell, so I'm wondering if I have the correct setup to process the commands you wrote.

In addition, this is my first time working with Unix commands. I see you've left placeholders in your code so I can fill in the blanks, but I just need some clarification on what I am supposed to be putting in.

Below is how my mind sees where things should go, in "[]":

xxd -c1 "[The ROM.gba, but are the quotations kept?]" | grep -Pzo "([Something goes here?]<=.{8}: fe  .\n.{8}: ff  .\n).{8}" > [Pointer Address]

perl -ne 'print sprintf("#BLOCK NAME:\t\tDialogue Block (POINTER_RELATIVE)\n#TYPE:\t\t\tNORMAL\n#METHOD:\t\tPOINTER_RELATIVE\n#POINTER ENDIAN:\tLITTLE\n#POINTER TABLE START:\t\$%s#POINTER TABLE STOP:\t\$%08x\n#POINTER SIZE:\t\t\$04\n#POINTER SPACE:\t\t\$00\n#ATLAS PTRS:\t\tYes\n#BASE POINTER:\t\t-\$8000000\n#TABLE:\t\t\tlegendztable.tbl\n#COMMENTS:\t\tYes\n#END BLOCK\n\n", $[Pointer Address], oct("0x".$[Pointer Address])+4);' < addrs

I hope I'm not asking for too much, but I need to know the above so I can start messing around with it on my own.

Also, here's what I tried using something that has grep but not xxd. It didn't work:
https://i.imgur.com/w45SomR.png
Title: Re: Need help with getting POINTER_RELATIVE script dump.
Post by: abw on August 31, 2020, 09:19:18 am
From what I hear, Windows PowerShell is a huge improvement over CMD and doesn't completely suck anymore, but in many ways it still hasn't caught up to where Unix was 40 years ago :P. If you want to use these exact commands, you can install something like Cygwin or MinGW to get something pretty close to Linux on Windows, or you can try finding equivalent PowerShell commands, or if you know any programming language, you can write your own code to fill in the missing pieces. To help you out, I'll explain a bit more about what the commands I used do.

xxd -c1 "rom.gba" | grep -Pzo "(?<=.{8}: fe  .\n.{8}: ff  .\n).{8}" | perl -ne 'print sprintf("#BLOCK NAME:\t\tDialogue Block (POINTER_RELATIVE)\n#TYPE:\t\t\tNORMAL\n#METHOD:\t\tPOINTER_RELATIVE\n#POINTER ENDIAN:\tLITTLE\n#POINTER TABLE START:\t\$%s#POINTER TABLE STOP:\t\$%08x\n#POINTER SIZE:\t\t\$04\n#POINTER SPACE:\t\t\$00\n#ATLAS PTRS:\t\tYes\n#BASE POINTER:\t\t-\$8000000\n#TABLE:\t\t\tlegendztable.tbl\n#COMMENTS:\t\tYes\n#END BLOCK\n\n", $_, oct("0x".$_)+4);'

xxd -c1 "rom.gba"
This opens the file "rom.gba" (which you should indeed replace with the actual name of your ROM file) and prints out all the bytes of that file in the format <file address in hexadecimal>: <byte in hexadecimal>  <corresponding ASCII character, with non-printable characters replaced by ".">, e.g.
Code: [Select]
00000000: 2e  .
00000001: 00  .
00000002: 00  .
00000003: ea  .
00000004: 24  $
00000005: ff  .
If you want to, you can redirect that output to a file named foo.txt with xxd -c1 "rom.gba" > foo.txt, but here we use that output as the input to the next command (a process referred to as piping) by sticking a | between the commands.

grep -Pzo "(?<=.{8}: fe  .\n.{8}: ff  .\n).{8}"
This searches through the input and outputs all the parts that match the regular expression "(?<=.{8}: fe  .\n.{8}: ff  .\n).{8}" (which you should not replace as long as you're looking for $FEFF, but you can change the fe and ff if you want to look for different bytes). -o means to output only the parts that match (the default is to output the entire line containing a match). By default, newline bytes ($0A) are line separators, but -z changes that to use null ($00) bytes as line separators instead; since our input doesn't contain any null bytes, that effectively turns the input into one giant line (which we need to do in order to match the newlines which would otherwise have broken the input up into separate sections). -P means to use Perl-compatible regular expressions (you'll see that abbreviated as PCRE sometimes) which we're using for their extra power vs. POSIX basic regular expressions, in particular the positive lookbehind assertion (?<=.{8}: fe  .\n.{8}: ff  .\n), which looks through xxd's output format for a $FE byte on one line followed by a $FF byte on the next line; the parts of the input that get matched in the assertion do not get included in the match output from the full regular expression, so only the .{8} is output, which looks like
Code: [Select]
00000126
00000e8c
0000163c
00001680
0000168a
As before, you can redirect that output to a file if you want or pipe it to the next command.

perl -ne 'print sprintf("#BLOCK NAME:\t\tDialogue Block (POINTER_RELATIVE)\n#TYPE:\t\t\tNORMAL\n#METHOD:\t\tPOINTER_RELATIVE\n#POINTER ENDIAN:\tLITTLE\n#POINTER TABLE START:\t\$%s#POINTER TABLE STOP:\t\$%08x\n#POINTER SIZE:\t\t\$04\n#POINTER SPACE:\t\t\$00\n#ATLAS PTRS:\t\tYes\n#BASE POINTER:\t\t-\$8000000\n#TABLE:\t\t\tlegendztable.tbl\n#COMMENTS:\t\tYes\n#END BLOCK\n\n", $_, oct("0x".$_)+4);'
This takes each line of the input and uses the line and the line + 4 to fill in the parts I've highlighted with <> in the template below:
Code: [Select]
#BLOCK NAME: Dialogue Block (POINTER_RELATIVE)
#TYPE: NORMAL
#METHOD: POINTER_RELATIVE
#POINTER ENDIAN: LITTLE
#POINTER TABLE START: $<%s>
#POINTER TABLE STOP: $<%08x>
#POINTER SIZE: $04
#POINTER SPACE: $00
#ATLAS PTRS: Yes
#BASE POINTER: -$8000000
#TABLE: legendztable.tbl
#COMMENTS: Yes
#END BLOCK
where %s is a copy of the input line and %08x is the input line + 4 formatted as an 8-digit hexadecimal number with leading zeroes (just to make it look consistent with the 8-digit numbers from %s), giving you output like I posted previously.
Title: Re: Need help with getting POINTER_RELATIVE script dump.
Post by: SquiddyGoat on August 31, 2020, 03:46:43 pm
Once again, the help is much appreciated. I chose to go with MinGW and after setting everything up, I can now use commands like xxd to do stuff. I was able to get a full hexdump of the game using xxd -c1 Legendz.gba. Also, I figured out that "> addrs" tells the program to print a file when used like "> Legendzblocks3.txt"

However, I seemingly ran into a wall with the -Pzo command as shown here:
https://i.imgur.com/PvBWgAN.png

I'm not exactly sure why this is happening, but It's telling me that the -P part of -Pzo "isn't compiled into this" and won't output anything. Earlier, I fixed an error with the "perl" command by actually installing the thing called Perl, but the solution to this error is a mystery.
Title: Re: Need help with getting POINTER_RELATIVE script dump.
Post by: abw on August 31, 2020, 06:45:30 pm
However, I seemingly ran into a wall with the -Pzo command as shown here:
Well, that sounds like a weird thing for MinGW to have disabled :(. On the bright side, since you've got perl, you can achieve the same output as grep using perl instead by replacing
Code: [Select]
grep -Pzo "(?<=.{8}: fe  .\n.{8}: ff  .\n).{8}"
with
Code: [Select]
perl -e 'local $/; my $str = <>; my @m = $str =~ /(?<=.{8}: fe  .\n.{8}: ff  .\n).{8}/g; print join("\n", @m)."\n";'
or combine both this and the next perl calls into one if you don't care about examining the intermediate results:
Code: [Select]
perl -e 'local $/; my $str = <>; my @m = $str =~ /(?<=.{8}: fe  .\n.{8}: ff  .\n).{8}/g; print join("\n", map {sprintf("#BLOCK NAME:\t\tDialogue Block (POINTER_RELATIVE)\n#TYPE:\t\t\tNORMAL\n#METHOD:\t\tPOINTER_RELATIVE\n#POINTER ENDIAN:\tLITTLE\n#POINTER TABLE START:\t\$%s\n#POINTER TABLE STOP:\t\$%08x\n#POINTER SIZE:\t\t\$04\n#POINTER SPACE:\t\t\$00\n#ATLAS PTRS:\t\tYes\n#BASE POINTER:\t\t-\$8000000\n#TABLE:\t\t\tlegendztable.tbl\n#COMMENTS:\t\tYes\n#END BLOCK\n\n", $_, oct("0x".$_)+4)} @m)."\n";'
Title: Re: Need help with getting POINTER_RELATIVE script dump.
Post by: SquiddyGoat on September 01, 2020, 12:51:32 am
I'm almost there, for sure. Just one small problem occurred once I ran the perl -e code.

Below is exactly how I'm putting it into MinGW32:

Code: [Select]
xxd -c1 "Legendz.gba" | perl -e 'local $/; my $str = <>; my @m = $str =~ /(?<=.{8}: fe  .\n.{8}: ff  .\n).{8}/g; print join("\n", map {sprintf("#BLOCK NAME:\t\tDialogue Block (POINTER_RELATIVE)\n#TYPE:\t\t\tNORMAL\n#METHOD:\t\tPOINTER_RELATIVE\n#POINTER ENDIAN:\tLITTLE\n#POINTER TABLE START:\t\$%s\n#POINTER TABLE STOP:\t\$%08x\n#POINTER SIZE:\t\t\$04\n#POINTER SPACE:\t\t\$00\n#ATLAS PTRS:\t\tYes\n#BASE POINTER:\t\t-\$8000000\n#TABLE:\t\t\tlegendztable.tbl\n#COMMENTS:\t\tYes\n#END BLOCK\n\n", $_, oct("0x".$_)+4)} @m)."\n";' > Textfile.txt
Now that the problematic -P is out of the way, the instructions are accepted and the program begins processing it. However, the only thing that comes out is an "out of memory during "large" request" error. Is this because my 8GB of RAM isn't enough, or did I leave something blank the code?
https://i.imgur.com/7MAT1X2.png

Code: [Select]
xxd -c1 -l 0x184B50 "Legendz.gba" | perl -e 'local $/; my $str = <>; my @m = $str =~ /(?<=.{8}: fe  .\n.{8}: ff  .\n).{8}/g; print join("\n", map {sprintf("#BLOCK NAME:\t\tDialogue Block (POINTER_RELATIVE)\n#TYPE:\t\t\tNORMAL\n#METHOD:\t\tPOINTER_RELATIVE\n#POINTER ENDIAN:\tLITTLE\n#POINTER TABLE START:\t\$%s\n#POINTER TABLE STOP:\t\$%08x\n#POINTER SIZE:\t\t\$04\n#POINTER SPACE:\t\t\$00\n#ATLAS PTRS:\t\tYes\n#BASE POINTER:\t\t-\$8000000\n#TABLE:\t\t\tlegendztable.tbl\n#COMMENTS:\t\tYes\n#END BLOCK\n\n", $_, oct("0x".$_)+4)} @m)."\n";' > Textfile.txt
I found a command, -l, that supposedly allows me to stop the program's search to a certain address, which I thought would solve my "out of memory" issue by giving the computer less work. Well, while the "out of memory" error did not occur, a file was output with nothing in it.

I tried making multiple changes to the code to get it to work, but I think I've hit another standstill...
Title: Re: Need help with getting POINTER_RELATIVE script dump.
Post by: abw on September 01, 2020, 06:34:31 pm
That error message makes it look like barely over 128 MB of memory was allocated, so your 8 GB should be able to handle that easily. It kind of looks like something is messed up with your MinGW install :(.

Buuut, it sort of doesn't matter anyway since I've been an idiot here. Getting a list of possible string start addresses is nice, but for extraction and insertion it's the addresses of the pointers to those strings that we need, not the addresses of the strings.

So I put together a little script that scans through its input looking for 0xFEFF byte sequences (possible string end tokens), keeps track of the addresses of the following byte (possible string starts), then scans through its input again looking for possible pointers to those strings and outputs a Cartographer block for every match it finds. It's based on a lot of assumptions that might not hold true, probably generates a lot of false positives (especially if 0xFEFF is used for things other than strings), and whenever you have a group of strings you'll have to find the pointer for the first string yourself (since we're detecting pointers based on the string end token, but the first string in a group probably won't be preceded by 0xFEFF), so it's far from perfect, but it does find that $251CC pointer and many more matches in that general area, so it looks like at least one thing is going right.

Here (https://drive.google.com/file/d/1PcMzydUPgwVyJpNuRfZX8CDj7wqlRKa-/view?usp=sharing) is a link to the script in case you want to try running it yourself or want to modify it, and here (https://drive.google.com/file/d/1u8EItXqRrIpuxM2T8mdY_kZXe2T8TK2R/view?usp=sharing) is a link to the generated blocks. Hopefully that will help a bit more!
Title: Re: Need help with getting POINTER_RELATIVE script dump.
Post by: SquiddyGoat on September 02, 2020, 12:23:10 am
This is all incredibly helpful! Perfect or not, It's better than the nothing that I had before.

Now Cartographer is actually doing something! It's extracting data from the game based on the pointers, except they are all huge and can't be opened without crashing the text editor, even when I just copied over the 251CC block for testing purposes. If Cartographer is taking the pointer and tracing it back to where that pointer points to, then it should only print until the FEFF signals a text break because the next pointer points to what comes after that FEFF. I wonder why It's dumping so many bytes? I'll check my table to see if anything's wrong there.

Also, the #BASE POINTER is dumping something, but is -$8000000 correct? I've read that the base pointer is supposed to be what's added/subtracted to the pointer to get the string address, although I'm not saying you're wrong.

Still, all of this is progress!

By the way, have you been looking through the game too? Have a copy of my table file and a map of where all the interesting stuff is. It might help you get a good idea of how the text is arranged and other things:

Address Map
(This is what I use, so everything else besides the script is listed here too.)
https://drive.google.com/file/d/1A62HEVUD3nr8yUqEW-gAcPXpnnDS91M3/view

Table
(The name is a half-lie. It has hira/kana but no Kanji.)
https://drive.google.com/file/d/1-zLpT1EOuD1dj5wimN5Q8dmgAAi5JkDU/view

As a whole, the game's entire story script exists as one huge block, starting at 1619DC and ending at 184B50. For whatever reason, the "Main" script contains a majority of the first few cutscenes and other unrelated dialogue while the NPC script contains both NPC dialogue and data for various minor cutscenes.

0 = NPC Script
1 = "Main" Script

0000000000
0000000000
0000000000
0000000000
0000000000
0000000000
0000000000
0000000000
0000000000
0000000000
0000000000
0000000000
1111111111
1111111111
1111111111
1111111111
1111111111
1111111111

Also, I guess I'll try to re-install MinGW to run and experiment with your script. I'd bet there was a setting that I looked over or something.
Title: Re: Need help with getting POINTER_RELATIVE script dump.
Post by: abw on September 02, 2020, 08:04:11 pm
I'm not able to access either of those links - you'll have to tell Google to make them available to other people. Based on your description, though, my first guess would be that your table file doesn't indicate that FEFF is an end token and Cartographer winds up dumping everything from the string's start address through to the end of the ROM file. If that's the case, you'll just need to put a "/" in front of the FEFF, e.g. "/FEFF=[end]" instead of just "FEFF=[end]".

Also, the #BASE POINTER is dumping something, but is -$8000000 correct? I've read that the base pointer is supposed to be what's added/subtracted to the pointer to get the string address, although I'm not saying you're wrong.
As far as I know, yes, but keep in mind that I don't actually know what the strings are supposed to look like, so I can't really say for sure. If you have a pointer whose RAM value is $08180F1C and where it points to comes from ROM 0x180F1C, then the difference is -$80000000. That's one of the things that might need to be adjusted; e.g., if the pointer's destination actually comes from ROM 0x1180F1C, then the difference would be -$70000000 instead and that's the value we would want to use for #BASE POINTER.

By the way, have you been looking through the game too?
Not really, no, I'm mostly just playing along for fun, but more information is definitely helpful ;).
Title: Re: Need help with getting POINTER_RELATIVE script dump.
Post by: SquiddyGoat on September 02, 2020, 08:36:01 pm
Ah, silly me. I updated the permissions, so even if you don't have the game you can still look at my list of handy addresses.

I completely forgot about having to put /FEFF=[end] in my Cartographer table, so I'll see how that changes things. Also, here's a cropped screencap of how the decoded text strings look: https://i.imgur.com/wX1v52x.png

UPDATE:

So, it turns out that adding that command fixed the dumping issue! Now, I have several files containing just one string of clear text that I can merge into one file using a simple cmd command. To test the insertion process, I edited the first string and told Atlas to begin inserting. A sample from the file I inserted looks like this, with "_" representing $00:

Code: [Select]
//BLOCK #320 NAME: Block #359

//POINTER #0 @ $25152 - STRING #0 @ $17FAD8

#W32($25152)
//し_ん_ぱ_い_な_ん_で_し_ょ_?___だ_い_じ_ょ_う_ぶ_[DWN]あ_の_コ_は___し_っ_か_り_し_て_る_も_の_[end]//GAME NAME:

//BLOCK #321 NAME: Block #360

//POINTER #0 @ $2515A - STRING #0 @ $17FB1A

#W32($2515A)
//そ_う_だ_な_…___[DWN]ボ_ク_た_ち_の___こ_ど_も_だ_か_ら_な_!_[end]//GAME NAME:

//BLOCK #322 NAME: Block #361

//POINTER #0 @ $251CC - STRING #0 @ $180F1C

#W32($251CC)
//T_h_e__o_c_e_a_n__i_s____[DWN]i_n_c_r_e_d_i_b_l_e_!_____[end]//GAME NAME:

However, when the minor edit was inserted and I booted up the game, all the lines of dialogue were replaced with glitched blocks.

When looking at it in a hex editor, the dialogue strings were completely unchanged but the $251CC pointer and the pointers below it were replaced with "00000000". I couldn't find the inserted text anywhere, either.

I'm also trying to see if there's a way for legendz_block_maker to only scan a certain area of interest from beginning to end for each single byte that comes after FEFF, instead of the entire game.
Title: Re: Need help with getting POINTER_RELATIVE script dump.
Post by: abw on September 03, 2020, 09:57:34 pm
Ah, silly me. I updated the permissions, so even if you don't have the game you can still look at my list of handy addresses.
Yup, those links work for me now - thanks!

As a general note, unless you're very sure you fully understand how the game's text encoding works, you'll want to be careful with table entries like "00=" that discard parts of the ROM, especially if you plan on re-inserting data; each game's code can do basically whatever it wants to do, and if it decides to do something wacky like "0001=[switch to super awesome happy music!!!]", then that "00=" entry would mean you'd only see "01=あ" instead. I'm not saying this game in particular does that, just that it's a possibility you should be aware of, and it might be safer to have entries like "00=[why is there a 00 here?]" and "0100=あ" instead to draw attention to any unexpected 00s you might come across so you can investigate them further.

UPDATE: [...]
Unfortunately, Cartographer's output doesn't include everything you'll need in order to properly re-insert the extracted text. As you've seen, what you've got right now is a script that just writes a bunch of 0 pointers to the various addresses. The reason it doesn't insert any text is because all of the text is commented out (that's what the // at the start of a line does), and the reason the pointers are all 0 is because Atlas sort of defaults to inserting at the start of the ROM file. You'll need to uncomment the text to make Atlas pay attention to it, tell Atlas about your table file so it knows how to convert the text to binary, and tell Atlas where you want to start inserting the text (telling where to stop inserting is also a good idea to prevent accidentally overwriting other data), and how to calculate the pointer values. Try something like this, replacing the $17FB19 with the maximum address you're comfortable overwriting:
Code: [Select]
// Define required TABLE variables and load the corresponding tables
#VAR(scriptTbl, TABLE)
#ADDTBL("legendztable.tbl", scriptTbl)
#ACTIVETBL(scriptTbl) // Activate this block's starting TABLE

#JMP($17FAD8, $17FB19) // Jump to insertion point
#HDR($-8000000) // Difference between ROM and RAM addresses for pointer value calculations

//BLOCK #320 NAME: Block #359

//POINTER #0 @ $25152 - STRING #0 @ $17FAD8

#W32($25152)
し_ん_ぱ_い_な_ん_で_し_ょ_?___だ_い_じ_ょ_う_ぶ_[DWN]あ_の_コ_は___し_っ_か_り_し_て_る_も_の_[end]

//BLOCK #321 NAME: Block #360

//POINTER #0 @ $2515A - STRING #0 @ $17FB1A

#W32($2515A)
そ_う_だ_な_…___[DWN]ボ_ク_た_ち_の___こ_ど_も_だ_か_ら_な_!_[end]

// etc., etc. for the rest of the strings

I'm also trying to see if there's a way for legendz_block_maker to only scan a certain area of interest from beginning to end for each single byte that comes after FEFF, instead of the entire game.
Yup, that sounds like a perfectly reasonable thing to want to do after you've weeded out a bunch of false positives :). As a quick hack, changing the first
Code: [Select]
while (my $byte = <STDIN>) {
line to say
Code: [Select]
my $startAddr = 0x1619DC;
my $stopAddr = 0x184B50;

seek(STDIN, $startAddr, 0);
while (my $byte = <STDIN>) {
last if (tell() > $stopAddr);
instead ought to do the trick; just replace 0x1619DC and 0x184B50 with the endpoints of the address range you want to scan for string end tokens. Leave the second while() loop alone unless you also want to restrict the address range used for searching for pointers to the strings that were found in the first loop.
Title: Re: Need help with getting POINTER_RELATIVE script dump.
Post by: SquiddyGoat on September 03, 2020, 11:15:42 pm
Thanks for the tips and other life-saving information! I'll update my tables later.

The game actually does do something weird with entries like "0001". The game's Kanji begins at F2 but at FF the bytes roll back to "00". So, it seems like the game begins adding "01" to each byte following "00" to tell the game to print a Kanji symbol instead of the usual letters.

This isn't a big deal for the main dialogue, which only uses Hiragana and Katakana. It's only used for the game's in-battle messages of which there's so few of that I'll just manually re-point any of the short ones to the huge spot of free real estate somewhere in 00400000-something.

Also, I have just one question before I get started with legendz_block_maker.pl, which has to do with getting it to preform It's only purpose. I created a brand-new folder for the script and placed both the .pl file and the game into it. Then, I did all that was needed to be done to access it (Change directory, etc.) and I typed in the command to run the script.

When I run legendz_block_maker.pl through this method, the line skips to the next line and nothing happens. I'm not sure if this is actually some kind of prompt to load or otherwise access the game for scanning, but I'm not sure what to do after this part:

(PC name) /c/(directory of the script)
$ perl legendz_block_maker.pl
_

(The image-hosting site I use is currently down, so I can't include a picture of the actual lines.)
Title: Re: Need help with getting POINTER_RELATIVE script dump.
Post by: abw on September 05, 2020, 03:01:29 pm
Thanks for the tips and other life-saving information! I'll update my tables later.
You're welcome!

When I run legendz_block_maker.pl through this method, the line skips to the next line and nothing happens. I'm not sure if this is actually some kind of prompt to load or otherwise access the game for scanning, but I'm not sure what to do after this part:

(PC name) /c/(directory of the script)
$ perl legendz_block_maker.pl
_
By default the script checks for input on standard input (STDIN), so if you call it without providing any input, it won't do anything useful. I've actually updated it (same link as before) to add support for specifying a filename and the start and end addresses for the string and pointer scans and to spit out a little bit of information about the process while it runs.
Code: [Select]
> perl legendz_block_maker.pl --help
scans the input for 0xFEFF byte sequences (indicating the end of a string) and attempts to find pointers to the bytes following those byte sequences (start of the next string)

usage: legendz_block_maker.pl [options]
where options include:
-sss, --string-scan-start=<address> start address for scanning for string end tokens
-sse, --string-scan-end=<address> end address for scanning for string end tokens
-pss, --pointer-scan-start=<address> start address for scanning for pointers to strings
-pse, --pointer-scan-end=<address> end address for scanning for pointers to strings
-fn, --filename filename to scan; to scan STDIN, either set this option to "-" or leave this option unset
-h, --help display this help message and exit
so you can call it like "perl legendz_block_maker.pl < rom.gba > blocks.txt" to get a full scan or e.g. "perl legendz_block_maker.pl -sss=0x1619DC -sse=0x184B50 -pss=0x12345 -pse=0x54321 < rom.gba > blocks.txt" to scan for strings between 0x1619DC and 0x184B50 and pointers to those strings between 0x12345 and 0x54321.