News: 11 March 2016 - Forum Rules
Current Moderators - DarkSol, KingMike, MathOnNapkins, Azkadellia, Danke

Author Topic: Need help with getting POINTER_RELATIVE script dump.  (Read 2168 times)

SquiddyGoat

  • Jr. Member
  • **
  • Posts: 15
    • View Profile
Need help with getting POINTER_RELATIVE script dump.
« on: August 09, 2020, 04:48:27 am »
Background: I'm working with a Japanese-literate friend to translate Legendz: Yomigaeru Shiren no Shima into English using the English letters that are already present in the ROM.

In Cartographer, I'm only getting 0kb POINTER_RELATIVE dumps that are completely empty, only a ($00)-filled RAW dump appears. Below is a copy of my command folder:

#GAME NAME:      Legendz: Island of Ordeal (GBA)

#BLOCK NAME:      Dialogue Block (RAW)
#TYPE:         NORMAL
#METHOD:      RAW
#SCRIPT START:      $160BF0
#SCRIPT STOP:      $189700
#TABLE:         legendztable.tbl
#COMMENTS:      Yes      
#END BLOCK            

#BLOCK NAME:      Dialogue Block (POINTER_RELATIVE)
#TYPE:         NORMAL
#METHOD:      POINTER_RELATIVE
#POINTER ENDIAN:   LITTLE
#POINTER TABLE START:   $08B0927F
#POINTER TABLE STOP:   $08F0FF7F
#POINTER SIZE:      $04
#POINTER SPACE:      $00
#ATLAS PTRS:      Yes
#BASE POINTER:      $080C8669
#TABLE:         legendztable.tbl
#COMMENTS:      Yes
#END BLOCK

I based the SCRIPT START/STOP areas on where the legible Japanese words seem to start and end. I have no clue if I need to be more exact.

There is a large section of 00's and FF's at the bottom of the ROM, which I currently believe is the pointer section (Correct me if I'm wrong.). This is what the POINTER TABLE START/STOP areas are based on.

One thing to note is that the legible game script is far above the huge pointer table, which lies at the bottom of the ROM. I don't understand anything about the base pointer, either, so that could be acting up too. Any idea what could be going wrong?

FAST6191

  • Hero Member
  • *****
  • Posts: 2896
    • View Profile
Re: Need help with getting POINTER_RELATIVE script dump.
« Reply #1 on: August 09, 2020, 03:13:36 pm »
Might want to note it is for the GBA.

Also have you got a means to bypass hardware addons this franchise uses? Others playing along these games are basically spyro skylanders or those disney statue things but circa 2004.

"There is a large section of 00's and FF's at the bottom of the ROM, which I currently believe is the pointer section (Correct me if I'm wrong.)."
You are probably wrong. Most ROM dumpers for the GBA just dump up to the nearest power of 2 mark (4 megabytes, 8 megabytes, 16 or 32) and any difference between where the program stops is padded with 00 or FF just to make up the space (this is what ROM trimming sorts).

Start and stop of legible text is not necessarily what is being done -- many games will have formatting and other things before the text or placeholders (it costs ? a night to stay here sort of thing where ? is one thing, or maybe a character name if you can customise it). Not to mention the pointer section in those tools are asking for where the game keeps the pointers, not where the pointers will be pointing to.

Anyway this is the GBA so you probably want to play to that.

In memory the whole GBA ROM for 99.99% of games* is visible all at once in memory.
http://problemkaputt.de/gbatek.htm#gbamemorymap
In the vast majority of cases this means the game starts at 08000000 in memory and finishes at 09FFFFFF, however as most games are 16 megabytes or less that will be between 08000000 and 08FFFFFF. There are areas later in the memory that the console will wait a while to fetch data from (presumably done so as not to interrupt other goings on in the game for data it does not care about right away, though there are other reasons) but most things won't use it. Likewise you are free and clear as a hacker to use the 09?????? region (a full 16 megabytes) without restriction -- only some older flash carts will be troubled by this and most of those won't mind too much either.
This is so much so that I will usually encourage people to search for 08 and see the results. If you end up with a bunch of them mostly equally spaced in a row that is likely some pointers. It could be anything in the ROM (graphics, music, levels, the code itself... also use pointers) but usually something worth looking at (even if only to eliminate or add to your notes on a game -- you might only be interested in text but others might be interested in other things). If the rest of the pointer decodes as an area in or around the text (those remaining bytes in the 6 values after 08 are the location in the ROM if you opened it in a hex editor and pressed goto, albeit probably flipped a bit because endianness) which you apparently already know the location of then probably worth having a deeper look.
Find the pointers and then you can dump them, though I don't normally use atlas and cartographer for this sort of thing (they tend to not play as nicely with the GBA as I might like). Personally I translate the thing and then between values there is usually some indicator (or you might add one and do some maths after the fact). When it is all translated you then dump it again with your new values, look at the indicator stuff and generate accordingly.

*for those following along that want the exception then flash carts, technically pogoshell and in commercial games https://mgba.io/2015/10/20/dumping-the-undumped/

Anyway time to find at least a few pointers manually and then you can look at persuading those tools to do something, making your own (a spreadsheet might well do) or maybe going to play with kruptar and co.

SquiddyGoat

  • Jr. Member
  • **
  • Posts: 15
    • View Profile
Re: Need help with getting POINTER_RELATIVE script dump.
« Reply #2 on: August 09, 2020, 04:38:22 pm »
Following your advice on finding the real pointers, I came across a block of hex values with "08" going down in a straight line. Here's a cheaply-made example:

(hex)08(hex)F1(hex)08(hex)
(hex)08(hex)F1(hex)08(hex)
(hex)08(hex)F2(hex)08(hex)
(hex)08(hex)F2(hex)08(hex)
(hex)08(hex)F3(hex)08(hex)
(hex)08(hex)F3(hex)08(hex)
(hex)08(hex)F4(hex)08(hex)
(hex)08(hex)F4(hex)08(hex)

Thing is, "08" pointer tables of all sizes can be found all over the place, especially in the beginning of the ROM. How will I know which one deals with the game script?

One thing I'm confused about is how I'm supposed to use these pointers once dumped. If using Atlas and Cartographer isn't a good idea, could they still help me extract and re-insert text easily (My current goal.) through other means? You suggested Kruptar, which I had installed a while ago, but I'm pretty new so I couldn't figure out how to load the dumped script and my table file.

Also, in the ROM the Japanese text is spaced out with periods between letters. (Using the few English words as an example, ...I.N.F.O.は... shows up at one point along with .R.P.G...) with the periods corresponding to values like 00, 000000, 00FDFF and 00FEFF. I'm not entirely sure why this happens, but it makes the raw dumps have a million ($00)'s and ($FF)'s in between each letter.

One more thing, I do own the necessary adapter and other accessories needed to test it on real hardware.
« Last Edit: August 09, 2020, 11:48:29 pm by SquiddyGoat »

FAST6191

  • Hero Member
  • *****
  • Posts: 2896
    • View Profile
Re: Need help with getting POINTER_RELATIVE script dump.
« Reply #3 on: August 11, 2020, 08:32:44 am »
The adapter bit was more for those without such toys -- flash carts and emulators sometimes include a real time clock and occasionally a rumble pack, fewer have simulation/replication/emulation of the Legendz hardware addons.

Spaces between characters as it were. Fairly standard -- Japanese has thousands of characters, even in the more condensed versions. Most European languages do not and can fit it in with the 256 possibilities that 8 bit gives you. You can make things parse things with combined 8 and 16 bit characters but it is computationally costly so most game devs sacrificed the space instead rather than have to do things.

How do you know what one is the script? If they are of the plain 08 and say 6 characters format (which may include another 08 as it is valid to have them in the rest) then the remaining hex will likely be the location of the script within the ROM or in and around it after you handle any flipping issues (see big endian vs little endian -- the GBA, and most things that are not a PC, don't have trivial to read numbers but for complicated historical and legacy reasons flip things such that it does not read in order).
Some later stuff (like the DS) and the GBA is not immune to such quirks might have calculated, offset, sector based or relative pointers (relative in this case meaning if you are at location 60 and the value is 30 then you add the two together to get 90 in the final ROM, or possibly memory layout which is another reason that failed above), to say nothing of the more complicated memory maps on older systems (the GBA having the whole game visible at once is something of a rarity, stuff like the NES and SNES seeing you cycle things).

You could probably coax atlas and cartographer into doing something, however the workarounds required for it are almost enough that you might as well make your own script or abuse a spreadsheet (dump the pointers into a spreadsheet when you find them, find some indicator of value if there is an end of line/start of line value in the text itself, do you changes, search for the new location of the end of line/start of line stuff and change pointers accordingly. This is doable in a hex editor with a decent search function). You don't necessarily need pointers for a dump either.


In this case it is probably redundant or only likely to confuse things further but I will note there is a Korean version of this game ( Legendz - Buhwarhaneun Siryeonyi Seom ) and sometimes you can use such things as a kind of Rosetta stone to compare and contrast things -- most game translations only change the bare minimum so anything that changed is likely to do with that.

SquiddyGoat

  • Jr. Member
  • **
  • Posts: 15
    • View Profile
Re: Need help with getting POINTER_RELATIVE script dump.
« Reply #4 on: August 11, 2020, 01:23:17 pm »
By the way, thanks for taking the time to offer all this great advice. I'll continue on a bit more before coming up with any more questions.

About what you said first, though, flashcarts won't recognize the dedicated adapter at all? I was planning to have a flashcart running the game plugged into a GBA with the adapter plugged into the back. Not being able to do that would... complicate things a lot, such as leaving me with no way to test the game at all.

In that case, would this be of any use for getting it onto a flashcart?
https://shonumi.github.io/articles/art15.html
https://github.com/shonumi/gbe-plus/tree/master/src/data/bin/soul_dollz

FAST6191

  • Hero Member
  • *****
  • Posts: 2896
    • View Profile
Re: Need help with getting POINTER_RELATIVE script dump.
« Reply #5 on: August 11, 2020, 03:12:25 pm »
I was thinking more for people playing it after the fact without any access to it (not like it litters second hand shops like so many skylanders readers). I don't know the specifics of this one, though generally if it goes to the cartridge itself then probably not but if it goes to the link port then that is a different matter and most flash carts will probably work -- any issues coming from timings changes that trouble some link games. Mostly only looked into what went for a possible fix (it is one of the very few things that cause troubles for flash carts and emulator authors on the GBA, indeed this, its sequel and plaston gate tending to be the big three that are not basic sensors, with the latter having a form of bypass) but as it seems to be active electronics doing active things that means emulating/creating a whole lot of extra stuff which is seldom fun (even doing it for a basic anti piracy fix is one of those things that might take years -- see the time between release for banjo kazooie and its patch
As far as those links. Looks like I have some reading to do, thanks for sharing those. Missed them entirely in my rounds of the internet.

SquiddyGoat

  • Jr. Member
  • **
  • Posts: 15
    • View Profile
Re: Need help with getting POINTER_RELATIVE script dump.
« Reply #6 on: August 13, 2020, 03:50:26 pm »
Ever since my last message, I've been working on a basic menu translation as practice (Not something that's fit for release.). So far, I've changed the title screen options, some text on the naming screens, the date/time setting screen and the "command" window in the pause menu. I have also changed the text of four basic moves, the menu description of said moves and a little bit of text in the "TALISPD" menu. There's a lot more that I could be changing, but doing this all by manually replacing the hex value for "あ" with the hex value for "A" is pretty tiring. Also, I really wish I understood more about lengthening pointers since I had to replace "Reborn" with "CALL", "Save" with "SAV", "Items" with "INV", etc.

https://i.imgur.com/qnEJIS7.png
https://i.imgur.com/TNpoOhM.png
https://i.imgur.com/UItAktp.png
https://i.imgur.com/5UOTNQB.png


I'm currently pursuing more efficient ways to edit the text faster, many of which I am just beginning to understand. Cartographer is apparently bad with GBA games (I keep getting /pause and 0KB dumped after a thousand edits of my legendz_commands.txt file.) I'm still experimenting with Kruptar, but I don't know how to add pointers. I'll probably try the spreadsheet thing later, but I have no idea where to start with that. One of the simplest tools appears to be this: http://www.romhacking.net/utilities/880/. I followed all the instructions in the readme, but this one part really confuses me:

"table_write.ini” on the other hands is a binary write so you can include the Hex Byte “00” in your table file.  Now I don’t mean just going and writing “00=00” in your “table_write.ini”.  Open a hex editor and write “00=Hex00.”". There isn't enough space to write "00=Hex00" in a hex editor at "00=", so outside of typing it directly into the .ini I have no idea how to do this.

I tried just leaving the file alone and I got this after an hour of waiting:

ERROR in
action number 1
of Draw Event
for object obj_control:

Error opening file for writing.

Well, there's that. Now for something else:

You could probably coax atlas and cartographer into doing something, however the workarounds required for it are almost enough that you might as well make your own script or abuse a spreadsheet (dump the pointers into a spreadsheet when you find them, find some indicator of value if there is an end of line/start of line value in the text itself, do you changes, search for the new location of the end of line/start of line stuff and change pointers accordingly.

I've located and marked where several menu options and the entire movelist are in the ROM. Before each line of menu/move text is a "00FFFF". At one point, I had accidentally changed an "00FFFF" to a "0000FF" while editing text for the 1st move "CLW" manually and text for the 2nd move "2XCLAW" overflowed into the info box (Which I later fixed.).

Now that I've thought of it, I wonder if this is the "end of line/start of line" value that you described? There's also a "000000", "00FDFF" or "00FEFF" in-between each line of the script dialogue.

Also, I'm pretty sure that the pointers stored from 15CF50 to 15FB50 are the script pointers, purely because they seem to be the longest list of pointers and they sit very close to the game script (Where the first line spoken by a character in the game starts, 180F1E.). While I was using Cartographer, I had 38C005 as my "base pointer" because 50CF15(15CF50 in little endian.) - 38C005 = 180F1E.

A lot of things seem to ask me for this "base pointer" or something that's to be "added/subtracted to the offset to get to the script", so I wonder if 38C005 is what I'm looking for.
« Last Edit: August 13, 2020, 11:00:46 pm by SquiddyGoat »

abw

  • Hero Member
  • *****
  • Posts: 525
    • View Profile
Re: Need help with getting POINTER_RELATIVE script dump.
« Reply #7 on: August 27, 2020, 09:23:50 pm »
#POINTER TABLE START:   $08B0927F
#POINTER TABLE STOP:   $08F0FF7F
This is definitely a problem - the entire ROM is only $800000 bytes long, so telling Cartographer that the pointer table starts at $08B0927F means that it would have to read beyond the end of the ROM file, which isn't going to work. I don't have your table file to test it out myself, but what happens if you try something like this instead?
Code: [Select]
#BLOCK NAME: Dialogue Block (POINTER_RELATIVE)
#TYPE: NORMAL
#METHOD: POINTER_RELATIVE
#POINTER ENDIAN: LITTLE
#POINTER TABLE START: $15CF5C
#POINTER TABLE STOP: $15FB60
#POINTER SIZE: $03
#POINTER SPACE: $01
#ATLAS PTRS: Yes
#BASE POINTER: $0
#TABLE: legendztable.tbl
#COMMENTS: Yes
#END BLOCK
You are going to need to figure out what determines the end of each string; probably that'll be some byte sequence like $00FFFF that you can include in your table file (e.g. "/00FFFF=[end]"), but there's a chance it could be something more complicated instead.

"table_write.ini” on the other hands is a binary write so you can include the Hex Byte “00” in your table file.  Now I don’t mean just going and writing “00=00” in your “table_write.ini”.  Open a hex editor and write “00=Hex00.”". There isn't enough space to write "00=Hex00" in a hex editor at "00=", so outside of typing it directly into the .ini I have no idea how to do this.
I'not familiar with that program, but as a guess, probably those instructions mean "00=" in ASCII and "00" in hexadecimal, so in hex your line would be 30303D00 ($30 = "0", $3D = "=", $00 = a zero byte).

SquiddyGoat

  • Jr. Member
  • **
  • Posts: 15
    • View Profile
Re: Need help with getting POINTER_RELATIVE script dump.
« Reply #8 on: August 29, 2020, 01:14:20 am »
Ah, so that's what was crashing Cartographer. After messing around with some of those VBA tools, I've only recently begun to understand the different sections that make up a GBA ROM, so I get what you're saying.

As much as I'd like to try out your solution, I recently found out that the pointer table for the game script does not start at 0015CF5C (It is a table of pointers, but for something completely different.). In fact, there doesn't seem to be a "table" at all, just a huge list of scattered pointers that begins at 000251CC and goes down for miles. They're still in order (I think?), but with uneven strings of data in-between each one. In this case, would Cartographer still be of any use? Does it scan for the "08" pointers or am I out of luck?

As for the "end of each string", It's probably a good thing that I know a ton of them. 00FFFF teminates the line in places like menus, 00FDFF brings the text to the second line, 00FEFF ends the line until the next line is brought up by the player, etc. I have them all marked down in my table file, which I recently updated with all 66 of the Kanji symbols present in the game in order.

Also, thanks for the advice for using Rewriter as well, I'll have to try that out and see how it goes. I'd really like to find a faster way than what I'm doing, even though it gets the job done:

https://i.imgur.com/i9bsjW6.png
https://i.imgur.com/zQlSZ4e.png

abw

  • Hero Member
  • *****
  • Posts: 525
    • View Profile
Re: Need help with getting POINTER_RELATIVE script dump.
« Reply #9 on: August 29, 2020, 09:40:14 pm »
Hmm, if the pointers really are scattered around with no useful pattern, then dumping their strings will be annoying. You could still do it in Cartographer by setting up a block for each pointer; probably the only thing that would differ between blocks would be the #POINTER TABLE START and #POINTER TABLE STOP lines, so if you have a list of pointers you could generate all the blocks programmatically pretty easily (search-and-replace in any decent text editor would be enough for that).

That said, it would probably be worth spending a little bit of time to verify that a) the things you're looking at really are pointers and b) they're pointers to the strings you want to dump instead of pointers to something else. One way of doing that would be to find a string in-game that you can get to display easily, find its pointer, change its pointer to have the same value as the pointer for some other string, and then display the string in-game again; if the game now shows the other string, then you know you've found one of the pointers you want.

If you're comfortable working with a debugger, a more thorough approach would be to trace the code for displaying the string back to its pointer, which would probably take more work but would probably give you a great deal of information about the locations of all the script pointers, so it might end up being faster in the long run.

SquiddyGoat

  • Jr. Member
  • **
  • Posts: 15
    • View Profile
Re: Need help with getting POINTER_RELATIVE script dump.
« Reply #10 on: August 30, 2020, 03:33:55 am »
I'm 100% positive that they're the script pointers, as I was able to replace the first line in the game with another character's line by changing what came before the "08". In fact, I was able to change multiple pointers to point to a huge blank area where I could write anything of any length and have it show up in-game. For such an advanced GBA game, changing text by hand is time-consuming but really easy and risk-free.

Right now, I'm working on creating a list of every single script pointer in the game using one of my own methods. Every pointer is pointing to the first letter that comes after FEFF, so I am simply using Notepad++ to search and replace FEFFXX (FEFF00, FEFF01, etc.) with FEFFD7, with D7 being a letter that would never show up in a conversation. Then, I will insert the binary data into the game and search for D7 within the parameters of the script and just copy down all the addresses. Then, I have a simple macro set up in Notepad++ that will convert every address into a little-endian pointer, giving me a set of script pointers that I can copy and paste into anything.

This is all I need for the multiple-block method you described, right? Do I input the pointers themselves (XXXXXX08), or do I need to input the addresses at which the pointer is located (000251CC) into Cartographer? The second one sounds painful, so I hope It's the first.

Also, how would I generate the blocks using the pointers I'm working on? What would the blocks roughly look like and how would I easily use a search and replace function to create them?

abw

  • Hero Member
  • *****
  • Posts: 525
    • View Profile
Re: Need help with getting POINTER_RELATIVE script dump.
« Reply #11 on: August 30, 2020, 03:06:35 pm »
Do I input the pointers themselves (XXXXXX08), or do I need to input the addresses at which the pointer is located (000251CC) into Cartographer? The second one sounds painful, so I hope It's the first.

Also, how would I generate the blocks using the pointers I'm working on? What would the blocks roughly look like and how would I easily use a search and replace function to create them?
You're going to need the addresses of the pointers themselves, and not just for extracting the original strings, but also for inserting your new strings. Cartographer's output should include the appropriate pointer write commands for Atlas, so the good news is that once you've got the Cartographer command file set up, the resulting Atlas file should be very close to what you'll need for inserting.

Assuming you have access to some Unix-like command line, you can get a list of all ROM addresses following a 0xFEFF byte sequence with e.g.
Code: [Select]
xxd -c1 "rom.gba" | grep -Pzo "(?<=.{8}: fe  .\n.{8}: ff  .\n).{8}" > addrs
Those are the values you'll want to use for #POINTER TABLE START, and those + 4 are the values you'll want to use for #POINTER TABLE STOP. You can take those addresses and generate the Cartographer blocks in many ways; the following works for me:
Code: [Select]
perl -ne 'print sprintf("#BLOCK NAME:\t\tDialogue Block (POINTER_RELATIVE)\n#TYPE:\t\t\tNORMAL\n#METHOD:\t\tPOINTER_RELATIVE\n#POINTER ENDIAN:\tLITTLE\n#POINTER TABLE START:\t\$%s#POINTER TABLE STOP:\t\$%08x\n#POINTER SIZE:\t\t\$04\n#POINTER SPACE:\t\t\$00\n#ATLAS PTRS:\t\tYes\n#BASE POINTER:\t\t-\$8000000\n#TABLE:\t\t\tlegendztable.tbl\n#COMMENTS:\t\tYes\n#END BLOCK\n\n", $_, oct("0x".$_)+4);' < addrs
Or, putting it all together:
Code: [Select]
xxd -c1 "rom.gba" | grep -Pzo "(?<=.{8}: fe  .\n.{8}: ff  .\n).{8}" | perl -ne 'print sprintf("#BLOCK NAME:\t\tDialogue Block (POINTER_RELATIVE)\n#TYPE:\t\t\tNORMAL\n#METHOD:\t\tPOINTER_RELATIVE\n#POINTER ENDIAN:\tLITTLE\n#POINTER TABLE START:\t\$%s#POINTER TABLE STOP:\t\$%08x\n#POINTER SIZE:\t\t\$04\n#POINTER SPACE:\t\t\$00\n#ATLAS PTRS:\t\tYes\n#BASE POINTER:\t\t-\$8000000\n#TABLE:\t\t\tlegendztable.tbl\n#COMMENTS:\t\tYes\n#END BLOCK\n\n", $_, oct("0x".$_)+4);'
gets me output like:
Code: [Select]
#BLOCK NAME: Dialogue Block (POINTER_RELATIVE)
#TYPE: NORMAL
#METHOD: POINTER_RELATIVE
#POINTER ENDIAN: LITTLE
#POINTER TABLE START: $00000126
#POINTER TABLE STOP: $0000012a
#POINTER SIZE: $04
#POINTER SPACE: $00
#ATLAS PTRS: Yes
#BASE POINTER: -$8000000
#TABLE: legendztable.tbl
#COMMENTS: Yes
#END BLOCK

#BLOCK NAME: Dialogue Block (POINTER_RELATIVE)
#TYPE: NORMAL
#METHOD: POINTER_RELATIVE
#POINTER ENDIAN: LITTLE
#POINTER TABLE START: $00000e8c
#POINTER TABLE STOP: $00000e90
#POINTER SIZE: $04
#POINTER SPACE: $00
#ATLAS PTRS: Yes
#BASE POINTER: -$8000000
#TABLE: legendztable.tbl
#COMMENTS: Yes
#END BLOCK

#BLOCK NAME: Dialogue Block (POINTER_RELATIVE)
#TYPE: NORMAL
#METHOD: POINTER_RELATIVE
#POINTER ENDIAN: LITTLE
#POINTER TABLE START: $0000163c
#POINTER TABLE STOP: $00001640
#POINTER SIZE: $04
#POINTER SPACE: $00
#ATLAS PTRS: Yes
#BASE POINTER: -$8000000
#TABLE: legendztable.tbl
#COMMENTS: Yes
#END BLOCK

etc., etc.
Yes, it's huge and clunky, but it ought to work (I haven't tested it, so possibly it might need some tweaking) and hopefully you'll only have to do it one time.

SquiddyGoat

  • Jr. Member
  • **
  • Posts: 15
    • View Profile
Re: Need help with getting POINTER_RELATIVE script dump.
« Reply #12 on: August 30, 2020, 08:17:09 pm »
This is some really great help! If I can get this to work, then I'm sure my current problems will be either solved or close-to-solved.

One thing, though. I'm not exactly sure what a "Unix-like command line" is, so I googled it and found Windows Powershell. Is this the correct thing I'm supposed to be using? I've read that "xxd" doesn't exist in Powershell, so I'm wondering if I have the correct setup to process the commands you wrote.

In addition, this is my first time working with Unix commands. I see you've left placeholders in your code so I can fill in the blanks, but I just need some clarification on what I am supposed to be putting in.

Below is how my mind sees where things should go, in "[]":

xxd -c1 "[The ROM.gba, but are the quotations kept?]" | grep -Pzo "([Something goes here?]<=.{8}: fe  .\n.{8}: ff  .\n).{8}" > [Pointer Address]

perl -ne 'print sprintf("#BLOCK NAME:\t\tDialogue Block (POINTER_RELATIVE)\n#TYPE:\t\t\tNORMAL\n#METHOD:\t\tPOINTER_RELATIVE\n#POINTER ENDIAN:\tLITTLE\n#POINTER TABLE START:\t\$%s#POINTER TABLE STOP:\t\$%08x\n#POINTER SIZE:\t\t\$04\n#POINTER SPACE:\t\t\$00\n#ATLAS PTRS:\t\tYes\n#BASE POINTER:\t\t-\$8000000\n#TABLE:\t\t\tlegendztable.tbl\n#COMMENTS:\t\tYes\n#END BLOCK\n\n", $[Pointer Address], oct("0x".$[Pointer Address])+4);' < addrs

I hope I'm not asking for too much, but I need to know the above so I can start messing around with it on my own.

Also, here's what I tried using something that has grep but not xxd. It didn't work:
https://i.imgur.com/w45SomR.png
« Last Edit: August 30, 2020, 11:30:54 pm by SquiddyGoat »

abw

  • Hero Member
  • *****
  • Posts: 525
    • View Profile
Re: Need help with getting POINTER_RELATIVE script dump.
« Reply #13 on: August 31, 2020, 09:19:18 am »
From what I hear, Windows PowerShell is a huge improvement over CMD and doesn't completely suck anymore, but in many ways it still hasn't caught up to where Unix was 40 years ago :P. If you want to use these exact commands, you can install something like Cygwin or MinGW to get something pretty close to Linux on Windows, or you can try finding equivalent PowerShell commands, or if you know any programming language, you can write your own code to fill in the missing pieces. To help you out, I'll explain a bit more about what the commands I used do.

xxd -c1 "rom.gba" | grep -Pzo "(?<=.{8}: fe  .\n.{8}: ff  .\n).{8}" | perl -ne 'print sprintf("#BLOCK NAME:\t\tDialogue Block (POINTER_RELATIVE)\n#TYPE:\t\t\tNORMAL\n#METHOD:\t\tPOINTER_RELATIVE\n#POINTER ENDIAN:\tLITTLE\n#POINTER TABLE START:\t\$%s#POINTER TABLE STOP:\t\$%08x\n#POINTER SIZE:\t\t\$04\n#POINTER SPACE:\t\t\$00\n#ATLAS PTRS:\t\tYes\n#BASE POINTER:\t\t-\$8000000\n#TABLE:\t\t\tlegendztable.tbl\n#COMMENTS:\t\tYes\n#END BLOCK\n\n", $_, oct("0x".$_)+4);'

xxd -c1 "rom.gba"
This opens the file "rom.gba" (which you should indeed replace with the actual name of your ROM file) and prints out all the bytes of that file in the format <file address in hexadecimal>: <byte in hexadecimal>  <corresponding ASCII character, with non-printable characters replaced by ".">, e.g.
Code: [Select]
00000000: 2e  .
00000001: 00  .
00000002: 00  .
00000003: ea  .
00000004: 24  $
00000005: ff  .
If you want to, you can redirect that output to a file named foo.txt with xxd -c1 "rom.gba" > foo.txt, but here we use that output as the input to the next command (a process referred to as piping) by sticking a | between the commands.

grep -Pzo "(?<=.{8}: fe  .\n.{8}: ff  .\n).{8}"
This searches through the input and outputs all the parts that match the regular expression "(?<=.{8}: fe  .\n.{8}: ff  .\n).{8}" (which you should not replace as long as you're looking for $FEFF, but you can change the fe and ff if you want to look for different bytes). -o means to output only the parts that match (the default is to output the entire line containing a match). By default, newline bytes ($0A) are line separators, but -z changes that to use null ($00) bytes as line separators instead; since our input doesn't contain any null bytes, that effectively turns the input into one giant line (which we need to do in order to match the newlines which would otherwise have broken the input up into separate sections). -P means to use Perl-compatible regular expressions (you'll see that abbreviated as PCRE sometimes) which we're using for their extra power vs. POSIX basic regular expressions, in particular the positive lookbehind assertion (?<=.{8}: fe  .\n.{8}: ff  .\n), which looks through xxd's output format for a $FE byte on one line followed by a $FF byte on the next line; the parts of the input that get matched in the assertion do not get included in the match output from the full regular expression, so only the .{8} is output, which looks like
Code: [Select]
00000126
00000e8c
0000163c
00001680
0000168a
As before, you can redirect that output to a file if you want or pipe it to the next command.

perl -ne 'print sprintf("#BLOCK NAME:\t\tDialogue Block (POINTER_RELATIVE)\n#TYPE:\t\t\tNORMAL\n#METHOD:\t\tPOINTER_RELATIVE\n#POINTER ENDIAN:\tLITTLE\n#POINTER TABLE START:\t\$%s#POINTER TABLE STOP:\t\$%08x\n#POINTER SIZE:\t\t\$04\n#POINTER SPACE:\t\t\$00\n#ATLAS PTRS:\t\tYes\n#BASE POINTER:\t\t-\$8000000\n#TABLE:\t\t\tlegendztable.tbl\n#COMMENTS:\t\tYes\n#END BLOCK\n\n", $_, oct("0x".$_)+4);'
This takes each line of the input and uses the line and the line + 4 to fill in the parts I've highlighted with <> in the template below:
Code: [Select]
#BLOCK NAME: Dialogue Block (POINTER_RELATIVE)
#TYPE: NORMAL
#METHOD: POINTER_RELATIVE
#POINTER ENDIAN: LITTLE
#POINTER TABLE START: $<%s>
#POINTER TABLE STOP: $<%08x>
#POINTER SIZE: $04
#POINTER SPACE: $00
#ATLAS PTRS: Yes
#BASE POINTER: -$8000000
#TABLE: legendztable.tbl
#COMMENTS: Yes
#END BLOCK
where %s is a copy of the input line and %08x is the input line + 4 formatted as an 8-digit hexadecimal number with leading zeroes (just to make it look consistent with the 8-digit numbers from %s), giving you output like I posted previously.

SquiddyGoat

  • Jr. Member
  • **
  • Posts: 15
    • View Profile
Re: Need help with getting POINTER_RELATIVE script dump.
« Reply #14 on: August 31, 2020, 03:46:43 pm »
Once again, the help is much appreciated. I chose to go with MinGW and after setting everything up, I can now use commands like xxd to do stuff. I was able to get a full hexdump of the game using xxd -c1 Legendz.gba. Also, I figured out that "> addrs" tells the program to print a file when used like "> Legendzblocks3.txt"

However, I seemingly ran into a wall with the -Pzo command as shown here:
https://i.imgur.com/PvBWgAN.png

I'm not exactly sure why this is happening, but It's telling me that the -P part of -Pzo "isn't compiled into this" and won't output anything. Earlier, I fixed an error with the "perl" command by actually installing the thing called Perl, but the solution to this error is a mystery.
« Last Edit: August 31, 2020, 04:35:42 pm by SquiddyGoat »

abw

  • Hero Member
  • *****
  • Posts: 525
    • View Profile
Re: Need help with getting POINTER_RELATIVE script dump.
« Reply #15 on: August 31, 2020, 06:45:30 pm »
However, I seemingly ran into a wall with the -Pzo command as shown here:
Well, that sounds like a weird thing for MinGW to have disabled :(. On the bright side, since you've got perl, you can achieve the same output as grep using perl instead by replacing
Code: [Select]
grep -Pzo "(?<=.{8}: fe  .\n.{8}: ff  .\n).{8}"
with
Code: [Select]
perl -e 'local $/; my $str = <>; my @m = $str =~ /(?<=.{8}: fe  .\n.{8}: ff  .\n).{8}/g; print join("\n", @m)."\n";'
or combine both this and the next perl calls into one if you don't care about examining the intermediate results:
Code: [Select]
perl -e 'local $/; my $str = <>; my @m = $str =~ /(?<=.{8}: fe  .\n.{8}: ff  .\n).{8}/g; print join("\n", map {sprintf("#BLOCK NAME:\t\tDialogue Block (POINTER_RELATIVE)\n#TYPE:\t\t\tNORMAL\n#METHOD:\t\tPOINTER_RELATIVE\n#POINTER ENDIAN:\tLITTLE\n#POINTER TABLE START:\t\$%s\n#POINTER TABLE STOP:\t\$%08x\n#POINTER SIZE:\t\t\$04\n#POINTER SPACE:\t\t\$00\n#ATLAS PTRS:\t\tYes\n#BASE POINTER:\t\t-\$8000000\n#TABLE:\t\t\tlegendztable.tbl\n#COMMENTS:\t\tYes\n#END BLOCK\n\n", $_, oct("0x".$_)+4)} @m)."\n";'

SquiddyGoat

  • Jr. Member
  • **
  • Posts: 15
    • View Profile
Re: Need help with getting POINTER_RELATIVE script dump.
« Reply #16 on: September 01, 2020, 12:51:32 am »
I'm almost there, for sure. Just one small problem occurred once I ran the perl -e code.

Below is exactly how I'm putting it into MinGW32:

Code: [Select]
xxd -c1 "Legendz.gba" | perl -e 'local $/; my $str = <>; my @m = $str =~ /(?<=.{8}: fe  .\n.{8}: ff  .\n).{8}/g; print join("\n", map {sprintf("#BLOCK NAME:\t\tDialogue Block (POINTER_RELATIVE)\n#TYPE:\t\t\tNORMAL\n#METHOD:\t\tPOINTER_RELATIVE\n#POINTER ENDIAN:\tLITTLE\n#POINTER TABLE START:\t\$%s\n#POINTER TABLE STOP:\t\$%08x\n#POINTER SIZE:\t\t\$04\n#POINTER SPACE:\t\t\$00\n#ATLAS PTRS:\t\tYes\n#BASE POINTER:\t\t-\$8000000\n#TABLE:\t\t\tlegendztable.tbl\n#COMMENTS:\t\tYes\n#END BLOCK\n\n", $_, oct("0x".$_)+4)} @m)."\n";' > Textfile.txt
Now that the problematic -P is out of the way, the instructions are accepted and the program begins processing it. However, the only thing that comes out is an "out of memory during "large" request" error. Is this because my 8GB of RAM isn't enough, or did I leave something blank the code?
https://i.imgur.com/7MAT1X2.png

Code: [Select]
xxd -c1 -l 0x184B50 "Legendz.gba" | perl -e 'local $/; my $str = <>; my @m = $str =~ /(?<=.{8}: fe  .\n.{8}: ff  .\n).{8}/g; print join("\n", map {sprintf("#BLOCK NAME:\t\tDialogue Block (POINTER_RELATIVE)\n#TYPE:\t\t\tNORMAL\n#METHOD:\t\tPOINTER_RELATIVE\n#POINTER ENDIAN:\tLITTLE\n#POINTER TABLE START:\t\$%s\n#POINTER TABLE STOP:\t\$%08x\n#POINTER SIZE:\t\t\$04\n#POINTER SPACE:\t\t\$00\n#ATLAS PTRS:\t\tYes\n#BASE POINTER:\t\t-\$8000000\n#TABLE:\t\t\tlegendztable.tbl\n#COMMENTS:\t\tYes\n#END BLOCK\n\n", $_, oct("0x".$_)+4)} @m)."\n";' > Textfile.txt
I found a command, -l, that supposedly allows me to stop the program's search to a certain address, which I thought would solve my "out of memory" issue by giving the computer less work. Well, while the "out of memory" error did not occur, a file was output with nothing in it.

I tried making multiple changes to the code to get it to work, but I think I've hit another standstill...

abw

  • Hero Member
  • *****
  • Posts: 525
    • View Profile
Re: Need help with getting POINTER_RELATIVE script dump.
« Reply #17 on: September 01, 2020, 06:34:31 pm »
That error message makes it look like barely over 128 MB of memory was allocated, so your 8 GB should be able to handle that easily. It kind of looks like something is messed up with your MinGW install :(.

Buuut, it sort of doesn't matter anyway since I've been an idiot here. Getting a list of possible string start addresses is nice, but for extraction and insertion it's the addresses of the pointers to those strings that we need, not the addresses of the strings.

So I put together a little script that scans through its input looking for 0xFEFF byte sequences (possible string end tokens), keeps track of the addresses of the following byte (possible string starts), then scans through its input again looking for possible pointers to those strings and outputs a Cartographer block for every match it finds. It's based on a lot of assumptions that might not hold true, probably generates a lot of false positives (especially if 0xFEFF is used for things other than strings), and whenever you have a group of strings you'll have to find the pointer for the first string yourself (since we're detecting pointers based on the string end token, but the first string in a group probably won't be preceded by 0xFEFF), so it's far from perfect, but it does find that $251CC pointer and many more matches in that general area, so it looks like at least one thing is going right.

Here is a link to the script in case you want to try running it yourself or want to modify it, and here is a link to the generated blocks. Hopefully that will help a bit more!

SquiddyGoat

  • Jr. Member
  • **
  • Posts: 15
    • View Profile
Re: Need help with getting POINTER_RELATIVE script dump.
« Reply #18 on: September 02, 2020, 12:23:10 am »
This is all incredibly helpful! Perfect or not, It's better than the nothing that I had before.

Now Cartographer is actually doing something! It's extracting data from the game based on the pointers, except they are all huge and can't be opened without crashing the text editor, even when I just copied over the 251CC block for testing purposes. If Cartographer is taking the pointer and tracing it back to where that pointer points to, then it should only print until the FEFF signals a text break because the next pointer points to what comes after that FEFF. I wonder why It's dumping so many bytes? I'll check my table to see if anything's wrong there.

Also, the #BASE POINTER is dumping something, but is -$8000000 correct? I've read that the base pointer is supposed to be what's added/subtracted to the pointer to get the string address, although I'm not saying you're wrong.

Still, all of this is progress!

By the way, have you been looking through the game too? Have a copy of my table file and a map of where all the interesting stuff is. It might help you get a good idea of how the text is arranged and other things:

Address Map
(This is what I use, so everything else besides the script is listed here too.)
https://drive.google.com/file/d/1A62HEVUD3nr8yUqEW-gAcPXpnnDS91M3/view

Table
(The name is a half-lie. It has hira/kana but no Kanji.)
https://drive.google.com/file/d/1-zLpT1EOuD1dj5wimN5Q8dmgAAi5JkDU/view

As a whole, the game's entire story script exists as one huge block, starting at 1619DC and ending at 184B50. For whatever reason, the "Main" script contains a majority of the first few cutscenes and other unrelated dialogue while the NPC script contains both NPC dialogue and data for various minor cutscenes.

0 = NPC Script
1 = "Main" Script

0000000000
0000000000
0000000000
0000000000
0000000000
0000000000
0000000000
0000000000
0000000000
0000000000
0000000000
0000000000
1111111111
1111111111
1111111111
1111111111
1111111111
1111111111

Also, I guess I'll try to re-install MinGW to run and experiment with your script. I'd bet there was a setting that I looked over or something.

abw

  • Hero Member
  • *****
  • Posts: 525
    • View Profile
Re: Need help with getting POINTER_RELATIVE script dump.
« Reply #19 on: September 02, 2020, 08:04:11 pm »
I'm not able to access either of those links - you'll have to tell Google to make them available to other people. Based on your description, though, my first guess would be that your table file doesn't indicate that FEFF is an end token and Cartographer winds up dumping everything from the string's start address through to the end of the ROM file. If that's the case, you'll just need to put a "/" in front of the FEFF, e.g. "/FEFF=[end]" instead of just "FEFF=[end]".

Also, the #BASE POINTER is dumping something, but is -$8000000 correct? I've read that the base pointer is supposed to be what's added/subtracted to the pointer to get the string address, although I'm not saying you're wrong.
As far as I know, yes, but keep in mind that I don't actually know what the strings are supposed to look like, so I can't really say for sure. If you have a pointer whose RAM value is $08180F1C and where it points to comes from ROM 0x180F1C, then the difference is -$80000000. That's one of the things that might need to be adjusted; e.g., if the pointer's destination actually comes from ROM 0x1180F1C, then the difference would be -$70000000 instead and that's the value we would want to use for #BASE POINTER.

By the way, have you been looking through the game too?
Not really, no, I'm mostly just playing along for fun, but more information is definitely helpful ;).