Romhacking.net

Romhacking => ROM Hacking Discussion => Topic started by: RedComet on December 29, 2008, 02:50:34 pm

Title: Cartographer
Post by: RedComet on December 29, 2008, 02:50:34 pm
http://www.rpgclassics.com/subsites/twit/progs/Cartographer_PR3.zip

I've included a sample of how the dump by pointer and raw dump (like romjuice) methods work using FF1 as an example. I still need to write a few more examples, but I need to find games that I'm not actively working on to demonstrate the features on. Everything should be complete, but there may still be bugs, which is the whole point I'm posting this. Use it and report any bugs you find.
Title: Re: Cartographer
Post by: Pennywise on December 29, 2008, 03:06:08 pm
There's this NES game called Jajamaru Gekkimaden or something that doesn't use control codes and the only way to find where a string begins is by looking at the pointer table. At least that's how I remember things. Could be a possible example.
Title: Re: Cartographer
Post by: Vehek on December 30, 2008, 07:55:15 pm
I don't understand how to use this. Do I need to set up my table a certain way?
 "#0: Line 1: First character of the line is not a recognized table character"

Edit: Is this because my table's in UTF-8 encoding?
Title: Re: Cartographer
Post by: Tauwasser on December 30, 2008, 08:29:53 pm
Seriously, if this does not support unicode, nobody here wants it anyway. Can't be that hard to make it unicode compliant...

cYa,

Tauwasser
Title: Re: Cartographer
Post by: RedComet on December 31, 2008, 11:40:05 am
I don't understand how to use this. Do I need to set up my table a certain way?
 "#0: Line 1: First character of the line is not a recognized table character"

Edit: Is this because my table's in UTF-8 encoding?

Could you send me your table file so I can check it out?
Title: Re: Cartographer
Post by: Vehek on December 31, 2008, 12:47:28 pm
Is e-mail okay?

I didn't have any problems when I tried using a table in ANSI format.
Title: Re: Cartographer
Post by: RedComet on December 31, 2008, 01:15:22 pm
Yeah, email's fine. rpgcredcomet@gmail.com
Title: Re: Cartographer
Post by: C_CliFF on January 01, 2009, 10:19:22 am
I just tested the program with an old project I had, FF5. I don't know if the program has a bug but when I try to extract a block that uses 3-byte pointers the program crashes. This is my commands (I tried to explain as well as I could):

Code: [Select]

#GAME NAME: Final Fantasy 5 (SNES)

#BLOCK NAME: Dialogue Block (RAW) // this raw block extracts fine
#TYPE: NORMAL
#METHOD: RAW
#SCRIPT START: $21020D
#SCRIPT STOP: $21FE16
#TABLE: ff5_raw.tbl
#COMMENTS: Yes
#END BLOCK


#BLOCK NAME: Dialogue Block (POINTER_RELATIVE)
#TYPE: NORMAL
#METHOD: POINTER_RELATIVE
#POINTER ENDIAN: LITTLE
#POINTER TABLE START: $2015F0
#POINTER TABLE STOP: $202F3F
#POINTER SIZE: $04 // see POINTER SIZE comment below
#POINTER SPACE: $00
#ATLAS PTRS: Yes
#BASE POINTER: $210200 // see BASE POINTER comment below
#TABLE: ff5_ptr.tbl
#COMMENTS: Yes
#END BLOCK

// POINTER SIZE: This is were the program crashes. The game uses 3-byte pointers
// so I changed it to 04 like the readme said. I tried just for testing
// to change the value to $02 and then it extracted fine. But
// like I said, since the game uses 3-byte pointers then the pointers
// are wrong. Besides, the script doesen't look like it should.

// BASE POINTER: I'm not sure exactly how this one works. To get the right adress
// to the pointer table you subtract 200 and then add C00000 to get the
// right value to calculate the pointers so I don't really know if
// I made it right. I tried and remove POINTER_RELATIVE metod and changed
// it to POINTER and then removed the BASE POINTER method but the program
// still crashes.

-C_CliFF
Title: Re: Cartographer
Post by: RedComet on January 04, 2009, 07:14:33 pm
The base pointer is the value you that you either add or subtract to the pointer in the table to get the string address. Take the string address and subtract the pointer value from the address. The difference is the value that you must add to the pointers to get the string address, therefore it's the base pointer.

As for the table accepting unicode -- it should, but I'm using Klarth's table library, so if that doesn't then Cartographer doesn't either. I'll test that out.

I'll address those other issues this week hopefully. Short on time.

Title: Re: Cartographer
Post by: Vehek on January 04, 2009, 07:23:10 pm
I haven't tested it yet, but my problem might have to do with UTF-8's "byte-order mark".

Edit-Tested it by cutting the byte-order mark out with a hex-editor, and the table worked.
Title: Re: Cartographer
Post by: Tauwasser on January 05, 2009, 02:29:05 am
Seriously, there gotta be a standard library that can use all file types, be it Unicode, ± BOM, ANSI... It should encourage Unicode of course. And yes, Unicode can be UTF-7, UTF-8, UTF-16, UTF-32 ± BOM or UCS-2, UCS-4 ± BOM! Seriously, I don't get what's so hard about making a unicode compliant program. What is this? 1998?? It's 10 years after Unicode was invented guys, wake up and use it properly!
If there is no ready-to-use c library/header/whatever for this, then it's a shame. I doubt it tho, Microsoft can do it, so can you.


cYa,

Tauwasser
Title: Re: Cartographer
Post by: Nightcrawler on January 05, 2009, 08:43:21 am
Quit being so harsh, especially since you are seemingly ignorant on supporting Unicode formats in C/C++ (assuming this utility was written as such). It's not as trivial as you make it out to be. Suggesting Unicode support is strongly needed is one thing, being as ass about it is another. Offer some assistance if you're going to be so demanding.

This is another example of why I have never, and probability never will, release a public utility to this community. These kind of reactions make me ill.  :(
Title: Re: Cartographer
Post by: byuu on January 05, 2009, 11:39:38 am
Quote
// POINTER SIZE: This is were the program crashes. The game uses 3-byte pointers
//        so I changed it to 04 like the readme said. I tried just for testing
//       to change the value to $02 and then it extracted fine.

... am I the only one who wondered why you weren't supposed to use a value of $03 to represent a 3-byte pointer? :/

Quote
And yes, Unicode can be UTF-7, UTF-8, UTF-16, UTF-32 ± BOM or UCS-2, UCS-4 ± BOM! Seriously, I don't get what's so hard about making a unicode compliant program.

Windows internally uses UTF-16 only. Linux uses UTF-8 only. You can feed it whatever format you want, but you have to manually convert it.

Easy to get cross-platform support by writing a MultiByteToWideChar(CP_UTF8...) wrapper for 'doze. Losing O(1) character lookup sucks royally, though. UTF-16 can't do that 100% cleanly thanks to surrogate pairs. Sure you can ignore those, but yuck. So you can use the even less common UTF-32, fun. I just make sure my UTF-8 tables use characters that encode as 3-bytes/per so I can be cheap about indexing.

The ROM hacking scene has always been about ten years behind the common encoding format, anyway. When I started, it was all about EUC-JIS encoding. Be happy most are at least using Shift JIS now.
Title: Re: Cartographer
Post by: Tauwasser on January 05, 2009, 01:12:08 pm
So, seriously, I have no idea of c/c++, but I'm freaking sick of seeing programs not work for many people because it only supports latin-1 or whatever.
I'm just saying, that there is a way, Microsoft does it - so can everybody else with a little effort. There just has to be a standard library in c/c++ by now that features text file access that does not read ANSI only or crashes on BOM, valid unicode etc.

And yes, by now I think the translation community - especially the translation community - ought to care about that stuff. Cartographer might be nice, however, I'm running a Japanese OS, so it probably won't work for me with German umlauts etc. That's a problem! And there is a solution. And it has been around for 10 years. I just can't believe there is no library in c/c++ or whatever this is written in that supports this!

Quote from: byuu
Windows internally uses UTF-16 only. Linux uses UTF-8 only.

I'm not talking about filenames. The .Net framework is pretty darn good at reading all of those formats (though I didn't test UTF-32, but I read it somewhere... I tested UTF-8, UTF-16 and UCS-2 tho). You don't need to convert not textfile for it to be able to read it with StreamReader!

And it's not only the romhacking community. IrfanView doesn't even support UTF-16 filenames! HexWorkshop didn't, but the newest release does! However, it seems so slow compared to other apps (and yes, Microsoft apps!) that work seamlessly and without ANSI and have been working so for years!

cYa,

Tauwasser
Title: Re: Cartographer
Post by: Nightcrawler on January 05, 2009, 01:27:48 pm
Few points:

1. To take advantage of Unicode (UTF-16) on Windows using straight Win32 C, you need to do a bunch of little things including using all 'W' versions of the library functions and variables such as LPWSTR for strings. And don't forget the 'L' in front of a declared string in your code. One time I had a large Win32 program made. A few years later, the company wanted Unicode support added. Converting after the fact was nearly unfeasible. I spent some time with it, but there were so many things to change and so many small errors cropped up, I told them it wasn't worth their time and they should instead include Unicode in the next generation application where it can be added properly into the design rather than kludged to hell after the fact. Converting between character sets is also somewhat painful. Apparently it's not intuitive enough because I've clean forgot off hand how to do it. Unicode in Win32 can be a real pain, especially if you didn't plan for it from the beginning!

2. If you're just using standard C or C++, same story really. You need to use the 'w' versions of the common functions and string/char variables. You still have some issues converting, but I think you can use something like setlocale(), iconv() or something for the output you desire. It's been so long since I didn't anything in plain vanilla C/C++, I don't remember nor did I really do much there involving Unicode.

3. Any .NET language makes character encoding handling insanely easier on Windows. Unicode is always used internally, but you have the freedom to input and output most anything easily. You want S-JIS for input and UTF-8 for output? Or vice versa? Not a problem. For people like myself scarred from former character encoding headache nightmares, they all went away with .NET. :)

It's not surprising that many English made ROM hacking tools don't support Unicode well.


So.... RedComet, what did you program the utility in?
Title: Re: Cartographer
Post by: byuu on January 05, 2009, 02:04:15 pm
Quote
I'm not talking about filenames.

Neither was I, I was talking about displaying the text inside a user interface. But filenames are a huge problem, too.

Quote
To take advantage of Unicode (UTF-16) on Windows using straight Win32 C ... Converting after the fact was nearly unfeasible.

I'll share some of the fun I had porting my apps to Unicode.

1) need to #define UNICODE (cleaner, safer than adding Ws everywhere), which instantly breaks a few hundred to a few thousand Win32 API calls. You'll cry when GCC spits out 4,873 compilation errors.
2) best to write a generic wrapper to turn UTF-8 into UTF-16 on-the-fly:
Code: [Select]
class utf16 { public:
  operator wchar_t*() { return buffer; }
  operator const wchar_t*() const { return buffer; }
  utf16(const char *s = "") {
    if(!s) s = "";
    unsigned length = MultiByteToWideChar(CP_UTF8, 0, s, -1, 0, 0);
    buffer = new(zeromemory) wchar_t[length + 1];
    MultiByteToWideChar(CP_UTF8, 0, s, -1, buffer, length);
  }
  ~utf16() { delete[] buffer; }
private: wchar_t *buffer;
};
... and vice versa.
3) all of your filename passing fails. Have to convert them to UTF-16 first. This also breaks all your libc file access functions: fopen needs to become _wfopen, mkdir needs to become _wmkdir, etc. This also breaks all your third-party libraries: have fun patching zlib, libjma, etc.
4) int main(int argc, char *argv[]) fails. The non-ANSI parts become question marks, so even converting them to UTF-16 won't let you open the files. Need some serious black magic to get that back to valid UTF-8:
Code: [Select]
int __stdcall WinMain(HINSTANCE, HINSTANCE, LPSTR, int) {
  //argv[] is in 7-bit ANSI format; Unicode characters are converted to '?'s.
  //this needs to be converted to UTF-8, eg for realpath(argv[0]) to work.
  int argc;
  wchar_t **wargv = CommandLineToArgvW(GetCommandLineW(), &argc);
  char **argv = new char*[argc];
  for(unsigned i = 0; i < argc; i++) {
    argv[i] = new char[_MAX_PATH];
    strcpy(argv[i], utf8(wargv[i]));
  }
5) all of these changes are Win32-specific, so you have to encapsulate all of them in #ifdef _WIN32, so that it continues to work on pure UTF-8 systems like Linux / BSD / OS X. And if you want unified GUI text for all platforms, then 100% of your Win32 API calls need to wrap UTF-8 -> UTF-16.

The best part ... all of this could be avoided, and current apps could transparently gain Unicode support, if Windows would just accept a UTF-8 codepage with the *A functions.

The bad news, you pretty much have to do this stuff. If someone has a Windows username that isn't pure ANSI, and you app saves data inside their profile (as it should, apps are supposed to store data in the App Data folder), it will completely fail to save the data without Unicode support. This really pisses off non-English speakers, and for good reason. I had someone on 2ch asking why I hated Japanese people because I couldn't load Japanese-named ROMs >_<

The really bad news, most big-name commercial apps can't handle this, either! Winamp, Firefox 2 ... 95% of my applications failed to work at all when I used a non-English profile username.
Title: Re: Cartographer
Post by: RedComet on January 05, 2009, 04:26:41 pm
So.... RedComet, what did you program the utility in?

C++.
Title: Re: Cartographer
Post by: Gemini on January 05, 2009, 05:23:19 pm
There's already a procedure to convert UTF8 to Windows' Unicode:
Code: [Select]
#include "winldap.h"
#pragma comment(lib, "wldap32.lib")

int UnicodeToUtf8(CString string, char* &dest)
{
int strlen=LdapUnicodeToUTF8(string,string.GetLength(),dest,0);
dest=(char*)new BYTE[strlen];
LdapUnicodeToUTF8(string,string.GetLength(),dest,strlen);
return(strlen);
}

int Utf8ToUnicode(TCHAR* &dest, char* string)
{
int len=LdapUTF8ToUnicode(string,(int)strlen(string),dest,0);
dest=(TCHAR*)new TCHAR[len];
LdapUTF8ToUnicode(string,(int)strlen(string),dest,len);
return(len);
}
I've been using these for almost 3 years, with no problems at all. Ok, it's Windows specific because of wldap, but I for one sure don't care. :p You can also replace CString if anything similar is not available. LPCTSTR+wcslen should work fine for the task. So:
Code: [Select]
int UnicodeToUtf8(LPCTSTR string, char* &dest)
{
int unilen=wcslen(string);
int strlen=LdapUnicodeToUTF8(string,unilen,dest,0);
dest=(char*)new BYTE[strlen];
LdapUnicodeToUTF8(string,unilen,dest,strlen);
return(strlen);
}
Title: Re: Cartographer
Post by: C_CliFF on January 05, 2009, 06:39:18 pm
Quote
... am I the only one who wondered why you weren't supposed to use a value of $03 to represent a 3-byte pointer? :/

You're right. For some odd reason I accidently read 24 bit instead of 32... :) It doesen't keep the program from crashing though.

-C_CliFF
Title: Re: Cartographer
Post by: Gil Galad on January 06, 2009, 02:22:03 am
I talked to RedComet earlier today about Cartographer. So I have an example to show you guys based on the game Cadillac, which is a playing card puzzle type game. I am also calling the project.

The main point of this post is to address some documentation and explain some things in order to dump text a bit easier.

BASE POINTER

I tried a pointer table dump without success and then discovered that in order to dump the text of this game I needed to subtract instead of add in the #BASE POINTER command. The reason why you need to subtract is if the ROM address location of the text is less than the pointer address.

Cadillac is a mapper 3 Famicom game. For those that don't know, Mapper 3 is a 1 - 32KB PRG (Program ROM) bank game. The range of the data would be 10h - 800Fh.

The range of the text data is at 186Fh - 1F39h. So you would add $8000 to get the real address location of the data. So if you start at 186Fh, add 8000 to that and then -10, the result is $985F.  So, $985F is the address and the pointer if you flip the bytes around, 5F98. You would add $8000 because that's the address that the bank starts at and adds up right if you use the SetOff pointer calculation.

Next is the pointer table location. I will show you the commands in the file.

#POINTER TABLE START:   $1F3A
#POINTER TABLE STOP:   $1FB1

The first two bytes of the pointer table are 5F98. Those two bytes are the correct pointer for the first line of text in this block.


Based on the way that I have my command files set up, here is how Cartographer normally works. In # BASE POINTER, you take the modifier address and either add or subtract from the pointer to find the ROM address. In the readme file, it only says that you can add, but you can also subtract.

You know that 8000 is the SetOff calculation, so based on the way that Cartographer works, if you add $8000 to the pointer, the program is going to crash or not function as intended. $8000 + $985F = 1185F, that's way out of bounds of the PRG bank and the NES address range.

Here is the way around it. Instead, subtract $8000 from $985F and that equals 185Fh, near the intended location. Now, here is where it gets a bit weird. You also have to subtract the header size, in this case you would subtract the header from the BASE POINTER modifier, that would be $7FF0. So, your new BASE POINTER modifier would be -$7FF0.   So, $985F - $7FF0 equals 186Fh, that is the correct ROM offset.


Table Files

Here are a couple tips for table files. Make sure that you remove all bookmarks from the table files as well as anything that is not supported by Cartographer.

Make sure the line and end break codes are at the bottom of the file. There are two types that I have used, one is for raw and the other for relative pointers.  You can check out the differences in the material that I am going to provide.

The end line codes in your table file should be something like this.

FE=[liNE]\r
FF=[END]\n\r

You can use the /r and /n as you wish.

Now, for the RELATIVE POINTER table

FE=[liNE]\r
/FF=[END]\n\r.

Dumps

For the dumps that are RAW, I suggest that you have your table files correct or the dumps will not occur or be messed up. I also removed the #END BLOCK command in the raw dump file so that I could dump the text.


In closing, some of these things were already documented. However, these things I talked about are based on my experience and how I solved some of the crashing issues. And the lower ROM address compared to the pointer also needed to be discussed, I believe.

Here are the files for you guys to look at. Some of these files are unedited and directly from Cartographer.

HERE (http://gilgalad.arc-nova.org/junk/projects/cadillac/)


Title: Re: Cartographer
Post by: C_CliFF on January 06, 2009, 07:22:13 am
Thank you! It worked now. I just tried with FF5, which use 3-byte pointers.

The text is stored at 21020D, so to get the Pointer Table value, you subtract 200 and add C00000 = 21020D - 200 + C00000 = E1000D. E1000D is the first pointer value for 21020D.

What I did was subtracting 200 from C00000. So, C00000 - 200 = BFFE00. To make sure I got to the right adress where the text is stored, I subtract the Pointer Table value with BFFE00. So, E1000D - BFFE00 = 21020D. That worked fine, so this is my commands, if you're going to handle games that uses standard 3 byte pointers:

Code: [Select]

#GAME NAME: Final Fantasy 5 (SNES)

#BLOCK NAME: Dialogue Block (RAW) // this raw block extracts fine
#TYPE: NORMAL
#METHOD: RAW
#SCRIPT START: $21020D
#SCRIPT STOP: $21FE16
#TABLE: ff5_raw.tbl
#COMMENTS: Yes
#END BLOCK


#BLOCK NAME: Dialogue Block (POINTER_RELATIVE)
#TYPE: NORMAL
#METHOD: POINTER_RELATIVE
#POINTER ENDIAN: LITTLE
#POINTER TABLE START: $2015F0
#POINTER TABLE STOP: $20205F
#POINTER SIZE: $03
#POINTER SPACE: $00
#ATLAS PTRS: Yes
#BASE POINTER: -$BFFE00     // C00000 - 200 = BFFE00. E1000D - BFFE00 = 21020D
#TABLE: ff5_ptr.tbl
#COMMENTS: Yes
#END BLOCK


I wasn't aware of that you could subtract, only add so now it works.

Thank you, Gil Galad, for clearing that up!

-C_CliFF
Title: Re: Cartographer
Post by: Rappa on February 13, 2009, 10:02:47 am
Code: [Select]
I'm, translating Fireemblem 3 and using Script Insertor/extractor. It's easy to use but it does not support pointer recalculating so I must do it manually. That's why I tried

Cartographer by Redcomet to dump the text im FE3.
My table include entries like this

[code]0000=[END]\n\r
0001=[liNE]\r
01=あ
02=い
03=う
....
F5=父
F6=行
F7=戦
....

120E=身
120F=愛
1210=姉
....

13AE=特
13AF=殊
13B0=離
....

148E=泣
148F=欲
1490=河
1491=音
[/code]

As you see, FE3 uses 2 kinds of Kanji: 1byte kind (ends at FF) and 2bytes kind (begins with 00 and 12,13,14 array. I mean in the Rom it would be 0012XX or 0013XX...)
When I dumped the text, I have two problem:


1. The 1byte Kanji and the Kanas (1byte) are confused. Say, we have 12=kana1, 4F=kana2, 124F=Kanji1. Instead of displayed 124F as kana1 kana2, it displayed Kanji1. How to remedy this?

2. I started dumping at the begining of the sentence, including control codes and the result looks like

Code: [Select]
//<$00>ダ[END]

//ヂさグ<$00>パあ<$00>ジいきマルス王子[LINE]
//タリス城からシ-ダ様が[LINE]
//来られました<$00>ヅ<$00>ズう<$00>パ[END]

//ジ<$00>えどうしたんだ!シ-ダ[LINE]
//城で何か あったのか<$00>ヅ<$00>パあ<$00>ジあきマルス様 会え相かった<$00>ヅ[LINE]
//ガルダの海賊が[LINE]
//<$00>突<$00>然<$00>ち おそっ録たの<$00>ヅ[LINE]
//お城も<$00>占ん<$00>ちされて[LINE]
//大ぜいの人が殺されたわ[LINE]
//おねがい!お父様を<$00>助<$00>ちけて<$00>ヅ<$00>パ[END]



I wonder where I went wrong?

Here I include the table file for you to download, in case you want to check it.

http://www.mediafire.com/?juhumkzmkzy

Control codes explain

0000= end the text (pointer read until this one)
0001= line break
008800 = begin the talk
0089XX80 = change music to XX melody
00920X0084YYZZ = open a dialouge box, with X is 0 (top) or 1 (bottom), YY is character's portrait ID, ZZ is 04 (top left) or 05 (top right), or 06 (bottom left) or 07 (bottom right)
008A = pause until key pressed, clear previous text.
00920X0002: switch to other character, with X=0 for top character, X=1 for bottom characte.
00850X= close dialouge box, X=2 for top, X=3 for bottom.

....

The text I dumped begins at $00071A2E

Anyone check this for me?
Thank you in advance.

This is a result of script extractor/insertor. Sometimes it work wrongly!

(http://img266.imageshack.us/img266/4329/khoeoi9.jpg)
(http://img7.imageshack.us/img7/7495/khoe2bh1.jpg) (http://imageshack.us)
(http://img7.imageshack.us/img7/khoe2bh1.jpg/1/w512.png) (http://g.imageshack.us/img7/khoe2bh1.jpg/1/)
Title: Re: Cartographer
Post by: Tauwasser on February 13, 2009, 12:26:38 pm
Quote
(begins with 00 and 12,13,14 array. I mean in the Rom it would be 0012XX or 0013XX...)

But in your table file, you only wrote 12XX or 13XX. If the game has a 00 in front of every of those characters, then you'll need those, too. Also, if your game has it like 0012XX12XX12XX0013XX0012XX12XX0013XX13XX or something, then it's probably really hard to get cartographer to know what you mean.

Also, please do not post ANSI text, it was converted to gibberish, so noone really gets your point...

cYa,

Tauwasser
Title: Re: Cartographer
Post by: Kajitani-Eizan on February 13, 2009, 02:56:32 pm
1. The 1byte Kanji and the Kanas (1byte) are confused. Say, we have 12=kana1, 4F=kana2, 124F=Kanji1. Instead of displayed 124F as kana1 kana2, it displayed Kanji1. How to remedy this?

now, correct me if i'm wrong here, but isn't that the expected behavior?
Title: Re: Cartographer
Post by: Rappa on February 14, 2009, 01:31:04 am
Quote
Also, please do not post ANSI text, it was converted to gibberish, so noone really gets your point...

cYa,

Oh sorry. I wrote this in Unicode and open the file by Notepad in another PC. It turned out like this. I'll correct the gibberish soon.

Title: Re: Cartographer
Post by: KaioShin on February 14, 2009, 03:51:27 am
1. The 1byte Kanji and the Kanas (1byte) are confused. Say, we have 12=kana1, 4F=kana2, 124F=Kanji1. Instead of displayed 124F as kana1 kana2, it displayed Kanji1. How to remedy this?

now, correct me if i'm wrong here, but isn't that the expected behavior?

Yeah... how would the game be able to distinguish between this? This doesn't make much sense.
Title: Re: Cartographer
Post by: Tauwasser on February 14, 2009, 08:40:10 am
Quote
and 2bytes kind (begins with 00 and 12,13,14 array. I mean in the Rom it would be 0012XX or 0013XX...)

As he said above, IMO it probably needs a 00 in front of it. However, the 00 code could just be a "clear kanji set" or something and 12 would be "use kanji set 12 from now on". So it might have lots of variations or something. Like, kana0012XXYYZZ0013XXZZYY where XX, YY, ZZ are always kanji codes in the 12/13 set and not kana...

We need more info for this one...

cYa,

Tauwasser
Title: Re: Cartographer
Post by: Rappa on February 15, 2009, 10:50:06 am
I corrected the ANSI.

I think 00 is a kind of trigger byte. It triggers the bytes following are Kanji. You can see this in Fe3 and Fe4.
But I'm Ok with it now 'cause I already have the script. I just wonder what if I didn't have it. So this problem must be solved.

Anyway, I'm having some troubles with Atlas. It seems that Atlas hates Unicode table!
Title: Re: Cartographer
Post by: hanhnn on September 24, 2009, 10:27:50 pm
why i can't find this tool in the Utilities section ?
Title: Re: Cartographer
Post by: Nightcrawler on September 25, 2009, 09:12:16 am
We've told RedComet about this at least 3 or 4 times. I don't know why he doesn't add it to the database, especially after he recommends people to use it and is a staff member on this site. It defies logic. I don't believe he's ever given any reason for withholding.

Care to comment RedComet? I'm calling you out!
Title: Re: Cartographer
Post by: RedComet on September 25, 2009, 09:34:30 am
The reasoning behind my logic was that I was going to update it and fix the bugs that people had reported before submitting it. Since that looks like it's gonna take forever, I'll go ahead and submit it later on tonight.
Title: Re: Cartographer
Post by: Nightcrawler on September 25, 2009, 11:22:48 am
The last version is from Dec. 2008 isn't it? And for three quarters of a year, you've been telling people to use it... Yes, I think logically you might want to add it already. How many years will you wait while recommending it in the meantime? :P
Title: Re: Cartographer
Post by: Klarth on September 26, 2009, 01:58:06 am
Atlas was closed source/distribution for about a year or so for the basic features before I redesigned it in its current form.  Probably should've stayed closed for another week or two so I would've made decent docs for it.  :p
Title: Re: Cartographer
Post by: Nightcrawler on September 28, 2009, 09:13:46 am
Atlas was closed source/distribution for about a year or so for the basic features before I redesigned it in its current form.  Probably should've stayed closed for another week or two so I would've made decent docs for it.  :p

Yes, I can certainly see your point there with Atlas documentation! It's never too late to update!
Title: Re: Cartographer
Post by: Klarth on September 28, 2009, 08:18:01 pm
Yes, I can certainly see your point there with Atlas documentation! It's never too late to update!
Too busy with work, additional side projects at work (planning a 3 million dollar heart catheterization facility being the biggest), reading books (chemical warfare at the moment), and occasional writing of "the doc to end all docs" which hopefully I can release early next year...no time for Atlas documentation though the beforementioned doc will have some on Atlas.
Title: Re: Cartographer
Post by: hanhnn on November 08, 2009, 11:11:01 am
i don't know uploading a tool that was already public is so difficult
Title: Re: Cartographer
Post by: RedComet on November 09, 2009, 09:31:58 am
i don't know uploading a tool that was already public is so difficult

I don't know why using correct grammar is so difficult, but I digress.

http://www.romhacking.net/utils/647/
Title: Re: Cartographer
Post by: misterj on November 09, 2009, 09:51:13 am
It's good that mods troll their own forums.
(It's not)
Title: Re: Cartographer
Post by: Lilinda on November 09, 2009, 09:59:36 am
He's not a mod. >.>
Title: Re: Cartographer
Post by: Tauwasser on November 09, 2009, 12:59:23 pm
He's not a mod. >.>

And that should be his user title! So finally that gets out of everybody's head. Even people in moderation thought he was until it was pointed out :O!

cYa,

Tauwasser
Title: Re: Cartographer
Post by: BRPXQZME on November 09, 2009, 02:32:59 pm
He will always be the Accidental Destroyer of Threads.
Title: Re: Cartographer
Post by: RedComet on November 10, 2009, 12:15:10 am
He will always be the Accidental Destroyer of Threads.

I do it better when I'm drunk and have mod powers. :P
Title: Re: Cartographer
Post by: Lilinda on November 10, 2009, 12:16:32 am
There's still a report to moderator button...
Title: Re: Cartographer
Post by: Nightcrawler on November 20, 2009, 03:42:50 pm
I decided to play around with Cartographer since I don't have a robust generic dumper utility. First thing I tried to make it do is fixed length item list dumping. Apparently, it doesn't seem to want to dump more than the first item.

Commands:

#GAME NAME:      Tenshi No Uta (SNES)

#BLOCK NAME:   Magic Names (RAW)
#TYPE:         FIXED_STRING
#STRING LENGTH:   6
#STRING END:      No
#METHOD:      RAW
#SCRIPT START:   $d6f00
#SCRIPT STOP:   $d7640
#TABLE:         tenshi8x16j.tbl
#COMMENTS:      Yes      //start first line with //
#END BLOCK            //remainder of comment placement

It dumps the first 6 bytes and that's it. What am I doing wrong? Or can't Cartographer do something like this?
Title: Re: Cartographer
Post by: Dragonsbrethren on November 20, 2009, 05:15:52 pm
I think you need to do both FIXED_STRING and FIXED_LINE, at least that's what I did:

Code: [Select]
#BLOCK NAME: Status effect names
#TYPE: FIXED_STRING && FIXED_LINE
#STRING LENGTH: 10
#STRING END: No
#LINE LENGTH: 10
#LINE END: Yes
#METHOD: RAW
#SCRIPT START: $2AFE1
#SCRIPT STOP: $2B121
#TABLE: ff6_snes_menu_a.tbl
#COMMENTS: No
#END BLOCK

I can't remember why I did it this way anymore, but there must've been a reason. You also need to have LINE END set to Yes or you'll get an error about it.
Title: Re: Cartographer
Post by: Nightcrawler on November 23, 2009, 11:36:53 am
Ok. Two developments.

1 Cartographer apparently requires commands to be in a specific order or it complains.
2. It now dumps the whole desired range, but doesn't output any line breaks or line controls or anything. I just get a raw dump of the range. It's my understanding from the documentation it should stick line breaks in there or at least print (LINE). I've even tried the #LINE CTRL command. No difference.

Any ideas?

Code: [Select]
#GAME NAME: Tenshi No Uta (SNES)

#BLOCK NAME: Magic Names (RAW)
#TYPE: FIXED_STRING && FIXED_LINE
#STRING LENGTH: 6
#STRING END: No
#LINE LENGTH: 6
#LINE END: Yes
#LINE CTRL: 41
#METHOD: RAW
#SCRIPT START: $d6f00
#SCRIPT STOP: $d7640
#TABLE: tenshi8x16j.tbl
#COMMENTS: Yes //start first line with //
#END BLOCK //remainder of comment placement
//is handled by control codes
Title: Re: Cartographer
Post by: RedComet on November 23, 2009, 05:40:53 pm
How is the data stored? Something like this:

A name
B name
C name
(Each of these is a separate 6 character long string.)

And in ROM it looks like this:

A nameB nameC name

In other words, there's no end of string control code, right?
Title: Re: Cartographer
Post by: Nightcrawler on November 24, 2009, 08:22:06 am
Right. Man oh man, if we can't sort this out soon, I could have coded my own generic dumper by now! :P

In game:
AAAAAA
BBBBBB
CCCCCC
DDDDDD
EEEEEE
FFFFFF

In ROM:
AAAAAABBBBBBCCCCCCDDDDDDEEEEEEFFFFFF

Current Cartographer Output:
AAAAAABBBBBBCCCCCCDDDDDDEEEEEEFFFFFF

Desired Cartographer Output:
AAAAAA
BBBBBB
CCCCCC
DDDDDD
EEEEEE
FFFFFF
Title: Re: Cartographer
Post by: RedComet on November 24, 2009, 09:29:12 am
Code: [Select]
#GAME NAME: Tenshi No Uta (SNES)

#BLOCK NAME: Magic Names (RAW)
#TYPE: FIXED_STRING
#STRING LENGTH: 6
#STRING END: Yes
#END CTRL: [END]                //you can change this to whatever
#METHOD: RAW
#SCRIPT START: $d6f00
#SCRIPT STOP: $d7640
#TABLE: tenshi8x16j.tbl
#COMMENTS: Yes //start first line with //
#END BLOCK //remainder of comment placement
//is handled by control codes

Try that. The #END CTRL specifies a little string of text that you want output at the end of each string. This isn't an actual control code in the game. Rather it's just something I added that would end up in the script to make it easier to determine where individual strings end.

With the above your example should output like this:

AAAAAA[END]

BBBBBB[END]

CCCCCC[END]

DDDDDD[END]

EEEEEE[END]

FFFFFF[END]


The FIXED_LINE thing is really for Genesis games that don't specify line breaks instead leaving it up to the tilemap to insert them. Bare Knuckle 3 did this. It would stick all of the tiles in VRAM right after one another and the tilemap was changed separately.

Here's an example:

Code: [Select]
#GAME NAME: Tenshi No Uta (SNES)

#BLOCK NAME: Magic Names (RAW)
#TYPE: FIXED_STRING && FIXED_LINE
#STRING LENGTH: 12
#STRING END: Yes
#END CTRL: [END]                //you can change this to whatever
#LINE LENGTH:      6
#LINE END:         Yes
#LINE CTRL:         [LINE]                  //this can be whatever you want too
#METHOD: RAW
#SCRIPT START: $d6f00
#SCRIPT STOP: $d7640
#TABLE: tenshi8x16j.tbl
#COMMENTS: Yes //start first line with //
#END BLOCK //remainder of comment placement
//is handled by control codes

In ROM:
AAAAAABBBBBBCCCCCCDDDDDDEEEEEEFFFFFF

Output:

AAAAAA[LINE]
BBBBBB[END]

CCCCCC[LINE]
DDDDDD[END]

EEEEEE[LINE]
FFFFFFF[END]

Hope that clears things up.
Title: Re: Cartographer
Post by: Nightcrawler on November 24, 2009, 04:01:53 pm
Now I'm about right back to where I started. I tried your code. It only outputs 6 characters or 1 item again. It prints [END], however it doesn't bother dumping past the first item. Your program is as lazy as you are Red!  :laugh:

Output:
AAAAAA[END]

Is this a bug in your program?
Title: Re: Cartographer
Post by: RedComet on November 24, 2009, 07:03:40 pm
Yeah, that must be a bug. I dunno what the hell the deal is.  :huh:
Title: Re: Cartographer
Post by: C_CliFF on April 03, 2011, 01:46:58 pm
This was the only related topic I could find for a problem I stumbled upon.

Is it possible for Cartographer to dump text that uses a sandwich format?

This is a 11 year old project I had, again FF5.

This is how it looks:

(http://img826.imageshack.us/img826/1103/ff5text.png)

And this is where the pointer are:

(http://img847.imageshack.us/img847/9246/ff5pointers.png)

This is a 200h rom.

As you can see, this one doesn't have an end byte to it. This is also not fixed strings.

And this is my settings:

Code: [Select]
// 270200 - 270862

#BLOCK NAME: Location Names (POINTER_RELATIVE)
#TYPE: NORMAL
#METHOD: POINTER_RELATIVE
#POINTER ENDIAN: LITTLE
#POINTER TABLE START: $107200
#POINTER TABLE STOP: $107344
#POINTER SIZE: $02
#POINTER SPACE: $00
#ATLAS PTRS: Yes
#BASE POINTER: -$FFD8FE00 // 0000 - 270200 = FFD8FE00, 0000 - FFD8FE00 = 270200
#TABLE: ff5_ptr.tbl
#COMMENTS: Yes
#END BLOCK

The output from this was a mess, with this text in a big chunk, since the text block doesen't specify an end-byte.

So, is it possible for Cartographer to dump text that doesn't have an end byte value?

-C_CliFF
Title: Re: Cartographer
Post by: RedComet on April 03, 2011, 06:28:15 pm
No. I didn't realize such a scheme existed, so I didn't add it.
Title: Re: Cartographer
Post by: Gideon Zhi on April 03, 2011, 07:00:36 pm
Only game I've seen that does this is Romancing SaGa 2. It had a few unused bytes in its data set though, so when initially dumping it I hijacked one of those and wrote a small program to rip the text to a new binary file and insert the dummy byte after each string as a placeholder end-of-string command. Course, this mucked up the data offsets for the pointers, but then, I wasn't using cartographer for it :)
Title: Re: Cartographer
Post by: RedComet on April 03, 2011, 08:10:09 pm
Only game I've seen that does this is Romancing SaGa 2. It had a few unused bytes in its data set though, so when initially dumping it I hijacked one of those and wrote a small program to rip the text to a new binary file and insert the dummy byte after each string as a placeholder end-of-string command. Course, this mucked up the data offsets for the pointers, but then, I wasn't using cartographer for it :)

I am HURT. :P
Title: Re: Cartographer
Post by: Gideon Zhi on April 03, 2011, 08:45:47 pm
Hey, Cartographer wasn't even a glimmer in your left testicle when RS2's script got dumped ;)
Title: Re: Cartographer
Post by: Nightcrawler on April 04, 2011, 10:48:00 am
This is a method of storage I have omitted from my WIP utility (http://transcorp.parodius.com/forum/YaBB.pl?num=1273690996) as well.  String Types  are C-Style (end terminated), Pascal (string length in the beginning), and Fixed. This is a bastardization. I do support a pointer table to fixed length strings no problem. However, we have variable length here with no length or end indication. Length can only be inferred from pointer math. Perhaps a 'Fixed Length by Next Pointer' option extension to Fixed Length string types. Something to that effect.

However, one big issue is what about the last string? How do you know how long the last string is? How does the game know in this case? This gets a little ugly and may turn into something very game specific.
Title: Re: Cartographer
Post by: Gideon Zhi on April 04, 2011, 02:23:20 pm
However, one big issue is what about the last string? How do you know how long the last string is? How does the game know in this case? This gets a little ugly and may turn into something very game specific.

At least in RS2's case, if there are n strings there are n+1 pointers.
Title: Re: Cartographer
Post by: C_CliFF on April 04, 2011, 03:14:05 pm
It's the same for the last string. Doesn't end with an end byte. The script ends when it hits an end byte value, which in this case passes beyond this script block, the monster names and spells. The monster and spell names are fixed strings, though.

The last string is supposed to end at $270862 but ends at $272962. The whole block, with the available space included, is between $270200 and $270AFF, so the extraction gets quite far before it hits the end.

This is the only block in the game that has this kind of format, so fortunately, it's not a real biggie. But maybe this applies to other games besides this and RS2.

-C_CliFF
Title: Re: Cartographer
Post by: RedComet on April 04, 2011, 06:30:31 pm
So the last string ends with an end control code or it doesn't end with an end control code? Your post is confusing.
Title: Re: Cartographer
Post by: C_CliFF on April 04, 2011, 06:52:37 pm
So the last string ends with an end control code or it doesn't end with an end control code? Your post is confusing.

Sorry, it doesn't end with an end control code.

-C_CliFF
Title: Re: Cartographer
Post by: Nightcrawler on April 05, 2011, 09:05:40 am
It's the same for the last string. Doesn't end with an end byte. The script ends when it hits an end byte value, which in this case passes beyond this script block, the monster names and spells. The monster and spell names are fixed strings, though.

The last string is supposed to end at $270862 but ends at $272962. The whole block, with the available space included, is between $270200 and $270AFF, so the extraction gets quite far before it hits the end.

This is the only block in the game that has this kind of format, so fortunately, it's not a real biggie. But maybe this applies to other games besides this and RS2.

-C_CliFF

I don't understand what you're saying. When the game hits the last pointer to a string, how does it know when to stop? Is it like Gideon described where there is an extra pointer (that doesn't go to any text) at the end of the pointers? That makes sense to me. It seems for you, it just keeps reading past the end of the useful string until it hits some end indication at the end of your entire data area? If that's the case, what stops it from printing all that on the screen when that string is called until it gets there?
Title: Re: Cartographer
Post by: C_CliFF on April 05, 2011, 01:25:44 pm
Your post is confusing.

I don't understand what you're saying.
I re-read my post a few times and I see that it was confusing.

It seems for you, it just keeps reading past the end of the useful string until it hits some end indication at the end of your entire data area? If that's the case, what stops it from printing all that on the screen when that string is called until it gets there?

This is what happens. The reason that it stops is because I got an end byte value in my table-file. ($00)

The text reads on and on til it gets to the end byte control code I have in my table file. I upload a file so you can see what the output looks like. I shortened it because the text's repeating til the pointer end offset I specified in the command settings above.

http://www.mediafire.com/?15yhrhfwo7i2w87

Sorry for the confusion.  My english is not good today.

EDIT: Oh, and the last pointer points to Elder Tree. The rest belongs to other script blocks.

-C_CliFF


Title: Re: Cartographer
Post by: Nightcrawler on April 05, 2011, 03:18:23 pm
OK. I see what issue is. You're explaining again why you are having trouble dumping the text with Cartographer. I already understood that part. When you use Cartographer it doesn't know where any of the strings end and overruns the entire area until it hits a $00 for each string. The result is you get a big mess of a block for each string.

What I want to know is how does the GAME know when a string ends? Most likely it uses the n+1 pointer to figure it out. That's not too big a deal. I could do that. The million dollar question is what does it do on the last string, where conceivably there would be no more pointers. Gideon's game, RS2, simply has an extra pointer at the end that serves no purpose other than to calculate the end of the last string. I wanted to know if your game did the same or used some other game specific method to know. That's the only way I can really help you out.

It seems you probably don't know what the game does. Are you experienced enough to find out? To dump this properly, you need to figure that out or just overdump on the last string and trim each block manually. You'd still need a dumper that can do fixed length string lengths based on pointer calculations (which it does not appear any public ones do).
Title: Re: Cartographer
Post by: C_CliFF on April 05, 2011, 04:58:39 pm
OK. I see what issue is. You're explaining again why you are having trouble dumping the text with Cartographer. I already understood that part. When you use Cartographer it doesn't know where any of the strings end and overruns the entire area until it hits a $00 for each string. The result is you get a big mess of a block for each string.

What I want to know is how does the GAME know when a string ends? Most likely it uses the n+1 pointer to figure it out. That's not too big a deal. I could do that. The million dollar question is what does it do on the last string, where conceivably there would be no more pointers. Gideon's game, RS2, simply has an extra pointer at the end that serves no purpose other than to calculate the end of the last string. I wanted to know if your game did the same or used some other game specific method to know. That's the only way I can really help you out.

It seems you probably don't know what the game does. Are you experienced enough to find out? To dump this properly, you need to figure that out or just overdump on the last string and trim each block manually. You'd still need a dumper that can do fixed length string lengths based on pointers (which it does not appear any public ones do).

The thing is that I already know why, the question I was asking in the first place was if Cartographer was able to dump text that use this kind of format. My purpoes wasn't to take this much further then that. I know you're working with TextAngel, and when Gid told me his game used this kind of format, I thought it would have been a good idea to explain how FF5 does with this text block and maybe this could be of some use for your tool, (which I'm not doing well, at all) unless it's not game specific.

This text is for the location names, and it's the only block that has this kind of format, the rest of the game's text uses either standard text handling and fixed strings for items, magics, monster names and such. This is how it looks at the end of the text block:

(http://img197.imageshack.us/img197/3313/ff5text2.png)

The rest of the text that comes after where my cursor is, is probably some old translated text that hasn't been deleted. But this is where the last pointer points to, and this is how it looks in the pointer table:

(http://img97.imageshack.us/img97/2871/ff5pointers2.png)

I hope this explains it more clearly and sorry for any missunderstanding on why I bumped this thread.

-C_CliFF



 
Title: Re: Cartographer
Post by: Gideon Zhi on April 05, 2011, 10:14:19 pm
Maybe it's just that it's 10PM and I've come from a two-hour calculus review, but those pointers don't appear to match up to your data in any meaningful way, even if they're relative based on some offset. If we assume that "Oldest Tree" starts at relative address 0x662, then "Underground River" relatively starts at 0x66D, but the pointer is 0x673. If we instead assume that "Oldest Tree" starts at relative address 0x658, then "Underground River" relatively starts at address 0x663, but again, its pointer is 0x662.
Title: Re: Cartographer
Post by: RedComet on April 05, 2011, 10:30:14 pm
You'd still need a dumper that can do fixed length string lengths based on pointers (which it does not appear any public ones do).

I'm pretty sure Cartographer does that. :P At least, I seem to remember coding it in there for Bare Knuckle 3. :huh:
Title: Re: Cartographer
Post by: Nightcrawler on April 06, 2011, 02:30:07 pm
Red:
I meant to say "can do fixed length string lengths based on pointer math", such as the case here. By the way, are you interesting in possibly updating Cartographer in the event Klarth issues an update to TableLib that would use the new table file standard (http://transcorp.parodius.com/forum/YaBB.pl?num=1273691610/0) we were working on? He mentioned he would update Atlas as well when the time comes.

C_CliFF:
It looks like this is the same case as Gideons. Correct me if I'm mistaken, but pointer 0662 points to 'Underground River' which is the last string, correct? If so, there is one more pointer 0673 which points to where your cursor is, the end of the block. If so, we have the answer here. There's just one extra pointer that is simply used to calculate the end of the block and does not point to a string.

I will see about implementing this method in my WIP utility. I can't think of the game, but I know I have seen this elsewhere in the past. I think I still classify this as 'fixed length' string type. They are surely not C-style/end terminated or pascal. I don't know of any other string type to classify them as. So I will just treat it as an extra checkbox option extension for fixed length strings to determine the length by relative pointer math instead of the defined fixed length constant.
Title: Re: Cartographer
Post by: C_CliFF on April 06, 2011, 04:39:52 pm
C_CliFF:
It looks like this is the same case as Gideons. Correct me if I'm mistaken, but pointer 0662 points to 'Underground River' which is the last string, correct? If so, there is one more pointer 0673 which points to where your cursor is, the end of the block. If so, we have the answer here. There's just one extra pointer that is simply used to calculate the end of the block and does not point to a string.

That's absolutely correct.

-C_CliFF
Title: Re: Cartographer
Post by: RedComet on April 06, 2011, 05:31:33 pm
Red:
I meant to say "can do fixed length string lengths based on pointer math", such as the case here. By the way, are you interesting in possibly updating Cartographer in the event Klarth issues an update to TableLib that would use the new table file standard (http://transcorp.parodius.com/forum/YaBB.pl?num=1273691610/0) we were working on? He mentioned he would update Atlas as well when the time comes.

Yeah, I've been thinking about revisiting Cartographer for a while now and fixing a few things and adding a few others that people have brought up to me in the past year or so. Probably would be beneficial for both of us if the community spoke up and said what exactly they want out of a text dumper. I'm sure it would only improve TextAngel and Cartographer if we were able to cover all the bases. Who knows,  someone else may have a weird format like this that might come up later on.
Title: Re: Cartographer
Post by: Pennywise on April 06, 2011, 07:52:36 pm
I have a sort of weird format I've come across. I think it was somewhat similar to Cliff's.

Basically there are at least two large blocks of text where the text has a bunch of non-script data embedded into the text. I can dump by pointers easily and get all the data, but there's not a standard control code to signify an end to a string. So I end up with a lot of overlap because all the pointers are normal and follow logical increments in that the offsets don't jump around.
Title: Re: Cartographer
Post by: abw on April 06, 2011, 08:33:00 pm
Probably would be beneficial for both of us if the community spoke up and said what exactly they want out of a text dumper.

I'll take that as an invitation to bring up an earlier discussion (http://www.romhacking.net/forum/index.php/topic,8945.0.html) :P

I'm not sure whether this will be useful or not, but after thinking about it for a little bit, achieving the desired text-based effects from a pointer-based implementation is probably not as difficult as I had assumed:

Code: [Select]
sorted_ptrs = pointer_table.sort(by_target_address)
cur_text_pos = sorted_ptrs[0].target_address - 1

for (i = 0 to sorted_ptrs.length) {
    if (sorted_ptrs[i].target_address <= cur_text_pos) {
        if (sorted_ptrs[i].target_address == sorted_ptrs[i - 1].target_address) {
            // handle duplicate pointer
        } else {
            // handle partially overlapping text
        }
    } else {
        if (sorted_ptrs[i].target_address > cur_text_pos + 1) {
            // handle pointerless text between sorted_ptrs[i - 1] and sorted_ptrs[i]
        }
        // dump text as normal
    }
    // byte_length includes printable text as well as any applicable control codes
// (e.g. end token for C-style strings, length token for Pascal strings, screen position tokens, etc.)
    cur_text_pos = sorted_ptrs[i].target_address + sorted_ptrs[i].byte_length
}

The basic goal is to keep track of which bytes in the dump range have been dumped as well as how many times they've been dumped (0, 1, > 1). If we sort the pointer table by the addresses the pointers point to (their target addresses), figuring out the start and (to a lesser extent) the end points of the dump range becomes trivial, and from there it's just a linear scan through the dump range, moving from pointer to pointer in ascending target address order, with some backtracking if pointer ranges overlap.
Title: Re: Cartographer
Post by: Nightcrawler on April 07, 2011, 10:24:11 am
RedComet:
OK. I will keep you in the loop when I get back around to talking to Klarth again.

Pennywise:
It sounds like the same thing. Whether you have data or not in the mix is immaterial. You want to dump by having the program figure out the 'string' (even if it has data) length by pointer math. Although you may have a different case. How does your game know when it has reached the end of a 'string'?

abw:
I will handle duplicate strings. I provide for an option to both order strings by pointer location and combine duplicates if desired. I believe I can also handle the case where you have a pointer to a string contained within another string. It is a logical extension after strings are already ordered by pointer to check if the n+1 string is within the dumped range for the current string n. If so, combine output. Theoretically, you'd probably need to check n+2, n+3 or go down the line until it doesn't overlap to be most flexible. This extension does add a good deal of logic and post processing. I see the usefulness, however I don't know if I will implement this or not. What I can say though is I will ensure my internal data structure would be able to support such functionality if implemented. These kinds of things are quite possible when using an internal data structure with attributes for your string/pointer combinations. It allows for post-processing of this kind of thing before making any kind of output.
Title: Re: Cartographer
Post by: Pennywise on April 07, 2011, 05:49:21 pm
Under normal circumstances, just a simple check for FF. However, the text I'm talking about is a little complicated and it doesn't appear to use an end of string control code. So far the only thing I've been able to determine that the non-text data starts when a value in the script is in the C0-E0 range then it will read a some more non-text data then without stopping it will switch the pointers. As an example we have $C2 $98 $01 $EB $03

As I mentioned before any value between C0-E0 seems to trigger the non-script data and $03 is how many pointers to skip to get to the desired pointer in the table. I haven't yet messed with other strings to see if after $C2, the end is 4 values away.

I did some more messing around and there's another control code that seems to correspond to presenting choices from you to pick from. The number of choices available is specified in the non-script data and so the more choices available, the longer the data.
Title: Re: Cartographer
Post by: abw on April 07, 2011, 09:47:43 pm
A second post-processing pass is definitely the way I would go in a situation like this. For output a forward-facing loop probably would be better than my backward-facing loop. If you don't mind outputting in sorted pointer order (or if each pointer can figure out its next sorted sibling), something like this might work:

Code: [Select]
for (i = 0 to sorted_ptrs.length) {
    // if there is a next pointer and the current pointer finishes *after* the next pointer starts
    if (sorted_ptrs[i + 1] && sorted_ptrs[i].target_address + sorted_ptrs[i].data.length > sorted_ptrs[i + 1].target_address) {
        // handle overlapping text
        output sorted_ptrs[i].target_address // e.g. #W16($280F4)
        output sorted_ptrs[i].data up to sorted_ptrs[i + 1].target_address // nop for duplicate pointers
        // if the next pointer ends *before* the current pointer ends, make sure any trailing text from the current pointer will eventually get printed
        sorted_ptrs[i + 1].data += sorted_ptrs[i].data from (sorted_ptrs[i + 1].target_address + sorted_ptrs[i + 1].data.length to sorted_ptrs[i].target_address + sorted_ptrs[i].data.length)
        // rest of text will be output when dealing with next pointer
    } else {
        // dump text as normal
        output sorted_ptrs[i].target_address
        output sorted_ptrs[i].data
        // if there is a next pointer and there's a gap between where the current pointer finishes and the next pointer starts
        if (sorted_ptrs[i + 1] && sorted_ptrs[i].target_address + sorted_ptrs[i].data.length < sorted_ptrs[i + 1].target_address + 1) {
            output data from (sorted_ptrs[i].target_address + sorted_ptrs[i].data.length to sorted_ptrs[i + 1].target_address)
        }
    }
}
Title: Re: Cartographer
Post by: Nightcrawler on April 08, 2011, 08:40:21 am
Under normal circumstances, just a simple check for FF. However, the text I'm talking about is a little complicated and it doesn't appear to use an end of string control code. So far the only thing I've been able to determine that the non-text data starts when a value in the script is in the C0-E0 range then it will read a some more non-text data then without stopping it will switch the pointers. As an example we have $C2 $98 $01 $EB $03

As I mentioned before any value between C0-E0 seems to trigger the non-script data and $03 is how many pointers to skip to get to the desired pointer in the table. I haven't yet messed with other strings to see if after $C2, the end is 4 values away.

I did some more messing around and there's another control code that seems to correspond to presenting choices from you to pick from. The number of choices available is specified in the non-script data and so the more choices available, the longer the data.

It seems as though there is insufficient data to dump this properly even if you were to make a custom dumper. You need to figure out what those game commands are doing more specifically. You know the end result is a pointer switch, but you don't know exactly how that occurs. So, how would you be able to dump based only on what you know so far? It helps to examine the game code in these cases.

Whatever you discover is going to be pretty game specific. It would probably still need a custom dumper. The way I would envision something like this working in a generic tool is figuring out more about those data commands and sticking them in the table as linked entries. You just need to know how many bytes, if any, are associated with each and not necessarily what they all do. We just need to be able to parse properly and not falsely hit parameter bytes. Then define the few that change the pointer or end the string to the dumper. So basically, it will dump until it hits a defined linked entry. I think you need to know most of this to dump the script anyway using any tool unless you do it the way you have with lots of overlap and garbage.

These types of scripts with embedded game commands require the most knowledge about the text system to be able to dump and insert. You typically need to know enough to be able to extract only the text, or parse or skip the commands. Insertion can be even trickier without knowing more about the commands. In most cases for insertion I end up adding a new relocation control code in the original blocks where the original text was so I don't have to alter any of the game commands.