News: 11 March 2016 - Forum Rules
Current Moderators - DarkSol, KingMike, MathOnNapkins, Azkadellia, Danke

Author Topic: Cartographer  (Read 32569 times)

RedComet

  • Hero Member
  • *****
  • Posts: 3163
    • View Profile
    • Twilight Translations
Cartographer
« on: December 29, 2008, 02:50:34 pm »
http://www.rpgclassics.com/subsites/twit/progs/Cartographer_PR3.zip

I've included a sample of how the dump by pointer and raw dump (like romjuice) methods work using FF1 as an example. I still need to write a few more examples, but I need to find games that I'm not actively working on to demonstrate the features on. Everything should be complete, but there may still be bugs, which is the whole point I'm posting this. Use it and report any bugs you find.
Twilight Translations - More than just Dragonball Z. :P

Pennywise

  • Hero Member
  • *****
  • Posts: 2232
  • I'm curious
    • View Profile
    • Yojimbo's Translations
Re: Cartographer
« Reply #1 on: December 29, 2008, 03:06:08 pm »
There's this NES game called Jajamaru Gekkimaden or something that doesn't use control codes and the only way to find where a string begins is by looking at the pointer table. At least that's how I remember things. Could be a possible example.

Vehek

  • Full Member
  • ***
  • Posts: 174
    • View Profile
Re: Cartographer
« Reply #2 on: December 30, 2008, 07:55:15 pm »
I don't understand how to use this. Do I need to set up my table a certain way?
 "#0: Line 1: First character of the line is not a recognized table character"

Edit: Is this because my table's in UTF-8 encoding?
« Last Edit: December 30, 2008, 08:17:08 pm by Vehek »

Tauwasser

  • Hero Member
  • *****
  • Posts: 1392
  • Fantabulous!!
    • View Profile
    • My blog
Re: Cartographer
« Reply #3 on: December 30, 2008, 08:29:53 pm »
Seriously, if this does not support unicode, nobody here wants it anyway. Can't be that hard to make it unicode compliant...

cYa,

Tauwasser

RedComet

  • Hero Member
  • *****
  • Posts: 3163
    • View Profile
    • Twilight Translations
Re: Cartographer
« Reply #4 on: December 31, 2008, 11:40:05 am »
I don't understand how to use this. Do I need to set up my table a certain way?
 "#0: Line 1: First character of the line is not a recognized table character"

Edit: Is this because my table's in UTF-8 encoding?

Could you send me your table file so I can check it out?
Twilight Translations - More than just Dragonball Z. :P

Vehek

  • Full Member
  • ***
  • Posts: 174
    • View Profile
Re: Cartographer
« Reply #5 on: December 31, 2008, 12:47:28 pm »
Is e-mail okay?

I didn't have any problems when I tried using a table in ANSI format.
« Last Edit: December 31, 2008, 01:08:57 pm by Vehek »

RedComet

  • Hero Member
  • *****
  • Posts: 3163
    • View Profile
    • Twilight Translations
Re: Cartographer
« Reply #6 on: December 31, 2008, 01:15:22 pm »
Yeah, email's fine. rpgcredcomet@gmail.com
Twilight Translations - More than just Dragonball Z. :P

C_CliFF

  • Jr. Member
  • **
  • Posts: 63
    • View Profile
    • General CoolNES Translations
Re: Cartographer
« Reply #7 on: January 01, 2009, 10:19:22 am »
I just tested the program with an old project I had, FF5. I don't know if the program has a bug but when I try to extract a block that uses 3-byte pointers the program crashes. This is my commands (I tried to explain as well as I could):

Code: [Select]

#GAME NAME: Final Fantasy 5 (SNES)

#BLOCK NAME: Dialogue Block (RAW) // this raw block extracts fine
#TYPE: NORMAL
#METHOD: RAW
#SCRIPT START: $21020D
#SCRIPT STOP: $21FE16
#TABLE: ff5_raw.tbl
#COMMENTS: Yes
#END BLOCK


#BLOCK NAME: Dialogue Block (POINTER_RELATIVE)
#TYPE: NORMAL
#METHOD: POINTER_RELATIVE
#POINTER ENDIAN: LITTLE
#POINTER TABLE START: $2015F0
#POINTER TABLE STOP: $202F3F
#POINTER SIZE: $04 // see POINTER SIZE comment below
#POINTER SPACE: $00
#ATLAS PTRS: Yes
#BASE POINTER: $210200 // see BASE POINTER comment below
#TABLE: ff5_ptr.tbl
#COMMENTS: Yes
#END BLOCK

// POINTER SIZE: This is were the program crashes. The game uses 3-byte pointers
// so I changed it to 04 like the readme said. I tried just for testing
// to change the value to $02 and then it extracted fine. But
// like I said, since the game uses 3-byte pointers then the pointers
// are wrong. Besides, the script doesen't look like it should.

// BASE POINTER: I'm not sure exactly how this one works. To get the right adress
// to the pointer table you subtract 200 and then add C00000 to get the
// right value to calculate the pointers so I don't really know if
// I made it right. I tried and remove POINTER_RELATIVE metod and changed
// it to POINTER and then removed the BASE POINTER method but the program
// still crashes.

-C_CliFF
« Last Edit: January 03, 2009, 08:34:45 am by C_CliFF »

RedComet

  • Hero Member
  • *****
  • Posts: 3163
    • View Profile
    • Twilight Translations
Re: Cartographer
« Reply #8 on: January 04, 2009, 07:14:33 pm »
The base pointer is the value you that you either add or subtract to the pointer in the table to get the string address. Take the string address and subtract the pointer value from the address. The difference is the value that you must add to the pointers to get the string address, therefore it's the base pointer.

As for the table accepting unicode -- it should, but I'm using Klarth's table library, so if that doesn't then Cartographer doesn't either. I'll test that out.

I'll address those other issues this week hopefully. Short on time.

Twilight Translations - More than just Dragonball Z. :P

Vehek

  • Full Member
  • ***
  • Posts: 174
    • View Profile
Re: Cartographer
« Reply #9 on: January 04, 2009, 07:23:10 pm »
I haven't tested it yet, but my problem might have to do with UTF-8's "byte-order mark".

Edit-Tested it by cutting the byte-order mark out with a hex-editor, and the table worked.
« Last Edit: January 04, 2009, 09:18:25 pm by Vehek »

Tauwasser

  • Hero Member
  • *****
  • Posts: 1392
  • Fantabulous!!
    • View Profile
    • My blog
Re: Cartographer
« Reply #10 on: January 05, 2009, 02:29:05 am »
Seriously, there gotta be a standard library that can use all file types, be it Unicode, ± BOM, ANSI... It should encourage Unicode of course. And yes, Unicode can be UTF-7, UTF-8, UTF-16, UTF-32 ± BOM or UCS-2, UCS-4 ± BOM! Seriously, I don't get what's so hard about making a unicode compliant program. What is this? 1998?? It's 10 years after Unicode was invented guys, wake up and use it properly!
If there is no ready-to-use c library/header/whatever for this, then it's a shame. I doubt it tho, Microsoft can do it, so can you.


cYa,

Tauwasser

Nightcrawler

  • Hero Member
  • *****
  • Posts: 5753
    • View Profile
    • Nightcrawler's Translation Corporation
Re: Cartographer
« Reply #11 on: January 05, 2009, 08:43:21 am »
Quit being so harsh, especially since you are seemingly ignorant on supporting Unicode formats in C/C++ (assuming this utility was written as such). It's not as trivial as you make it out to be. Suggesting Unicode support is strongly needed is one thing, being as ass about it is another. Offer some assistance if you're going to be so demanding.

This is another example of why I have never, and probability never will, release a public utility to this community. These kind of reactions make me ill.  :(
TransCorp - Over 20 years of community dedication.
Dual Orb 2, Wozz, Emerald Dragon, Tenshi No Uta, Glory of Heracles IV SFC/SNES Translations

byuu

  • Hero Member
  • *****
  • Posts: 888
    • View Profile
    • A farewell message
Re: Cartographer
« Reply #12 on: January 05, 2009, 11:39:38 am »
Quote
// POINTER SIZE: This is were the program crashes. The game uses 3-byte pointers
//        so I changed it to 04 like the readme said. I tried just for testing
//       to change the value to $02 and then it extracted fine.

... am I the only one who wondered why you weren't supposed to use a value of $03 to represent a 3-byte pointer? :/

Quote
And yes, Unicode can be UTF-7, UTF-8, UTF-16, UTF-32 ± BOM or UCS-2, UCS-4 ± BOM! Seriously, I don't get what's so hard about making a unicode compliant program.

Windows internally uses UTF-16 only. Linux uses UTF-8 only. You can feed it whatever format you want, but you have to manually convert it.

Easy to get cross-platform support by writing a MultiByteToWideChar(CP_UTF8...) wrapper for 'doze. Losing O(1) character lookup sucks royally, though. UTF-16 can't do that 100% cleanly thanks to surrogate pairs. Sure you can ignore those, but yuck. So you can use the even less common UTF-32, fun. I just make sure my UTF-8 tables use characters that encode as 3-bytes/per so I can be cheap about indexing.

The ROM hacking scene has always been about ten years behind the common encoding format, anyway. When I started, it was all about EUC-JIS encoding. Be happy most are at least using Shift JIS now.

Tauwasser

  • Hero Member
  • *****
  • Posts: 1392
  • Fantabulous!!
    • View Profile
    • My blog
Re: Cartographer
« Reply #13 on: January 05, 2009, 01:12:08 pm »
So, seriously, I have no idea of c/c++, but I'm freaking sick of seeing programs not work for many people because it only supports latin-1 or whatever.
I'm just saying, that there is a way, Microsoft does it - so can everybody else with a little effort. There just has to be a standard library in c/c++ by now that features text file access that does not read ANSI only or crashes on BOM, valid unicode etc.

And yes, by now I think the translation community - especially the translation community - ought to care about that stuff. Cartographer might be nice, however, I'm running a Japanese OS, so it probably won't work for me with German umlauts etc. That's a problem! And there is a solution. And it has been around for 10 years. I just can't believe there is no library in c/c++ or whatever this is written in that supports this!

Quote from: byuu
Windows internally uses UTF-16 only. Linux uses UTF-8 only.

I'm not talking about filenames. The .Net framework is pretty darn good at reading all of those formats (though I didn't test UTF-32, but I read it somewhere... I tested UTF-8, UTF-16 and UCS-2 tho). You don't need to convert not textfile for it to be able to read it with StreamReader!

And it's not only the romhacking community. IrfanView doesn't even support UTF-16 filenames! HexWorkshop didn't, but the newest release does! However, it seems so slow compared to other apps (and yes, Microsoft apps!) that work seamlessly and without ANSI and have been working so for years!

cYa,

Tauwasser

Nightcrawler

  • Hero Member
  • *****
  • Posts: 5753
    • View Profile
    • Nightcrawler's Translation Corporation
Re: Cartographer
« Reply #14 on: January 05, 2009, 01:27:48 pm »
Few points:

1. To take advantage of Unicode (UTF-16) on Windows using straight Win32 C, you need to do a bunch of little things including using all 'W' versions of the library functions and variables such as LPWSTR for strings. And don't forget the 'L' in front of a declared string in your code. One time I had a large Win32 program made. A few years later, the company wanted Unicode support added. Converting after the fact was nearly unfeasible. I spent some time with it, but there were so many things to change and so many small errors cropped up, I told them it wasn't worth their time and they should instead include Unicode in the next generation application where it can be added properly into the design rather than kludged to hell after the fact. Converting between character sets is also somewhat painful. Apparently it's not intuitive enough because I've clean forgot off hand how to do it. Unicode in Win32 can be a real pain, especially if you didn't plan for it from the beginning!

2. If you're just using standard C or C++, same story really. You need to use the 'w' versions of the common functions and string/char variables. You still have some issues converting, but I think you can use something like setlocale(), iconv() or something for the output you desire. It's been so long since I didn't anything in plain vanilla C/C++, I don't remember nor did I really do much there involving Unicode.

3. Any .NET language makes character encoding handling insanely easier on Windows. Unicode is always used internally, but you have the freedom to input and output most anything easily. You want S-JIS for input and UTF-8 for output? Or vice versa? Not a problem. For people like myself scarred from former character encoding headache nightmares, they all went away with .NET. :)

It's not surprising that many English made ROM hacking tools don't support Unicode well.


So.... RedComet, what did you program the utility in?
TransCorp - Over 20 years of community dedication.
Dual Orb 2, Wozz, Emerald Dragon, Tenshi No Uta, Glory of Heracles IV SFC/SNES Translations

byuu

  • Hero Member
  • *****
  • Posts: 888
    • View Profile
    • A farewell message
Re: Cartographer
« Reply #15 on: January 05, 2009, 02:04:15 pm »
Quote
I'm not talking about filenames.

Neither was I, I was talking about displaying the text inside a user interface. But filenames are a huge problem, too.

Quote
To take advantage of Unicode (UTF-16) on Windows using straight Win32 C ... Converting after the fact was nearly unfeasible.

I'll share some of the fun I had porting my apps to Unicode.

1) need to #define UNICODE (cleaner, safer than adding Ws everywhere), which instantly breaks a few hundred to a few thousand Win32 API calls. You'll cry when GCC spits out 4,873 compilation errors.
2) best to write a generic wrapper to turn UTF-8 into UTF-16 on-the-fly:
Code: [Select]
class utf16 { public:
  operator wchar_t*() { return buffer; }
  operator const wchar_t*() const { return buffer; }
  utf16(const char *s = "") {
    if(!s) s = "";
    unsigned length = MultiByteToWideChar(CP_UTF8, 0, s, -1, 0, 0);
    buffer = new(zeromemory) wchar_t[length + 1];
    MultiByteToWideChar(CP_UTF8, 0, s, -1, buffer, length);
  }
  ~utf16() { delete[] buffer; }
private: wchar_t *buffer;
};
... and vice versa.
3) all of your filename passing fails. Have to convert them to UTF-16 first. This also breaks all your libc file access functions: fopen needs to become _wfopen, mkdir needs to become _wmkdir, etc. This also breaks all your third-party libraries: have fun patching zlib, libjma, etc.
4) int main(int argc, char *argv[]) fails. The non-ANSI parts become question marks, so even converting them to UTF-16 won't let you open the files. Need some serious black magic to get that back to valid UTF-8:
Code: [Select]
int __stdcall WinMain(HINSTANCE, HINSTANCE, LPSTR, int) {
  //argv[] is in 7-bit ANSI format; Unicode characters are converted to '?'s.
  //this needs to be converted to UTF-8, eg for realpath(argv[0]) to work.
  int argc;
  wchar_t **wargv = CommandLineToArgvW(GetCommandLineW(), &argc);
  char **argv = new char*[argc];
  for(unsigned i = 0; i < argc; i++) {
    argv[i] = new char[_MAX_PATH];
    strcpy(argv[i], utf8(wargv[i]));
  }
5) all of these changes are Win32-specific, so you have to encapsulate all of them in #ifdef _WIN32, so that it continues to work on pure UTF-8 systems like Linux / BSD / OS X. And if you want unified GUI text for all platforms, then 100% of your Win32 API calls need to wrap UTF-8 -> UTF-16.

The best part ... all of this could be avoided, and current apps could transparently gain Unicode support, if Windows would just accept a UTF-8 codepage with the *A functions.

The bad news, you pretty much have to do this stuff. If someone has a Windows username that isn't pure ANSI, and you app saves data inside their profile (as it should, apps are supposed to store data in the App Data folder), it will completely fail to save the data without Unicode support. This really pisses off non-English speakers, and for good reason. I had someone on 2ch asking why I hated Japanese people because I couldn't load Japanese-named ROMs >_<

The really bad news, most big-name commercial apps can't handle this, either! Winamp, Firefox 2 ... 95% of my applications failed to work at all when I used a non-English profile username.
« Last Edit: January 05, 2009, 02:09:27 pm by byuu »

RedComet

  • Hero Member
  • *****
  • Posts: 3163
    • View Profile
    • Twilight Translations
Re: Cartographer
« Reply #16 on: January 05, 2009, 04:26:41 pm »
So.... RedComet, what did you program the utility in?

C++.
Twilight Translations - More than just Dragonball Z. :P

Gemini

  • Hero Member
  • *****
  • Posts: 2007
  • 時を越えよう、そして彼女の元に戻ろう
    • View Profile
    • Apple of Eden
Re: Cartographer
« Reply #17 on: January 05, 2009, 05:23:19 pm »
There's already a procedure to convert UTF8 to Windows' Unicode:
Code: [Select]
#include "winldap.h"
#pragma comment(lib, "wldap32.lib")

int UnicodeToUtf8(CString string, char* &dest)
{
int strlen=LdapUnicodeToUTF8(string,string.GetLength(),dest,0);
dest=(char*)new BYTE[strlen];
LdapUnicodeToUTF8(string,string.GetLength(),dest,strlen);
return(strlen);
}

int Utf8ToUnicode(TCHAR* &dest, char* string)
{
int len=LdapUTF8ToUnicode(string,(int)strlen(string),dest,0);
dest=(TCHAR*)new TCHAR[len];
LdapUTF8ToUnicode(string,(int)strlen(string),dest,len);
return(len);
}
I've been using these for almost 3 years, with no problems at all. Ok, it's Windows specific because of wldap, but I for one sure don't care. :p You can also replace CString if anything similar is not available. LPCTSTR+wcslen should work fine for the task. So:
Code: [Select]
int UnicodeToUtf8(LPCTSTR string, char* &dest)
{
int unilen=wcslen(string);
int strlen=LdapUnicodeToUTF8(string,unilen,dest,0);
dest=(char*)new BYTE[strlen];
LdapUnicodeToUTF8(string,unilen,dest,strlen);
return(strlen);
}
« Last Edit: January 05, 2009, 05:30:26 pm by Gemini »
I am the lord, you all know my name, now. I got it all: cash, money, and fame.

C_CliFF

  • Jr. Member
  • **
  • Posts: 63
    • View Profile
    • General CoolNES Translations
Re: Cartographer
« Reply #18 on: January 05, 2009, 06:39:18 pm »
Quote
... am I the only one who wondered why you weren't supposed to use a value of $03 to represent a 3-byte pointer? :/

You're right. For some odd reason I accidently read 24 bit instead of 32... :) It doesen't keep the program from crashing though.

-C_CliFF

Gil Galad

  • Full Member
  • ***
  • Posts: 186
    • View Profile
    • Homepage of Gil Galad
Re: Cartographer
« Reply #19 on: January 06, 2009, 02:22:03 am »
I talked to RedComet earlier today about Cartographer. So I have an example to show you guys based on the game Cadillac, which is a playing card puzzle type game. I am also calling the project.

The main point of this post is to address some documentation and explain some things in order to dump text a bit easier.

BASE POINTER

I tried a pointer table dump without success and then discovered that in order to dump the text of this game I needed to subtract instead of add in the #BASE POINTER command. The reason why you need to subtract is if the ROM address location of the text is less than the pointer address.

Cadillac is a mapper 3 Famicom game. For those that don't know, Mapper 3 is a 1 - 32KB PRG (Program ROM) bank game. The range of the data would be 10h - 800Fh.

The range of the text data is at 186Fh - 1F39h. So you would add $8000 to get the real address location of the data. So if you start at 186Fh, add 8000 to that and then -10, the result is $985F.  So, $985F is the address and the pointer if you flip the bytes around, 5F98. You would add $8000 because that's the address that the bank starts at and adds up right if you use the SetOff pointer calculation.

Next is the pointer table location. I will show you the commands in the file.

#POINTER TABLE START:   $1F3A
#POINTER TABLE STOP:   $1FB1

The first two bytes of the pointer table are 5F98. Those two bytes are the correct pointer for the first line of text in this block.


Based on the way that I have my command files set up, here is how Cartographer normally works. In # BASE POINTER, you take the modifier address and either add or subtract from the pointer to find the ROM address. In the readme file, it only says that you can add, but you can also subtract.

You know that 8000 is the SetOff calculation, so based on the way that Cartographer works, if you add $8000 to the pointer, the program is going to crash or not function as intended. $8000 + $985F = 1185F, that's way out of bounds of the PRG bank and the NES address range.

Here is the way around it. Instead, subtract $8000 from $985F and that equals 185Fh, near the intended location. Now, here is where it gets a bit weird. You also have to subtract the header size, in this case you would subtract the header from the BASE POINTER modifier, that would be $7FF0. So, your new BASE POINTER modifier would be -$7FF0.   So, $985F - $7FF0 equals 186Fh, that is the correct ROM offset.


Table Files

Here are a couple tips for table files. Make sure that you remove all bookmarks from the table files as well as anything that is not supported by Cartographer.

Make sure the line and end break codes are at the bottom of the file. There are two types that I have used, one is for raw and the other for relative pointers.  You can check out the differences in the material that I am going to provide.

The end line codes in your table file should be something like this.

FE=[liNE]\r
FF=[END]\n\r

You can use the /r and /n as you wish.

Now, for the RELATIVE POINTER table

FE=[liNE]\r
/FF=[END]\n\r.

Dumps

For the dumps that are RAW, I suggest that you have your table files correct or the dumps will not occur or be messed up. I also removed the #END BLOCK command in the raw dump file so that I could dump the text.


In closing, some of these things were already documented. However, these things I talked about are based on my experience and how I solved some of the crashing issues. And the lower ROM address compared to the pointer also needed to be discussed, I believe.

Here are the files for you guys to look at. Some of these files are unedited and directly from Cartographer.

HERE


Homepage of Gil Galad || New Forum

“I don’t know half of you half as well as I should like; and I like less than half of you half as well as you deserve. ”