News: 11 March 2016 - Forum Rules
Current Moderators - DarkSol, KingMike, MathOnNapkins, Azkadellia, Danke

Author Topic: Looking for specific hex editor  (Read 9910 times)

baka_neko

  • Jr. Member
  • **
  • Posts: 13
    • View Profile
Looking for specific hex editor
« on: June 16, 2011, 05:38:24 pm »
Not sure if this would best be put here or in 'Coding/Programming', but I'll make the safe bet of here.

I've been on the lookout for a hex editor with a few specific features, but have had no luck. I've tried nearly every one in the utilities section and nothing fit the criteria. Mainly, I'm looking for one with multi-byte table support, unicode/sjis, and the ability to insert vs overwrite(so adding a 5 byte string would add 5 bytes to the total size of the file).

I've had pretty terrible luck so far where they would either support insertion and not multi-byte tables or the size of the table would freak the program out and crash it(the custom table file I'm using has over 3000 mappings). Currently I'm using CrystalTile2 since it seems to handle the multibyte table the best, but the lack of insertion makes my current task a real pain in the ass.

Ryusui

  • Hero Member
  • *****
  • Posts: 4989
  • It's the greatest day.
    • View Profile
    • Tumblr
Re: Looking for specific hex editor
« Reply #1 on: June 16, 2011, 05:55:24 pm »
What do you need insertion for?
In the event of a firestorm, the salad bar will remain open.

baka_neko

  • Jr. Member
  • **
  • Posts: 13
    • View Profile
Re: Looking for specific hex editor
« Reply #2 on: June 16, 2011, 06:10:34 pm »
For my current translation project, the script files are relatively dynamic as long as you change the control codes that are sandwiched between each block of text. Unfortunately the game doesn't support ascii. Luckily this isn't too much of a problem due to the aforementioned dynamic script files, so I can just use unicode and end up with larger files.

Due to being sandwiched between control codes though, the only way for me to insert text is by creating a new file with a very large size(way more than I'd need), copying and pasting each section of control codes and adding the new text(then modifying the previous codes). Once this is done for every text block I calculate the final size of the file and create another new file at that size to copy and paste to.

As you could imagine, this is an incredibly obtuse way to approach the problem and can sometimes take up to an hour per file. For some game's that might be okay, but not for this one as it has over 4000 script files XD

Ryusui

  • Hero Member
  • *****
  • Posts: 4989
  • It's the greatest day.
    • View Profile
    • Tumblr
Re: Looking for specific hex editor
« Reply #3 on: June 16, 2011, 06:25:33 pm »
So the file size doesn't matter as long as the format is correct, right?

Three words for you, friend: Custom Script Editor. It's what I'd do in your circumstances. Just cobble together something in Visual Basic and fire away. You wouldn't need to muck around in a hex editor. Heck, with a little effort, you could make something that could read a TXT file written in Notepad and turn it into a working script file.
In the event of a firestorm, the salad bar will remain open.

baka_neko

  • Jr. Member
  • **
  • Posts: 13
    • View Profile
Re: Looking for specific hex editor
« Reply #4 on: June 16, 2011, 06:29:56 pm »
Right, I've been thinking it might come to that. The only custom script editors I've made were for games with much simpler script files. It'll be a bit harder due to how dynamic it is. I'll nearly have to write a script parser than a basic script editor. I assumed that is what I'd have to end up doing, but I thought it wouldn't hurt to ask anyways XD

Ryusui

  • Hero Member
  • *****
  • Posts: 4989
  • It's the greatest day.
    • View Profile
    • Tumblr
Re: Looking for specific hex editor
« Reply #5 on: June 16, 2011, 06:31:54 pm »
Look at it this way: it may require a bit more up-front time investment, but it'll save you a lot of effort in the long run.

*furiously scribbles a note to himself: get to work on Sylvanian Families script editor, you lazy hypocrite!*
In the event of a firestorm, the salad bar will remain open.

baka_neko

  • Jr. Member
  • **
  • Posts: 13
    • View Profile
Re: Looking for specific hex editor
« Reply #6 on: June 16, 2011, 06:36:26 pm »
Well, thanks for the input anyways :P I guess it's time to hunker down and fire up the C# IDE(and start brewing some coffee... it's gonna be a long night :laugh:)

Klarth

  • Sr. Member
  • ****
  • Posts: 494
    • View Profile
Re: Looking for specific hex editor
« Reply #7 on: June 16, 2011, 07:00:20 pm »
Sounds like you expect too much from a hex editor.  You may need to write a custom script dumper, but there are few scripts that Atlas can't insert.  But if you're an experienced programmer, a custom program usually works out a bit better.  If you need to look at how a table library works, you can look at my TableLib in C++.  Sorry that I never released the C#/DLL wrapper version.

We could advise you better on your options if we knew what problems precisely makes the available tools insufficient.  And to reiterate Ryusui's point, in this case, an intelligent approach rather than brute force will save you dozens of hours in the long run.  In a game with massive amounts of content, it's best to create two automated systems: one to take the game apart and one to insert text/apply graphic patches/rebuild game.

Ryusui

  • Hero Member
  • *****
  • Posts: 4989
  • It's the greatest day.
    • View Profile
    • Tumblr
Re: Looking for specific hex editor
« Reply #8 on: June 16, 2011, 07:30:19 pm »
From the sounds of things, the script is in individual files. This has the benefit that, as mentioned, size isn't really an issue as long as the formatting is correct; however, it means that unless Atlas can expand a file if it hits the end, then that approach will still be limited by the current size of the files.

Also: is this a DS game, by any chance? If it uses a standard font file and libraries, you might be able to implement ASCII support simply by copying an NFTR that has ASCII characters over the existing one.
In the event of a firestorm, the salad bar will remain open.

baka_neko

  • Jr. Member
  • **
  • Posts: 13
    • View Profile
Re: Looking for specific hex editor
« Reply #9 on: June 16, 2011, 07:33:36 pm »
Yeah, I used to do most of my translating with just a hex editor since that was how I learned it. Guess it's pretty easy to get stuck in your ways XD

I probably will need to write something that dumps the scripts as well, I suppose. I was able to work romjuice in such a way that I could at least get the scripts to my translators, but I suppose that doesn't work all that well when it comes to automating insertion. I'll definitely be checking out your library as well. I'm not a great programmer by any means, but I know my way around a few different languages so writing the dumper/inserter in C++ shouldn't be that much trouble.

Oh and Ryusui, it is a DS game... but an odd one in how it works. The font is stored in the standard .nftr files, but the game seems to interpret it in different ways. It uses the same font file, but switches between two different mappings. For the majority of the text it uses a custom remapped version of sjis(hense the custom table file I had to make), but in certain parts with a script it will switch to standard sjis encoding then right back to the custom remap. Why it does this, I have no clue. When they actually use regular sjis, then ascii works fine, but it won't work with the remapped encoding.

Sorry if that's kind of confusing... it's a pretty confusing system that took a lot of work to finally figure out and that's the best I could explain it.
« Last Edit: June 16, 2011, 07:40:24 pm by baka_neko »

Klarth

  • Sr. Member
  • ****
  • Posts: 494
    • View Profile
Re: Looking for specific hex editor
« Reply #10 on: June 16, 2011, 08:44:55 pm »
it means that unless Atlas can expand a file if it hits the end, then that approach will still be limited by the current size of the files.
Atlas should happily chug away past the file size unless you constrain it with #JMP($StartLocation, $EndFileLocation).  However, there's no way to insert new bytes in between existing bytes.  It's overwrite only, but I've been known to add features or do a custom build if there's a nuance holding back a script insertion...and I'm made aware of it.

I was able to work romjuice in such a way that I could at least get the scripts to my translators, but I suppose that doesn't work all that well when it comes to automating insertion.
So when it comes to dumping, there's two things you have to do.  A) Decode the game's encoding to a standard, readable format and B) Isolate the pointers so they can be used for reinsertion.  If you haven't done both, then at some point, you may be in a very big pickle.  Best case scenario is that you have pointer tables and you'll have to find those (and automate repointering using Atlas or custom tool).  Worst case scenario is that pointers are scattered through a scripting system...and that adds a lot of work to match up strings with pointer locations.  If your script is largely translated, it makes things more complex.

Just wondering, did you dump each of those 4000 files manually?  If so, then ouch.  :(

baka_neko

  • Jr. Member
  • **
  • Posts: 13
    • View Profile
Re: Looking for specific hex editor
« Reply #11 on: June 16, 2011, 09:17:15 pm »
Yeah, unfortunately it's sorta the "worst case scenario", but sorta not. As far as I can tell, the script files are read in a linear fashion, but there are pointers within there that need to be modified. Right before each block of text is a pointer to the next block of control codes such as

Code: [Select]
{01}{00}{38}{00}でんげんを いれたナ!でんげんを いれてしまったナ!!{0F}{00}{02}{00}{01}{00}{01}{00}{01}{00}{1E}{00}
Where {01}{00}{38}{00} is pointing to the {0F} which is 38 bytes away and {01}{00}{1E}{00} pointing to the next block after the text. The reason I'll have to write my dumper as a parser is what you see in the 2nd block. {02}{00} is followed by 4 bytes of arguments instead of the normal which is 2 like the {01}{00}{xx}{xx} code. So that's sorta where my problem comes in... the amount of different controls are pretty vast and some I honestly have no idea what purpose they serve. This makes writing a parser-style dumper problematic(and why I was trying to take the "easy" road of ignoring everything but the pointer and just copying it).

Oh and for dumping the 4000+ files, I created a .bat script to iterate through the script folders and dump each file XD I'm not that crazy ::)

Klarth

  • Sr. Member
  • ****
  • Posts: 494
    • View Profile
Re: Looking for specific hex editor
« Reply #12 on: June 16, 2011, 10:05:57 pm »
Where {01}{00}{38}{00} is pointing to the {0F} which is 38 bytes away and {01}{00}{1E}{00} pointing to the next block after the text. The reason I'll have to write my dumper as a parser is what you see in the 2nd block. {02}{00} is followed by 4 bytes of arguments instead of the normal which is 2 like the {01}{00}{xx}{xx} code. So that's sorta where my problem comes in... the amount of different controls are pretty vast and some I honestly have no idea what purpose they serve. This makes writing a parser-style dumper problematic(and why I was trying to take the "easy" road of ignoring everything but the pointer and just copying it).

Oh and for dumping the 4000+ files, I created a .bat script to iterate through the script folders and dump each file XD I'm not that crazy ::)
Yeah, you'll certainly need a custom dumper in a situation like that to sort the pointers out.  Atlas can insert scripts like this, but you'll have a mess of embedded pointers (which your dumper can write out for you).  It's a fairly difficult task to write a program to "fix" translated files if the translators have gotten that far in these cases.  If you need help with TableLib/Atlas, just PM me.

As for the control codes...you need two bits of information.  One is the number of bytes following each control code...and second is identification of pointers.  Past that, you might just want to determine a few of the more useful ones (portraits, text colors, and text choices...the latter usually has a pointer associated for each option...which you'll need).

Nightcrawler

  • Hero Member
  • *****
  • Posts: 5787
    • View Profile
    • Nightcrawler's Translation Corporation
Re: Looking for specific hex editor
« Reply #13 on: June 17, 2011, 09:47:51 am »
I've been working on a way to handle something like this with TextAngel. I've come up with a few ideas based on how I've done this in the past with my projects.

1. Define a set of rules or patterns to identify and extract the strings. Could use something simple like {01}{00}{xx}{xx} with wildcards, or even something like a regex statement. In the past I've written some dumpers where I just defined a few rules like this and it extracted all the strings from the data. I didn't need to know all that much about the surrounding data. It would be possible to hit false positives if your rule sequences were too simple such as every time you see a $F0 or $F1. A little regex in defining your rule might help here if the strings ended with end tokens or had fixed values. I think some rule based pattern matching can extract the strings for many cases. But, what about the pointers or reinsertion you say? Read on.

2. I don't need to know anything about the pointers as I end up relocating the strings. This typically requires a minor game code modification to handle a new control code for the repointing, but that's fairly trivial compared to writing a full fledged parser for the data block, especially if it has say a full 256 or more possible control values (all with varying parameter lengths). The inserter just leaves a control code and new pointer in the original string's location. It could be very difficult if you have to reinsert the block w/ data that often have embedded jumps that jump around your text to other data that you don't know about that get mangled.

I've found this method to be the least time consuming way to handle this type of situation. It has been adequate for the games I have worked on with this method of mixed storage. Occasionally, you may hit a false positive, but that can fixed manually. The alternative of learning enough to parse the block in it's entirety can be much more time consuming.


Atlas Embedded Pointers:
Speaking of embedded pointers in Atlas, I don't think I understand exactly how they work. I raised a question on the intended operation in this post with an example. I always meant to direct you to it to explain what was supposed to happen.

http://transcorp.parodius.com/forum/YaBB.pl?num=1273691610/35#35
TransCorp - Over 20 years of community dedication.
Dual Orb 2, Wozz, Emerald Dragon, Tenshi No Uta, Glory of Heracles IV SFC/SNES Translations

Auryn

  • Hero Member
  • *****
  • Posts: 650
    • View Profile
Re: Looking for specific hex editor
« Reply #14 on: June 17, 2011, 12:09:30 pm »
@baka_neko: what game is it?? look similar to ace attorney series but it's not from what i remember.

baka_neko

  • Jr. Member
  • **
  • Posts: 13
    • View Profile
Re: Looking for specific hex editor
« Reply #15 on: June 17, 2011, 01:30:49 pm »
Thanks again Klarth! My C++ skills are a bit rusty, but I'm making some progress. I've actually just been working with your sample dumper/inserter and modifying that some. Thought I might as well not reinvent the wheel and just use what's there for the most part. It's just going to be for personal use, but I'll make sure to give credit where credit is due :D

@Auryn: The game is Game Center CX2. The script system might be a bit similar(I have never looked at ace attorney's, so I'm not sure), but I'm pretty sure they're different just guessing from the style of the two games.

Ryusui

  • Hero Member
  • *****
  • Posts: 4989
  • It's the greatest day.
    • View Profile
    • Tumblr
Re: Looking for specific hex editor
« Reply #16 on: June 17, 2011, 02:00:11 pm »
Oh, you're with this project? Cool. :3
In the event of a firestorm, the salad bar will remain open.

Klarth

  • Sr. Member
  • ****
  • Posts: 494
    • View Profile
Re: Looking for specific hex editor
« Reply #17 on: June 17, 2011, 07:01:15 pm »
Speaking of embedded pointers in Atlas, I don't think I understand exactly how they work. I raised a question on the intended operation in this post with an example. I always meant to direct you to it to explain what was supposed to happen.
Here's a quick summary of embedded pointers and why they're necessary.  (Though you may come up with a better way to solve it)

EMBSET - "I want to put a pointer at the current text location, but I don't know the address where I want it to point to.  Reserve some space here in the meantime."
EMBWRITE - "I want a pointer to point to the current text location, but I don't know the address to put the pointer at."

These issues creep up when you have pointers embedded within text blocks.  Text and pointer locations are shifted around to where you can't determine a hard address for either until insertion.  So the idea was to link commands with an ID until both were known.  So it solves insertion problems regarding yes/no control statement types as well as trees.

Granted, trees will look like a mess in notepad.  It also scares me that an editor/translator/hacker(..!) will accidentally do something that damages the command integrity of the script and cause catastrophic, hard-to-debug failure.

baka_neko, that's what the source code is out there for.  Or at least cut down on the amount of time it takes you to roll your own library.

Gideon Zhi

  • Discord Staff
  • Hero Member
  • *****
  • Posts: 3532
    • View Profile
    • Aeon Genesis
Re: Looking for specific hex editor
« Reply #18 on: June 17, 2011, 07:17:10 pm »
These issues creep up when you have pointers embedded within text blocks.  Text and pointer locations are shifted around to where you can't determine a hard address for either until insertion.  So the idea was to link commands with an ID until both were known.  So it solves insertion problems regarding yes/no control statement types as well as trees.

Granted, trees will look like a mess in notepad.  It also scares me that an editor/translator/hacker(..!) will accidentally do something that damages the command integrity of the script and cause catastrophic, hard-to-debug failure.

An unfortunate necessity and reality for some games. Stuff like this is why I'm surprised Mystic Ark and Gun Hazard are so stable. The embedded pointer structure in Atlas was designed with these two games in mind, and they both have thousands of the things.

baka_neko

  • Jr. Member
  • **
  • Posts: 13
    • View Profile
Re: Looking for specific hex editor
« Reply #19 on: June 18, 2011, 12:20:35 am »
Oh, you're with this project? Cool. :3

Yup :3 I'm the leader on that project. I've got it to a point where it works well for dumping/inserting on some of the files, so that's a huge plus. I'm still having to do them by hand, but it's much faster to do what I need to now. I spent a few hours this morning running through a single script in the game and got a decent amount of the unknown control codes figured out and documented so the dumped script is much cleaner now(love the linked entry function!).

Unfortunately, that really only works for one folder which is all just one person talking. The big problem is in the 2nd folder, which is dialogue between two characters. What I hadn't noticed before is that there's a pointer table at the beginning of each file. Since the script was linear, the first six bytes were just

Code: [Select]
{00}{00}{00}{00}{14}{00}
which I would later figure out meant there was only 1 pointer(at 0x14,which is the beginning of the dialogue). Here's where the fun part comes in for me. The other folder with the two-person dialogue will have pointers to different sections of the code with the number determined by the first 2 bytes, followed by that many 4 byte pointers directly after. Now I don't think this effects the dumping/inserting TOO much, but they will have to be modified once the script is inserted to match up with the new positions. I'm trying to figure out if there's a way for the dumper to toss tags to those positions which will help me know where each different block is. Hopefully this can be done in a way where the inserter will ignore said tags.

There is one part I'm not sure how to handle though, which is the questions. In the two-person dialogue there will occasionally be questions, when answered it uses the pointers at top to determine which block of dialogue it sends you to. The problem is that the encoding switches here. The choices you're given are encoded in sjis instead of the remapped sjis used in the rest of the dialogue.

There is a control code for when these question blocks appear, but I'm unsure if there's any good way to handle this. The only thing I could think of would be to have two table files loaded and for that control code to switch table files in the dumper/inserter. Is there any reasonable way to handle this?