News: 11 March 2016 - Forum Rules
Current Moderators - DarkSol, KingMike, MathOnNapkins, Azkadellia, Danke

Author Topic: Trying to Understand Script Extractor Workflow  (Read 3278 times)

Thutmose

  • Jr. Member
  • **
  • Posts: 9
    • View Profile
Trying to Understand Script Extractor Workflow
« on: February 17, 2011, 10:27:17 am »
I have been working on a PSX translation project for a few months now (Japanese to English, it's an uncommon/niche game software as well).  I've got around a few hundred or so text chunks translated, and have calculated some pointers to move translated text to different areas in the files for more room.

But my workflow isn't as efficient as I'd like it to be.

What I would like to do is figure out a way of extracting the script, and I'm wondering if a script extraction tool like Cartographer might work for me.

My main questions are:  When making the command file(s) for Cartographer (or other relevant extraction tool), do I need to write in all the locations of all the pointer tables and/or script blocks?  Are there automated tools that can find the table locations? And, optionally, extract the text as well?

This particular software has several dozen pointer tables and text blocks spread out across two files - and, if I can help it, I'd rather not have to search through megabytes of data in a hex editor looking for pointer tables/text blocks to manually write the command file(s).

Maybe there is a simpler way of extracting the text?  Or is this just how it's done?

Thanks in advance.

Ryusui

  • Hero Member
  • *****
  • Posts: 4989
  • It's the greatest day.
    • View Profile
    • Tumblr
Re: Trying to Understand Script Extractor Workflow
« Reply #1 on: February 17, 2011, 02:57:26 pm »
This is pretty much How It's Done, though if your current method is to modify all the strings and pointers in a hex editor, believe me, tag-teaming this project with Cartographer and Atlus will save you a lot of headaches.
In the event of a firestorm, the salad bar will remain open.

Thutmose

  • Jr. Member
  • **
  • Posts: 9
    • View Profile
Re: Trying to Understand Script Extractor Workflow
« Reply #2 on: February 17, 2011, 03:31:31 pm »
This is pretty much How It's Done, though if your current method is to modify all the strings and pointers in a hex editor, believe me, tag-teaming this project with Cartographer and Atlus will save you a lot of headaches.
Thanks.

I wanted to confirm that this was the case before I spent too much time working on my back-up plan.

My back-up plan then, which is now my goal, is to program my own script extractor that will automatically find the pointer tables, calculate the locations of the text in the data files, and then extract it (along with all the various addresses for each piece of text) to something like a CSV file.

From there I plan to import it into an online system I've been working on that will allow a team to collaboratively translate the text.  I've already got the database back end done, and enough of the front-end stuff for me to enter data manually.  I have been manually entering the text and my translations into the system by copying/pasting - which is, needless to say, quite time consuming.  But since it will automatically calculate pointer values and other memory addresses related to a piece of text based on its location in the data file, it is still a big time saver for me even when entering data manually.  If I can get that script extractor working, then it would be fantastic.

Once the translations are done, the system would then export a file to be read by a custom script insertion program and update the data files.

Given the relatively uniform structure of the pointer tables, I don't anticipate that it should be too difficult to programmatically find the pointer tables.  Once I find those, it should be easy enough to calculate the location of the corresponding text.

I don't have very many problems with text needing to be moved to a new location as much as I have a problem with the translated text fitting on screen (mono-spaced English font that it twice as large as it needs to be), or dialog boxes that need to be enlarged.  But that's another topic.

Thanks again for your help.

Ryusui

  • Hero Member
  • *****
  • Posts: 4989
  • It's the greatest day.
    • View Profile
    • Tumblr
Re: Trying to Understand Script Extractor Workflow
« Reply #3 on: February 17, 2011, 03:49:46 pm »
Yeah, that's actually How I Usually Do It. ^_^; Cartographer is awesome - it's what I'm using presently for the BoF1GBA retrans - but for projects like Death Note: The Kira Game, Wan Wan Meitantei and Solatorobo, I usually find myself coding my own insertion/extraction tools for one reason or another. If the pointers are an ungodly mess (Death Note) or the script is compressed (WWM/Solatorobo), it's pretty much your only option.
In the event of a firestorm, the salad bar will remain open.

Thutmose

  • Jr. Member
  • **
  • Posts: 9
    • View Profile
Re: Trying to Understand Script Extractor Workflow
« Reply #4 on: May 20, 2011, 01:14:54 pm »
Just thought that I would post an update, kind of a necropost though.

I've been too busy with work to spend a lot of time on this (basically haven't touched it in a few months), but I have made progress since my previous post.  I've wrote a simple program that will scan the files in the disc image which contains the text, find the pointer tables, calculate all the text positions and other data, and extract text (and its related memory locations/addresses/pointers).  It has saved me perhaps hundreds of hours of copying, pasting, and typing.  And, luckily, the text wasn't compressed or anything tricky.  Some of the pointers are two bytes, and others are three bytes, so that threw me for a loop at first.  But it's all working well enough now.

It processes it all and stuffs it into a normalized relational database.  Once I get the collaborative translation system online, I'll probably post more details about it - and I'll no doubt be looking for people to help out with the translation.

Basically my goal is to create a collaborative translation system, like I said before.  It will allow team members to tag text snippets, propose & vote on translations, upload screenshots, etc.  The idea being to create something that's structured enough that I can have many people contributing, and keep it all maintainable and easy for me to ultimately get the translations put into the disc image in some automated way.