News: 11 March 2016 - Forum Rules
Current Moderators - DarkSol, KingMike, MathOnNapkins, Azkadellia, Danke

Author Topic: Reinserting a translated script - space problems  (Read 7131 times)

tagengo

  • Jr. Member
  • **
  • Posts: 20
  • AYBABTU!
    • View Profile
Reinserting a translated script - space problems
« on: August 09, 2012, 09:18:40 am »
Hey all, looking for some help, asking for  as many opinions as possible (again for research), if people don't mind?  :thumbsup:

When re-insterting a script into the Rom, it seems like a common problem is one of space.

Some solutions are technically orientated (adapting the space to the translation), some are translation orientated (adapting the translation to the space).

  • How common is either of these approaches?
  • Which type of solution gets used more often?
  • How often are they used in combination?
  • Is one approached only used when another approach fails?
  • What kinds of factors influence how the space problems are solved?
  • How (if possible) do people avoid these kinds of problems arising?
  • Do you think there are any trends in the way people solve these problems? If so, what are they?


If there are any other comments people wish to add on the area that is important, I'd love to hear them too.

Many thanks in advance, everyone :-)
<<NOTE: I'm conducting research on video game translations - I might use some answers (anonymously) in my research, so please only answer my posts if this is OK. More information can be found at https://docs.google.com/a/sheffield.ac.uk/spreadsheet/viewform?formkey=dDRIbGx2aE9hZnhQM2Ff

Zoinkity

  • Hero Member
  • *****
  • Posts: 565
    • View Profile
Re: Reinserting a translated script - space problems
« Reply #1 on: August 09, 2012, 05:02:29 pm »
Note: although I will write 'text', the same applies to any image, code, or other resource that needs to be replaced. 

Generalizing an answer is somewhat difficult since so much is game-specific.  In general though you'll see virtually every translation project will try to fit as much text as is needed in, necessitating compressing data and/or extending the ROM.  Dwindling the whole script to fit within space would only really be concidered if extension or further compression isn't possible.  That said, shortening up lines for printability (ie. what will fit within a given window) is somewhat common. 

The biggest factor is how the game was coded and your hardware target, though the ability of the coders inserting the script can be a factor.  If they are incapable of the ASM work needed to redirect resources, or a full codec can't be written for a certain format, that will certainly affect which route they take. 

There are really only two ways to replace resources.  Either you tack new data at the end of the existing ROM and direct pointers/code to access it, or you physically replace the original data and shift other data as needed. 
Obviously the first will extend the ROM by leaps and bounds, so it is fine for the more common situation of relatively general hardware concerns and no real limit on space. 
The second will usually be more involved to program if you need to shift other resources around--at the extreme effectively rebuilding the ROM.  In the same breath, you can say it can be remarkably easier since no redirection is involved if you stay within the original filesizes.  However, if your game happens to use a very specific hardware arrangement and can't be extended, or to avoid adding an extra layer of compression, or just to avoid having to redirect pointers you may have to go this route.  In general, you'll see more text dwindling with this arrangement to fit to size concerns.  Again, matters on the case since two-byte japanese code means double the string lengths in ASCII.

In the end though, it usually isn't hardware concerns that will drive you to do something a certain way, but more what you're shooting for within the translation. 

As I mention though, this really is a game-by-game basis.  For instance, if the game happens to use a computing codepage for their japanese where every character is 2+ bytes long, the english equivalent might fit fine--especially if they stay away from kanji.  In other words, due to a technical aspect of the game you can avoid a technical solution.  However, that would only really apply to a subset of games from gen 5 consoles onwards.
« Last Edit: August 09, 2012, 08:59:16 pm by Zoinkity »

Klarth

  • Sr. Member
  • ****
  • Posts: 492
    • View Profile
Re: Reinserting a translated script - space problems
« Reply #2 on: August 09, 2012, 06:44:14 pm »
How common is either of these approaches?
The approach depends on how much the current script overshoots the game's original script size.  Trimming the script by editing may be an option if it is less than 5%.  If you know assembly language, you can implement a simple DTE or substring algorithm into the game which can reduce the script size by 20-35%.  Some games contain free space outside of the normal text area and the script pointers are able access that free space.  In that case, you don't need any advanced knowledge to implement.

Which type of solution gets used more often?
If the game's quality or popularity is high, then more work is put in and a technical solution is provided.

How often are they used in combination?
Several technical solutions are often used in tandem, but combinations of technical and editing solutions are rare.  Technical solutions can sometimes reduce an English script to smaller size than the original, uncompressed Japanese script.

Is one approached only used when another approach fails?
I assume if the translated script is barely larger than the original (less than 2%?), then it almost always gets trimmed via editing.  Technical solutions take much more time to implement.

What kinds of factors influence how the space problems are solved?
The technical difficulty of each viable option must be considered.

How (if possible) do people avoid these kinds of problems arising?
The problem is intrinsic to each game.  The only workaround that is nearly 100% successful is a technical solution.  If the individual doesn't have the technical skill to achieve this, then the project is either abandoned or they seek outside help with varying success.

Do you think there are any trends in the way people solve these problems? If so, what are they?
The trend is moving towards using technical solutions almost exclusively as techniques are better documented and better utilities are available.  The learning curve is less than it used to be.

FAST6191

  • Hero Member
  • *****
  • Posts: 2937
    • View Profile
Re: Reinserting a translated script - space problems
« Reply #3 on: August 10, 2012, 08:23:42 am »
As others said game by game although newer systems are a bit more amenable to this sort of thing, I want to note that problems are fourfold and might warrant different approaches and fixes
1) The game only reads a given length of text and attempts to display it- this can be
i)  Hardcoded (fixed length is often used in RPG menus)
ii) The game has a simple file/data section containing lengths/locations somewhere aka pointers
iii) Still technically pointers but the game binary has actually assembly level code to handle locations/the text is in the binary itself

2) The game lacks the space in memory to hold the text- this is not so bad for old systems where things can usually be read directly from memory to the graphics but consoles like the DS will have to put data from the cart into memory leading to potential issues if the buffer size for text is fixed at so much. Of course the older systems will be a lot more limited in terms of space in the original game image leading to all sorts of potential workarounds. Fixes include
i) Converting the text from 16 bit (or worse) to 8 bit- Japanese has a few thousand characters in common use (more than 2^8 meaning 2^16 is the next viable length of a value for most systems) where Roman character set languages have less than 100 and less than 256 (2^8) if you include a couple of characters with accents and Greek.

3) In the case of text as graphics (usually but not exclusive to puzzle games, low text games and anything that would be considered "word art") sometimes repeated tiles can get in the way... easier to show it graphically
*****
*?*?*
*?*?*
*****
*= identical "blank" tiles
?= something

The question marks can quite easily house a couple of Kanji or something and say all it needs to say but English comes along and although well within space limits as far as screen real estate goes you are now going to have to figure out how the game maps the tiles or face having a massive line down the middle of your text.

4) Related to 3) even if the game has memory to hold it all then you might run into trouble fitting it all on the screen. In 3) you might also be facing very sharp limits on how much video RAM you can use and given images are usually layered with the text on top it can be a real problem. Aesthetics comes into play here as well but that is a slightly different discussion.


    How common is either of these approaches?
Assuming it is not ridiculous (essay for an options menu for instance) then the rejig the game to accommodate the space is considered the better approach and might not take that much effort if you know what you are doing. There is however a difference between "a slightly different sentence length and the game uses pointers/length values to hold how long each is which can accommodate the increase" and "I need to find some more space in the memory at runtime and/or recode the text display method". Likewise there is a difference between changing "Welcome weary travellers from a distant land" to "Welcome travellers" for an innkeeper seen once in a game and refactoring (probably want a better term) an entire translation.

    Which type of solution gets used more often?
For my money when working in text after learning what a table/encoding is then learning what pointers* are is mandatory before being able to call yourself a text hacker. To that end the usual method is hopefully have a rough idea of what the limits will be and try to work to those if it is realistic but once a translation appears:
Try to redo pointers to fit the text in.
If necessary try fiddling with some text for line length/aesthetic purposes (if a single word causes a next text box to appear...)
If you can not redo the pointers and solve screen real estate issues see if you can trim the translation a bit
If you have trimmed and are facing having to either use slang then comes either assembly level hacks or workarounds like dual/multiple tile encoding (my usual example is ll fits on one tile and as a bonus looks pretty good but you can also do things like analyse a script and encode the common pairings so a single value displays them).

Certainly butchering your translation to fit in with the original script length is not a technique that might aspire to call themselves a good hacker will be caught doing if they can help it. Magic spell names in RPG type games are usually the worst culprits here (I am probably supposed to point at Phantasy star but I see App. sur start/[insert button] instead of "appuyez sur start"/"appuyez sur le bouton" in several French games). Minor exception- it is sometimes OK to lose a quirk of a character (a meaningless sign off/catchphrase or something) but we are heading into translation methods and we do not need the exact vs meaning vs whatever debate again right now.

*this said people are usually far happier to take some time out to help someone figure out even a basic pointer system compared to where asking for a text encoding (assuming it is not a highly custom/unusual one with evidence of it being as such) is something of a faux pas in most ROM hacking communities that I have seen.

    How often are they used in combination?
See reply to previous question really. I would say it is rare for a script to be translated and not have to undergo a bit of fiddling to display/space purposes if the engine was not extensively vetted beforehand (and given once you have the encoding and can extract it then give it to a translator) and especially so for something like Japanese to a European language. Sometimes it can also be for gameplay purposes; if you have to press a button five times at the start/end of every random battle for the pre/post battle speech that is nothing more than flavour text then trimming that to a single time (or none) probably makes the game better.

    Is one approached only used when another approach fails?
Again the other questions pretty much take this one but whether it is because most hackers are technical people it is usually the technical/make the "game work for you" approach at first but neither is or should be renounced entirely.

    What kinds of factors influence how the space problems are solved?
If it is just pointers then do those, if it is going to involve extensive assembly hacking then tweak the script unless you have/are an assembly hacker with the time to do it, if it requires a modest assembly hack then play it by ear, if a simple graphical/font tweak can solve it then do that.... but time available and skill or persons involved is probably the short answer.

    How (if possible) do people avoid these kinds of problems arising?
Knowing limits of the text display methods are a good start (characters per line, length of section, ease of adding in a next "screen" of text, how much space you have when it comes to fitting it back into the ROM and/or memory it gets read to and so on) and either telling your translators do not go over this or coding the translation program built for the game to not exceed these limits.

    Do you think there are any trends in the way people solve these problems? If so, what are they?
Trends..... if I am going to have to recode a bunch of assembly to say mansion and hope something does not break (pointer maths based on an earlier location or something like calculated pointers) or just visit a thesaurus or in this case say home and not have to worry you can bet the latter will be done.
8 bit systems and some instances of 16 bit systems- every byte counts and moreover the game will probably be coded to reflect this where later systems can afford easier going coding techniques and/or are more likely to have more resources; most GBA games are 16 megabytes or less but every GBA game can address a full 32 megabytes and it is usually trivial to add it where something like the NES or SNES adding an extra 16 megs is either impossible, going to require a special emulator/hardware or be one of the hardest hacks you can do on the console. Although you will probably run into file/format or memory issues long before then DS ROM images can store up to 512 megs I believe and this is achieved by just pressing rebuild with the files in the appropriate directories; indeed developers quite often do not bother to clean up before building the final ROM and this leads to all sorts of niceties (build tools/scripts, extra levels and even source code on occasion).
To this end I guess as space effectively becomes unlimited and screen resolutions increase (although these are still far from unlimited) some of the more in depth techniques are not necessarily dying off but becoming less necessary for someone to completely understand to be able to effect a near perfect translation.

Zoinkity

  • Hero Member
  • *****
  • Posts: 565
    • View Profile
Re: Reinserting a translated script - space problems
« Reply #4 on: August 10, 2012, 11:44:36 am »
One nasty situation of fixed-length text is with saved names in save files.  The files themselves are usually very small, specificly formatted, and can only be one of several supported sizes for any given console. 
Unless you want to be stuck with a name no longer than 'Bob', you'll need to invest a lot of work into rearranging what data is stored, possibly limitting the number of saves if this isn't possible, and concider the save's name length as the final name limitation.  In this case, it may justify having to alter character name lengths (and therefore names themselves) unless you can use a clever workaround.
One such workaround would be to permit a custom name of only one length and a special code that would load the longer 'official' name if you decided not to change it.

FAST6191

  • Hero Member
  • *****
  • Posts: 2937
    • View Profile
Re: Reinserting a translated script - space problems
« Reply #5 on: August 10, 2012, 12:40:06 pm »
3 letter names for saves..... well if arcades are not so much of a thing as they once were I guess the arcade can come to the people.

Zoinkity

  • Hero Member
  • *****
  • Posts: 565
    • View Profile
Re: Reinserting a translated script - space problems
« Reply #6 on: August 10, 2012, 07:52:06 pm »
I suppose four letters is more common.  For some insane reason I was thinking "Four letters?  What about the NULL?" when obviously this is a fixed-length string (aegh). 
You have to admit there's a lot of four letter only name cases, and that is very limitting in english.

Ryusui

  • Hero Member
  • *****
  • Posts: 4989
  • It's the greatest day.
    • View Profile
    • Tumblr
Re: Reinserting a translated script - space problems
« Reply #7 on: August 10, 2012, 07:59:33 pm »
One such workaround would be to permit a custom name of only one length and a special code that would load the longer 'official' name if you decided not to change it.

To be honest, this is exactly what I was planning to do with Bistro Recipe/Fighting Foodons.
In the event of a firestorm, the salad bar will remain open.

LostTemplar

  • Hero Member
  • *****
  • Posts: 906
    • View Profile
    • au-ro-ra.net
Re: Reinserting a translated script - space problems
« Reply #8 on: August 11, 2012, 04:00:16 am »
Yes, this is one example of how much everything is game-dependent. The game I'm working on right now saves the protagonist's name with two bytes per character and allows up to eight characters. I changed it to one byte per character and voilà, 16 characters for English names without any further ado and any messing with the SRAM. You usually don't get any luckier than that.

Gideon Zhi

  • Discord Staff
  • Hero Member
  • *****
  • Posts: 3531
    • View Profile
    • Aeon Genesis
Re: Reinserting a translated script - space problems
« Reply #9 on: August 11, 2012, 08:44:41 am »
Yeah, but then you have to rejigger all the interfaces to accommodate that sort of expansion. (See shop screen in the link.) That one was especially messy - you had combat interfaces that only allowed five characters, then you had characters with names like "Great Leader" or "Starfish Ron." What we did there was rig up a few of the interface bits to display a truncated name, which either ended up as part of the name (i.e. "Ron") or as painted text (see Daichi and Hinako in the screens) and rewired the rest to allow for longer names. We also changed the default male main character name from Masato to Masao as a result of the constraints.

Zoinkity

  • Hero Member
  • *****
  • Posts: 565
    • View Profile
Re: Reinserting a translated script - space problems
« Reply #10 on: August 12, 2012, 02:38:25 pm »
Adding +6 string length for Animal Forest names broke every single string copier, string lookup, fixed-length printer, string storage, colorizer, text bubble sizer, and message generator in the whole game.  For whatever silly reason they happen to duplicate the routines within stand-alone archives instead of calling the identical, generalized functions, so at least that step was made easier by altering one routine and redirecting everything else to call it. 

The really fun part was that the name text bubbles had to be offset more from the sides of the screen than usual.  Under a fixed-width, 6 char setup they were set so they would never be drawn off-screen regardless of scale using a simple offset and multiplier.  Drawing off-screen is very bad for the RDP and will crash real hardware.  An entirely different offsetting scheme was introduced to account for variable-width generated bubbles and minimum spacing from screen borders, computing the drawn width of the text, accounting for the bubble at scale, aligning away from the window edges, and leaving a small space for security.
Oddly, doing that buggered up the Snowman's mail generator (it inserts his name) because for whatever reason they used that particular text printer to copy the name into the mail.  Presumably it was to handle wordwrapping.  The best part is that the crash occurs when you talk to him, when the text bubble is first filled with his name, making it look like the name bubble printer was at fault--which it technically was, just from a different root cause.

Hourai was especially fun, certainly worth the effort.

Klarth

  • Sr. Member
  • ****
  • Posts: 492
    • View Profile
Re: Reinserting a translated script - space problems
« Reply #11 on: August 12, 2012, 04:07:26 pm »
For whatever silly reason they happen to duplicate the routines within stand-alone archives instead of calling the identical, generalized functions, so at least that step was made easier by altering one routine and redirecting everything else to call it. 
This is most likely due to a compiler optimization setting that inlines all function calls to reduce function call overhead.  In most cases, it's not very significant savings, but in some cases it is.

Zoinkity

  • Hero Member
  • *****
  • Posts: 565
    • View Profile
Re: Reinserting a translated script - space problems
« Reply #12 on: August 13, 2012, 04:16:13 pm »
I can certainly understand why they would do that, but due to the nature of the game, the original functions are in a system bank that is always loaded (along with their dependancies) so doing this only means they're eating ram loading a duplicate of the same thing. I might add it doesn't bother to do the same thing with the function dependancies.  Deleting extroneous code like this is a great way to make room for added content though.

What you said though would have worked great for functions dynamically loaded either by TLB or found only within dedicated files.  The Pokémon Stadium titles and pretty much anything made by Rare would be excellent examples of that done right.  Stadium titles go one step further and duplicate any resource used--especially images.  There are so many pokéballs and cursors you could choke a horse.