News: 11 March 2016 - Forum Rules
Current Moderators - DarkSol, KingMike, MathOnNapkins, Azkadellia, Danke

Author Topic: (SOLVED) Wanderbar - text grab function for Jungle Wars 2  (Read 763 times)

aqualung

  • Full Member
  • ***
  • Posts: 212
    • View Profile
(SOLVED) Wanderbar - text grab function for Jungle Wars 2
« on: November 02, 2020, 12:58:24 pm »
UPDATE 2: (I put the previous ones in spoiler so as not to clutter the post too much)

Well, I don't know which kind of planets have aligned, but I tried some crazy idea that popped up on my mind to make that text grab function in Lua on Wanderbar, and it worked. ALMOST worked. Because the Lua code works as intended and gets the char codes of the dialog printing in the screen, but it doesn't recognize some characters (specifically it doesn't recognize the kanas that have maru or tenten). So, instead of "desu" it ends up printing "tesu", or "haha" instead of "papa", and so on. The table file I created is correct, I've checked it. It seems that the game does something strange when it has to print a kana that has tenten or maru. I've been looking at it for some days, but I think at this point I've really hit a wall. I've attached the Lua code in case someone is curious about how I finally managed to do it. To look how the in-game print dialog works, I've used the No$sns emulator embedded debugger.

Link to a video showing the code in action, on Wanderbar: https://www.youtube.com/watch?v=Ig-iWjG5foo&feature=youtu.be

And this is the main.lua file that Mato created for me initially and I've expanded to add the text grab routine:  https://www.dropbox.com/s/c50sv23d9opo8z8/main.lua?dl=0

If anyone can give it a go and finds out why the kanas with maru and tenten are not printed correctly, I'll appreciate it.





Spoiler:
UPDATE: I've made a video to give a more visual example:

https://www.youtube.com/watch?v=htAxR-9J5LE

I've created two watchpoints in snes9x, one that follows the address 000592 (7E0592 in snes9x) and another one I've found that changes its value to 01 every time the dialog box opens (000596. Perhaps some kind of flag for when there's dialog printing?)

snes9x prints the watched values in decimal, but if you change them to hex you can see that the values match with the one in the JW table file values (for instance: 216 = 0xD8, which is the code for the arrow pointing down, or 159 -> 0x9F, which is the code for white space).

So, my request still stands. If someone could provide me of a Lua script or any other way to make a script that stores in an array every dialog printed to 000592, I could get the game's text and translate it as I play the game (it'd be easier to translate, as I'd have the context of what's happening at the same time the dialogs are printed).

I've had to use my cellphone's potato-cam again (snes9x built-in video recording function didn't record the watchpoint info), but I hope the image can be seen clearly enough.

Original post: ----------------------------------------

I'd need a helping hand for a thing I'd like to do with Wanderbar, if it is possible. I'm near finishing my dialog-only translation of the snes rpg "Idea no Hi", and the next game on my list is the also snes one "Jungle Wars 2".

For Idea no Hi I've used a OCR soft called Capture2Text, but that program doesn't work well with the font JW2 uses (the OCR returns mostly gibberish and random symbols).

So instead of randomly testing lots of OCR programs until I find one that works (if I find it) or copying the text by hand as I tried to do before, I've remembered about a Wanderbar Lua plugin that Mato wrote for me some time ago, which captures and prints the line ID of every dialog text whenever it appears in the game. I've looked into it a little and this is what I know about how the text in JW2 is printed:

Every time a dialog window opens, the char code for every character is written in memory, one after the other, in the address 000592 (every char is only 1 byte long and every code gets written in that same address, so every char overwrites the previous one until the text ends).

Would it be very difficult to create a routine that reads every char code that gets written in that address, and store it in an array? I already made a table file for that game, so writing a routine that converts that array of char codes to their equivalent characters and then print it into Wanderbar's browser window should be more or less trivial and I should be able to do that myself. What I need is for someone to do the routine that fires itself every time a dialog box opens and then captures in an array, one by one, every char code while they're written into the memory address 000592 (that part is beyond my knowledge).

That routine would help me a lot, as I could ctrl+C every line of text while I play the game and paste it into a translator instead of having to copy the text manually, which would take me ages.

PD: I know there are programs that can extract all the text from the rom at once (Atlas, ABCDE, Windhex32, etc), but then I'd have all the text of the game without any order and I need to see what is happening into the game in order to have proper context of what's happening (specially because it's an only kana game, so it'll be a little more difficult to translate than Idea no Hi).

I hope I've managed to explain myself. Thanks in advance to anyone who can offer any help.
« Last Edit: November 25, 2020, 12:31:26 pm by aqualung »

mono21400

  • Newbie
  • *
  • Posts: 2
    • View Profile
Re: (almost solved!!) Wanderbar - text grab function for Jungle Wars 2
« Reply #1 on: November 24, 2020, 01:40:04 pm »
Okay I gave it a shot

Success! (I'm using FFVI's html file as you didn't provide one, but works regardless)

You can do it two ways, with combining unicode characters (easiest, but some programs dislike combining characters):
Spoiler:
Code: [Select]
function PC_86F7()
local charCode = getMemory(0x592, 1)
--The game grabs another character ("。" or "`") from 0x594 to use for the maru/tenten, just like the unicode combining characters
local maruten = getMemory(0x594, 1)

if maruten == 0x93 then --tenten!
dialogCharCodeString = dialogCharCodeString .. charCodeToCharacter(string.format("%x", charCode)) .. utf8.char(0x3099)
elseif maruten == 0x94 then --maru!
dialogCharCodeString = dialogCharCodeString .. charCodeToCharacter(string.format("%x", charCode)) .. utf8.char(0x309A)
else
dialogCharCodeString = dialogCharCodeString .. charCodeToCharacter(string.format("%x", charCode))
end
--writeHTML("devinfo", string.format("%x", charCode))
writeHTML("scriptarea", dialogCharCodeString)
end

or with a  lookuptable to avoid using combining characters (requires an extra table, but it's the most compatible):
Spoiler:
Code: [Select]
--First, create this Lookuptable below the original jw_tbl table
jw_tbl_maruten={}
jw_tbl_maruten["0"]="が"
jw_tbl_maruten["1"]="ぎ"
jw_tbl_maruten["2"]="ぐ"
jw_tbl_maruten["3"]="げ"
jw_tbl_maruten["4"]="ご"
jw_tbl_maruten["5"]="ざ"
jw_tbl_maruten["6"]="じ"
jw_tbl_maruten["7"]="ず"
jw_tbl_maruten["8"]="ぜ"
jw_tbl_maruten["9"]="ぞ"
jw_tbl_maruten["a"]="だ"
jw_tbl_maruten["b"]="ぢ"
--jw_tbl_maruten["c"]="(Invalid)"
jw_tbl_maruten["d"]="づ"
jw_tbl_maruten["e"]="で"
jw_tbl_maruten["f"]="ど"
--skipped n block
jw_tbl_maruten["15"]="ば"
jw_tbl_maruten["16"]="び"
jw_tbl_maruten["17"]="ぶ"
jw_tbl_maruten["18"]="べ"
jw_tbl_maruten["19"]="ば"
jw_tbl_maruten["1a"]="ぱ"
jw_tbl_maruten["1b"]="ぴ"
jw_tbl_maruten["1c"]="ぷ"
jw_tbl_maruten["1d"]="ぺ"
jw_tbl_maruten["1e"]="ぽ"
--katakana
jw_tbl_maruten["37"]="ガ"
jw_tbl_maruten["38"]="ギ"
jw_tbl_maruten["39"]="グ"
jw_tbl_maruten["3a"]="ゲ"
jw_tbl_maruten["3b"]="ゴ"
jw_tbl_maruten["3c"]="ザ"
jw_tbl_maruten["3d"]="ジ"
jw_tbl_maruten["3e"]="ズ"
jw_tbl_maruten["3f"]="ゼ"
jw_tbl_maruten["40"]="ゾ"
jw_tbl_maruten["41"]="ダ"
jw_tbl_maruten["42"]="ヂ"
--jw_tbl_maruten["43"]="(Invalid)"
jw_tbl_maruten["44"]="ヅ"
jw_tbl_maruten["45"]="デ"
jw_tbl_maruten["46"]="ド"
--skipped n block
jw_tbl_maruten["4c"]="バ"
jw_tbl_maruten["4d"]="ビ"
jw_tbl_maruten["4e"]="ブ"
jw_tbl_maruten["4f"]="ベ"
jw_tbl_maruten["50"]="ボ"
jw_tbl_maruten["51"]="パ"
jw_tbl_maruten["52"]="ピ"
jw_tbl_maruten["53"]="プ"
jw_tbl_maruten["54"]="ペ"
jw_tbl_maruten["55"]="ポ"


-- Then, edit these functions
function charCodeToCharacter(c, m)
-- The second argument is "mode", if it's 1, then the character has maru or tenten diacritics
if m == 1 then
res = jw_tbl_maruten[c]
else
res = jw_tbl[c]
end
return res
end


function PC_86F7()
local charCode = getMemory(0x592, 1)
local maruten = getMemory(0x594, 1)

--if "maruten" is equals to 0x93 the character has tenten, like a combining unicode character
--otherwise, if it equals to 0x94 the character has maru, like a combining unicode character
--Used a lookuptable for compatibility with other programs that dislike combinig characters
if maruten == 0x93 then
dialogCharCodeString = dialogCharCodeString .. charCodeToCharacter(string.format("%x", (charCode - 0x2e)), 1)

elseif maruten == 0x94 then
dialogCharCodeString = dialogCharCodeString .. charCodeToCharacter(string.format("%x", (charCode - 0x29)), 1)
else
dialogCharCodeString = dialogCharCodeString .. charCodeToCharacter(string.format("%x", charCode), 0)
end
writeHTML("devinfo", string.format("%x", charCode))
writeHTML("scriptarea", dialogCharCodeString)
end

aqualung

  • Full Member
  • ***
  • Posts: 212
    • View Profile
Re: (almost solved!!) Wanderbar - text grab function for Jungle Wars 2
« Reply #2 on: November 25, 2020, 12:45:18 pm »
Okay I gave it a shot

Success! (I'm using FFVI's html file as you didn't provide one, but works regardless)

You can do it two ways, with combining unicode characters (easiest, but some programs dislike combining characters):
Spoiler:
Code: [Select]
function PC_86F7()
local charCode = getMemory(0x592, 1)
--The game grabs another character ("。" or "`") from 0x594 to use for the maru/tenten, just like the unicode combining characters
local maruten = getMemory(0x594, 1)

if maruten == 0x93 then --tenten!
dialogCharCodeString = dialogCharCodeString .. charCodeToCharacter(string.format("%x", charCode)) .. utf8.char(0x3099)
elseif maruten == 0x94 then --maru!
dialogCharCodeString = dialogCharCodeString .. charCodeToCharacter(string.format("%x", charCode)) .. utf8.char(0x309A)
else
dialogCharCodeString = dialogCharCodeString .. charCodeToCharacter(string.format("%x", charCode))
end
--writeHTML("devinfo", string.format("%x", charCode))
writeHTML("scriptarea", dialogCharCodeString)
end

or with a  lookuptable to avoid using combining characters (requires an extra table, but it's the most compatible):
Spoiler:
Code: [Select]
--First, create this Lookuptable below the original jw_tbl table
jw_tbl_maruten={}
jw_tbl_maruten["0"]="が"
jw_tbl_maruten["1"]="ぎ"
jw_tbl_maruten["2"]="ぐ"
jw_tbl_maruten["3"]="げ"
jw_tbl_maruten["4"]="ご"
jw_tbl_maruten["5"]="ざ"
jw_tbl_maruten["6"]="じ"
jw_tbl_maruten["7"]="ず"
jw_tbl_maruten["8"]="ぜ"
jw_tbl_maruten["9"]="ぞ"
jw_tbl_maruten["a"]="だ"
jw_tbl_maruten["b"]="ぢ"
--jw_tbl_maruten["c"]="(Invalid)"
jw_tbl_maruten["d"]="づ"
jw_tbl_maruten["e"]="で"
jw_tbl_maruten["f"]="ど"
--skipped n block
jw_tbl_maruten["15"]="ば"
jw_tbl_maruten["16"]="び"
jw_tbl_maruten["17"]="ぶ"
jw_tbl_maruten["18"]="べ"
jw_tbl_maruten["19"]="ば"
jw_tbl_maruten["1a"]="ぱ"
jw_tbl_maruten["1b"]="ぴ"
jw_tbl_maruten["1c"]="ぷ"
jw_tbl_maruten["1d"]="ぺ"
jw_tbl_maruten["1e"]="ぽ"
--katakana
jw_tbl_maruten["37"]="ガ"
jw_tbl_maruten["38"]="ギ"
jw_tbl_maruten["39"]="グ"
jw_tbl_maruten["3a"]="ゲ"
jw_tbl_maruten["3b"]="ゴ"
jw_tbl_maruten["3c"]="ザ"
jw_tbl_maruten["3d"]="ジ"
jw_tbl_maruten["3e"]="ズ"
jw_tbl_maruten["3f"]="ゼ"
jw_tbl_maruten["40"]="ゾ"
jw_tbl_maruten["41"]="ダ"
jw_tbl_maruten["42"]="ヂ"
--jw_tbl_maruten["43"]="(Invalid)"
jw_tbl_maruten["44"]="ヅ"
jw_tbl_maruten["45"]="デ"
jw_tbl_maruten["46"]="ド"
--skipped n block
jw_tbl_maruten["4c"]="バ"
jw_tbl_maruten["4d"]="ビ"
jw_tbl_maruten["4e"]="ブ"
jw_tbl_maruten["4f"]="ベ"
jw_tbl_maruten["50"]="ボ"
jw_tbl_maruten["51"]="パ"
jw_tbl_maruten["52"]="ピ"
jw_tbl_maruten["53"]="プ"
jw_tbl_maruten["54"]="ペ"
jw_tbl_maruten["55"]="ポ"


-- Then, edit these functions
function charCodeToCharacter(c, m)
-- The second argument is "mode", if it's 1, then the character has maru or tenten diacritics
if m == 1 then
res = jw_tbl_maruten[c]
else
res = jw_tbl[c]
end
return res
end


function PC_86F7()
local charCode = getMemory(0x592, 1)
local maruten = getMemory(0x594, 1)

--if "maruten" is equals to 0x93 the character has tenten, like a combining unicode character
--otherwise, if it equals to 0x94 the character has maru, like a combining unicode character
--Used a lookuptable for compatibility with other programs that dislike combinig characters
if maruten == 0x93 then
dialogCharCodeString = dialogCharCodeString .. charCodeToCharacter(string.format("%x", (charCode - 0x2e)), 1)

elseif maruten == 0x94 then
dialogCharCodeString = dialogCharCodeString .. charCodeToCharacter(string.format("%x", (charCode - 0x29)), 1)
else
dialogCharCodeString = dialogCharCodeString .. charCodeToCharacter(string.format("%x", charCode), 0)
end
writeHTML("devinfo", string.format("%x", charCode))
writeHTML("scriptarea", dialogCharCodeString)
end

Great! It's just what I was looking for, thanks a lot, Mono21400.

Sorry for not providing the html file. Frankly, I didn't have much hope of somebody replying, that's why I only provided the Lua script in the first place.

I looked into it for a few hours, but I couldn't figure out that just two bytes to the right was the flag for the maru and tenten characters. I didn't understand why, having all the accentuated characters already at the table file, they didn't just use those. From the two approaches you've given, I think I'll chose the second one as seems to be more compatible as you already said

I can continue translating this game at last. When I started almost a year ago, I tried manually copying the game dialogues, but it was too much work. Now it's only a matter of copypasting into DeepL, finetuning the results if needed, and paste it a text file.

Thanks a lot again, you've been a lifesaver! Would you mind if I cite you in the Lua code comments? I always like giving credit to any person who helps me.

mono21400

  • Newbie
  • *
  • Posts: 2
    • View Profile
Re: (SOLVED) Wanderbar - text grab function for Jungle Wars 2
« Reply #3 on: November 26, 2020, 02:32:28 pm »
Great! It's just what I was looking for, thanks a lot, Mono21400.

Sorry for not providing the html file. Frankly, I didn't have much hope of somebody replying, that's why I only provided the Lua script in the first place.
You are welcome! And don't be too sorry about the html file, at least I could use FFVI's so it wasn't too bad.


I looked into it for a few hours, but I couldn't figure out that just two bytes to the right was the flag for the maru and tenten characters. I didn't understand why, having all the accentuated characters already at the table file, they didn't just use those. From the two approaches you've given, I think I'll chose the second one as seems to be more compatible as you already said
It's because the table file does tell the game which letter is which, but the place where you are taking the individual characters is at the point where the game is just about to draw them.
The system they have is setup to not waste space, so they avoid having to store the "repeated" characters by having one nifty function that creates the new characters by overlaping the dot to create maru and two comas to createn the tenten, pretty ingenious don't you think?


I can continue translating this game at last. When I started almost a year ago, I tried manually copying the game dialogues, but it was too much work. Now it's only a matter of copypasting into DeepL, finetuning the results if needed, and paste it a text file.

Thanks a lot again, you've been a lifesaver! Would you mind if I cite you in the Lua code comments? I always like giving credit to any person who helps me.
About crediting, sure, go right ahead! And much luck in your translating adventure!