News: 11 March 2016 - Forum Rules
Current Moderators - DarkSol, KingMike, MathOnNapkins, Azkadellia, Danke

Author Topic: Something wrong with searching Cavespeak?  (Read 764 times)

Reefytheslim

  • Jr. Member
  • **
  • Posts: 7
    • View Profile
Something wrong with searching Cavespeak?
« on: May 08, 2020, 01:46:32 pm »
hi guys
Sorry for my question if it seems noobish, im still learning. i'm trying to translate a Jap Snes game called (Mini Yonku Let's & Go!! - Power WGP 2) and i've read many tutorials and did some practice on different games and everything was successful.
im using the Cavespeak relative search approach when i replace the jap font with english font (A-Z all caps) then using windhex to search for the cavespeak and building the table accordingly.
So when i tried to work on this game (Mini Yonku Let's & Go!! - Power WGP 2) I replaced the japanese font with english capital alphabet (which are already present in the rom) and i got the cavespeak result. but when i try to relative search for it using windhex i get either no results or results that are incorrect, I can't even search for more then 3-4 characters too. i used all Caps english font and no numbers.
what could be wrong?
here is the font in tile molester and the cavespeak:



this goes for many cavespeak sentences in the game, same result.
https://imgur.com/BhnrPZf
(hight resolution image)


Is there something missing? i really need some guidance here :-[
« Last Edit: May 08, 2020, 01:54:10 pm by Reefytheslim »

FAST6191

  • Hero Member
  • *****
  • Posts: 2966
    • View Profile
Re: Something wrong with searching Cavespeak?
« Reply #1 on: May 09, 2020, 01:43:18 pm »
Does your relative search do 16 bit encoding searches? Or possibly can you insert a wildcard between the values to mimic it? I don't tend to use windhex for relative search so don't know its limitations and general workflow here. Monkey moore is my chosen toy for this sort of thing https://www.romhacking.net/utilities/513/

Anyway you can't fit all the Japanese characters into 8 bit encoding (at least not without cycling it out which is tedious and quite limiting) so most things opt to use 16 bit encodings which can make life harder for a basic relative search.
Alternatively the encoding might not be relative -- there is no technical reason for the encoding to be relative (if there are gaps between encoding values, which can happen for all sorts of reasons* or even the same as the font order, just is common for it to be.

*ASCII for instance has the upper and lower case separated by 20 hex so you can add to go between them ( http://www.asciitable.com/ ) as well as the likes of all numbers being 3 and then the number you want. I don't specifically know offhand what Japanese encoding peeps might have done way back when (and my Japanese knowledge is barely enough to hazard a guess as to the possibilities) but there are all sorts of options there which would frustrate such an approach.

You may then have come upon a game for which relative search is not a useful tool; in many cases it is quite useful but not in all of them.

Reefytheslim

  • Jr. Member
  • **
  • Posts: 7
    • View Profile
Re: Something wrong with searching Cavespeak?
« Reply #2 on: May 09, 2020, 02:33:06 pm »
Does your relative search do 16 bit encoding searches? Or possibly can you insert a wildcard between the values to mimic it? I don't tend to use windhex for relative search so don't know its limitations and general workflow here. Monkey moore is my chosen toy for this sort of thing https://www.romhacking.net/utilities/513/

Anyway you can't fit all the Japanese characters into 8 bit encoding (at least not without cycling it out which is tedious and quite limiting) so most things opt to use 16 bit encodings which can make life harder for a basic relative search.
Alternatively the encoding might not be relative -- there is no technical reason for the encoding to be relative (if there are gaps between encoding values, which can happen for all sorts of reasons* or even the same as the font order, just is common for it to be.

*ASCII for instance has the upper and lower case separated by 20 hex so you can add to go between them ( http://www.asciitable.com/ ) as well as the likes of all numbers being 3 and then the number you want. I don't specifically know offhand what Japanese encoding peeps might have done way back when (and my Japanese knowledge is barely enough to hazard a guess as to the possibilities) but there are all sorts of options there which would frustrate such an approach.

You may then have come upon a game for which relative search is not a useful tool; in many cases it is quite useful but not in all of them.

thnx for the reply :)
Well this complicates things lol i guess im gonna need more reading to do
what do you suggest in this case?
by the way i tried Monkey moore and Translhextion (both have the ability to use a wildcard) and i skipped a value after each character, still no results found, although i think i need to search for more text from different locations in the game since I only used it once.

any thoughts on how to approach this? anything would be helpful

KingMike

  • Forum Moderator
  • Hero Member
  • *****
  • Posts: 7042
  • *sigh* A changed avatar. Big deal.
    • View Profile
Re: Something wrong with searching Cavespeak?
« Reply #3 on: May 10, 2020, 01:51:17 am »
Anyway you can't fit all the Japanese characters into 8 bit encoding (at least not without cycling it out which is tedious and quite limiting) so most things opt to use 16 bit encodings which can make life harder for a basic relative search.
Alternatively the encoding might not be relative -- there is no technical reason for the encoding to be relative (if there are gaps between encoding values, which can happen for all sorts of reasons* or even the same as the font order, just is common for it to be.
Standardized Japanese encodings are extremely rare in a 16-bit game. I've encountered ONE SNES game using Shift-JIS.
Commonly games will use 8-bit values for kanji, with a set of values reserved as a lead byte for kanji.

If using a wildcard character doesn't work (that is, with a relative searcher that supports wildcards, using say "U*F*U*" or "*U*F*U" where the searcher understands * as a wildcard), there is the other thing that was common in SNES RPGs: compression. That is an advanced topic, because it requires knowing how to read and decipher ASM code to figure out the compression format.
"My watch says 30 chickens" Google, 2018

Vehek

  • Full Member
  • ***
  • Posts: 200
    • View Profile
Re: Something wrong with searching Cavespeak?
« Reply #4 on: May 10, 2020, 02:19:00 am »
I've encountered ONE SNES game using Shift-JIS.
I've encountered at least two, not including Satellaview releases: Appleseed and Power Lode Runner. This also doesn't include games that contain most of the kanji in JIS order, but don't use the encoding directly.
« Last Edit: May 10, 2020, 02:29:51 am by Vehek »

FAST6191

  • Hero Member
  • *****
  • Posts: 2966
    • View Profile
Re: Something wrong with searching Cavespeak?
« Reply #5 on: May 10, 2020, 12:00:49 pm »
Standardized Japanese encodings are extremely rare in a 16-bit game. I've encountered ONE SNES game using Shift-JIS.
Commonly games will use 8-bit values for kanji, with a set of values reserved as a lead byte for kanji.

If using a wildcard character doesn't work (that is, with a relative searcher that supports wildcards, using say "U*F*U*" or "*U*F*U" where the searcher understands * as a wildcard), there is the other thing that was common in SNES RPGs: compression. That is an advanced topic, because it requires knowing how to read and decipher ASM code to figure out the compression format.

I was not thinking standard encodings as much as tricks a dev might have done in the encodings that would frustrate a simple relative search. Things like they might break kanji down by moji and leave a gap between values, or encode such that dakuten and handakuten are say 1 up from their equivalent, or 30+ up, and still not be in the same order as the font. ASCII however has a few of those tricks that are easy to explain and hopefully would have got my concept across more easily than plain words might. My knowledge of Japanese and commonly used patterns by Japanese programmers is rather minimal here (as in I had to go look up dakuten and handakuten as I had forgotten the terms for the kana characters with punctuation) so I don't really have any Japanese examples, not to mention I venture a guess it would not matter much to the OP anyway.

Equally is 16 bit encoding that uncommon on the SNES? SNES is by no means my thing but the few I messed around with had some 16 bit stuff (completely custom encoding I ended up brute forcing/decoding by changing and seeing what the result was but that is neither here nor there for this) for the menus and in low volume text games (did a few puzzle games at one point). Would be surprised to see it on the NES, even more so if it was common, but bump up to the PS1 and contemporaries and successors and at least in the Japanese editions of games (or games that have some amount of kanji -- kana only stuff for kids is a different matter) it is common as dirt, only time I would even note it is if it is one of the games that stuffs everything in memory and 16 bit English sees you run out of memory.
Again though not my area so if it is uncommon on the SNES then I happily defer to those that do play more in that world, my limited examples from the SNES leading me astray.

Reefytheslim

  • Jr. Member
  • **
  • Posts: 7
    • View Profile
Re: Something wrong with searching Cavespeak?
« Reply #6 on: May 10, 2020, 04:10:35 pm »
Thnx for all your replies guys

i got some results by finding some english text in one of the dialogues, so i searched for it and got a successful result (it was START word sitting between jap characters) so i built a table accordingly. The table is correct as i checked it on several examples.
but here's what i noticed, this game is a racing car type game (mini 4wd) so there are 2 places where text appear: 1) during a race where instructions are given, and this is where i got a successful table from. and  2) the actual dialogues between game characters. so when i apply the table hex data on the first point, i get correct results. But when i try it on the second point (dialogues) i get nothing. Even relative search works fine on the first point as well as value search, but on the second i get the same problem, nothing!
could it be that the game uses a compression with dialogues but not on the instructions and other texts? shouldn't compression be on the entire game text? this is frustrating
here are screenshots:
https://imgur.com/P0EWD1p
https://imgur.com/9plQyYA