This is a subject as broad or broader than any other in ROM hacking and is basically half of ROM hacking (the other half being how do I interpret data I have found).
First the PS2 much like most other consoles that use a CD/DVD for game storage uses a filesystem and probably does not have the thing mapped to memory. This changes things a bit and makes some things easier and some things harder. For the record the PS2 is a mix between standard iso 9660 (open with any iso editor) and raw DMA (will have to fish it out with a hex editor give or take a few area specific tools) depending upon the game but more on that as things go on.
The king of the hill is a method called tracing, it is flawless but as it necessitates messing around with assembly it has a higher barrier to entry than most other methods though not as much as you might think (indeed I reckon it could be done without knowing a single assembly instruction, it is better if you do but not necessary for a lot of it). For this you need a proper hacker emulator (not quite sure what the PS2 stuff is up to these days) and you start by finding the data in memory (viewing it in hex, cheats, tile viewers or plain seeing it in the game and working back from there giving you the first "ah there it is in memory"). When discussing the GBA I usually link http://www.romhacking.net/documents/361/
and it is much the same for every console (see it in memory, reset the game and set a breakpoint for something to write to that memory area, the emulator will stop things when that happens and assuming it is the data you want you follow it backwards, typically it is only one (normal cases) or two (when compression/manipulation is necessary to deal with) steps.
This is where the "standard iso 9660 and raw DMA" stuff comes back in as on the GBA (and the SNES and the NES and everything else a lot of the RHDN userbase plays with more commonly) the ROM is in memory and handled accordingly. The NES has things like mappers to contend with but it is still there, the playstation (and other CD/DVD based consoles or indeed more modern consoles like the DS) will not have this and instead have a function (both in the game and more generally in terms of hardware) to read from it; you get to decode this function read and/or hardware call to the game storage section.
Next comes everything else. Where tracing was flawless everything else with the possible exception of corruption if you work at it long enough has huge pitfalls.Filesystem stuff.
Back to the filesystem stuff though it makes tracing subtly different and perhaps more difficult it has three huge perks
1) Filenames/Extensions -- developers are not always that hell bent on not having J Random Hacker from getting at the files in a game and they have to get stuff done so the file names and the extensions of the files can give an awful lot away.
2) Directory structure -- much like 1) really.
3) File sizes -- the 200 byte file is probably not going to be your 5 minute heavy CGI cutscene. It might be a cutscene if such a thing is rendered in the game engine though.
You can also use all three in the "remove files from the search" list -- the contents of the audio and movie directories will probably not interest you if you are after ripping the text which is a bonus as audio and video take up an awful lot of space. Now the developer could have chosen to stick the files in an archive file for some reason, it is what a good chunk of DS hacking is concerned with when it comes down to it. There is also the issue of magic stamps, for various reasons developers like to indicate file types by putting parts of the header, parts of the file or parts of the footer as known valueshttp://astrogrep.sourceforge.net/
is great here.
Other methodsHex searching.
I typically use this for and teach this to people that are looking for a palette for an image -- rip it from memory, do a search of the ROM for it or some fragment of it and you might be surprised how often it works. Things like compression, dynamic data (a number might change after it has been read from ram) and manipulated data (the way something is stored and the way something ends up in memory need not be the same) will trouble this. That said I have seen people find monster stats this way, typically it comes with a lot of work in the analysis stage (more on that a bit later) but it can happen.Relative searching.
Half decoding method, half search method. In most languages that are not logographic/ideographic the letters have a defined order (even in Japanese though the kana have a few accepted orders). More generally this means A=1 B=2 C=3 and so on or something like it for the text encoding.
You find a string from the game and it looks for patterns in the game to match the order if the game uses a relative encoding.http://www.romhacking.net/utilities/513/
is my tool of choice for this though Crystaltile2 also works for me.
There are tricks to picking a good string (try not to have capitals and lower case mixed, try not to have variables in the searched string, try not to have sections of different fonts/text effects in the searched string, try not to have a new screen, new paragraph or even new line start in the searched string, long is good but too long is possible (usually by virtue of things in the previous point popping up), short works but that often makes for a mountain to dig through) and I have been known to live dangerously and do a search for something like " the " as the with spaces either side is quite common in English.
Text decoding methods are many and varied too. You can do things like frequency analysis (space is the more common character in almost all texts, spaces are rarely more than 12 characters apart, scrabble letter scores are picked for a reason), linguistic analysis (every word has a vowel or y in it in English), combination analysis (if you have the space character you can fill in the blanks)..... you eventually arrive at my much enjoyed method "corruption".Corruption.
You have an emulator the runs the game from your hard drive, of if you are feeling time happy and money happy then you could burn discs every time if you like.
This affords you the option to change something in the game (do note the filesystem stuff from earlier) and run it. If something is broken and you changed something.... the game could crash, the game might not load what you want for a while and other things like that but eventually you zero in on things.
Going back to the text decoding for a moment I like to change text to repeating patterns (if you stick a long run of a single character in the text and it displays in the game then you know what that character is, what the ones beside it are and can work out the rest. Do it properly and add a pattern like 1112 1213 1313 ..... and you see a pattern in the game of characters getting one more in a run and one higher and you are laughing).
You can do this with any halfway decent hex editor by inverting things (which has the added bonus of being able to turn it back easily), filling sections, pasting in new data.....
Some consider this a bit crude... and it is but it is potent and though the traditional "corrupt half the ROM and go from there" approach is probably to be avoided a slightly more subtle approach can see it do things for you (the text decoding by forcing the game -- why spend 20 minutes messing around with assembly code when you can just force the game to decode it and note the result).
Two main issues are game devs probably did not expect random changes in their data so games can get a bit odd when you do this and in some instances the corruption can trigger antipiracy protection (again game devs did not expect random changes -- it is called a ROM or CD/DVD-ROM for a good reason) and any changes might trigger this depending upon the game.Analysis
I already mentioned magic stamps but the ability to detect certain types of files and methods (a process known as fingerprinting) is much more varied than that. On the GBA the cart is typically read from 08?????? so if you see lots of 08 values with 6 random values in between them (better yet if they get larger every time) you probably have a pointer section. Pointers to what I have no idea but pointers... well they point to things that are typically not random data and they will probably also have given you the lengths and starts of the sections at hand (they point to the start of things and the next one will come at the end or shortly after the previous one).
Many around here will seemingly also stare at a hex editor and decode files, this is not exactly what happens as they will probably be shifting the line length (if each section gives over so many bytes to the name and so many to the location and so many to the size it might not be obvious if you just see the world 10h bytes at a time, shifting it can have each of those appear on one line and things get very easy from there), looking for common things (lengths of files to be referenced, magic stamps further down in archive files and things that point to just before them). Beyond that pasting a few lines into a spreadsheet and manipulating numbers can make for some interesting things, the obvious one is if your archive format does not have file lengths and you obviously need those to extract things then by taking the start of the next file and then taking the value of the location of the start of the previous one you have your length value or something close to it (for various reasons game devs will pad other files out so the next one starts at a nice number).
Back to the stats thing you have the game in front of you, if the game is nice enough to provide you with a bestiary then that is probably also the order the game stores the enemy stats in (if not it probably is still logical -- as the dev you are effectively making a database after all and a bad database schema is to be avoided). If the game bestiary tells you attack, def, hp... then convert back to hex (this is also the reason why stats are often limited to the powers of two or half that in the cases of signed values).
Analysis also extends to looking at the thing is a tile editor, noting the sections that are 2d graphics and ignoring them or looking in a tile editor and looking for graphical patterns (humans are pretty good at seeing patterns in colours and less so at seeing them in raw hex).
Anyhow 3am is rapidly approaching so I will tie it off here. Naturally you can combine methods (I encourage it) and this was but the briefest of overviews so if you can see other approaches that come from others here or are logical variations on the theme then congratulations you are probably destined to be a ROM hacker.