Yeah as others said and I probably should have made clearer every system is different and that does mean system and not just CPU as different memory mappings, interrupts, BIOS and similar calls, coprocessors and more exist. To this end the extra paragraphs I threatened
You have to handle the memory for the most part (and unless you are doing a very big hack probably directly with numbers in most cases when hacking- proper assemblers and disassemblers can define a shorthand/more human friendly name similar and/or almost declare it as a variable) which is typically what gets every new assembly programmer (every big language outside assembly attempts to do away with having the programmer manager memory at increasingly higher levels and for good reason). The differences even with that taken into account can be quite large as well- compare say a modern X86/X64 processor to a modern ARM and things are very different again (although in some ways this plays into the two big CPU philosophies* or RISC and CISC and the legacy junk x86 has to drag around with it more than straight up philosophy). Having to learn all this is the price you pay for gaining the ability to control anything and it more than most is a great example of how every problem can be solved (within the limitations of the system) and everything means something until you can prove otherwise that often takes new ROM hackers and computer scientists a bit of time to grasp.
*there are also architectures like modified Harvard architecture and Von Neumann but let us not get into that right now as very little in the console world uses anything resembling Harvard architecture.
Learn instructions little by little.... certainly I would if I were to tackle an X86 but even on a simple processor there might be complex instructions but there are a core group of a handful of instructions and concepts that the vast majority of programs will rarely use anything else but. To this end the following (all are important but I probably could have spent a little while longer on ordering) I would say learn
Basic debugging tools and how they work- a debugging emulator will typically have a break on command and that is the most valuable one. Here you say if a portion of memory is written to stop emulation and tell me what caused it. This often leads immediately to the instruction or part of the ROM you want to look at (unless there is compression involved in which case you probably get the portion of memory it was copied to so as to be decompressed but that just means you run it again and probably get where you want, coincidentally you have just bypassed compression and maybe figured out how it works if it is custom making you very valuable to your team).
The bootup sequence and if it is different (say for a modern console that has a menu) the code loading sequence for your console of choice.
What a register is and how many of them there are in the CPU you care about. Broadly speaking you will have general purpose ones and ones with a specific task (usually a pointer to the next instruction, a return address and maybe a flag register (signed, floating, carry and other such things)
Learn the memory layout including where any IO is (controllers will tend to
Learn what DMA is (you can pipe everything through the CPU but that would be unbearably slow in many cases not to mention taking up valuable CPU time so most devices outside the simplest ones will be able to trigger a memory transfer).
Learn what interrupts are and what are available for you (you can check to see if something has happened every cycle but it is better if the game knows to interrupt whatever it is doing and focus on the next thing when something happens)
Learn what the stack is. In short it is a step up from the registers (or a step down in speed) which can hold values that do not want to be written back to memory yet but do not need to be sitting taking up a register.
Learn what an operand is and what limitations they have (the ARM processor can access memory directly with an instruction but can not use the memory locations in an instruction unlike the X86 which can and frequently does). Generally speaking there are three classes of value which are other registers (take the contents of R2 and add the contents of R1 to them), immediate (add 45h to the contents of R1 (you might then have to declare a destination as well)) and memory locations which we already talked about.
Learn how an operand is constructed in your assembler of choice- typically this is something like instruction, destination, source and immediate value but that order can change to anything as far as the assembler goes and they often do change between them.
Appreciate any restrictions on the processor and the operating modes- X86 notably has stacked registers- the 16 bit register is a subset of the 32 bit register meaning if you overwrite a 32 bit register the equivalent 16 also gets overwritten
ARM in the case of the GBA and DS has THUMB mode which uses 16 bit instructions at the cost of having only some registers available for a lot of the instructions in it.
If it was not already something you knew what big and little endian mean.
On instructions you can learn all of them but as mentioned a subset of key ones or their classes is better
push and pop - frees a register to do something (push) and returns it (pop)
Mov - in most cases this copies the value either of an instruction or a register to another (note the copy as opposed to move which implies the original location is reset to 0 which most mov instructions do no do)
NOP - is an instruction that does nothing, if you have to overwrite an instruction or two (say a branch for an anti piracy check) NOP the thing and it will never have happened (granted it can get a bit more complex than that) and you will not have to redo the rest of the rom as it is now 32 bits or something out of line from the original.
Classes of instruction
The ones already mentioned probably fall into the CPU/memory management class but they are vital.
Add. Does what it says but there might be extra ones to account for signed values or to add a chain of them or to add a register to a register along with an immediate value.
Subtract. See add above but replace with subtract.
Boolean logic and bitwise operations. The power of boolean logic is undeniable and as such most processors have abilities here (although you may be limited in the NOR, XNOR and NAND department but that is easily worked around) and most will also have has bitwise operations (shifts, rotates and maybe a flip)
Multiply. More or less the same as the other maths but naturally you are more likely to exceed the register size with a multiplication so learn how it handles that and how to handle it.
Divide - very few consoles will have this in the main CPU (hell the GBA and DS do not) but they will usually have a method to do it by (coprocessor, BIOS instruction, inbuilt log tables (remember 4/5=? is able to be written and log(4) - log(5) = log (?)....) which comes right back around to the every system thing.
Branch and compare - you can run a program from start to finish but all but the simplest programs will become horrific if you try that so assembly and processors allow you to jump to something and return later. Also in this is the compare and branch class of instructions which are the processor level manifestation of the IF ELSE "loop" and other loops.
Memory load and memory store. Fairly vital.
Links to your chosen system and processor(s) it uses can usually be found in the docs section of this site (it being the mission of the site to collect such information and all) and along with any debugging emulators although you can usually find another document from the processor manufacturer (ARM, motorolla, Intel and to a slightly lesser extent IBM who pretty much have the processor markets sewn up and especially as far as consoles go are very open with their stuff).