Romhacking.net

Romhacking => Programming => Topic started by: jonk on May 17, 2016, 12:44:23 pm

Title: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 17, 2016, 12:44:23 pm
Although I'm familiar with assembly generally and, decades ago wrote 6502 code for the Apple II, I am a bit conflicted upon reading the WDC manual and some disassembly examples as it relates to direct page and absolute operands. Take the following example as input to an assembler:

Code: [Select]
LABEL EQU $50
      LDA LABEL

Would you interpret that as a direct page operand? Or an absolute operand? Would it make any difference to you if I wrote it this way:

Code: [Select]
LABEL EQU $0050
      LDA LABEL

(Or "LABEL = $50", if you prefer.) I'm pretty sure I know what the responses will be regarding this following example:

Code: [Select]
LABEL EQU $1A50
      LDA LABEL

But I'm wondering if the assembly writer has any control here. Is it simply the case that the assembler examines the upper byte(s) of the symbol to see if it should be taken one way, or another? It seems to me that I should be able to force the interpretation to go the way I want it to go, regardless of the assemble-time or link-time value of the symbol. But I'm curious what experiences people have, with various assembly tools, on this narrow point.

A similar question might be posed regarding the 'absolute long' mode available with the 65816, as well. But I'll hold short of that, for now. Answering the above question may be sufficient.

(I apologize for being a little lazy, today. But I'd rather not go out and install a variety of different assembler tools, learn how to use them, write code to test the above idea, and find out that way... when someone here may be able to quickly tell me what their experience says already about this. I did do a bit of googling this already, without finding anything specific enough to be sure about it.)
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: Revenant on May 17, 2016, 12:55:58 pm
Most of the time a two-digit literal address (like $50) would be interpreted as a direct page address rather than a normal absolute one. However, most assemblers have a way to specify a specific address size (e.g. "lda.w $50" would read the absolute address $0050, and likewise "lda.l" for a long address). The syntax varies from one assembler to another.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: Disch on May 17, 2016, 12:56:31 pm
Most assemblers allow the programmer to explicitly determine the size of the instruction.  I'm not sure how WDC does it.

But unless the programmer explicitly specifies, it's up to the assembler to determine the size.  And many assemblers are pretty lousy in this area and don't give the programmer the control they should to take advantage of direct page effectively. Which is partly why I wrote my own a while back (http://www.romhacking.net/forum/index.php/topic,20086.msg283058.html#msg283058) that I never really released.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 17, 2016, 02:14:30 pm
Most of the time a two-digit literal address (like $50) would be interpreted as a direct page address rather than a normal absolute one.
That would be a problem when using symbols which may not only be defined elsewhere but also might possibly not be fully determined until link-time. The assembler wouldn't even know the size of the instruction as assembly-time!

However, most assemblers have a way to specify a specific address size (e.g. "lda.w $50" would read the absolute address $0050, and likewise "lda.l" for a long address). The syntax varies from one assembler to another.
Hmm. Okay. So, some assemblers (not necessarily all) have a way of over-riding an otherwise possibly ambiguous interpretation? How exactly then can they compute the appropriate DP-relative value, given ONLY a label and NO INFORMATION at all about the current DP value? Just curious.



May 17, 2016, 02:26:15 pm - (Auto Merged - Double Posts are not allowed before 7 days.)
Most assemblers allow the programmer to explicitly determine the size of the instruction.  I'm not sure how WDC does it.
But it's not exactly the 'size', though of course that is implied. But instead the semantic meaning of the operand.

For example, let's assume we are talking about a symbolic assembler and the symbol under discussion is a 24-bit address. This may be accessible via DP, or not. The assembler would need to be told the current value of the DP (as known by the code writer, of course) in order to figure this out. In fact, it would need to know that in order to figure out the remaining offset value. So an assembler would need a pseudo-op so as to tell it the current DP value. Forcing the assembler writer to have to keep track of the relative offsets seems pretty stupid to me, in fact.

The assembler might, knowing DP and knowing the symbolic label value, find that it the label cannot be reached through the DP value. But that, perhaps, it can be reached via bank 0. In that case, an absolute address would be perfectly fine. Not being able to reach the label either by DP or by bank 0 should, of course, produce an error.

I'm a little bit bothered by what I'm seeing in the disassemblers. But I'm even more bothered by what I see as an apparent complete lack of a mechanism by which to keep the assembler up to date on the important register values (DP, for example.) It seems to be impossible for an assembler to accept a valid label, all of which technically possess a 24-bit address, and to then be able to compute the DP-relative offset for it so that a valid LDA via DP can be computed and applied.

But unless the programmer explicitly specifies, it's up to the assembler to determine the size.  And many assemblers are pretty lousy in this area and don't give the programmer the control they should to take advantage of direct page effectively. Which is partly why I wrote my own a while back (http://www.romhacking.net/forum/index.php/topic,20086.msg283058.html#msg283058) that I never really released.
I'm not even certain, though, that an assembler can even COMPUTE the correct DP-relative offset, given a label. The programmer, obviously, can "hand-calculate" the offset. But that is STUPID to require in all cases. It may be common practice. But it is still stupid. The assembler should be able to tell, at assembly time, if the label is reachable by the DP and, if so, what the appropriate offset is, relative to the DP, so that it can encode the instruction correctly. The programmer shouldn't have to go to some piece of paper, or have to remember.

I'm thinking about a fully relocating assembler here.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: Disch on May 17, 2016, 02:49:39 pm
But it's not exactly the 'size', though of course that is implied. But instead the semantic meaning of the operand.

To-may-to/to-mah-to.  Direct Page / Absolute / Long all have the same functionality, they just specificy a different number of bits of the target address... with the remaining number of bits filled in by registers.

Quote
Forcing the assembler writer to have to keep track of the relative offsets seems pretty stupid to me, in fact.

I agree.  That's why I put direct page directives in my assembler (Schasm).  It is sorely lacking in others.

Quote
Not being able to reach the label either by DP or by bank 0 should, of course, produce an error.

Agreed.  You actually are summing up all the reasons I wrote my own assembler.  =P  It was primarily to address all the same things you're talking about.  None of the existing assemblers seemed tailored for developers focusing on symbols rather than raw addresses.

Quote
I'm not even certain, though, that an assembler can even COMPUTE the correct DP-relative offset, given a label.

It's easy if the label preceeds the reference (so the address is known at the time of the reference), but gets surprisingly complicated if the label is AFTER the reference (since the address depends on, among other things, the size of the very instruction you're trying to assemble).  I think I settled on defaulting to absolute for unresolved labels and erroring if it's discovered they needed long addressing once the label is resolved.


You're actually making me want to pick up development of Schasm again.  I'm very interesting in having a good cross assembler for 65xx series that supports symbolic debugging and isn't extraordinarily clunky like many others are.  I stopped development because I was unhappy with how I approached macros and didn't want to redo it... but if you're interested, maybe we could hash out ideas for a feature set for an assembler?
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 17, 2016, 03:03:18 pm
You're actually making me want to pick up development of Schasm again.  I'm very interesting in having a good cross assembler for 65xx series that supports symbolic debugging and isn't extraordinarily clunky like many others are.  I stopped development because I was unhappy with how I approached macros and didn't want to redo it... but if you're interested, maybe we could hash out ideas for a feature set for an assembler?
Yes. Count me in, if you'd like any help.

I can pick up any of the parts you don't want to deal with, with your existing tool. For example, I'm quite adept at expression parsing, symbolic algebraic manipulation, and application of rules towards optimization/simplification. That can be a separable part. But you may already have all you need there, too. No idea.

Or we can actually just do a ground-up design, if you'd like. Personally, I'd like an assembler/linker system that provides fully abstracted "segments" which support code, data, and even can overlap or overlay segments on top of each other (in effect, a segment 'union'.) At the link phase, I'd like to be able to place code as I see fit, without regard to where the assembler "thought" it belonged. (In short, I'd like to be able to assemble code "overlays" which can be banked into the address space through mapping hardware.) Macro processing is a bit of a question right now for me. I'd like an invoked macro to be able to switch from the current code segment to a previous data segment, drop down some data bytes with a macro-generated label, and then pop back up into whatever the current code segment happens to be for adding code that references this newly generated data label. There are dozens, if not hundreds, of other small details that can make a huge difference in coding practice. I'm still on the fence regarding link-time instruction sizing -- pc-relative branching is an example, where one isn't forced to make a decision about BRA vs BRL while writing code, but instead allows the assembler to determine which one will work. I'm okay not having the linker do that. But I'd like to explore it (and a few other related options) just to make sure it's the right decision to not do this at link time.

On the "clunky" point you make, that's a matter we'd need to discuss. I'd very much like to press towards something elegant and as simple as possible to understand and use well. I want people who don't know how to use assemblers well to be able to pick it up and run with it. But I'd also like to have features targeted for those who have grown past the early learning stages and need support for truly large projects, too. I don't know exactly what that may mean, yet. I'm still just wrapping my mind around all this.

(Something very, very strange has occurred to me, too: should an assembler macro processor, let's say, be able to read data bytes directly from a ROM being "patched" in order to 'modify' bit-mapped table values? This comes up in patching DQ3, for example, where I provide an example of extending the 'gold' display to also show the field/town X and Y location. You know you want to increase the height by two and to move it upward on the screen by two, but you don't care where it is currently located or what its current size is -- you just want to examine, extract, and then modify those bit fields and you don't want to "hard code" knowledge of the screen locations or the display box size in your assembler source code. It's really odd. I admit it. But it crosses my crazy mind, anyway.)

Regardless, sure... count me in if you'll have me.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: Disch on May 17, 2016, 03:49:49 pm
Yes. Count me in, if you'd like any help.

Most of what I'd want help with is ideas for practical features.  It's probably easier for me to crank out the code myself than it would be to figure out how to split it across two developers (at least for something of this complexity).

Quote
Or we can actually just do a ground-up design, if you'd like.

I'd prefer this.  I can borrow pieces from the old assembler and work them into a new one -- but it would be best to start with a fresh design that has all the features we'd want.

Quote
Personally, I'd like an assembler/linker system that provides fully abstracted "segments" which support code, data, and even can overlap or overlay segments on top of each other (in effect, a segment 'union'.) At the link phase, I'd like to be able to place code as I see fit, without regard to where the assembler "thought" it belonged. (In short, I'd like to be able to assemble code "overlays" which can be banked into the address space through mapping hardware.)

I have a few concerns with this approach:

-  It makes the assembler much more difficult to use.  ca65 is a great example of this -- that is a fantastic assembler, but you can't just write a file and assemble it.  You have to juggle config files, set up segments, and actually doing a full build is a multi step process.  It's much more complicated than it needs to be, and it turns a lot of people away from using it.

-  It complicates "code injection" style assembling.  xkas is popular because you can write a small file, and run it to assemble and inject that code into an existing ROM.  Practical, simple, and easy.  Introducing a linker phase almost makes the assumption that the linker has full control over the output file -- which is a bad assumption to make.

-  I'm not sure the benefits are worth it.  Having relocatable segments is a very specific feature for something that could very easily be done in the assembler by way of directives.  Give them an ORG directive to indicate the PC, give them an OFFSET directive to indicate a file offset.  If they want to move stuff around, it's as simple as tweaking a constant in their code.  No need for linker bloat.


Linkers make sense for HLLs but I think they're overkill for this kind of assembler.

Quote
Macro processing is a bit of a question right now for me.

I'll have to reread docs of how existing assemblers handle macros.  I originally took an overly restrictive approach where each macro had to take a fixed number of arguments, and the arguments had to be complete symbols.

Quote
I'd like an invoked macro to be able to switch from the current code segment to a previous data segment, drop down some data bytes with a macro-generated label, and then pop back up into whatever the current code segment happens to be for adding code that references this newly generated data label.

Pushing & popping assembler settings was something I definitely added in my first go of Schasm.  So yeah I see the value here.  Really macros are just text substituion with some scoping rules.  Any features you want them to have apart from that could probably be accomplished through other directives.




My previous assembler is here:   https://www.dropbox.com/s/wsivt454ucd884q/schasm_alpha002_2015_07_16.zip?dl=0

Documentation is included.  If you want to review the docs and give feedback as to improvements that would be a big help.  The only real big changes I can think of would be:

- Redo macros completely.  Look into macros accepting a variable number of arguments
- Add a directive to repeat a block of code a given number of times
- Support for structs
- Support for fixed-point  (need good syntax for this)
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 17, 2016, 04:10:34 pm
Yes. Count me in, if you'd like any help.
Most of what I'd want help with is ideas for practical features.  It's probably easier for me to crank out the code myself than it would be to figure out how to split it across two developers (at least for something of this complexity).
Sounds fine. I can certainly kibitz with the best of them.  ;)

Or we can actually just do a ground-up design, if you'd like.
I'd prefer this.  I can borrow pieces from the old assembler and work them into a new one -- but it would be best to start with a fresh design that has all the features we'd want.
Your call. I'm just going to be kibitzing. :)

Personally, I'd like an assembler/linker system that provides fully abstracted "segments" which support code, data, and even can overlap or overlay segments on top of each other (in effect, a segment 'union'.) At the link phase, I'd like to be able to place code as I see fit, without regard to where the assembler "thought" it belonged. (In short, I'd like to be able to assemble code "overlays" which can be banked into the address space through mapping hardware.)
I have a few concerns with this approach:

-  It makes the assembler much more difficult to use.  ca65 is a great example of this -- that is a fantastic assembler, but you can't just write a file and assemble it.  You have to juggle config files, set up segments, and actually doing a full build is a multi step process.  It's much more complicated than it needs to be, and it turns a lot of people away from using it.

-  It complicates "code injection" style assembling.  xkas is popular because you can write a small file, and run it to assemble and inject that code into an existing ROM.  Practical, simple, and easy.  Introducing a linker phase almost makes the assumption that the linker has full control over the output file -- which is a bad assumption to make.

-  I'm not sure the benefits are worth it.  Having relocatable segments is a very specific feature for something that could very easily be done in the assembler by way of directives.  Give them an ORG directive to indicate the PC, give them an OFFSET directive to indicate a file offset.  If they want to move stuff around, it's as simple as tweaking a constant in their code.  No need for linker bloat.

Linkers make sense for HLLs but I think they're overkill for this kind of assembler.
There is a sneaky way to handle this. Each source file can be treated as a "segment." The coder won't even need to know it is happening, if they don't care to know. I have an example of this to show, if you care to see it. But I'm also perfectly willing to see what you already have (you mention some things I should read) and see how that applies to what I care about. If it doesn't help me, it doesn't. If it does, then fine. I move forward and don't look back.


Macro processing is a bit of a question right now for me.
I'll have to reread docs of how existing assemblers handle macros.  I originally took an overly restrictive approach where each macro had to take a fixed number of arguments, and the arguments had to be complete symbols.
Okay. Look that over. I can easily provide some "tough cases" for you to consider supporting. Just to push you a bit. :)


I'd like an invoked macro to be able to switch from the current code segment to a previous data segment, drop down some data bytes with a macro-generated label, and then pop back up into whatever the current code segment happens to be for adding code that references this newly generated data label.
Pushing & popping assembler settings was something I definitely added in my first go of Schasm.  So yeah I see the value here.  Really macros are just text substituion with some scoping rules.  Any features you want them to have apart from that could probably be accomplished through other directives.
The directives/pseudo ops should be able to be pushed/popped, where it makes sense to do so. You might, for example, push the current state of the accumulator size or index size. Macros may need this. But since you aren't thinking towards segments perhaps this isn't anything you'd see a reason to care about.


My previous assembler is here:   https://www.dropbox.com/s/wsivt454ucd884q/schasm_alpha002_2015_07_16.zip?dl=0

Documentation is included.  If you want to review the docs and give feedback as to improvements that would be a big help.  The only real big changes I can think of would be:

- Redo macros completely.  Look into macros accepting a variable number of arguments
- Add a directive to repeat a block of code a given number of times
- Support for structs
- Support for fixed-point  (need good syntax for this)
Thanks. I'll see about helping push your envelopes. ;) I definitely WANT something for structs! So put a special mark on that one!


As a final note, just a comment on expression analysis. Your parser should be able to recognize that (LABEL1 - LABEL2) isn't an address, but instead is a constant representing the span. Adding a span to an address label is another valid address label. Adding two memory address labels is "bad." At least, if not otherwise compensated later by a subtraction. Etc. Semantic context during expression analysis is helpful. I usually keep track of such semantic details during expression analysis and simplification.



May 17, 2016, 04:30:31 pm - (Auto Merged - Double Posts are not allowed before 7 days.)
I'd love to see how you handle those for operand expressions. For example, may I do this?

Code: [Select]
      ASSUME    DP = MyTable
      LDA          MyTable[5 * SIZEOF MyTableEntryStruct].EntryMember

And have it compute a DP-relative expression, (5 * SIZEOF MyTableEntryStruct + OFFSET MyTableEntryStruct.EntryMember), as the operand value for an 0xA5 LDA instruction?

Structs are cool!

Hmm. Bit-fields, too! That means we'd need a MASK operator to automatically generate the mask bits for the bit fields. I'm loving this, already!



May 17, 2016, 05:02:53 pm - (Auto Merged - Double Posts are not allowed before 7 days.)
I don't have a dropbox account and really don't want to set one up, either. You open to just sending me the file, directly?
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: Disch on May 17, 2016, 05:23:45 pm
I had to look up what 'kibitzing' was.  Hah.

---------
Regarding a linker phase:

I can see the value in separating the build process into multiple steps for large projects, so you don't have to assemble the entire file each time.  An interface for that is worth working in -- and such an interface would imply intermediate object files.

The question then becomes... does the linker decide what address/offset to put code at?  Or does the assembler?

I shudder at ca65's approach.  It has an ORG directive, but it's effectively useless because the linker determines the ultimate address.  And I also want something like ORG to be available because it's familiar and simple.  So I would lean towards everything being determined by the assembler.  Object files would consist of the final assembled output, as well as a list of exported symbols, and a list of symbols that need to be imported and "plugged into" the assembled code.

Furthermore, I would not want the two-step process to be the default behavior for the assembler.  I would imagine most use cases are going to be one-shot (even one-file) assemblies.

If we don't want to go with that idea, how would you want the linker to determine the final address/offsets?


---------
Regarding structs:

Syntax for structs is weird, and I'm not sure how I'd want to tackle it.  A lot of interfacing with structs requires the coder to manually compute an index (typically by left-shifting the ID several times):

Code: [Select]
; assuming the struct is 2 bytes wide
;
; data stored interleaved as:  foo/bar/foo/bar/foo/bar

lda desired_index
asl A
tax

lda mystruct.foo, X
sta mystruct.bar, X

Personally I hate structs and never use them in assembly, since they require that additional shifting which also means larger indexes, more page crosses, and often padding to space the struct to a conveinient size which would otherwise be completely unnecessary.

I've never found a situation where I would prefer a struct to storing each field in their own array.  Something like this:

Code: [Select]
; no struct, data stored in own array, such as:   foo/foo/foo/bar/bar/bar

ldx desired_index
lda foo,X
sta bar,X


This is why I didn't support them initially.  But apparently structs are a popular feature?  Barf.

I'm lost on this one.  Since this is not something I would use, I have no idea what kind of syntax I would want for it.  How would you want the syntax for structs to look?




----------
For bit fields:

???   huh???



EDIT:

You shouldn't need a dropbox account to download the file.  Just follow the link and click the download button.

Unless dropbox changed?

But yeah I don't mind sending it by email or something.  Send me a PM with your email address and I can shoot it over.




EDIT 2:

Also I forgot to mention this:

My assembler didn't make a distinction between labels and other numeric constants.  As far as it's concerned they're all just numbers, so adding two labels together would be meaningless but would be completely legal.

Adding what is effectively type safety to numeric values seems like overkill.  What benefit would it have other than guarding the very rare mistake of adding two labels?
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 17, 2016, 06:12:08 pm
I had to look up what 'kibitzing' was.  Hah.
Hehe. I love it someone else is willing to do all the work and just let me sit in the back seat and kibitz about their driving habits. ;) Frees me up to be truly annoying.

Regarding a linker phase:

I can see the value in separating the build process into multiple steps for large projects, so you don't have to assemble the entire file each time.  An interface for that is worth working in -- and such an interface would imply intermediate object files.
A few weeks ago, I would have agreed with this. But then I saw the Merlin32 assembler code. The darned thing loads up all of the assembly source files -- and I mean ALL OF THEM -- and linked-list chains up the lines in each of them. It then linked-list chains up groups of them into segments. (You can include multiple ASM files in a grouped segment.) It then linked-list chains up the groups into the total project. It goes through each of the individual lines, assigning addresses and tossing out opcodes and the like. Each line is its own tiny "code byte stream." At the total project level, he has another chained list of code and data patches, that the assembly-phase added when it couldn't resolve things then, so that the link phase knows what to "patch in" later on. Once all that is done, each of the individual lines has a fully patched in and decoded "byte stream." (The instructions are never longer than 4 bytes there. But the data can be longer, of course.) The whole thing doesn't care about the physical memory system, or how any of that maps to a ROM, at this phase. It's just a whole lot of individual lines, each with their own ORG address plus a tiny strip of attached data to it. It's only at this point that his code decides how to do output. Note that you can ORG to some place, generate a little bit of code, ORG to somewhere else, generate a little more, etc. So far as these structures care, it doesn't matter if you have 100 separate lines all with the same ORG and different data. It just doesn't matter because none of it understands anything about the memory address space or the ROM. It's just "ORG + BYTES" everywhere.

I had a very easy time jumping into that thing and making it patch a ROM file directly and to support any kind of mapping hardware.

I'm not suggesting you consider using that thing. I'm just pointing out that it didn't need to support separate compilation or object file formats. It's really easy from a user point of view because they don't have to know about separate compilation or object files. So far as they know, they have a project with some ASM files and somehow it all just works right. No binary object files are generated. They just need to have a basic project file listing the sources. (Kind of a make file, I suppose.) The rest just happens.

The question then becomes... does the linker decide what address/offset to put code at?  Or does the assembler?
Yeah. That's the question. I can offer only some modest thoughts.

Since symbols can be external, expressions involving them can only be resolved at link-time. If you are going to support externals, then I think you are stuck with that fact. This would mean to me that while the assembler can do some of the constant folding semantics of an expression, so long as the expression carries any external reference in it, you have to defer final computation it until link-time. That implies retaining a reduced expression tree. (It may involve two or more externals in the expression, so what choice do you have then?)

Also, if you are going to support letting the assembler "know about" the DP value, it's possible that the DP expression (the one used in the 'assume' directive the assembler uses to keep track of DP) itself contains externals! So, again, the assembler can't possibly know at assembly-time how to use the assumed-known DP in the context of an LDA instruction that may also reference that external, for example, where the two need to be 'differenced' later to compute the LDA value. So once again, that has to be deferred until link-time. All the assembler can do is to correctly construct the expression tree to be resolved at link-time.

Note that if you do this right and well, a user won't even know all the trouble you are going to, here. All they see is that this somehow always "just works right." And they shouldn't have to care, either. It should just work right.

I shudder at ca65's approach.  It has an ORG directive, but it's effectively useless because the linker determines the ultimate address.  And I also want something like ORG to be available because it's familiar and simple.  So I would lean towards everything being determined by the assembler.  Object files would consist of the final assembled output, as well as a list of exported symbols, and a list of symbols that need to be imported and "plugged into" the assembled code.
At some point, I'd like to expose you to that bit of "hacking" I did to the Merlin32 tool to make ASMPATCH. Not because I want you to use it, but because I want you to see how easy it is to specify a patch to a ROM so that you can see if there is anything useful to learn from such examples. I don't know if there is, actually. I'd just like to hear your opinion after seeing an example or two. It might shake out an idea from you.

Furthermore, I would not want the two-step process to be the default behavior for the assembler.  I would imagine most use cases are going to be one-shot (even one-file) assemblies.
Yes, I'd like to see something very, very easy to use in the common case of just one assembly source file with patches in it. That's the example I give on the ASMPATCH web link, in fact. Very simple to do. Just works.

If we don't want to go with that idea, how would you want the linker to determine the final address/offsets?
Well, let's exchange more thoughts before I answer here. If it is deferred to link-time, I think it would be "hacked" if there is no expression tree to process at link-time. The Merlin32 tool I started out modifying is really dumb, this way. It ONLY knows about one external and one constant offset to it. And it is really kludgy, as a result. I don't like that. So that would seem to spell out a reduced expression tree presented to the linker, I think. But that's only if you defer things until then. If not, none of it matters.

Regarding structs:

Syntax for structs is weird, and I'm not sure how I'd want to tackle it.  A lot of interfacing with structs requires the coder to manually compute an index (typically by left-shifting the ID several times):

Code: [Select]
; assuming the struct is 2 bytes wide
;
; data stored interleaved as:  foo/bar/foo/bar/foo/bar

lda desired_index
asl A
tax

lda mystruct.foo, X
sta mystruct.bar, X

Personally I hate structs and never use them in assembly, since they require that additional shifting which also means larger indexes, more page crosses, and often padding to space the struct to a conveinient size which would otherwise be completely unnecessary.

I've never found a situation where I would prefer a struct to storing each field in their own array.  Something like this:

Code: [Select]
; no struct, data stored in own array, such as:   foo/foo/foo/bar/bar/bar

ldx desired_index
lda foo,X
sta bar,X


This is why I didn't support them initially.  But apparently structs are a popular feature?  Barf.

I'm lost on this one.  Since this is not something I would use, I have no idea what kind of syntax I would want for it.  How would you want the syntax for structs to look?


----------
For bit fields:

???   huh???
Let me think on this. I don't have a ready answer for you. But there are good examples to be found and examined elsewhere. Have you used Microsoft's MASM/ML assembler? It supports structs, masking of bit fields, and so on. It has a syntax for it. Might be worth a look.

EDIT:

You shouldn't need a dropbox account to download the file.  Just follow the link and click the download button.

Unless dropbox changed?
Well, it hassled me about setting up an account and I didn't see, right off, a way to avoid it. I'll go try, again, just to be sure.

But yeah I don't mind sending it by email or something.  Send me a PM with your email address and I can shoot it over.
If you go to this link, Patching SNES ROMs Directly from Assembly (http://www.infinitefactors.org/jonk/patch.html), my address is at the bottom of the page. If I can't get dropbox to work for me, you can send it there.

EDIT 2:

Also I forgot to mention this:

My assembler didn't make a distinction between labels and other numeric constants.  As far as it's concerned they're all just numbers, so adding two labels together would be meaningless but would be completely legal.

Adding what is effectively type safety to numeric values seems like overkill.  What benefit would it have other than guarding the very rare mistake of adding two labels?
Hmm. It's useful in understanding what is being asked during assembly. I gave an example already using that complex LDA to a struct object. There is semantic context there. But there are so many other points that need resolving first (the link-time stuff looms large, which includes a lot of lingering questions still) that I really feel this can wait until I understand your direction better. It would be just "made up," right now, without much context and probably just fall on deaf ears. When I better understand your direction, I may be able to put something interesting into that context, then.

May 17, 2016, 06:16:44 pm - (Auto Merged - Double Posts are not allowed before 7 days.)
About dropbox. GOT IT!  I see what I did wrong. Looks good and I was able to pull it down. Thanks!
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: Disch on May 17, 2016, 07:03:20 pm
Quote
But there are so many other points that need resolving first (the link-time stuff looms large, which includes a lot of lingering questions still) that I really feel this can wait until I understand your direction better.

Let's start here then.  The first question we should ask is:

- What do we want a separate linker to accomplish?  What extra functionality would it provide?



To me, the reason for having a separate linker in an assembler is the same as the reason for having one in HLLs like C/C++.  That is, project scalability.  A large project with multiple files need only re-assemble the files that were touched and not the entire project.  Relying on the linker being able to plug together all the intermediate object files.

That's more or less what ca65 uses them for (since it's actually a C compiler as well as an assembler).... but from your description of Merlin's approach, it does not seem to accomplish that at all, as it strings together all the source files before they're parsed.  So for the life of me I can't imagine what value it has.

You've mentioned something about being able to drop "segments" into different places -- but I'm having a hard time grasping what that actually means and how it would be useful.  What is a practical example?
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 17, 2016, 08:12:39 pm
Let's start here then.  The first question we should ask is:

- What do we want a separate linker to accomplish?  What extra functionality would it provide?

To me, the reason for having a separate linker in an assembler is the same as the reason for having one in HLLs like C/C++.  That is, project scalability.  A large project with multiple files need only re-assemble the files that were touched and not the entire project.  Relying on the linker being able to plug together all the intermediate object files.
Well, there is that. Which reminds me of a side-bar:
But separate compilation isn't the only goal. You get modules, separation of implementation details from usage, organization, and a host of things all amounting to "good code practice." It's not just a matter of saving compilation time. In fact, I don't even care about the compilation time. Our computers are so way-too-fast these days, anyway. Merlin32 is really dumb, in that regard. It loads everything into memory and has to resolve all of the symbols every time it starts up. And I don't even care. It's just ... crazy fast ... anyway. What matters is the benefits from organizing, separating, documenting individually in bite-sized chunks, and in modularizing my code. That's largely why I care about it.

That's more or less what ca65 uses them for (since it's actually a C compiler as well as an assembler).... but from your description of Merlin's approach, it does not seem to accomplish that at all, as it strings together all the source files before they're parsed.  So for the life of me I can't imagine what value it has.
Well, it reads the files one at a time and parses them. So each line is already broken into "label text," "opcode text," and "operand test." Plus an operand byte, if appropriate, a data array representing the few bytes it may need, an ORG address, and.. that's about it. Well, a link-next pointer, of course. hehe.

But it was trivial to adapt this to poking ROMs, including those involving mapping hardware. It provides the usual benefits of modularization and good coding practices, too. But it has its own set of warts, too.

Think of Merlin32 more as an "interpreter" that loads up all the modules at once and resolves things in-place. It has all the features of separate compilation, without the complexities of creating object files no one really cares about, anyway. It doesn't have the gain in speed you are suggesting. But then, I think you may have missed remembering the modularization benefits here.

Anyway, it's kind of like that, before it starts thinking about writing to some file, anyway.

You've mentioned something about being able to drop "segments" into different places -- but I'm having a hard time grasping what that actually means and how it would be useful.  What is a practical example?
Lots of examples spring to mind. But I'll avoid Harvard architectures and some oddball von Neumann situations and just focus on the 65C816 and the SNES.

Suppose you have a cartridge ROM image that uses mapping hardware (I have the SD2SNES, for example.) Suppose in this case we want to support a ROM that is 32Mbyte in size. SNES bus addresses from $C00000 to $FFFFFF will be always mapped to the first 4MByte of the 32Mbyte ROM. SNES bus addresses from $400000 to $5FFFFF will be used as an overlay area, though, allowing us to map in any 2MByte ROM segment (aligned on 2Mbyte ROM address boundaries) into that SNES address space, under our software control. Let's say we have 10 of these overlays, using up an additional 20Mbyte of our 32Mbyte ROM. (But only one of them mapped into the SNES addresses at a time.) Our cartridge also has some RAM, which is mapped into the SNES address space starting at $600000 and continuing to $7DFFFF. We have initialized data for some of that RAM that must be placed into the ROM where it is non-volatile and can be copied out into the RAM before starting the program. The variables must appear to be located in the SNES address space, while actually being mapped into the 32MByte ROM's address space for safe keeping. We also have another (up to the remaining 8Mbyte) bit of texture maps and other things also located initially in "hidden" ROM address space, but which we may map into the SNES address space temporarily.

How are you going to tell the linker about all this? I need to create 10 different segments of code, all co-located at the same SNES address space, but positioned differently into the ROM address space.

What I'm doing right now with Merlin32 handles all this without a blip. In fact, it's trivial to specify and easy to read, too. And I can move things around without touching a single line of assembly code.

But Merlin32 has a really bad expression syntax, really bad expression handling, it's macros have a lot to be desired, doesn't support structures or bit fields, and... well... it is bad enough that I'm just this side of writing my own tool. But it got me by with my son working on DQ3. So it's doing its job, for now.

-------------------------

When I looked at the other tools they were hard to use, as you alluded to earlier. It takes an expert to use them. Merlin32 is really easy. But it's syntax is funky and it pretends at separate compilation while being nearly completely unable to produce a decent object file of its own. (In fact, it can't. All it can do is produce a single record type in a specific OMF 2.1 file type, which is close to useless.) It has a lot of problems. So I just junked the output side of it, removing its ability to generate that stupid OMF 2.1 file and removing its ability to write out binary streams (all useless, by the way, if for some reason there is ANY discontinuity in the segment.) I then tapped into all that fabulous data it keeps for each line and just walk through all those myself, patching the ROM directly from that data plus the ROM specification information that I added to the linker-file parsing step. The ROM specification is simple to write and use and provides all I need for now. Except, of course, that the assembler generally is pretty lousy.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: elmer on May 17, 2016, 08:34:23 pm
But I'm wondering if the assembly writer has any control here. Is it simply the case that the assembler examines the upper byte(s) of the symbol to see if it should be taken one way, or another? It seems to me that I should be able to force the interpretation to go the way I want it to go, regardless of the assemble-time or link-time value of the symbol. But I'm curious what experiences people have, with various assembly tools, on this narrow point.

This was all standardized years ago back in the 6502 era. WDC basically took the same approach and extended it for the long addressing on the 65816.

For an immediate value ...

"<" selects the low byte of a 16-bit value.
">" selects the high byte of a 16-bit value.
"^" selects the 3rd-byte or 3rd-and-4th byte of a 32-bit value.

For an addressing mode ...

"<" forces zero-page addressing (i.e. the low byte).
">" forces absolute addressing on the 6502, or long addressing on the 65816.
"|" forces absolute addressing on the 65186.

It's up to the programmer to put things in zero-page or not and then force the correct addressing mode if it is ambiguous. The assembler will always use absolute addressing if the size/location of the label is undetermined (or perhaps long addressing on the 65816).

So "lda <myvar,y" forces zero-page addressing, and "lda myvar,y" will use absolute addressing unless the assembler already knows that the variable is located in zero-page memory.

Now ... if "modern" assemblers don't follow the standards ... then that's their problem.

SNASM658 (the expensive professional kit that I used when developing for the SNES) certainly did.

CA65 certainly supports "<", but I'm not sure about the others, because I haven't needed them yet, and so haven't looked.


That would be a problem when using symbols which may not only be defined elsewhere but also might possibly not be fully determined until link-time. The assembler wouldn't even know the size of the instruction as assembly-time!
Hmm. Okay. So, some assemblers (not necessarily all) have a way of over-riding an otherwise possibly ambiguous interpretation? How exactly then can they compute the appropriate DP-relative value, given ONLY a label and NO INFORMATION at all about the current DP value? Just curious.

They can't ... it's up to the programmer to tell the assembler what addressing mode to use based upon the programmer's knowledge of their own code and where the different linked-segments are going to reside.

It's just a basic (and simple) part of software-development on the 6502/65816 architecture.

It's also why things like the DP don't change very often ... it would be a stupid thing to do and dramatically over-complicate the software development for little benefit.


I'm thinking about a fully relocating assembler here.

IMHO, you're overthinking things here trying to apply a 32-bit programming paradigm to a system with a segmented and specialized memory layout where that paradigm doesn't fit.

Back-in-the-day we just didn't do that.

The banks/segments/org system that CA65 uses is very, very similar to the professional SNASM658 system that people used to build these games in the first place ... including using a linker.


But then I saw the Merlin32 assembler code. The darned thing loads up all of the assembly source files -- and I mean ALL OF THEM -- and linked-list chains up the lines in each of them. It then linked-list chains up groups of them into segments. (You can include multiple ASM files in a grouped segment.) It then linked-list chains up the groups into the total project. ...

Errr ... that's just plain stupid. Except for small projects.

But it's also the way that an old in-house development system that I used worked for developing 8-bit and ST/Amiga games ... until we threw the monster away and bought SNASM.


I shudder at ca65's approach.  It has an ORG directive, but it's effectively useless because the linker determines the ultimate address.  And I also want something like ORG to be available because it's familiar and simple.  So I would lean towards everything being determined by the assembler.  Object files would consist of the final assembled output, as well as a list of exported symbols, and a list of symbols that need to be imported and "plugged into" the assembled code.

Furthermore, I would not want the two-step process to be the default behavior for the assembler.  I would imagine most use cases are going to be one-shot (even one-file) assemblies.

You seem to be thinking of hacking projects.

Professional developers used lots of files, and a linker.

Here's the makefile for just the simple frontend (not the main game code) on a SNES project that got cancelled.

As such, its a simple example of the early-days of a project  ...

Code: [Select]
[SnMake]

d:vectors.obj; vectors.658 shellram.658 equates.658
c:\snasm\snasm658.exe /w /z /l /b64 $! vectors.658,d:vectors.obj,,,x.tmp

d:startup.obj; startup.658 shellint.658 shellram.658 equates.658
c:\snasm\snasm658.exe /w /z /l /b64 $! startup.658,d:startup.obj,,,x.tmp

d:shellprg.obj; shellprg.658 shellram.658 equates.658
c:\snasm\snasm658.exe /w /z /l /b64 $! shellprg.658,d:shellprg.obj,,,x.tmp

d:shelldat.obj; shelldat.658 shellram.658 equates.658
c:\snasm\snasm658.exe /w /z /l /b64 $! shelldat.658,d:shelldat.obj,,,x.tmp

t7:; d:vectors.obj d:startup.obj d:shellprg.obj d:shelldat.obj
c:\snasm\snlink.exe /c /b $! @hockey.lnk,t7:,hockey.sym,hockey.map,x.tmp
!ifdef(debugstr)
c:\snasm\snbug658.exe hockey.sym
!endif

[Debug]
c:\snasm\snbug658.exe hockey.sym

[Eval]
c:\snasm\snbug658.exe /v$$$ hockey.sym
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: Disch on May 17, 2016, 09:09:57 pm
Unless I'm mistaken.... Harvard architectures are n/a here as 65xx series are strictly von Neumann, are they not?  I don't really have plans/interest in expanding this assembler to other families.

That aside, I'm with you on most of the linker functionality.  But some of it directly conflicts with other practices... particularly this bit at the end of your post:

Quote
How are you going to tell the linker about all this? I need to create 10 different segments of code, all co-located at the same SNES address space, but positioned differently into the ROM address space.

Isn't that exactly what the ORG directive is for?  Specifying the desired SNES address space?  I'm loathed to remove ORG entirely due to its simplicity, effectiveness, and familiarity.  But I also don't want duplicate functionality in the linker -- otherwise you end up with the mess ca65 has on its hands (org technically exists but is virtually useless)

My approach to your SD2SNES problem would be to have separate ORG and OFFSET directives, the former specifying the SNES address as the origin for the code, and the latter indicating the file offset.  IMO this is easier to understand and doesn't require additional config files or complex cmdline args that tell the linker where it has to place that stuff -- because it's already specified in the code.

You can even map out regions of RAM by ORG'ing to $7E0000 (or wherever) and nulling the OFFSET so that nothing is actually output, but symbols are still defined.  I did this with the #var directive in my old assembler:

Code: [Select]
    #Var            name, size                      Declare a variable at given pc
   
Var is used to define variables/registers using the PC rather than by doing symbol assignments
specifically.

'name' is the name of the symbol to define, and 'size' is the size in bytes of the variable
it represents.


        #org $7E0000    ;start of SNES RAM
        #var foo, 2
        #var bar, 2
        #var baz, 2
       
The above code has the same effect as:

        foo = $7E0000
        bar = $7E0002
        baz = $7E0004
        #org $7E0006
       
The advantage to using #var over direct assignments is that it makes it easier to move variables around
and/or insert a var in the middle.


The elegance of ORG in this situation is much more preferable to me personally.


Is there a situation where an ORG/OFFSET combination would not suffice?  Or would be unmaintainable?



Quote
But the point here is that the label should NOT need to be an external symbol tied to an address. I should be able to have it be a simple expression -- a pre-defined constant, for example.

Having it set as an address is something that never even crossed my mind.

I have practically ZERO type checking in my assembler.  Symbols either [ultimately] resolve to a number or a string.  Strings are used exclusively with assembler directives (incbin, etc) and cannot be used in other parts of code.  Labels resolve to a number, constants to a number, etc.

Context matters prior to evaluation so you can properly evaluate the symbol -- but once it's evaluated its type is no longer relevant as far as I'm concerned.  Hence why "Label1 + Label2" is legal, despite being nonsensical.  Both labels have to be fully evaluated before the + can be evaluated, and at that point they're reduced to numbers.  From a parser perspective this is just waaaaaay easier to do.


----------------------------------------

@ Elmer:

You seem to be thinking of hacking projects.

I'm thinking of both.  I want a 2-step assembly process with a linker -- I just don't want it to be the default behavior.  Hacking projects are much more relevant these days than from-scratch homebrews.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 17, 2016, 10:02:24 pm
Okay. Let's zero in a bit. I'm not sure I understood your example code as it may apply in the following case.

Let's say I'm trying to patch a binary ROM file. So suppose I want two code segments, call them A and B, that are both assembled for SNES address space at $400000. Each of these code segments are large; $200000 in length (2Mbyte.) Neither of them is in memory at the same time, as the mapper will only be used to map one of them at a time into the same SNES $400000 to $5FFFFF address space. My pre-existing game ROM file is 32MByte in size (clearly larger than the SNES address space) and segment A is to be patched into an existing ROM file at file offset address $01A00000 to $01BFFFFF and segment B is patched into the ROM file at file offset address $01E00000 to $01FFFFFF. (Let's say that is how the two segments I'm modifying are organized.) How might I achieve that? (I can do that already with the existing asmpatch program.) Keep in mind that my hypothetical mapper is a hypothetical FPGA and I can hypothetically set up registers to map the ROM pretty much anywhere I want, whenever I want to. And that the total SNES game cartridge's non-volatile code and data well exceeds the total address space of the SNES in any of its configurations. Which, of course, is why I'm using the mapper (MSU1 or whatever.)

Hacking projects are much more relevant these days than from-scratch homebrews.
I agree with you. Assume we are talking about an existing game using an existing mapper with an existing ROM file that is big and is being "updated" by the assembler/linker's generated output.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: Disch on May 17, 2016, 10:17:45 pm
Neither of them is in memory at the same time, as the mapper will only be used to map one of them at a time into the same SNES $400000 to $5FFFFF address space. My pre-existing game ROM file is 32MByte in size (clearly larger than the SNES address space) and segment A is to be patched into an existing ROM file at file offset address $01A00000 to $01BFFFFF and segment B is patched into the ROM file at file offset address $01E00000 to $01FFFFFF.

Code: [Select]
#org $400000
#offset $1A00000

  ; ... code for one bank here


#org $400000
#offset $1E00000

  ; ... code for another bank here
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 17, 2016, 10:19:38 pm
Code: [Select]
#org $400000
#offset $1A00000

  ; ... code for one bank here
#org $400000
#offset $1E00000

  ; ... code for another bank here

Cool. Got it. That works for me. Somehow I'd missed the #offset stuff. I saw you use the word, but hadn't seen it in a context. Now I do. Thanks!



May 17, 2016, 10:24:56 pm - (Auto Merged - Double Posts are not allowed before 7 days.)
They can't ... it's up to the programmer to tell the assembler what addressing mode to use based upon the programmer's knowledge of their own code and where the different linked-segments are going to reside.
It's trivial to let the programmer keep the the assembler up to date on this. It's also done all the time on the x86 and segment registers aren't changed all that often there, either. The idea helps to avoid programmer counting errors, which they shouldn't be dealing with anyway.

We'll just have to disagree here. I think an assembler should support this. Period. The fact they didn't in the olden days is no excuse. (The assembler can default to "ASSUME DP = NOTHING" for those who don't want to know or care about it.)
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: Disch on May 17, 2016, 11:02:26 pm
I agree that the assembler should be smart about instruction size.  Having to manually select direct page/etc for each instruction is unnecessary.  The question now is how can the assembler (or more specifically, the linker) determine the appropriate size?  This was a simpler task when I had a one-part assembler with no linker.  Now that I have to worry about symbol exporting/importing it gets more complicated.


Most constants and variables are going to be able to be evaluated easily enough, but label names get tricky, since you need to know the size of all instructions between the label and the previous ORG to properly evaluate it -- and the size of those variables may depend on evaluation of that label!

Code: [Select]
lda foo, X
nop
nop
rts

foo:
 #byte 0, 1, 2, 3

Can't know the size of that lda until I evaluate foo, and can't evaluate foo until I know the size of that lda.


My first thought is to track a running min and max size so labels can immediately be given a possible range that they fall within.  From there, you might be able to "rule out" DP/Absolute mode based on the given range.  It wouldn't be full-proof, but I'd imagine it would work for most use cases -- unless you are intentionally trying to trip up the assembler.

But that introduces a set of problems with macros and conditional assembly.  Barf.


I'll have to think about it more and post tomorrow.  If you have ideas on how to approach I'm all ears.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: elmer on May 17, 2016, 11:39:12 pm
It's trivial to let the programmer keep the the assembler up to date on this. It's also done all the time on the x86 and segment registers aren't changed all that often there, either. The idea helps to avoid programmer counting errors, which they shouldn't be dealing with anyway.

We'll just have to disagree here. I think an assembler should support this. Period. The fact they didn't in the olden days is no excuse. (The assembler can default to "ASSUME DP = NOTHING" for those who don't want to know or care about it.)

I'm not disagreeing with you. As I said in my post ... the way that the assembler worked is/was to use the appropriate instruction if it knows the size/location of the label when it is assembling it.

And IIRC, "yes", SNASM and other old assemblers had an ASSUME directive to tell it where the DP currently was.

As you say ... that concept came about with the 8086, which was old technology by the time that WDC created the 65816.

I just looked at some old source code, and the "export" directive (a declaration) allowed you to specify attributes for a label ... such as "far", that would tell the assembler how to treat a specific label.

Where you'll really expand upon old practice and do something new, is if you implement LTCG so that you don't need to decide the size of an instruction until link time, rather than at assembly time the way that the old assemblers did it (because of limited CPU power, memory, disk space).
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: Disch on May 17, 2016, 11:42:53 pm
Where you'll really expand upon old practice and do something new, is if you implement LTCG so that you don't need to decide the size of an instruction until link time, rather than at assembly time the way that the old assemblers did it (because of limited CPU power, memory, disk space).

This is the plan.



EDIT:

The more I think of cases that need to be addressed, the more I remember why I abandoned the idea of a separate linker in my original assembler.  Most of the trouble is coming from conditional assembly.

Consider this basic macro definition to push AXY to the stack:
Code: [Select]
#macro pushregs
  #if in_65816_mode
    pha
    phx
    phy
  #else
    pha
    txa
    pha
    tya
    pha
  #endif
#endmacro

Since phx/phy are not available on 6502, you need to do the tax/pha combo to push X.  And if in 6502 mode, 'phx' would be an unrecognized symbol and would generate an error at compile time.  BUT you would not want this macro to produce an error because that phx is only parsed if in 65816 mode.

For this to work properly, the '#if' condition needs to be able to be evaluated immediately on first pass, so the inappropriate path can be completely ignored.  Which for the above example is trivial, but for other examples it might not be:

Code: [Select]
TimedLoop:
    lda foo, X
    sta bar
    nop
    inx
    bne TimedLoop

#if (current_pc & $FF00) != (TimedLoop & $FF00)
  #error "TimedLoop branch crossed page boundary.  Timing will be incorrect"
#endif

Here, you can't know the current PC until 'foo' and 'bar' are resolved, and the size of those lda/sta instructions are determined.  Which means the #if condition cannot be resolved at compile time.... and certainly not on first pass.



Any ideas how this problem can be addressed?
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: elmer on May 19, 2016, 06:31:16 pm
Code: [Select]
TimedLoop:
    lda foo, X
    sta bar
    nop
    inx
    bne TimedLoop

#if (current_pc & $FF00) != (TimedLoop & $FF00)
  #error "TimedLoop branch crossed page boundary.  Timing will be incorrect"
#endif

Here, you can't know the current PC until 'foo' and 'bar' are resolved, and the size of those lda/sta instructions are determined.  Which means the #if condition cannot be resolved at compile time.... and certainly not on first pass.

Any ideas how this problem can be addressed?

LTCG and multiple-passes to defer the problem until later could solve a lot of the problems ... but you'd still get the rare pathological cases that would break things and require an error message and an abort.

The question is ... what is the practical end-user complaint that you are trying to solve?

The "classic" method of 16-bit addressing if the label is undetermined (24-bit addressing on the 65816), and 8-bit addressing if the label is already known to comply, is the practical solution that will never generate incorrect code (just sometimes sub-optimal).

Then you leave it up to the programmer to override the "safe" default with "<", ">" or "|", or with an attribute on the label declaration.

That's one way that the problem can be addressed, both quickly and easily.

This isn't a linker issue ... unless you really want to attempt LTCG.

The linker is traditionally there for a different reasons, just as jonk said. Partially for speed (on old systems), but also for code-separation and for keeping a large multi-developer project "sane".
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 19, 2016, 07:08:08 pm
In the following, I'll write mostly "as if" I'm recommending the idea of allowing some, very modest really, link-time ability to make reasonable choices for certain opcodes. (These could be offered to the linker in an abstract way, I suppose, by the assembler writer who "knows" what the options are and who can provide a list of options to consider at link-time, rather than have to build in specific 65816 opcode knowledge into the linker. How this is achieved seems a "design question" more than a problem, to me.) The fact that I'm speaking in that tone here should, in no way, be taken to say that I am pushing for link-time selections. I'm open, either way.

The question is ... what is the practical end-user complaint that you are trying to solve?
Most of the problems quickly resolve out, without some kind of "pitching back and forth." There are always "edge cases." (No pun intended, but then I don't know who would know the pun unless they understood the use of "code edge.") But those can be finessed either by examining all of the logical alternatives and selecting those that don't "oscillate" (usually that isn't too hard to do) or by just making a choice and generating an error, if that's all you are left with. Some of this can (and probably should) be done by the assembler during assemble-time, as there is no need to sweep problems over to the linker when the assembler already knows enough to resolve them. The linker would only need to deal with those where the assembler doesn't have enough perspective.

The "classic" method of 16-bit addressing if the label is undetermined (24-bit addressing on the 65816), and 8-bit addressing if the label is already known to comply, is the practical solution that will never generate incorrect code (just sometimes sub-optimal).
Yeah. That would be easy, I suppose. The assembler could, if it can't resolve the situation given the current assembly-time information, pick the worst-case option. (Slowest, biggest, most generalized.)

But honestly, I don't think this is a hard problem for a paired assembler/linker and I'm not worried that it will become one. So some issues could be deferred out until link-time. It's not that terrible, at all.

Then you leave it up to the programmer to override the "safe" default with "<", ">" or "|", or with an attribute on the label declaration.

That's one way that the problem can be addressed, both quickly and easily.

This isn't a linker issue ... unless you really want to attempt LTCG.
There is that. You can always leave it up to the programmer and just generate errors as appropriate. Nothing new there and it's not hard on a programmer.

The linker is traditionally there for a different reasons, just as jonk said. Partially for speed (on old systems), but also for code-separation and for keeping a large multi-developer project "sane".
Yup. And I don't have any specific axe to grind here. If nothing fancy is done, I'm no worse off than before. If something is added that is nice, I'll take it.

There are times when I'd like the assembler to automatically recognize when a conditional branch can't reach its target and is able to replace it with a BRL and reverse-conditional branch around the BRL, for example. There are other times when I'd rather it wasn't that smart, too. Something perhaps I'd like to be able to turn ON or OFF? And if there is support in the linker when the assembler can't resolve things (a different question) then of course that adds more to consider when designing the list of options that an assembler presents to the linker (if you have the assembler do that and don't build in too much processor specific knowledge into the linker.)

Which brings up something else....

Should a macro facility have necessary and sufficient feature support, that it might be fully possible to write macros to achieve the above instruction replacement strategy? There is no reason, in principle, that it couldn't be achieved. Macro expansions could have conditional tests which are deferred until the second pass, for example. But I need to think about this. And, well, since I'm just kibitzing and not really doing any work....  perhaps the implementer would just prefer to shoot the messenger here and get on with it.  ;)

Title: Re: 65816: Direct Page vs Absolute Operand
Post by: Disch on May 19, 2016, 08:43:10 pm
@jonk:  You are using a whole lot of "quotes" in your "post".   XD

Anyway...  Defaulting to 16-bit sizes is what I did in my original assembler.  In addition to that, to solve the #if problem, I had a requirement that evaluations passed to directives had to be immediately resolvable or else you get an error.  This also skirted around other problems that arose, like setting DirectPage to a label that was not defined yet.  But I'm worried that might be too restrictive.  I want this to "just work" without the programmer having to manually select the size of every instruction and/or worry about whether or not the assembler is choosing the optimal sizes.

Quote
There are times when I'd like the assembler to automatically recognize when a conditional branch can't reach its target and is able to replace it with a BRL and reverse-conditional branch around the BRL, for example.
[snip]
Should a macro facility have necessary and sufficient feature support, that it might be fully possible to write macros to achieve the above instruction replacement strategy?

I was going to say... I'd argue that shouldn't be an assembler feature and SHOULD be done with macros.

On a side note:  BRL is the dumbest and most useless instruction on the 68516.  It's basically just JMP, but takes an extra cycle.  I guess maybe it's useful for relocatable or self-modifying code?  Whatever.



Anyway I have an idea for how this could be accomplished, but it basically amounts to the linker doing a lot of the same work the assembler would have to do.  Like, to the point where they might as well be the same executable just with different commandline args.


Here's a question:

- Is it unreasonable to expect every source file to have an ORG before any binary output?

I would think this would be a safe assumption, but I can see someone creating library-like files that don't care where they're put.  But would those be assembled directly or would they be #included into another source file?
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 20, 2016, 12:32:39 am
Defaulting to 16-bit sizes is what I did in my original assembler.
;) Always ask yourself, "Am I doing this because it is the easy path? Or because it is the courageous path?"

I value the courageous choices in life!

In addition to that, to solve the #if problem, I had a requirement that evaluations passed to directives had to be immediately resolvable or else you get an error.
Hmm. Same question to ask yourself.

This also skirted around other problems that arose, like setting DirectPage to a label that was not defined yet. But I'm worried that might be too restrictive.
I think you are touching on one of the serious questions to be asking yourself. If an "ASSUME DP" directive (whatever you name it) is set to an external symbolic, you have no choice but to allow the linker to locate that symbol (or expression containing external symbols needing resolution) before you can resolve the quantity, exactly. And without the exact quantity, you cannot resolve the DP references in the instructions (like LDA) that may refer to a symbolic, which by definition is to be taken as relative to DP. (Or, if you automatically choose between two different LDA's, then to find out if you need to do so.) Of course, you didn't even address yourself to an external here, but to a "label that was not defined yet," which might be one that is in the same source file but found later (so you would need to wait until pass 2.) My example couldn't even be resolved in pass 2, but only at link time. Which is yet another question to ask.

I want this to "just work" without the programmer having to manually select the size of every instruction and/or worry about whether or not the assembler is choosing the optimal sizes.
This whole area is something that can easily fall into arguments over "matters of style." Better experienced programmers can argue this question into any corner you want and make it stick pretty well.

Should the assembler just do what you say and let the programmer make all the decisions, emitting errors to help guide the hand of the programmer?

Should the assembler do a fair job of "counting bytes" and "figuring out offsets" for the programmer, freeing the programmer from having to worry about such details?

But in regards to the original question I was asking, with regard to informing the assembler about the DP setting, things may get interesting.

Suppose you support structs. Suppose the assembly programmer has a subroutine they want to write that assumes that the DP points somewhere useful before being called. But it doesn't know (or care) exactly where. Instead, as it turns out, this subroutine does something interesting and fun with a palette structure. However, there are a dozen different palettes in use. This subroutine doesn't care which one is in use. It only cares that the DP register points to the palette structure you want it to examine before you call it. (This could be a palette structure or it could be an NPC structure or it could be a saved-game structure. It doesn't matter. Make up something you consider worthy of the example here.) Now, the assembler needs to be told the type of the structure that DP points at. But the assembler doesn't need to know the exact address contained in DP. Just that whatever the DP is pointing it happens to be something of this type. Now, DP-relative LDA instructions should be able to be generated by the assembler just fine without any need for fix-ups during link-time because the assembler knows all it needs to know in order to correctly generate the DP-relative LDA instruction.

So while the assembler may need to know what kind of thing the DP points at, it doesn't actually need to know the absolute value of the DP.  On the other hand, one might actually want to tell the assembler about the absolute address of the DP and not tell it about the type of the data items that proceed there, at all.

Should one be able to over-ride all this in the instruction operand itself? Should I be able to say to the assembler,

Code: [Select]
      LDA    ((struct X *) DP)->field1

And then have it figure out that field1 is at offset $12 relative to DP?

I don't know. You tell me?

I really don't like to waste my time sitting down with a piece of paper, working out DP-relative offsets. Worse, this in effect hard-codes these deltas. If I later decide to move the DP base somewhere else or if I decide to modify a structure there and add some more fields.... then I'm running around having either to modify a lot of instructions that I should never have had to bother with or else I have to go find my long list of EQU/= symbols and go hack that thing into shape so that the offsets are correctly stated, again. This is seriously bad. I really think the assembler needs to have some information about where DP is established and that the programmer should allow the assembler to figure out the offsets. The assembler is really good at bookkeeping details like that. The programmer isn't so good.

Let the assembler do what it is good at doing.

But once you open that door and walk through it, when do you stop?  Frankly, I see the need for a very good, high quality expression/operand analyzer.

I was going to say... I'd argue that shouldn't be an assembler feature and SHOULD be done with macros.
I anticipated that.

On a side note:  BRL is the dumbest and most useless instruction on the 68516.  It's basically just JMP, but takes an extra cycle.  I guess maybe it's useful for relocatable or self-modifying code?  Whatever.
Yeah. Mostly for PIC. You might want that if you are transferring code into RAM for execution. No, I don't know why. In the MSP430 from TI, you may need to do that if you are modifying your flash because the RAM still works fine when the flash is being written. So there, you'd need something like that. For the 65816 in the SNES? I don't know. Maybe I'll think about making something up that sounds really important. ;)

Anyway I have an idea for how this could be accomplished, but it basically amounts to the linker doing a lot of the same work the assembler would have to do.  Like, to the point where they might as well be the same executable just with different commandline args.
That would be bad. You don't want to bury knowledge in two different places. This is why I let slip the idea of having the assembler pass along a list of options for the linker to consider, done up in such a way that the linker doesn't actually need to know what it is doing. Just a thought for now.

Here's a question:

- Is it unreasonable to expect every source file to have an ORG before any binary output?
Of course it is. Relocatable code doesn't use ORG. In the Merlin32 system I modified a few weeks back, the ORG can occur either in the linker file OR in the source code OR in both. But if it is in the linker file and not in the assembly source code, then the source code is moved to the location indicated in the linker file. If the source includes an ORG (or more than one) then the linker file ORG merely sets a barrier so that all ORGs in the source file must be at or after that address. But it otherwise doesn't restrict the use.

I would think this would be a safe assumption, but I can see someone creating library-like files that don't care where they're put.  But would those be assembled directly or would they be #included into another source file?
I don't think it is safe. Here's why.

When I'm hacking ROM code, one of the first things I do is get rid of certain subroutines, replacing them with others I place in some 0xFF region I believe isn't used for anything. Doing so leaves "holes" in the code. I mark these holes for later use by other, shorter routines I might later write.

Suppose a 4Mb ExHiROM. Suppose I know that everything at the tail end, from $F40000 to $FFFFFF, is safe to use for new code. Suppose I tear out subroutines OLDX, OLDY, and OLDZ, located at $C40000 to $C400E7, $C41320 to $C41410, and $C72010 to $C72251, respectively. My new replacement routines for these three functions will be located somewhere in the $F40000 to $FFFFFF region, but I really don't care exactly where. It doesn't matter. But I do have to keep the address of OLDX at $C40000, OLDY at $C41320, and OLDZ at $C72010, because the rest of the ROM expects them to stay there and I don't intend wasting tons of time tracking down all of the calls to these functions. So I insert a small snippet (perhaps just a JML) of code at the beginning. This leaves me with three useful holes in the code that I may use for something else (moved data table, additional data table, other subroutines I write that are short, etc.) When I write my patches, I want to specify the start of the patch areas (in this case, there are four patch areas: OLDX, OLDY, OLDZ, and NEWCODE) to the linker. But the assembler shouldn't care, at all. So far as the assembler knows, I have four named code segments that are, each of them, fully relocatable. The assembler should not need to know anything about their location in memory. (Aside from the rule that the linker won't locate a named code segment so that it sprawls across a bank boundary.) Only the linker knows that. So my oldx_seg has a JML followed by some small subroutine or two; my oldy_seg has another JML plus some additional other small subroutines; my oldz_seg has yet another JML followed by still more personal subroutines; and newcode_seg has the replacement code for OLDX, OLDY, and OLDZ plus a bunch more library routines, tables, data, and other stuff I couldn't fit into the earlier, tiny holes.

Why should the assembler care about an ORG here?
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: Disch on May 20, 2016, 01:56:00 am
;) Always ask yourself, "Am I doing this because it is the easy path? Or because it is the courageous path?"

Well if you note, I said that's the way I did it in my OLD assembler ;P

Since this is a rewrite with a completely different approach, I'm hoping to improve upon all that.

Quote
So while the assembler may need to know what kind of thing the DP points at, it doesn't actually need to know the absolute value of the DP.  On the other hand, one might actually want to tell the assembler about the absolute address of the DP and not tell it about the type of the data items that proceed there, at all.

Should one be able to over-ride all this in the instruction operand itself?

This is a pretty easy scenario, IMO.

- Assembler keeps track of a DirectPage (and DataBank) value, which the coder can set any time with a #DP or similar directive
- EVERY operand to any instruction is ultimately going to evaluate to a number
- If the given number is accessible via the given DirectPage, use direct page mode.
-   Otherwise, if it's accessible with the given DataBank, use absolute mode
-   Otherwise, use long mode


Basic case:

Code: [Select]
lda #$0100
tcd
#dp $0100

lda $0105    ; A5 05
lda $0205    ; AD 05 02     (if assembler's DB is $00)
lda $0205    ; AF 05 02 00  (if assembler's DB is not $00)


Want DP Relative code?  Set Assembler's DP to zero, but don't tcd a new value:
Code: [Select]
; doesn't matter what #dp is set to at this point
#push_dp    ; save it
#dp 0       ; Give a fake direct page of 0

lda $05     ; A5 05   --  DP relative
#pop_dp     ; restore prevous dp


Quote
That would be bad. You don't want to bury knowledge in two different places. This is why I let slip the idea of having the assembler pass along a list of options for the linker to consider, done up in such a way that the linker doesn't actually need to know what it is doing. Just a thought for now.

Both the assembler and linker are going to have to do symbol resolution.  And really, symbol resolution is the bulk of the work... and is the only difficult part of writing an assembler.  Everything else is trivial if you know what the symbols are.

Linker has to do it (pretty much by definition of what the linker's job is) to resolve external symbols.... but you also want the assembler to do it for simple expressions so that object files are not unnecessarily huge and complicated.  Basically anything that CAN be resolved by the assembler should be, and anything that can't be should get passed to the linker.  I think you even said something like this earlier.

But this means both assembler and linker are doing the same thing.  And I don't want to put duplicate code in two different executables.  So I'm thinking this should just be one monolith program, where you can assemble to object files with one option --- link object files together with another option -- or do both with a different option (which is what I want the default behavior to be anyway)

Quote
Of course it is [a bad idea to require ORG]. Relocatable code doesn't use ORG.
[big example snip]
Why should the assembler care about an ORG here?

So I'm just going to say it.  I hate the concept of segments.  They're overly complicated... and I have yet to see a good use for them.

They're the absolute biggest complaint I have with ca65.  I've seen SEVERAL people turn away from assemblers that use them because it's too difficult a concept for them to grasp, and they don't provide functionality that isn't already accomplished with directives.

Worse, all of this could be done much more simply with directives like ORG.

Your example is completely trivial to do with ORG, and honestly this is how I would imagine most people would approach this problem:

Code: [Select]
#org $C40000
JML newX
#pad $C400E7, $FF

#org $C41320
JML newY
#pad $C41410, $FF

#org $C72010
JML newZ
#pad $C72251


#org $F40000
newX:
    ...
   
newY:
    ...
   
newZ:
    ...


The higher level concept of having code segments is fine.  But ALL of it can be accomplished in code through use of directives.  I don't want to build that functionality directly into the linker.

ORG is just so simple and elegant.  Everybody understands it.  Segments are the exact opposite.  I really dislike them.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 20, 2016, 06:43:39 am
I agree about wanting to keep things so people can start out easy and don't have to learn complex, new ideas before they can use a tool. I also don't like it when people have to struggle a lot just to climb over some very high wall. I may disagree a little about not supporting useful ideas that people can grow into, as skills grow. It would be nice to come up with a way to allow that path for growth, if you can conceive of a way to do that. But I think there is enough to worry about, too. So let's drill in on the rest and drop this. I'm good with that.

That said, the Merlin32 system I've been modifying a bit also doesn't carry any concept about code segments. Yet, it does support relocation. A user can do exactly as you'd have it -- just ORG. That works and a user doesn't need to learn anything at all about segments to write good code. Yet, if they want to, they can avoid using ORG and use REL (relocatable code), instead. All they have to do is remove the ORG in their source code (which overrides things, but doesn't produce an error if they keep it) and place the ORG in their linker file that lists the asm file with an ASM statement. So, any programmer can start out with complete ignorance about segments and then, when they feel able, add a "special file" that says "ORG" and "ASM" to position their REL asm code, later. It's no more than creating a very short, very easy to understand second file; digging into their asm source code and changing an ORG to REL, and then re-assembling the result, allowing the linker file to control the location instead of the assembly source file. They can add several ASM for any given ORG, so that they can link together two or more ASMs to start at any ORG they want. Or they can completely ignore all this, never use a linker file, and just salt their code with ORG all over the place. It doesn't care. This is just as easy for someone to just go code stuff without any knowledge of segments. But it allows relocation added as a later after-thought, when someone is ready for the idea, with barely more than a line or two of change.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: Disch on May 20, 2016, 10:04:28 am
Yet, if they want to, they can avoid using ORG and use REL (relocatable code), instead. All they have to do is remove the ORG in their source code (which overrides things, but doesn't produce an error if they keep it) and place the ORG in their linker file that lists the asm file with an ASM statement.

Here's my question - and why I'm so hesitant on doing this.

What value does this have?  What functionality does this provide?  How does this make things any easier for the developer?

All of these things can be done in 2 lines of code by way of ORG and/or usage of macros.  Introducing segments into the linker just creates a whole new syntax the user has to learn to write the config files, and adds a bunch of redundant functionality which then creates problems when a conflict arises.  And it complicates the linker.

Literally every example I can imagine where segments would be useful could be just as easily (or more easily) solved using macros and assembler directives.



EDIT:

Code: [Select]
; segments.asm

#macro segment_newcode
  #offset $xxyyzz
  #org $F40000, $FFFFFF, $FF
#endmacro

Code: [Select]
; newcode.asm

!segment_newcode

  ; ... code here
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 20, 2016, 01:27:30 pm
Like I said, there are more important issues and I don't want to squander your motivational energy here. It's just not worth it if it causes you to hold short on even a single other feature.

I'm going to be really happy with structures. So, for example, drawing from the old CP/M and DOS systems:
Code: [Select]
PSP             STRUCT

pspInt20        dw        1 DUP(?)
pspNextPara     dw        1 DUP(?)
                db        1 DUP(?)
pspDispatcher   db        5 DUP(?)
pspTermVector   dd        1 DUP(?)
pspCtrlCVector  dd        1 DUP(?)
pspCritVector   dd        1 DUP(?)
                dw       11 DUP(?)
pspEnvironment  dw        1 DUP(?)
                dw       23 DUP(?)
pspFCB_1        db       16 DUP(?)
pspFCB_2        db       16 DUP(?)
                dd        1 DUP(?)
pspCmdTailCnt   db        1 DUP(?)
pspCmdTailTxt   db      127 DUP(?)

PSP             ENDS

                .
                .
                .

                #org    $C40000

                .
                .
                .

                #assume DP:STRUCT PSP               
MyFunction      LDA     pspNextPara
                JML     ExternalFunction
                .
                .
                .

I used to place that into a SEGMENT AT in order to achieve a "structure." But if you support structures, then I get what I want there and that's one less reason for segments.

So I really don't want to mess with your head over this. A linker is the right place to be slinging strings of bytes around, wholesale, not the assembler which hard-codes all this beforehand. I'd like to be able to define "segments" that I may at some later time choose to overlay on top in various ways, for example. I don't want to have to go modify my source code for that purpose. I shouldn't have to. On the other hand, if I'm careful about the design I can use your macro examples constructively and I don't want to spend too much time worrying over this.

Note that I do need a way to specify data storage that does NOT produce initialized data in any form, at all. It should ONLY advance the counter and it should NOT update the ROM file. I used the DUP(?) above to illustrate that point. In Merlin32, it's DS with a byte count operand, so that DW 1 DUP(?) in Merlin32 is written as DS 2. The ROM is NOT updated here. It's just "skipped over."

I'd like a feature where I can read the ROM contents, apply them in a calculation, and then re-apply the newly calculated bytes into the ROM as a patch. I don't know if you care about that. But I'd rather ask for that than to worry about code and data segments.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: Disch on May 20, 2016, 02:00:40 pm
Like I said, there are more important issues and I don't want to squander your motivational energy here.

Well I might be being overly aggressive -- I get like that unintentionally sometimes.  Sorry if that's how I come across -- I don't mean to be confrontational or anything  :thumbsup:

Really, if you can provide a realistic example of something segments can be used for that makes them easier to use than the other tools available, I'd love to see it, and I'm willing to consider it.  I just literally have never seen such an example in my entire life and so I can't fathom what good they are.  Every example I've seen made segments look more complicated and less useful than the alternatives.

Quote
I don't want to have to go modify my source code for that purpose. I shouldn't have to.

You have to modify something.  The only difference between my approach and yours is the extension of the file being modified.

- You're modifying a makefile or linker config file.
- I'm modifying a *.asm file.

If it's really a problem, just change the *.asm extention to *.cfg and imagine the syntax for my config files is strikingly similar to the syntax for macros  ;P



EDIT:

AHA!!

Okay I just thought of something that segments are good for!  ORG forces you to start at the given address every time you use it, whereas segments don't necessarily do that.  Multiple different blocks of code can have the same segment and the linker will fit them all into the given area without overlapping.  Whereas with my macro approach if you have two ORGs, each will try to overwrite the other!


Okay -- that's worth supporting.  I'll work it in.

So bringing back to my original question... but modified....

- is it reasonable to expect every source file to have an ORG or SEGMENT directive before any output is generated?


Quote
Note that I do need a way to specify data storage that does NOT produce initialized data in any form, at all. It should ONLY advance the counter and it should NOT update the ROM file.

I did that in my old assembler with the '#var' directive.  In fact I think I already posted the snippit from the documentation on that here:
http://www.romhacking.net/forum/index.php/topic,21927.msg306540.html#msg306540


Quote
I'm going to be really happy with structures.

Yeah I'll have to think of reasonable syntax for this.  =)

Quote
I'd like a feature where I can read the ROM contents, apply them in a calculation, and then re-apply the newly calculated bytes into the ROM as a patch. I don't know if you care about that. But I'd rather ask for that than to worry about code and data segments.

Doesn't sound like a job for an assembler.  Sounds more like a job for a python script.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 20, 2016, 02:16:02 pm
I'd like a feature where I can read the ROM contents, apply them in a calculation, and then re-apply the newly calculated bytes into the ROM as a patch. I don't know if you care about that. But I'd rather ask for that than to worry about code and data segments.
Doesn't sound like a job for an assembler.  Sounds more like a job for a python script.
Hehe. Okay. Just pushing. There are cases where I want to "update in-place" a pre-existing data structure, using a macro to do it. And, since you seemed to be so hot on the idea of making it really easy for people to just "use" the tool, I figured I'd mention it. Writing a Perl or Python script to do this, would put it almost certainly out of the reach of the more casual/inexperienced user. That would just be over the top for them to do. Worse, I can't even provide my source code as an example to teach others about it, because the learning curve and the required documentation writing would be WAY TOO MUCH for me to consider attempting, trying to teach someone all that. So it would both eliminate the use by casual users and it would eliminate my writing a web page on doing it. But for me, personally? I'm fine. So I'll just go away on that point. No problem.



May 20, 2016, 02:31:10 pm - (Auto Merged - Double Posts are not allowed before 7 days.)
I'm going to be really happy with structures. So, for example, drawing from the old CP/M and DOS systems:
Yeah I'll have to think of reasonable syntax for this.  =)
I can't wait! Note that structures involve not only the definition, but perhaps also being able to declare a reference to one in the #assume.

In case it isn't already clear, I think the #assume for DP should accept nothing (or equivalent) as valid. When I write subroutines which aren't supposed to make any assumptions, I do NOT want them accidentally picking up the #assume from some elsewhere code that set it before, from the assembler's perspective. I want to be able to specifically and clearly override any prior setting and to explicitly say "you know nothing about the DP, for purposes of the following statements, so I want you to flag warnings/errors when you think you are being asked to make assumptions here."

Personally, I also think I should be able to over-ride, on the asm statement itself (such as an LDA) how I want the DP interpreted. There are times where this makes sense to do.

I haven't heard, yet, a clear way to over-ride the operand type in cases where I don't want the assembler doing some kind of "hmm, what is the best way to code this? should I use a DP-relative, or absolute, absolute long, or...?" Are you still thinking about having the assembler make some intelligent decisions for the programmer? (I'd like that.) If so, is there a way to over-ride that behavior and be explicit if I want that?

I hadn't brought it up yet, but there is also the DBR ahead. I think the assembler needs to know where that is also set. I don't think the PBR is needed, though. So I kind of imagine the case where the assembler supports #assume for DBR and DP, and not for PBR. And... well, the X and Y, too. May as well do it. Especially the X and Y, in fact. If you need examples of why, I can lay a few out for you. But I suspect you already know of a few good cases, yourself.

(But as a clue regarding DBR, X, and Y, imagine the case where DBR:X or DBR:Y references a structure and where you want the 16-bit instruction operand available to indicate the offset into the structure here. And before you start saying, "What in the heck is DBR:X?" please recall that DBR can be considered part of either the instruction's operand or part of the X or Y value. The final addition within the processor is a 24-bit result summing of a 24-bit value and a 16-bit value (assuming index register size here.) If I tell the assembler that my DBR:X references this structure correctly, then the assembler should be perfectly able to calculate the instruction operand value from that knowledge.)



May 20, 2016, 02:35:36 pm - (Auto Merged - Double Posts are not allowed before 7 days.)
- is it reasonable to expect every source file to have an ORG or SEGMENT directive before any output is generated?
Well, I suppose you could construct, by default, a default segment with a default address. But I don't really care. Either way you go, I'm fine with it. You need information and I don't think it is unreasonable to ask that it be provided to you, somehow. Making assumptions is just another way of pretty much ensuring someone ignorantly screws things up and has no clue why and no way to figure things out because there are no errors, no warnings, ... just nothing.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: Disch on May 20, 2016, 02:50:55 pm
In case it isn't already clear, I think the #assume for DP should accept nothing (or equivalent) as valid.

Yeah that makes sense.  So when DP is 'nothing' the assembler just won't use direct page mode at all?

Quote
Personally, I also think I should be able to over-ride, on the asm statement itself (such as an LDA) how I want the DP interpreted. There are times where this makes sense to do.
[snip]
I haven't heard, yet, a clear way to over-ride the operand type

Agreed.  You'll always be able to override the assembler's decision and choose your own addressing mode explicitly.  I just don't want to you HAVE to do that.

Code: [Select]
lda.b   $xx     ; always direct page, no matter what
lda.w   $xxxx   ; always absolte, no matter what
lda.l   $xxxxxx ; always long, no matter what
lda     $xx     ; assembler chooses best

Quote
I hadn't brought it up yet, but there is also the DBR ahead.
Quote

Yeah DB is the same idea as DP and will be handled the exact same way.

PBR you don't need, as that is determined by the PC/org.

And A/X/Y sizes are absolutely necessary as they impact the size of immediate mode instructions.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 20, 2016, 02:52:17 pm
We've been cross-editing here, so go back and re-read some of my modifications. I've expanded things a bit.



May 20, 2016, 03:06:47 pm - (Auto Merged - Double Posts are not allowed before 7 days.)
Code: [Select]
lda.b   $xx     ; always direct page, no matter what
lda.w   $xxxx   ; always absolte, no matter what
lda.l   $xxxxxx ; always long, no matter what
lda     $xx     ; assembler chooses best
So, what happens here?
Code: [Select]
    lda.b   ExtSym1[5].Member2
    lda.w   ExtSym1[5].Member2
    lda.l   ExtSym1[5].Member2
    lda     ExtSym1[5].Member2

In this case, ExtSym1 happens to be a table of structures. You know the size of each, so you can easily compute from the [5] where the base of the 6th structure is at, and from that how much extra to add to get at .Member2. (Or, you might want to force the code writer to say [5*SIZEOF MYSTRUCT] instead. Up to you?)

In the first case, the .b forces a DP-relative encoding. I get that. So in this case, you'd have to go look at the #assume for the DP?

In the second case, the .w forces an absolute encoding. I get that, too. So in this case, you would need to know the DBR to be able to compute the appropriate 16-bit operand.

In the third case, the .l forces an absolute long encoding. In this case, no #assumes are required at all.

So, what about the last case? Do you go through all your options here? Fastest? Smallest? (Is there a case where the two are different? I'm not sure.)

Note that in all cases the symbol is external.  ;D
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: Disch on May 20, 2016, 04:01:14 pm
In this case, ExtSym1 happens to be a table of structures. You know the size of each, so you can easily compute from the [5] where the base of the 6th structure is at, and from that how much extra to add to get at .Member2. (Or, you might want to force the code writer to say [5*SIZEOF MYSTRUCT] instead. Up to you?)

Side note:  I wonder... is [] notation going to be useful for structs?  The given index would have to be a constant and couldn't be like the X index or anything.  So how often would you access an individual struct element outside of a loop?

Anyway to answer this specific question:

EVERYTHING gets resolved to a number.  After all symbols are defined and resolved, 'ExtSym1[5].Member2' will ultimately be reduced to a numerical value.  Only then will instruction size be determined.  (Unless it's overrided by .b/.w/.l suffixes)

Quote
In the second case, the .w forces an absolute encoding. I get that, too. So in this case, you would need to know the DBR to be able to compute the appropriate 16-bit operand.

DB would be ignored completely if the .w suffix is provided.  If I'm considering DB at all, there's no difference between the user giving the suffix and the assembler deciding it on its own.

I suppose I could check the given value to make sure it has the same bank as DB and give a warning if they don't match -- but I don't even think I should do that.  I see these suffixes like I see casts in C -- you're effectively telling the assembler "I know what I'm doing, just shut up and do it the way I tell you to".

Quote
In the first case, the .b forces a DP-relative encoding. I get that. So in this case, you'd have to go look at the #assume for the DP?

This is a more interesting question since (unlike DB), DP can affect lower bits.

Example:

Code: [Select]
#dp $0080

lda   $0080   ; A5 00
lda.b $0085   ; A5 05  ??? or A5 85  ???

Honestly I don't know what the "correct" solution is here.  I could make a case for either one.  What do you think?

Quote
So, what about the last case? Do you go through all your options here? Fastest? Smallest? (Is there a case where the two are different? I'm not sure.)

It ultimately boils down to "lda <some number>"

- Is <some number> on the currect direct page?  If yes, use Direct Page mode.
- Otherwise, is <some number> on the current data bank?  If yes, use absolute mode.
- Otherwise, use long mode.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 20, 2016, 04:23:53 pm
In this case, ExtSym1 happens to be a table of structures. You know the size of each, so you can easily compute from the [5] where the base of the 6th structure is at, and from that how much extra to add to get at .Member2. (Or, you might want to force the code writer to say [5*SIZEOF MYSTRUCT] instead. Up to you?)
Side note:  I wonder... is [] notation going to be useful for structs?  The given index would have to be a constant and couldn't be like the X index or anything.  So how often would you access an individual struct element outside of a loop?
Very often, given what I've seen already. For example, there is a list of identical structures in a large table in DQ3 (more than one of these, too.) However, they are directly referenced in the code, pointing straight at #6 for example in order to describe the gold status display structure. Each one is the exact same structure. But they don't index into the table with any kind of search. They know, by index ID value, exactly which one they want. They are simply collected together for convenience, I suppose. As far as how they use them, they just use them like this:
Code: [Select]
#define GOLDDISPLAY TABLE[5]
      .
      .
      .
      LDA    GOLDDISPLAY.Member
So I'd like to be able to do that conveniently.

Anyway to answer this specific question:

EVERYTHING gets resolved to a number.  After all symbols are defined and resolved, 'ExtSym1[5].Member2' will ultimately be reduced to a numerical value.  Only then will instruction size be determined.  (Unless it's overrided by .b/.w/.l suffixes)
Yeah. I get that. If it gets resolved in the linker, so be it. For that, I'd suggest at least considering the idea of letting the assembler spell out the exact list of cases in order to hide that knowledge into the assembler and to avoid forcing the linker to know too much of the same stuff. The "fix up" record can include a list of options, where appropriate. This is kind of like how REST works with http. But that's another story.

I suppose I could check the given value to make sure it has the same bank as DB and give a warning if they don't match -- but I don't even think I should do that.  I see these suffixes like I see casts in C -- you're effectively telling the assembler "I know what I'm doing, just shut up and do it the way I tell you to".
Yes, agreed. (I was previously thinking about x86 segments, which overlap, and this was dumb of me. I need to go back to the WDC documentation and make sure there aren't any odd-ball corner cases, though. Worth a look-see.)

This is a more interesting question since (unlike DB), DP can affect lower bits.

Example:

Code: [Select]
#dp $0080

lda   $0080   ; A5 00
lda.b $0085   ; A5 05  ??? or A5 85  ???

Honestly I don't know what the "correct" solution is here.  I could make a case for either one.  What do you think?

It ultimately boils down to "lda <some number>"

- Is <some number> on the currect direct page?  If yes, use Direct Page mode.
- Otherwise, is <some number> on the current data bank?  If yes, use absolute mode.
- Otherwise, use long mode.
If the DP is "in view" with the assembler (isn't 'nothing') then the DP value should be used. I see your point about the question of the over-ride, though. If you are forcing the assembler to "be stupid," does that mean that the assembler should revert to 'nothing' for DP?

I think it would be better that you always use DP, if it is in view. Period. Regardless of the instruction over-ride, itself. In the case that DP is 'nothing' then I'd use the actual value given by the programmmer. So:

Code: [Select]
#dp $0080
lda   $0080   ; A5 00
lda.b $0085   ; A5 05
#dp nothing
lda.b $0085   ; A5 85
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: Disch on May 20, 2016, 04:54:03 pm
Very often, given what I've seen already. [snip] So I'd like to be able to do that conveniently.

Fair enough  =)

Quote
I don't see how that works. Without knowing what DBR (I'm using WDC notation here) is, you can't compute the 16-bit absolute address relative to it. And it is ALWAYS relative to the DBR. The DBR is pre-pended to the 16-bit absolute address.

DBR sets bits 16-23
Absolute mode records bits 0-15

If the user is forcing absolute mode with the .w suffix, then DBR literally doesn't matter because bits 16-23 are not going to be included in the assembled output anyway.

Ex:

Code: [Select]
#dbr $01

lda   $010000  ; AD 00 00
lda.w $010000  ; AD 00 00
lda.w $020000  ; AD 00 00  (bank is wrong, sure, but .w is forcing absolute)
lda.w $xx0000  ; AD 00 00  (no value for 'xx' can change that.  Bank byte doesn't matter here)


Quote
It's not just a check, so far as I can see. You NEED the DBR value to correctly compute the 16-bit absolute address relative to the DBR.

Nope.  You just mask out the high bits.  'AND $FFFF'

Quote
I think it would be better that you always use DP, if it is in view. Period.

I can go for that.  Maybe a 4th suffix can be added later if "extremely stupid" mode is really desired.

But realy this seems like a very unlikely edge case and probably isn't worth worrying about.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 20, 2016, 05:04:00 pm
Yeah. I made a mistake about the DBR. Updated my response, too late I see. I agree with you on that topic. So put that to bed. I'm going to walk through EVERY SINGLE one of the addressing modes now and make sure I didn't miss something important.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: elmer on May 21, 2016, 12:46:19 pm
I really don't like to waste my time sitting down with a piece of paper, working out DP-relative offsets. Worse, this in effect hard-codes these deltas. If I later decide to move the DP base somewhere else or if I decide to modify a structure there and add some more fields.... then I'm running around having either to modify a lot of instructions that I should never have had to bother with or else I have to go find my long list of EQU/= symbols and go hack that thing into shape so that the offsets are correctly stated, again. This is seriously bad.

I'm going to be really happy with structures. So, for example, drawing from the old CP/M and DOS systems:
Code: [Select]
PSP             STRUCT

pspInt20        dw        1 DUP(?)
pspNextPara     dw        1 DUP(?)
...
pspCmdTailCnt   db        1 DUP(?)
pspCmdTailTxt   db      127 DUP(?)

PSP             ENDS

...
Note that I do need a way to specify data storage that does NOT produce initialized data in any form, at all. It should ONLY advance the counter and it should NOT update the ROM file. I used the DUP(?) above to illustrate that point. In Merlin32, it's DS with a byte count operand, so that DW 1 DUP(?) in Merlin32 is written as DS 2. The ROM is NOT updated here. It's just "skipped over."

Structure definition was traditionally handled with the RS directives (RSSET, RB, RW, RD, RS). That stops the programmer from having to count byte offsets.

Code: [Select]
               rsset 0
pspInt20       rw    1
pspNextPara    rw    1
...
pspCmdTailCnt  rb    1
pspCmdTailTxt  rb    127
PSP_SIZE       rb    0

Then, to define an array you can do ...

Code: [Select]
               rsset 0
PSP_0          rb    PSP_SIZE
PSP_1          rb    PSP_SIZE
PSP_2          rb    PSP_SIZE
PSP_3          rb    PSP_SIZE
PSP_4          rb    PSP_SIZE
PSP_5          rb    PSP_SIZE

At that point, your ExtSym1[5].Member2 example becomes "lda (PSP_5 + pspCmdTailCnt)".

Or if you want to set the DP to PSP_5, then you can do a simple "lda <pspCmdTailCnt", and it's clear and simple.


Quote
Code: [Select]
lda.b   $xx     ; always direct page, no matter what
lda.w   $xxxx   ; always absolte, no matter what
lda.l   $xxxxxx ; always long, no matter what
lda     $xx     ; assembler chooses best

Code: [Select]
    lda.b   ExtSym1[5].Member2
    lda.w   ExtSym1[5].Member2
    lda.l   ExtSym1[5].Member2
    lda     ExtSym1[5].Member2

I am having trouble seeing how what your proposing here is an "advance" upon existing practice.

The "lda.b/w/l" is using a suffix that's normally used in assembly language on every other processor to control the size of the load itself, and not the addressing mode.

It really doesn't seem like good design practice to override common usage and use it to indicate an addressing mode, especially when there is already an accepted syntax for doing what you need.

And if you want "C" like structure access notation ... the immediate question that I ask is, can I do a "lda ExtSyms1[IndexVar].Member2"?

If so, then you're actually generating new code, and you're actually writing a compiler.

If not, then the syntax is pretty pointless, and IMHO no advance on the RS directives in practical use.

If you do want to make 6502 programming easier and more high-level ... then I suggest that you investigate PLASMA or ATALAN (see David Wheeler's page for a discussion on advanced 6502 languages http://www.dwheeler.com/6502/).


There are times when I'd like the assembler to automatically recognize when a conditional branch can't reach its target and is able to replace it with a BRL and reverse-conditional branch around the BRL, for example.

Yes, this can be nice, and the GNU binutils suite supports it on a number of architectures.

I've never missed this in an practical usage of assembly language programming, because I normally want high-performance code, and so I want to know when something is out-of-range because I may choose to re-arrange the code a bit, or to just use a macro to do the long version.

But having the option of the assembler/linker doing it would be nice for more general run-of-the-mill code.

Title: Re: 65816: Direct Page vs Absolute Operand
Post by: Disch on May 21, 2016, 01:25:10 pm
I am having trouble seeing how what your proposing here is an "advance" upon existing practice.

Existing 65xx series assemblers typically do not automatically take advantage of direct page access and require the user to explicitly state the addressing mode they want.

I want to do that.  But I also want the user to be able to explicitly override the assembler's judgement.

Quote
The "lda.b/w/l" is using a suffix that's normally used in assembly language on every other processor to control the size of the load itself, and not the addressing mode.

It really doesn't seem like good design practice to override common usage and use it to indicate an addressing mode, especially when there is already an accepted syntax for doing what you need.

I wasn't aware the suffix notation conflicted with other architectures.  I recall seeing it on other 65xx assemblers (xkas maybe?  It's been a while)

What is the accepted syntax in this instance and what assemblers use it?  I saw the "lda <pspCmdTailCnt" in your post, but if that is intending to be a direct-page specification, that conflicts with the usage I'm familar with (which I've seen in most if not all 65xx assemblers) that treats < as a low-byte cast.

That is...
"<foo" and "foo & 0xFF" are equivalent

Quote
And if you want "C" like structure access notation ... the immediate question that I ask is, can I do a "lda ExtSyms1[IndexVar].Member2"?
[snip]
If not, then the syntax is pretty pointless, and IMHO no advance on the RS directives in practical use.

No, you would not be able to do that.  Which is why I originally raised the question as to how useful that syntax would actually be.  I'm still not entirely sold on it.

Quote
If you do want to make 6502 programming easier and more high-level [snip]

I don't want to change 65xx into an HLL.  The goal here is just to have an assembler that does intelligent symbol resolution without the user having to concern themselves with raw addresses.  Believe it or not, this is surprisingly lacking for 65xx assemblers.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 21, 2016, 01:47:37 pm
Existing 65xx series assemblers typically do not automatically take advantage of direct page access and require the user to explicitly state the addressing mode they want.

I want to do that.  But I also want the user to be able to explicitly override the assembler's judgement.
Agreed.

I wasn't aware the suffix notation conflicted with other architectures.  I recall seeing it on other 65xx assemblers (xkas maybe?  It's been a while)

What is the accepted syntax in this instance and what assemblers use it?  I saw the "lda <pspCmdTailCnt" in your post, but if that is intending to be a direct-page specification, that conflicts with the usage I'm familar with (which I've seen in most if not all 65xx assemblers) that treats < as a low-byte cast.
There are several conflicting standards, so far as I've been able to see. I think this is largely because WDC didn't originally own the IP (it was originally MOS Tech, I believe) and, when they did, it was entirely as an OEM supplier. So folks trying to address the need for an Apple II assembler (later an Apple IIgs, etc) had to come up with their own ideas, consistent with what they could find in terms of pre-existing docs (which I'm sure varied a lot from developer to developer back then.) The Merlin assembler, targeting the Apple II series, actually has an "LDA:" instruction -- yes, a ':' exists as part of the opcode!

On the web, today, there is a SINGLE book available that discusses syntax that is free. If you haven't already bothered (and I have no reason to imagine you don't have it already), then you can find it here (full res and big) (http://www.westerndesigncenter.com/wdc/documentation/Programmingthe65816_ProgManual.pdf), here (http://www.cs.bu.edu/~jappavoo/Resources/210/wdc_65816_manual.pdf), and here (http://wiki.nesdev.com/w/images/7/76/Programmanual.pdf). The first one is from WDC, itself. The second and third are identical but come from different sites. So there are two versions of the book. The point here is that this book is the only complete manual I know of that is available to anyone at all without charge. That alone makes it important. So you might consider this fact when considering some starting point for ideas. Just a thought.

No, you would not be able to do that.  Which is why I originally raised the question as to how useful that syntax would actually be.  I'm still not entirely sold on it.
Neither am I. I'm only illustrating semantics I'd like, not syntax for it. So long as the semantics are accessible, I plan to leave to you any question of syntax to reach it.

I don't want to change 65xx into an HLL.  The goal here is just to have an assembler that does intelligent symbol resolution without the user having to concern themselves with raw addresses.  Believe it or not, this is surprisingly lacking for 65xx assemblers.
Cripes, I don't want an HLL, either. Just a good assembler/linker toolset.

Worse, almost any kind of HLL imposes semantic restrictions. And I don't want any of that, at all. I want full access to the entire semantic range available to an assembler programmer. Not some limited subset that is invariably available to HLL programmers. (For example, C imposes restrictions on parameter passing and return value methods. And further restricts functions to single entry points. Not interested.)

However, if and when you feel like taking a jaunt through an interesting assembler that does a great deal while at the same time allowing the basics, you could look at Randy Hyde's "The Art of Assembly" book and assembler tool. It's got some pretty fancy HLL features adapted to assembly, including support for thunking. (Which makes a procedure call require both an address AND an activation frame pointer, not just an address -- very useful for things like 'iterators'.) But I'm in no way suggesting you even think about that -- especially not in the case of the 65816 -- it's just for a rainy day.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: elmer on May 21, 2016, 02:46:47 pm
Existing 65xx series assemblers typically do not automatically take advantage of direct page access and require the user to explicitly state the addressing mode they want.

What I've been trying to point out, is that good assemblers that followed the manufacturer's recommendations do actually take advantage of this stuff.

The problem was solved a long time ago.

It may have resurfaced because the hacking community are using quickly-knocked-together tools that ignore the published specifications ... but that doesn't mean that you have to re-invent the wheel, or come up with some new syntax to solve problems that have already been solved.

I suggest that you read the 65816 specs here ... http://archive.6502.org/datasheets/wdc_w65c816s_aug_4_2008.pdf

See pages 37-40 for the assembly language standards.

Having the assembler automatically take advantage of direct page access is a part of the standard, and was definitely used in the SNASM assembler that I used at the time.

BTW ... despite the 2008 date, these are the same basic specs that I photocopied from the Apple IIGS Hardware Reference manual when I started SNES development in 1991.


Quote
What is the accepted syntax in this instance and what assemblers use it?  I saw the "lda <pspCmdTailCnt" in your post, but if that is intending to be a direct-page specification, that conflicts with the usage I'm familar with (which I've seen in most if not all 65xx assemblers) that treats < as a low-byte cast.

That is...
"<foo" and "foo & 0xFF" are equivalent

This is in the specifications. The datasheet given above shows the syntax. AFAIK, CA65 supports this syntax.

The only difference that I remember in practice, was that a lot of assemblers supported using square-brackets instead of braces for indirection, i.e. ...

Code: [Select]
  lda [$01],y

instead of

Code: [Select]
  lda ($01),y

This was seen as a good-idea by most progammers because it separated the syntax of expression-evaluation from indirection.

CA65 now supports the square-bracket syntax as an option ... otherwise it uses the traditional manufacturer syntax.

It does use a different syntax for structure definitions and references than the "RS" method that I mentioned previously, probably to simplify its use with the CC65 compiler.


Quote
The goal here is just to have an assembler that does intelligent symbol resolution without the user having to concern themselves with raw addresses.  Believe it or not, this is surprisingly lacking for 65xx assemblers.

To the best of my knowledge, it's only lacking from the assemblers that you're familiar with.

From what I can see, CA65 is the current standards-bearer in 65xx assemblers.

I understand that you really don't like its linker syntax ... and I agree that it is a bit brutal at first look, but it provides the flexibility to create pretty-much any output ROM layout.

Providing template linker files for the common SNES ROM layouts seems like it would basically solve that complaint.

If you find that it's missing features that you feel that you need, then may I suggest that it might be more profitable to the 65xx programming community as a whole to attempt to extend CA65 before throwing all the toys out of the pram and starting from the scratch.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 21, 2016, 03:15:54 pm
I suggest that you read the 65816 specs here ... http://archive.6502.org/datasheets/wdc_w65c816s_aug_4_2008.pdf
That was a manual I hadn't yet found. Thanks for that.

This [ed: use of square brackets for indirection] was seen as a good-idea by most progammers because it separated the syntax of expression-evaluation from indirection.
That actually sounds like a good call to me, too.

It [ed: CA65] does use a different syntax for structure definitions and references than the "RS" method that I mentioned previously, probably to simplify its use with the CC65 compiler.
Ignorant of CA65/CC65, the fact that CA65 is and will be designed with the idea of working well with CC65 exposes the possibility of somewhat less than mutual goals in product directions/goals. I believe Disch wants something that a neophyte can start learning to use without having to climb a high learning curve before even getting started. However, I can say that when I looked it over with the idea of simply downloading it and giving it to my son for his current DQ3 SNES work, I knew immediately that I'd have way too much trouble trying to bring him up to the point where he'd use it. In the end, I chose wisely. I found a very simple assembler with source code that was trivial to modify and hacked together a complete, easy to use tool for him. He bit at it, right away, and has only come back to me twice about it (once for the weird LDA: opcode and once for the LDAL opcode.) Other than that, he has found it very easy to apply. He is autistic and has difficulty engaging and navigating through "ambiguity." He is a VERY GOOD test of whether or not someone new to assemblers can just "get started" with a tool. He'd never used an assembler before this and he has even more requirements than many about a tool needing to be very simple to use in getting started.

(It's difficult to fully explain why I say this, but think of it this way: my son can sit down with me and do rather complex perturbation theory in orbital mechanics with ease. But I wouldn't trust him to take a bus across town, as he has little understanding of neuro-typical social norms and expectations. He could teach a university classroom. He could be a university student. Because the roles are well-defined and clear to him. But he would be in anguish if left to his own devices at a "party." So I'm pretty picky about the ease of use of the assembly tool and the learning curves involved, as I was trying hard and unsuccessfully to get him to use an assembler. He was just avoiding it and would not go there, instead hand-coding and using Lua as a patcher because he already knew Lua. This tool I modified for him broke through the barriers. So I not only completely agree with Disch about his goals for the assembler, but I personally feel them as important. And having looked over CC65/CA65 before, with an eye out for my son, I looked elsewhere almost right away.)

If you find that it's missing features that you feel that you need, then may I suggest that it might be more profitable to the 65xx programming community as a whole to attempt to extend CA65 before throwing all the toys out of the pram and starting from the scratch.
If it is based on GNU then there is quite a high wall for initial entry; getting to the point of being effective in making such code modifications both well and appropriately. But again, that only comes from trying to go through gcc (not gas) some years back. I may be wrong on this point, too.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: Disch on May 21, 2016, 03:43:44 pm
I suggest that you read the 65816 specs here ... http://archive.6502.org/datasheets/wdc_w65c816s_aug_4_2008.pdf

Thank you for this.  =)

Quote
Having the assembler automatically take advantage of direct page access is a part of the standard, and was definitely used in the SNASM assembler that I used at the time.
[snip]
From what I can see, CA65 is the current standards-bearer in 65xx assemblers.

CA65 lets you set a zero page segment, and automatically trims 1-byte addresses to use zero/direct page mode, but apart from that does not support automatic direct page detection as far as I can tell.  I can't find any directive that let's you tell it where DP is supposed to be.  Or even DBR for that matter.  And there's no possible way it could do this properly without such a directive.

Ref page:  http://www.cc65.org/doc/ca65-11.html#ss11.1

This is a rather contrived example to keep the concept I'm trying to illustrate simple.... but I've found myself wanting to do things like this quite often.  If you can show me how this can be done on ca65, I would love to see it:

Code: [Select]
bigarray =  $0800       ; assume these are defined externally
snesreg =   $2118

;-------------
sep #$20                ; set data bank & tell assembler where it is
lda #^snesreg
pha
pld
#databank ^snesreg

;-------------
rep #$30
lda #bigarray           ; set direct page and tell assembler where it is
tcd
#directpage  bigarray

;-------------
ldx #0
:   lda bigarray        ; <- I want direct page without having to specify
    sta snesreg         ; <- I want absolute without having to specify
    inx
    inx
    bne :-

I've never seen any assembler capable of doing this.  If I'm wrong, please show me one.


Quote
I understand that you really don't like its linker syntax ... and I agree that it is a bit brutal at first look, but it provides the flexibility to create pretty-much any output ROM layout.

Can you do hot patching with it?  As in, if I want to modify an existing file and not create one from scratch.  Basically what makes xkas so useful.

I'm not dissing on ca65 just because I don't like the linker syntax.  It's a very good assembler and I've used it in the past (I even use it in my FF1 disassembly), but it falls short in more than one way.

Quote
If you find that it's missing features that you feel that you need, then may I suggest that it might be more profitable to the 65xx programming community as a whole to attempt to extend CA65 before throwing all the toys out of the pram and starting from the scratch.

Where and how symbols are resolved, and where and how instruction sizes are determined makes all the difference here.  ca65 does the latter in the assembler, but the former in the linker.

Therefore by the fundamental way it's designed, it can never work the way I want it to because instruction sizes are determined before symbols are fully resolved.  For me to "fix" this in ca65, it'd be a core design change.  I'd basically have to rip the entire thing apart and rebuild it.

It's easier to just start from scratch.

And way more fun.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: elmer on May 21, 2016, 04:18:07 pm
Ignorant of CA65/CC65, the fact that CA65 is and will be designed with the idea of working well with CC65 exposes the possibility of somewhat less than mutual goals in product directions/goals.

It's actually an easier syntax than the old "RS" method, and I suspect that you'd like it.

You basically just define a structure in a similar way to your PSP example, and then refer to structure members with a "::" syntax.

http://cc65.github.io/doc/ca65.html#structs


Quote
If it is based on GNU then there is quite a high wall for initial entry; getting to the point of being effective in making such code modifications both well and appropriately. But again, that only comes from trying to go through gcc (not gas) some years back. I may be wrong on this point, too.

Nope, you're right IMHO ... the GNU codebase is a bit ... yuk! I've updated/added V810 processor support to GCC/binutils, and it wasn't a particularly pleasant experience.

CC65/CA65/LK65 have nothing to do with the GNU codebase.

CC65 is a descendant of Small C, and everything else is custom code that seems to have been put together quite nicely (IMHO).

It certainly didn't take me long to add the square-bracket support to the codebase.


Quote
I believe Disch wants something that a neophyte can start learning to use without having to climb a high learning curve before even getting started. However, I can say that when I looked it over with the idea of simply downloading it and giving it to my son for his current DQ3 SNES work, I knew immediately that I'd have way too much trouble trying to bring him up to the point where he'd use it.

I totally understand that you want something easy-to-use for ROM hacking, and that you've got special considerations in trying to produce something that your son will use.

CA65/LK65 does not support loading/overwriting a ROM image, and that certainly makes it unsuitable for your current needs.

Now, whether you guys choose to try to extend CA65/LK65 to make them easier-to-use, or whether you choose to modify something else, or start from scratch ... the one thing that I'd heartily recommend is that you follow existing standards as much as possible, and not just invent some whole new syntax that's going to make any code that's written in your dialect be a total pain to reuse or to move to other standards-compliant assemblers.


May 21, 2016, 04:41:22 pm - (Auto Merged - Double Posts are not allowed before 7 days.)

This is a rather contrived example to keep the concept I'm trying to illustrate simple.... but I've found myself wanting to do things like this quite often.  If you can show me how this can be done on ca65, I would love to see it:

Code: [Select]
bigarray =  $0800       ; assume these are defined externally
snesreg =   $2118

;-------------
sep #$20                ; set data bank & tell assembler where it is
lda #^snesreg
pha
pld
#databank ^snesreg

;-------------
rep #$30
lda #bigarray           ; set direct page and tell assembler where it is
tcd
#directpage  bigarray

;-------------
ldx #0
:   lda bigarray        ; <- I want direct page without having to specify
    sta snesreg         ; <- I want absolute without having to specify
    inx
    inx
    bne :-

I've never seen any assembler capable of doing this.  If I'm wrong, please show me one.

Thank you for the example.

IIRC SNASM supported something similar ... but I don't have the software/documentation any more in order to verify that, so I can't prove it.

I see what you want ... and it's reasonable, and you're right, CA65 doesn't support it AFAIK.

It wouldn't be hard to add directives to CA65 to do that ... but given CA65/LK65s architecture, the labels would still need to be resolvable at assembly time, and not link time, in just the same way that the zero-page references work.


Quote
Therefore by the fundamental way it's designed, it can never work the way I want it to because instruction sizes are determined before symbols are fully resolved.  For me to "fix" this in ca65, it'd be a core design change.

You're right ... if your design-requirement is that the instruction-size can change during linking, and that's a must-have feature, then CA65 isn't going to do what you want, and you're going to have to write something new.

The cost/benefit/fun calculation is something that only you guys, as the prospective authors, can decide.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 21, 2016, 04:48:46 pm
Nope, you're right IMHO ... the GNU codebase is a bit ... yuk! I've updated/added V810 processor support to GCC/binutils, and it wasn't a particularly pleasant experience.
Hehe. I'm quite experienced (I've done compilers, interpreters, assemblers, and linkers more than once in my life) and in this GNU case found it requiring more time than I had considered acceptable. So I backed off of the idea.

CC65/CA65/LK65 have nothing to do with the GNU codebase.

CC65 is a descendant of Small C, and everything else is custom code that seems to have been put together quite nicely (IMHO).
Ah. That goes back a ways!! Cripes.

I may have rathered lcc, perhaps, since there is a very nice book out on the topic, it's based on newer roots, there is some active support and additional compiler-compiler tools around that work with it, and perhaps even a little more complete.

But thanks for the clue. That helps!

I totally understand that you want something easy-to-use for ROM hacking, and that you've got special considerations in trying to produce something that your son will use.

CA65/LK65 does not support loading/overwriting a ROM image, and that certainly makes it unsuitable for your current needs.

Now, whether you guys choose to try to extend CA65/LK65 to make them easier-to-use, or whether you choose to modify something else, or start from scratch ... the one thing that I'd heartily recommend is that you follow existing standards as much as possible, and not just invent some whole new syntax that's going to make any code that's written in your dialect be a total pain to reuse or to move to other standards-compliant assemblers.
Well, Disch doesn't want or need my help. So I'm in the kibitzing mode! Which I absolutely love! (It's a heck of a lot easier to be a gadfly.  ;)) But yes, where necessary and sufficient and widely used existing standards exist, I'd say it's poor form to at least not know why you aren't following them, if you choose not to. I'll bet Disch follows up.

Title: Re: 65816: Direct Page vs Absolute Operand
Post by: elmer on May 22, 2016, 03:06:55 pm
Ah. That goes back a ways!! Cripes.

I may have rathered lcc, perhaps, since there is a very nice book out on the topic, it's based on newer roots, there is some active support and additional compiler-compiler tools around that work with it, and perhaps even a little more complete.

From what I can see, there aren't many C compilers for the 6502 available, and the ones that I've found are mostly Small-C descendants with (at best) ANSI C syntax added on. As you already know, C as-a-language really doesn't match well with the original 6502 architecture, and AFAIK CC65 and other 6502 C compilers produce pretty poor code.

But some homebrew folks prefer to avoid all-assembly and write very-heavily-massaged C code, with just some optimized assembly functions for the most time-critical parts.

I don't quite understand that, myself, because by the time that you've learned all the compiler-specific rules to optimize the C output, and have totally changed the way that you write your C code, then IMHO you might as well have just written the whole thing in assembly-language using a good library of macros.

****************

As has been pointed out ... CC65 doesn't generate any 65816-specific code AFAIK, it's just outputing source for the original 6502 code.

The advances in the 65816 make it much better suited to C, and WDC will sell (give?) you an ANSI C compiler for the 65816 (and for the 65C02 too!).




Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 22, 2016, 03:34:01 pm
From what I can see, there aren't many C compilers for the 6502 available, and the ones that I've found are mostly Small-C descendants with (at best) ANSI C syntax added on. As you already know, C as-a-language really doesn't match well with the original 6502 architecture, and AFAIK CC65 and other 6502 C compilers produce pretty poor code.
It's not a good fit, in the sense of having instructions that make writing an effective C compiler 'easier.' But of course, there are a lot of processors that aren't particularly good fits to C (in that sense), such as the Microchip PIC parts which are very 'bare-metal' kinds of processors with all the ugliness exposed into plain view and a tiny hardware stack for return addresses to boot. (Let alone the 8051 core.) Doesn't change the fact that most people writing embedded code for such processors still use C (mixed with some assembly.)

I taught CS courses at the largest 4yr university in Oregon, where C (at that time) was required for freshman and sophmore years. I taught computer architecture, assembly, operating systems, and concurrent programming classes back then. I can assure you that the students generally HATED the assembly class and wanted to get right back into C/C++ as soon as possible and to NEVER look back again on assembly coding. (Except for the 5 or 6 students in a class of 75 that were from the EE department -- those folks had no problems with assembly and wanted it.) So it is no mystery to me about using C, even on the PIC parts.

In response, of course, to lower costs in manufacturing, smaller feature sizes, and improving yields, the manufacturers are of course responding to the reality of the now very large, lower tiers of programming skills that are cheaply and widely available to companies needing programmers. They are providing vast seas of flash memory on their processors and rapidly moving towards 32-bit cores and sophisticated memory management for general purpose operating systems and away from 8-bit cores (it's a little odd that 16-bit was largely skipped over, barring the MSP430, but there are some historical reasons there.)


But some homebrew folks prefer to avoid all-assembly and write very-heavily-massaged C code, with just some optimized assembly functions for the most time-critical parts.
That's not uncommon for embedded work. Companies almost insist on it, fearing "delays" threatened (and touted) by their employees who really aren't skilled at assembly coding, don't want to learn it, and remember their experiences in school with fear and trepidation. In any case, company management is probably wiser in choosing C/C++ w/asm because they can more readily find programmers able to use C/C++ and therefore pay less for them, besides. (Which means they can fire obnoxious asm programmers, too!)


I don't quite understand that, myself, because by the time that you've learned all the compiler-specific rules to optimize the C output, and have totally changed the way that you write your C code, then IMHO you might as well have just written the whole thing in assembly-language using a good library of macros.
That depends on a lot of factors. I tend to agree with the thrust in your writing, because I'm experienced with assembly (40+ years of it) and pretty much can write it as fast as I can think in any other language. But on the other hand, with the growth in availability of 32 bit cores -- ARM based and MIPS based -- and the need for general purpose operating systems (so that 'embedded' programmers don't really need to know anything special at all about embedded work and can just be dropped in from Windows or Linux development) and large codex and video and graphics and other libraries they can tap into without having to pay anything for them or lay hands on them.... they pretty much are forced into C/C++ by the sheer scale of the projects that are now possible with these new systems. So my opinion is a little nuanced here. I'd go different ways depending on the goals.


As has been pointed out ... CC65 doesn't generate any 65816-specific code AFAIK, it's just outputing source for the original 6502 code.
Yeah, someone here mentioned that news to me, earlier, and I retained it. Maybe you?


The advances in the 65816 make it much better suited to C, and WDC will sell (give?) you an ANSI C compiler for the 65816 (and for the 65C02 too!).
I have the WDC toolset loaded into my machine. It doesn't work well. Funny, in that it is dated from 2013, I think, which would seem to mean it is "current," sort of. Their C compiler, called WDC816CC.EXE, pops up an error message telling me "E0002 -- Could not get a license" and then goes on to tell me I don't have a product or group license for it. The assembler, WDC816AS.EXE, seems to work just fine. Their IDE is simply broken. You can't use it, at all. Keeps wanting a "version" to be given to it, but provides no way to do so that I'm aware of. I think this has to do with character set versions, looking around a bit, but I'm not sure. Regardless, I can't get the IDE working much at all. It actually crashes.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: AWJ on May 22, 2016, 03:58:22 pm
The only difference that I remember in practice, was that a lot of assemblers supported using square-brackets instead of braces for indirection, i.e. ...

Code: [Select]
  lda [$01],y

instead of

Code: [Select]
  lda ($01),y

This was seen as a good-idea by most progammers because it separated the syntax of expression-evaluation from indirection.

On the 65816, lda ($01),y and lda [$01],y are two different addressing modes. The first is indirect (dereferences a 16-bit pointer into the current DB), the second is indirect long (dereferences a 24-bit pointer).
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 22, 2016, 04:40:22 pm
On the 65816, lda ($01),y and lda [$01],y are two different addressing modes. The first is indirect (dereferences a 16-bit pointer into the current DB), the second is indirect long (dereferences a 24-bit pointer).
There are all kinds of odd-ball addressing modes provided on the 65816 and the syntax used is kind of 'strained' to fit it.

I don't know how an assembler is supposed to handle these three, if external symbolics are used instead of a constant value. Worse, I'm not sure how an assembly programmer is supposed to be able to force an absolute indexed with X addressing mode if the absolute address is to be $0010. Or even accurately read source code using such symbols, given various assemblers and not knowing what some specific assembler does in each case. But here they are:
Code: [Select]
     LDA $10,X          ; direct page indexed with X
     LDA $1010,X        ; absolute indexed with X
     LDA $101010,X      ; absolute long indexed with X

This following example is inconsistent:
Code: [Select]
     LDA ($10,X)        ; direct page indexed indirect with X
     JMP ($10,X)        ; absolute indexed indirect with X
All the above shows is that there are two different meanings for the same operand syntax, depending upon the opcode. Ugly.

But these do make some sense:
Code: [Select]
     LDA ($10),Y        ; direct page indirect indexed with Y
     LDA [$10],Y        ; direct page indirect long indexed with Y
     LDA ($10)          ; direct page indirect
     LDA [$10]          ; direct page indirect long

Still, I think I'd be open to a clean re-design of the operand syntax for the 65816. And I certainly understand the desire to allow () in expressions without further confusing an assembler tool.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: elmer on May 22, 2016, 05:26:33 pm
That depends on a lot of factors. I tend to agree with the thrust in your writing, because I'm experienced with assembly (40+ years of it) and pretty much can write it as fast as I can think in any other language. But on the other hand, with the growth in availability of 32 bit cores -- ARM based and MIPS based -- and the need for general purpose operating systems (so that 'embedded' programmers don't really need to know anything special at all about embedded work and can just be dropped in from Windows or Linux development) and large codex and video and graphics and other libraries they can tap into without having to pay anything for them or lay hands on them.... they pretty much are forced into C/C++ by the sheer scale of the projects that are now possible with these new systems. So my opinion is a little nuanced here. I'd go different ways depending on the goals.

Oh, I have no disagreement at all. C/C++ is faster to write in, and I have no complaint at all with people using it on architectures that provide a compatible environment for it (i.e. some 8-bit, and every 16-bit-or-higher architecture that I'm familiar with).

The 6502/65C02 just isn't one of those IMHO, unless you really, really, really tailor/limit your compiler even more than I've seen (so far) in practice.

David Wheeler's page on 6502 language implementation gathers together some interesting ideas on optimizing a 6502 compiler ...
http://www.dwheeler.com/6502/a-lang.txt

When you're already having to severely mangle your C code to make it semi-efficient on the platform, I don't see much downside in putting a few more restrictions on it in order to generate even-better code.

In particular, I love the idea of limiting C's stack to a split lo-byte/hi-byte fixed area of memory (2 256-byte absolute locations), and then automatically turning "large" local variable definitions into automatic heap allocations.


On the 65816, lda ($01),y and lda [$01],y are two different addressing modes. The first is indirect (dereferences a 16-bit pointer into the current DB), the second is indirect long (dereferences a 24-bit pointer).

Yep, my patch to add square-brackets to CA65 specifically documents not to use it in 65816 mode, for exactly that reason (because it hides the indirect long addressing mode from the assembler).

Using the square-brackets was only common on 6502 assemblers.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 22, 2016, 05:37:42 pm
I suggest that you read the 65816 specs here ... http://archive.6502.org/datasheets/wdc_w65c816s_aug_4_2008.pdf
See pages 37-40 for the assembly language standards.
On page 17, section 3.5, it says "Words, arrays, records, or any data structures may span 64 KByte bank boundaries with no compromise in code efficiency."

But then, for example, it says in section 3.5.3, "With Absolute Indexed with X (a,x) addressing the second and third bytes of the instruction are added to the X Index Register to form the low order 16 bits of the effective address. The Data Bank Register contains the high order 8 bits of the effective address." Parsing those words carefully, I'd imagine that a data structure could NOT "span 64 KByte bank boundaries with no compromise in code efficiency" using this mode. Instead, it appears that only a 16-bit ALU add is used and that any carry is tossed away, not added to the temporary DBR value to form up a 24-bit address. However, also in section 3.5.3, they show the following:

Code: [Select]
[  DBR  ][ addrh ][ addrl ]
   +                  X Reg
---------------------------
          effective address
Which suggests to me the possibility of a carry out of addrh affecting the bank value of the effective address.

So, is section 3.5 insane? Or is it section 3.5.3?




May 22, 2016, 05:52:26 pm - (Auto Merged - Double Posts are not allowed before 7 days.)
That depends on a lot of factors. I tend to agree with the thrust in your writing, because I'm experienced with assembly (40+ years of it) and pretty much can write it as fast as I can think in any other language. But on the other hand, with the growth in availability of 32 bit cores -- ARM based and MIPS based -- and the need for general purpose operating systems (so that 'embedded' programmers don't really need to know anything special at all about embedded work and can just be dropped in from Windows or Linux development) and large codex and video and graphics and other libraries they can tap into without having to pay anything for them or lay hands on them.... they pretty much are forced into C/C++ by the sheer scale of the projects that are now possible with these new systems. So my opinion is a little nuanced here. I'd go different ways depending on the goals.
Oh, I have no disagreement at all. C/C++ is faster to write in, and I have no complaint at all with people using it on architectures that provide a compatible environment for it (i.e. some 8-bit, and every 16-bit-or-higher architecture that I'm familiar with).

The 6502/65C02 just isn't one of those IMHO, unless you really, really, really tailor/limit your compiler even more than I've seen (so far) in practice.

David Wheeler's page on 6502 language implementation gathers together some interesting ideas on optimizing a 6502 compiler ...
http://www.dwheeler.com/6502/a-lang.txt

When you're already having to severely mangle your C code to make it semi-efficient on the platform, I don't see much downside in putting a few more restrictions on it in order to generate even-better code.

In particular, I love the idea of limiting C's stack to a split lo-byte/hi-byte fixed area of memory (2 256-byte absolute locations), and then automatically turning "large" local variable definitions into automatic heap allocations.
David's page is really focused on the 6502. I've got my head into the 65816 right now. But I can see why it may be useful, if still more headaches ahead, to keep both contexts in mind. (Well, there is also the somewhat incompatible 65C02, as well. Why not really get a nice headache and include all three fully and completely?)

I'm not entirely sure about your thoughts of using heap. I initially read your words (before seeing the 'automatic heap' part) as meaning that one might do call stack analysis with the idea of placing local variables in static memory (in the sense used when discussing C variable lifetimes) found on bank 0. In this sense, they'd just be static. Not heap allocated. The C compiler would have to trace out the call tree, of course. But that is already done with 8051 C compilers where there is only 128 bytes or 256 bytes within the CPU itself and the technology is understood pretty well. (Recursion, of course, is an issue.) But I suppose you mean that there would be a single heap allocation used to place the static region and that after that occurs, it would be treated as static. Is that about right? Or did I miss something important?

I note that David's page includes a short comment about CC65's -static-locals option. Is it different than what you propose?


On the 65816, lda ($01),y and lda [$01],y are two different addressing modes. The first is indirect (dereferences a 16-bit pointer into the current DB), the second is indirect long (dereferences a 24-bit pointer).
Yep, my patch to add square-brackets to CA65 specifically documents not to use it in 65816 mode, for exactly that reason (because it hides the indirect long addressing mode from the assembler).

Using the square-brackets was only common on 6502 assemblers.
Does CA65 follow the WDC manual on these?
Code: [Select]
     LDA ($10,X)        ; direct page indexed indirect with X
     JMP ($10,X)        ; absolute indexed indirect with X
That really looks inconsistent to me and just begs for a syntax adjustment that clarifies what's going on in the two cases.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: elmer on May 22, 2016, 11:16:45 pm
There are all kinds of odd-ball addressing modes provided on the 65816 and the syntax used is kind of 'strained' to fit it.

I don't know how an assembler is supposed to handle these three, if external symbolics are used instead of a constant value. Worse, I'm not sure how an assembly programmer is supposed to be able to force an absolute indexed with X addressing mode if the absolute address is to be $0010. Or even accurately read source code using such symbols, given various assemblers and not knowing what some specific assembler does in each case. But here they are:
Code: [Select]
     LDA $10,X          ; direct page indexed with X
     LDA $1010,X        ; absolute indexed with X
     LDA $101010,X      ; absolute long indexed with X

Well, according to my reading of the standards, "LDA unknownvar,X" is going to assemble to a 2-byte address.

If the programmer wants something else, they have to use an override.

To force an absolute indexed addessing mode when the assembler already knows that the high-byte is zero, then you're also going to need an override ...

LDA |$1010,X

So, there are clear and simple rules ... they just don't always look nice, nor are they always obvious.


On page 17, section 3.5, it says "Words, arrays, records, or any data structures may span 64 KByte bank boundaries with no compromise in code efficiency."

But then, for example, it says in section 3.5.3, "With Absolute Indexed with X (a,x) addressing the second and third bytes of the instruction are added to the X Index Register to form the low order 16 bits of the effective address. The Data Bank Register contains the high order 8 bits of the effective address." Parsing those words carefully, I'd imagine that a data structure could NOT "span 64 KByte bank boundaries with no compromise in code efficiency" using this mode. Instead, it appears that only a 16-bit ALU add is used and that any carry is tossed away, not added to the temporary DBR value to form up a 24-bit address. However, also in section 3.5.3, they show the following:

Code: [Select]
[  DBR  ][ addrh ][ addrl ]
   +                  X Reg
---------------------------
          effective address
Which suggests to me the possibility of a carry out of addrh affecting the bank value of the effective address.

So, is section 3.5 insane? Or is it section 3.5.3?

My old data sheet matches your section 3.5.3, and not the marketing-bulls*t in section 3.5.

The illustration is basically the same, but perhaps a tiny bit less ambiguous ...

Code: [Select]
|  DBR  | addrh | addrl |
       +|       | X Reg |
-------------------------
        effective address

So, I'm calling BS on section 3.5.


Quote
David's page is really focused on the 6502. I've got my head into the 65816 right now. But I can see why it may be useful, if still more headaches ahead, to keep both contexts in mind. (Well, there is also the somewhat incompatible 65C02, as well. Why not really get a nice headache and include all three fully and completely?)

Yeah, we're coming at this from different perspectives.

I really don't care about the extra stuff in the 65816. There are already got a couple of 65816 C compilers, like the WDC one ... and I believe that that one is supposed to be descended from whatever compiler was used  (by those few people that used it) on the SNES.

I'm much more interested in the 6502-variants, particularly in the HuC6280 that's in the PC Engine, which is basically a 65C02 with bank mapping and a couple of extra instructions.


Quote
I'm not entirely sure about your thoughts of using heap. I initially read your words (before seeing the 'automatic heap' part) as meaning that one might do call stack analysis with the idea of placing local variables in static memory (in the sense used when discussing C variable lifetimes) found on bank 0. In this sense, they'd just be static. Not heap allocated. The C compiler would have to trace out the call tree, of course.

David Wheeler refers to same idea, and IMHO, that would be the gold-standard to aim for in a 6502 C compiler for performance.

I don't even know if it would be possible to retrofit that into one of the existing 6502 C compilers, or if you'd just be better-off starting from scratch.

Either way ... it's way above my interest-level and capability to work on myself.


Quote
But I suppose you mean that there would be a single heap allocation used to place the static region and that after that occurs, it would be treated as static. Is that about right? Or did I miss something important?

Nope, I'm talking about a quick-and-easy hack to an existing 6502 C compiler in an attempt to improve 80% of the parameter/local stack handling, at the expense of using dynamic heap-allocation for what would normally be stack-based local variables if they are deemed "too big".

The idea being to keep the regular C parameter/local stack addressable as "absaddr,X" instead of "[cstack],Y". It would make access to the C stack a lot quicker, but limited to 256 entries (byte, word or long).

That's "nasty" for a general-purpose compiler, but IMHO, would be acceptible for writing games on a console.


Quote
I note that David's page includes a short comment about CC65's -static-locals option. Is it different than what you propose?

Yes, that option basically just turns every local variable into a "static" variable with no attempt to optimize the usage with a call-tree trace in the way that you were thinking about above.

My cheap-and-nasty proposal is another one of the ideas on David's page, but it keeps the stack concept and tries to improve its performance.

The ideas aren't necessarily exclusive ... I can imagine scenarios when you might want to use both techniques.


Quote
Does CA65 follow the WDC manual on these?
Code: [Select]
     LDA ($10,X)        ; direct page indexed indirect with X
     JMP ($10,X)        ; absolute indexed indirect with X
That really looks inconsistent to me and just begs for a syntax adjustment that clarifies what's going on in the two cases.

Looking at the "ea65.c" source file, it certainly seems to be following the WDC manual.

"Yes", it's inconsistant, and the JMP is nasty anyway because it's a 16-bit absolute index into bank 0 instead of the DBR.

Unless you actually know and understand the weird kinks in the 65816 instruction set, I doubt that you're going to be able to understand the darned code anyway, even with a "[]" or a "{}" or whatever you choose to put in there.

I don't have much love for WDC or for the 65816.

IMHO, the 6809 was a much better example of how to extend/expand an existing architecture (the 6800) than the 65816 was.

Heck, I'd even prefer Intel's expansion of the 8080 into the 8086!
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 23, 2016, 01:48:49 am
There are all kinds of odd-ball addressing modes provided on the 65816 and the syntax used is kind of 'strained' to fit it.

I don't know how an assembler is supposed to handle these three, if external symbolics are used instead of a constant value. Worse, I'm not sure how an assembly programmer is supposed to be able to force an absolute indexed with X addressing mode if the absolute address is to be $0010. Or even accurately read source code using such symbols, given various assemblers and not knowing what some specific assembler does in each case. But here they are:
Code: [Select]
     LDA $10,X          ; direct page indexed with X
     LDA $1010,X        ; absolute indexed with X
     LDA $101010,X      ; absolute long indexed with X
Well, according to my reading of the standards, "LDA unknownvar,X" is going to assemble to a 2-byte address.
So if it is an external symbolic, it's going to be treated as a 2-byte absolute indexed with X mode and nothing else?

Arbitrary.

I'd like to see something more like one of these for the direct page mode:
Code: [Select]
     LDA DP:$10,X       ; direct page indexed with X
     LDA DP[$10],X      ; direct page indexed with X
     LDA [DP+$10],X     ; direct page indexed with X
Just something to point out that DP is involved, rather than having absolutely nothing there.

These two I can understand living with, allowing the linker to adjust them as needed and if needed:
Code: [Select]
     LDA $1010,X        ; absolute indexed with X
     LDA $101010,X      ; absolute long indexed with X
They really are similar enough that it doesn't grate on my nerves. But the direct page mode looking so similar, too? That seems almost criminal to me.



If the programmer wants something else, they have to use an override.

To force an absolute indexed addessing mode when the assembler already knows that the high-byte is zero, then you're also going to need an override ...

LDA |$1010,X

So, there are clear and simple rules ... they just don't always look nice, nor are they always obvious.
Okay. I've seen a number of what I consider to be ill-considered parsing short-cuts made in these assemblers. It looks as though they ran their eyes over the keyboard and looked for some as yet unused character they could bend to some novel purpose of the moment. And that's about all the thinking it got, looks like to me. And different people's eyes went to different places on their keyboards, too. Or the same places, but their brains said different things to them for the same cheesy short-cut character. Oh, well.

By comparison, MASM/ML is a dream of craftsmenship.


My old data sheet matches your section 3.5.3, and not the marketing-bulls*t in section 3.5.

The illustration is basically the same, but perhaps a tiny bit less ambiguous ...

Code: [Select]
|  DBR  | addrh | addrl |
       +|       | X Reg |
-------------------------
        effective address

So, I'm calling BS on section 3.5.
Assuming Xreg is set to 16-bit, I gather you also don't believe they are doing a 16-bit + 16-bit ALU add together with a ripple carry into a temporary copy of the DBR. But instead, just doing a 16-bit + 16-bit ALU add, tossing away the carry, and passing on the actual DBR in the uppermost 8-bit lane of the EA. One less clock and a fair bit less logic, that way. However, if I now go look at wdc_65816_manual.pdf (about 470 pages long) and look at page 289, they most definitely say "The Data Bank Register is concatenated with the 16-bit Operand: the 24 bit result is added to X (16 bits if 65802/65816 native mode, x = 0; else 8)." And they show a nice little diagram showing what appears to me to allow beyond-bank addressing. Doesn't seem to make any bones about it -- it should be able to reach past the DBR bank.

Hmm. You need to let me know where to go find some nice simulator source code so I can go see what is done. Do you know of a good place to read, where the code is nicely organized and readable? If not, I suppose I should just go find something myself and load it down. My son is using bsnes-plus, so I suppose that's what I should grab up if you don't have a better way to go, off-hand. Thanks if you do!


Yeah, we're coming at this from different perspectives.

I really don't care about the extra stuff in the 65816. There are already got a couple of 65816 C compilers, like the WDC one ... and I believe that that one is supposed to be descended from whatever compiler was used  (by those few people that used it) on the SNES.

I'm much more interested in the 6502-variants, particularly in the HuC6280 that's in the PC Engine, which is basically a 65C02 with bank mapping and a couple of extra instructions.
Assuming by "more interested in" you mean C compilers:

Well, so far as I can tell I can't actually use the WDC C compiler. Wants a license, it says. Not happening. Besides, doesn't it handle the 6502 and 65C02, already? So that if you are pointing there for "yet another C compiler already there for the 65816" aren't you also pointing there for "yet another C compiler already there for the 6502?"

Is there any more need for a 6502 C compiler than for a 65816 C compiler? Just curious, not offering. I'm not sure I am anywhere near wanting to consider writing a good C compiler project for the 6502 series now. A good one would be more than pedestrian work.

Of course, maybe you were talking about a different interest I didn't catch.


Nope, I'm talking about a quick-and-easy hack to an existing 6502 C compiler in an attempt to improve 80% of the parameter/local stack handling, at the expense of using dynamic heap-allocation for what would normally be stack-based local variables if they are deemed "too big".

The idea being to keep the regular C parameter/local stack addressable as "absaddr,X" instead of "[cstack],Y". It would make access to the C stack a lot quicker, but limited to 256 entries (byte, word or long).

That's "nasty" for a general-purpose compiler, but IMHO, would be acceptible for writing games on a console.
Ah, so you are talking about a C compiler. But just a "quick and easy" hack to one.

I'm just not at all interested in C for the 6502. It's such a small device and I'm perfectly happy with assembler for something like that. And I think you are, too, from some of your earlier comments. Ah, here it is:
Quote from: elmer
I don't quite understand that, myself, because by the time that you've learned all the compiler-specific rules to optimize the C output, and have totally changed the way that you write your C code, then IMHO you might as well have just written the whole thing in assembly-language using a good library of macros.
That one. So you can't really be all that interested in C compilers, then.

Or am I missing something?


Yes, that option basically just turns every local variable into a "static" variable with no attempt to optimize the usage with a call-tree trace in the way that you were thinking about above.

My cheap-and-nasty proposal is another one of the ideas on David's page, but it keeps the stack concept and tries to improve its performance.

The ideas aren't necessarily exclusive ... I can imagine scenarios when you might want to use both techniques.
Agreed. I think I understand better now.


I don't have much love for WDC or for the 65816.
Hehe. Okay. WDC is a different story. But processors are 'just processors.' They don't come ugly or pretty to me. They just do stuff and I adjust the way I think to fit. I can dislike assembler syntax. But that's not about the architecture. I like the PIC10/PIC12/PIC14/PIC16/PIC18, for example, because I can trivially lay out RTL for it almost in my sleep. (Not counting the peripherals, I admit.) All the CPU bits are exposed. Some people hate that. I don't love it. But I understand it, respect it, and don't hate it. Same with most things. I really liked the PDP-11, though. That was a marvelous set of well-considered compromises to use a 16-bit wide instruction space. (Ignoring some of the weird memory mapped hardware bits tacked on.) The 36 bit PDP-10? ASCII was fun there? The 60-bit Cyber Star? Now that was different fun -- to get the CPU working I'd have to adjust the distance from the CPU to memory. hehe. Anyway, they are all good. I don't care that much, so long as good folks worked hard and produced a reasonably complete design.


IMHO, the 6809 was a much better example of how to extend/expand an existing architecture (the 6800) than the 65816 was.

Heck, I'd even prefer Intel's expansion of the 8080 into the 8086!
Intel's 8088 (still have some here) was a really nice innovation for some of us. Working on the 8080 (which was mostly obnoxious because of its need for phased clocks and three darned power rails) and later on the 8085 (much nicer, hardware wise, finally) using 7400 series registers to add banking to the memory system?? Now that was a pain. Switch banks and you had better have code sitting there in the right place in the new bank, because the CPU had no idea what was going on. The 8088 provided a large number of fine-grained overlapping segments, which was MUCH nicer by comparison. HORRIBLE if you wanted to write C compilers and assembler tools. But really nice by comparison with what served beforehand.

The PDP-11 really remains my favorite, though. I've fallen in love with the DEC VAX, the Mot 88k and the TI TMS9900 in their days. I've worked on VLIW. I worked hard and learned to love the MIPS R2000 in its day and I really like the MIPS M4k core used in the PIC32, too. And, of course, worked at Intel on around the time of the BX chipset and the Pentium II/Pentium Pro. (I do like the design of going from segmented memory via GDT/LDT/IDT to the paging system and from there to the physical memory pins. And the front side bus transaction design is nice, too. All that is more honery than pretty, though.)

So the PDP-11 is where my heart stays. The JSR alone was a marvelous idea and no one else does it like that, even to this day. Lots of other good ideas in there, with tight constraints to boot, and all of it done with a sense of the elegant I don't usually see in a CISC CPU design. (I do like the all-out RISC approaches that MIPS took and that DEC did with their Alpha. But the Alpha exception process was an absolute nightmare. They took MIPS' RISC approach and pushed it to the absolute limits and beyond reason. I did like their decision to not support byte lane changes, though. That RISC decision was right. But that exception system? Wow! Insanity for those writing the handlers.)

EDIT: Cripes. I'm too old. I just read back over the list above and worried I should cut the list down. Then I realized that I had been cutting it down and that it was actually a severely curtailed digest. So what's the point cutting more out? There are so many more I've worked on not mentioned above. They go back to mercury delay line memories. Who remembers those? Or the later 1k to 8k drum memories, where a few k-byte would be the size of a washing machine? Anybody living still remember hand-wiring core memory? Oh, well. I better go get my cane and look for some whipper-snapper I can club over the head when they tell me that their terabyte disks and 32-gig RAM systems aren't big enough for them.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: AWJ on May 23, 2016, 11:25:57 am
Indexed addressing definitely can straddle banks on the 65816. The exception is indexed indirect jmp (jmp (jumptable,x)), where the pointer is fetched from the program bank rather than the data bank, and apparently it wraps within that bank (assuming bsnes is correct, and I believe its 65816 simulation has been extensively tested, down to edge cases like BCD arithmetic with illegal BCD values)
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 23, 2016, 12:13:25 pm
Indexed addressing definitely can straddle banks on the 65816. The exception is indexed indirect jmp (jmp (jumptable,x)), where the pointer is fetched from the program bank rather than the data bank, and apparently it wraps within that bank (assuming bsnes is correct, and I believe its 65816 simulation has been extensively tested, down to edge cases like BCD arithmetic with illegal BCD values)
Thanks AWJ. It does seem as though this is strictly for the data memory, not code memory. It's just that the documents don't have anywhere near the clarity that I see in the Intel documents or in Microchip documents.. Or for that matter, in pretty much anyone else's documents. Instead, I find all manner of mistakes in the documents from WDC. From simple visual mistakes and absolutely stupid and obvious "copying" mistakes, to more subtle ones. The sheer number of errors is stunning, though. And uncorrected in all these years. Oh, well.

Thanks again!
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: elmer on May 23, 2016, 02:19:30 pm
They really are similar enough that it doesn't grate on my nerves. But the direct page mode looking so similar, too? That seems almost criminal to me.

I wouldn't disagree, I like my code readable and to avoid too many deep-knowledge-required tricks (at least, those without detailed comments).


Quote
Assuming Xreg is set to 16-bit, I gather you also don't believe they are doing a 16-bit + 16-bit ALU add together with a ripple carry into a temporary copy of the DBR. But instead, just doing a 16-bit + 16-bit ALU add, tossing away the carry, and passing on the actual DBR in the uppermost 8-bit lane of the EA. One less clock and a fair bit less logic, that way.

Yep, that's what I was thinking ... and as AWJ pointed out, I was wrong.  :-[

I just took a look at the bsnes source that's embedded in Mednafen, and it's definitely doing a 24-bit add.


Quote
Hmm. You need to let me know where to go find some nice simulator source code so I can go see what is done. Do you know of a good place to read, where the code is nicely organized and readable? If not, I suppose I should just go find something myself and load it down. My son is using bsnes-plus, so I suppose that's what I should grab up if you don't have a better way to go, off-hand. Thanks if you do!

Sorry, I don't know of any simulator except for the one on the WDC site that's built into the WDC Tools.

The bsnes source in Mednafen was easy to read ... but I've never looked at bsnes-plus, so I've no idea if it is any different.


Quote
Assuming by "more interested in" you mean C compilers:

Well, so far as I can tell I can't actually use the WDC C compiler. Wants a license, it says. Not happening. Besides, doesn't it handle the 6502 and 65C02, already? So that if you are pointing there for "yet another C compiler already there for the 65816" aren't you also pointing there for "yet another C compiler already there for the 6502?"

Yes, I was curious to see if the WDC compiler produced any better 6502 code than the open source compilers like CC65 and HuC.

I just downloaded the WDC C compiler, and I'm getting the same license error as you.

Perhaps the license is built into the TIDE editor/environment???

Whether it is or not ... I think that we'd both have the same complaint ... I'd want to choose an open source compiler if I was going to work on any changes/upgrades.

And I'm "more interested in the 6502 variants" both in C terms, and in general terms.


Quote
Is there any more need for a 6502 C compiler than for a 65816 C compiler? Just curious, not offering. I'm not sure I am anywhere near wanting to consider writing a good C compiler project for the 6502 series now. A good one would be more than pedestrian work.

Not really a huge need. AFAIK they're both really only of interest to modern homebrew programmers, and I lean towards the opinion that folks should probably be writing in assembly on those architectures in the same way that the games were originally developed.

IIRC, C wasn't in standard use until the 5th-generation machines (PlayStation/Saturn/3DO).

OTOH ... I am contemplating doing an arcade port to the PC Engine, and having a C compiler that didn't suck could certainly speed up the development of the parts of that port that didn't need to be time-critical.

As such, I'm drawn to the CC65/CA65 toolchain because it makes it so easy to switch between languages in the same project, and it's been put together very well (IMHO).


Quote
Of course, maybe you were talking about a different interest I didn't catch.

Ah, so you are talking about a C compiler. But just a "quick and easy" hack to one.

I'm just not at all interested in C for the 6502. It's such a small device and I'm perfectly happy with assembler for something like that. And I think you are, too, from some of your earlier comments. Ah, here it is:That one. So you can't really be all that interested in C compilers, then.

Or am I missing something?

Nope, you're basically right. I don't have the "passion" for creating a C compiler from scratch ... but I wouldn't object to putting in some work to implement a few simple improvements to one, if they'd help both me, and other developers, on the platforms that I care about.

For me, that's the 6502, and not the 65816.

Hahaha ... I really don't find a 7MHz 65C02-variant with 2.5M RAM (that's bytes, not bits) and a 650MB CD (for streaming code/data/audio) to be a particularly "small" device.  ;)


Quote
Hehe. Okay. WDC is a different story. But processors are 'just processors.' They don't come ugly or pretty to me. They just do stuff and I adjust the way I think to fit.

Ahhh ... I'm a little more critical and less forgiving.

I don't have anywhere near the breadth of experience that you have. It looks like you've been active at just the right time, and in just the right field, to have seen pretty-much the entire history of microprocessor development.

Of the dozen-or-so architectures that I have seen, some stand out as really well thought out, and some don't.

The expansion of the 8080 into the 8088 was a really nice piece of work. Yes, the limitations became apparent, and caused problems later ... but they seemed really well thought out for the situation that existed at the time.

I just took a look at the wikipedia page on the PDP-11, and yes, that JSR is a really nice idea!

The early RISC architectures are definitely interesting, and seemed like a huge change coming from CISC platforms like the 68000.

For some reason, I could never love MIPS architecture, and Hitachi's SuperH (SH-2) is IMHO and abomination, but I recently found the NEC V810 RISC architecture (in the VirtualBoy and the PC-FX), and that seems really well thought out, to me.


Quote
So the PDP-11 is where my heart stays. The JSR alone was a marvelous idea and no one else does it like that, even to this day. Lots of other good ideas in there, with tight constraints to boot, and all of it done with a sense of the elegant I don't usually see in a CISC CPU design.

Hahaha, yep, I just took a look at the NatSemi 32000 series again with the recent release of an updgraded FPGA implemention (the M32632).

It's definitely an interesting classic-CISC processor design, but I don't think that anyone would ever use the word "elegant" in describing it.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 23, 2016, 04:18:02 pm
I just took a look at the bsnes source that's embedded in Mednafen, and it's definitely doing a 24-bit add.
Yeah. This morning I also looked at the bsnes-plus source code, too. It's all in the ../cpu/core directory. Nice.

Sorry, I don't know of any simulator except for the one on the WDC site that's built into the WDC Tools.
The only emulators I recognize as an emulator are built on hardware. An emulator emulates hardware. Period. Everything else is a simulator to me. Microchip makes a simulator that runs in their MPLAB IDE. It simulates the cpu, peripherals, memory, etc. It's still a simulator, not an emulator. I have a Microchip emulator -- it has a nice little pod I can plug into a cpu socket. That's an emulator. When I worked at Intel, we had a huge box of FPGAs and out of that mess there was a nice cable and ... yup ... a pod that could fit into the CPU socket of a motherboard. And guess what? It ran the motherboard from the RTL (VHDL or Verilog plus floorplanning) from that huge FPGA box. That's also an emulator.

If it doesn't emulate a piece of hardware I can drop into other hardware, it's just a sim to me.

Someday I might change. But 40 years is a hard habit to break.

I just downloaded the WDC C compiler, and I'm getting the same license error as you.

Perhaps the license is built into the TIDE editor/environment???

Whether it is or not ... I think that we'd both have the same complaint ... I'd want to choose an open source compiler if I was going to work on any changes/upgrades.
Yup. So I'm writing off the WDC C compiler. I'm sure it would be a nightmare trying to get ahold of them, anyway. I'm sure they are at minimum staff, just barely enough to manage their IP and count the dollars coming in, I'm sure. I don't think they are pro-active anymore. Or if they ever were. I think they just wait around, instead, and pocket the IP bucks.

And I'm "more interested in the 6502 variants" both in C terms, and in general terms.
The only experience I have, commercially, with 6502 variants is with the Seiko message watch some years ago. But this makes my point here. I don't know of anyone using a general-purpose, end-user quantity situation with the 6502. Is anyone doing homebrew 6502 boards anymore? I might have a few CPUs here, still. In a box somewhere. But I will probably never build anything with them. Does anyone? Are they still sold?

Seems to me it is more of a "rice cooker" style, one million unit order size kind of thing. That fits my Seiko experience, where they needed some very specific features added. You either hire your own ASIC designer and WDC provides you with the basics to work with, licensing your rights; or else you hire WDC and let them contract that out for you.

But I thought the 6502 was otherwise kind of dead to the hobby world. Just folks who can buy an old Apple II, an SNES, or a NES, or something like that.

Not much to hang a hat on, if considering C compiler efforts.

AFAIK they're both really only of interest to modern homebrew programmers, and I lean towards the opinion that folks should probably be writing in assembly on those architectures in the same way that the games were originally developed.
Agreed, of course. Except that I'm really curious if there are ANY homebrewers. Who wire-wraps anymore? It's not hard to get a board built, but you have to do layout and order a panel's worth. Could do it 'dead bug' I suppose. But is anyone doing 6502 homebrew?

IIRC, C wasn't in standard use until the 5th-generation machines (PlayStation/Saturn/3DO).
Your memory would be better than mine on this. I don't have a reason to dispute your comment, though. And it is consistent with what I think I know.

OTOH ... I am contemplating doing an arcade port to the PC Engine, and having a C compiler that didn't suck could certainly speed up the development of the parts of that port that didn't need to be time-critical.

As such, I'm drawn to the CC65/CA65 toolchain because it makes it so easy to switch between languages in the same project, and it's been put together very well (IMHO).
Now that's intriguing. What is the PC Engine? I'd like to see a description of it. Thanks...

I don't have the "passion" for creating a C compiler from scratch ... but I wouldn't object to putting in some work to implement a few simple improvements to one, if they'd help both me, and other developers, on the platforms that I care about.
I sometimes wonder which is easier, modifying someone else's mess or writing my own. Data structure design is so crucial to helping simplify the resulting code and make it more robust to future change. Given that I have a small bit of experience with parsing and compilers, it's often easier for me to craft one than to wade through bad initial design decisions and later, ugly, horrible grafting work to hack in functionality they should have considered before starting out.

But I get your point, too. You want to choose the least-time path, whatever that looks like. It's just that different people will see that least-time path differently, I suppose.

Hahaha ... I really don't find a 7MHz 65C02-variant with 2.5M RAM (that's bytes, not bits) and a 650MB CD (for streaming code/data/audio) to be a particularly "small" device.  ;)
Well, 2.5Mb RAM and a CD means the system isn't "small." But the CPU is still small. It's a 1975 device and is probably some 3-4 thousand equivalent transistors (in CMOS it's all inverters and transmission gates, really, but who's counting?) If you made that in a current tech Intel FAB it would probably (not counting pad outs and external drivers) work out to 1 micron by 1 micron in size! Smaller than a lot of bacteria. Too small to see by eye, even on an absolutely clean and polished silicon wafer.

It's so close to zero, you couldn't detect the difference.

7MHz? Cripes. That thing could be running at GHz if some actually used a modern FAB on it. At 7MHz, it's probably being built in someone's clay oven, next to some pottery they are also making. The masks are probably hand-painted on the surface, etching done in a wash basin, and a polishing step with scotch-brite pad. ;) (I've built small demo FABs in my garage using a nickel plated chamber and water cooling, by the way.)

Ahhh ... I'm a little more critical and less forgiving.

Of the dozen-or-so architectures that I have seen, some stand out as really well thought out, and some don't.

The expansion of the 8080 into the 8088 was a really nice piece of work. Yes, the limitations became apparent, and caused problems later ... but they seemed really well thought out for the situation that existed at the time.
I like any decent design that I learn from. Most designs have thousands of constraints imposed on them and I'm often just impressed when I see how well the engineers navigated through them. It's often pretty remarkable. In most cases, I learn something new, too.

I just took a look at the wikipedia page on the PDP-11, and yes, that JSR is a really nice idea!
It's so useful -- especially for co-routines. Nothing like it in anything today. Too bad that state of the art wasn't remembered and/or retained in at least some newer designs.

The early RISC architectures are definitely interesting, and seemed like a huge change coming from CISC platforms like the 68000.

For some reason, I could never love MIPS architecture,
I remember flying in to MIPS and seeing Dr. Hennessey there. He had a huge mural behind glass of the 68020 processor's die. And he'd start there, describing how much of that die was "wasted" by sequencing logic and control store. About 70% as I recall. The rest, he'd say, was functional units and registers. But the 70% did nothing itself to add processing power. It was there just to make the instruction set "nice."

Motorola and Intel would be hide-bound before they'd sell any of their fancy FAB capacity to a competitor. (Their FABs were the most advanced and the most expensive.) MIPS could only buy "hand-me-down" FAB access, which meant roughly 150k transistor equivs when Intel and Motorola were fielding 4 million+ dies. So MIPS had to do, with 150k, what Intel and Mot were doing with millions. And MIPS DID! It was amazing to see.

MIPS stripped everything down. They had to, of course. They went straight to high clock rates, which meant high quality caching of memory, separation of instruction and data caches, the shortest possible combinatorial logic chains, and as much pipelining as possible. They didn't want to take a hit for a branch, either. Normally, the memory system is feeding the IR (instruction register) in a separate pipe. It doesn't know anything about branches. It just loads, loads, loads, etc. When a branch takes place, the IR is already loaded with the instruction after the branch and is already decoded. So now what? Toss it away and force a pipeline stall to wait while the target is re-loaded? No way. So MIPS said, nope. We execute that instruction regardless. You don't like that? Stick a NOP there. Hard luck. Besides, adding logic to handle the stall would insert something into the critical path and lengthen the clock rate. Sorry. Not happening. Same thing with register interlocks. Register reads heading to the ALU are done in parallel with register writes. If you write a register in a prior instruction, that write will actually still be in the pipeline when the next instruction reads the register. Normal folks add 'interlocks' and use these to stall the system to allow the write to occur (if they pipeline, at all.) Not MIPS. You get the old value, not the newly written one. Don't like that? Insert a NOP or find something else that is useful to do. The interlock adds logic, lengths the combinatorial logic chain, and therefore may lengthen the cycle time.

Lots of decisions like that. They were working with poor-man's FABs and had to compete with what they could access. And they did such a good job of it that it forced Intel to build RISC into their x86 family. (I was there, saw it happen. Intel was scared!) You have to respect what they achieved and how they achieved it.

and Hitachi's SuperH (SH-2) is IMHO and abomination, but I recently found the NEC V810 RISC architecture (in the VirtualBoy and the PC-FX), and that seems really well thought out, to me.
The Hitachi H8 was pretty interesting. I did do some work designing boards with it and programming it. They had a nice idea, too, of making it "EPROM" compatible so that you could drop the chips into a standard EPROM programmer. And the instruction set was "pretty."

Hahaha, yep, I just took a look at the NatSemi 32000 series again with the recent release of an upgraded[sp] FPGA implementation[sp] (the M32632).

It's definitely an interesting classic-CISC processor design, but I don't think that anyone would ever use the word "elegant" in describing it.
I'd forgotten the 32k. Now that was a odd lot of stuff. The timing controller... TCU... wow. I remember spending time studying that one.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: STARWIN on May 23, 2016, 06:11:46 pm
The only emulators I recognize as an emulator are built on hardware.

I think the interpretation that has managed to establish itself makes more sense than yours. Hardware is just software with errors. If someone gives you a black box which emulates something in a certain configuration, by your definition you can't be certain whether it is an emulator or a simulator.

Regarding MIPS, for some tiny hacks it can be convenient to have a NOP around, and with fixed size instructions too. I wish there was a NOP after each instruction!
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 23, 2016, 06:24:27 pm
I think the interpretation that has managed to establish itself makes more sense than yours. Hardware is just software with errors. If someone gives you a black box which emulates something in a certain configuration, by your definition you can't be certain whether it is an emulator or a simulator.
I have 40 years of shared terminology usage from long experience with other designers and it's a habit I'm not breaking. It would require 'convoluted' thinking on my part and that leads to making internal mental mistakes. So I'm not debating this. I'm just sharing my meaning with others, in case it helps them understand my wording better. Elmer misunderstood what I was saying, so I explained myself further. I've no interest debating or arguing or trying to change your mind or anyone else's about your use of terms. I'm just not changing mine, either. Not yet, anyway.

Regarding MIPS, for some tiny hacks it can be convenient to have a NOP around, and with fixed size instructions too. I wish there was a NOP after each instruction!
The assembler did some nice "back-filling" work with their re-organizer. So it "helped" you out, if you wanted the help. I've seen NOPs used profusely. But I never bothered. It was a very simple to understand processor. By comparison, I did more than a decade of programming on the ADSP-21xx DSP processor family. You could do a read, an ALU op, and a write all in the exact same, one cycle (50ns or so) instruction. I had a 1024-point complex input, complex output FFT that ran in well under 3ms on that beast. And it didn't support FP in the chip, by the way, so the FP was software with some barrel shifter hardware support. If you work on a processor like the ADSP-21xx for much time, you get really really good at handling parallel pilelines and back-filling code all over the place to maximize speed and minimize code and data size. MIPS R2000 assembly was dead easy by comparison.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: elmer on May 23, 2016, 06:31:05 pm
If it doesn't emulate a piece of hardware I can drop into other hardware, it's just a sim to me.

Hahaha, yep, we come from different job domains. I've only seen a couple of classic hardware-emulators for the CPU used in development over the years.

In my job we usually had much-simpler hardware-emulators for the system's ROM.


Quote
But this makes my point here. I don't know of anyone using a general-purpose, end-user quantity situation with the 6502. Is anyone doing homebrew 6502 boards anymore? I might have a few CPUs here, still. In a box somewhere. But I will probably never build anything with them. Does anyone? Are they still sold?

Ah ... our differences are showing again.

I've seen very little homebrew hardware over the years since the end of the 70s and the Altair and other card-based CP/M machines.

The last one that I saw was some Brazilian guy that was creating an MSX-clone machine from original Z80 parts (plus some modern logic).

"Homebrew" in the games and game-hacking world is just software-homebrew.

That's people writing complete new games/utilities for old games consoles/computers, rather than hacking an existing ROM to create a translation or to modify it.

And "yes", there are definitely quite a few folks doing that for various old machines that they love.

This just isn't really the forum where those kind of folks hang out.

From what I've seen, you're more likely to see them where the game players for their specific beloved-console hang out.

If you want to see people still using the 6502, you can look at places nesdev.com or atariage.com, or, in my case, pcenginefx.com


Quote
Now that's intriguing. What is the PC Engine? I'd like to see a description of it.

It's a game console that came out in 1987, the 1st of the 4th-generation machines. It was the first game console with a CD-ROM drive available for it.

Over here in America it was called the Turbo Grafx ... and it failed pretty badly for a variety of reasons.

But in Japan, it knocked the NES off the top spot for a time, until the SNES came out, and it outsold the Sega MegaDrive/Genesis.

Not much respect for it here in America by the average gamer, but well-loved by the people that know it and play the games.

Ground-breaking at the time for having excellent RPG games with CD soundtracks and voice acting years before anyone else.


Quote
I sometimes wonder which is easier, modifying someone else's mess or writing my own. Data structure design is so crucial to helping simplify the resulting code and make it more robust to future change. Given that I have a small bit of experience with parsing and compilers, it's often easier for me to craft one than to wade through bad initial design decisions and later, ugly, horrible grafting work to hack in functionality they should have considered before starting out.

But I get your point, too. You want to choose the least-time path, whatever that looks like. It's just that different people will see that least-time path differently, I suppose.

Yep, that's exactly it. I don't have the background to quickly knock-up my own ANSI-C compatible compiler, so the quick alternative is to either improve someone else's, or just live with a good macro-assembler (which wouldn't bother me at all).


Quote
Motorola and Intel would be hide-bound before they'd sell any of their fancy FAB capacity to a competitor. (Their FABs were the most advanced and the most expensive.) MIPS could only by "hand-me-down" FAB access, which meant roughly 150k transistor equivs when Intel and Motorola were fielding 4 million+ dies. So MIPS had to do, with 150k, what Intel and Mot were doing with millions. And MIPS DID! It was amazing to see.

Thanks for the background, that really does help put it all into perspective.

I guess that it must have been amazing to see from a front-row position.

But as an assembly-language programmer, all those nop-or-else delay rules were a royal PITA. Sure, the compiler could handle them easily, but compilers were pretty lousy in those days.

And if we're going to go with the perspective of the-little-engine-that-could, then I'd probably bring up the ARM processor as the product of a small company that managed to design a really powerful and pleasant RISC-like CPU.

BTW ... going back to the PDP-11's JSR instruction. Isn't that basically the CISC version of having the return address put into a link-register?

The NEC V850 (still in production by Reneas) lets you use an register to hold the return address with it's JAL instruction.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on May 23, 2016, 07:24:02 pm
I've seen very little homebrew hardware over the years since the end of the 70s and the Altair and other card-based CP/M machines.
That's too bad. I still do a lot of it. I wire-wrap, dead-bug, modify something I can buy cheap, or I go buy a panel and solder one up. Once the skills are there, you use them I suppose.

"Homebrew" in the games and game-hacking world is just software-homebrew.

That's people writing complete new games/utilities for old games consoles/computers, rather than hacking an existing ROM to create a translation or to modify it.

And "yes", there are definitely quite a few folks doing that for various old machines that they love.
Okay. That's a small stretch to wrap my mind around. But I can do it. I just need to figure out what someone (you) means when the word is written out, I suppose. Context. I'll struggle and remember that in this site 'homebrew' never means hardware and always means writing complete games instead of modifying old ones.

If you want to see people still using the 6502, you can look at places nesdev.com or atariage.com, or, in my case, pcenginefx.com
I'll give those a look.

It's a game console that came out in 1987, the 1st of the 4th-generation machines. It was the first game console with a CD-ROM drive available for it.

Over here in America it was called the Turbo Grafx ... and it failed pretty badly for a variety of reasons.

But in Japan, it knocked the NES off the top spot for a time, until the SNES came out, and it outsold the Sega MegaDrive/Genesis.

Not much respect for it here in America by the average gamer, but well-loved by the people that know it and play the games.

Ground-breaking at the time for having excellent RPG games with CD soundtracks and voice acting years before anyone else.
Okay. So I may someday think about it when I find a circumstance providing motivation.

Yep, that's exactly it. I don't have the background to quickly knock-up my own ANSI-C compatible compiler, so the quick alternative is to either improve someone else's, or just live with a good macro-assembler (which wouldn't bother me at all).
They aren't all that hard. Most of the work goes into all the target-specific stuff and the bells and whistles everyone demands (like colored keywords in their editor, and mostly useless cr*p like that.) Some goes into serious, useful optimizations, though. The parsing stuff and, if you don't care about optimization at all, the code generation stuff is pretty darned easy. Boilerplate.

Thanks for the background, that really does help put it all into perspective.

I guess that it must have been amazing to see from a front-row position.
Intel was VERY smart. (Is, I suppose, though I think NVIDIA is giving them some heartburn in some ways today.) Their processors became their largest profit center in 1985. Around that time, MIPS was showing up on the scene. The 80386 used the 1960's Multics address translation scheme, almost verbatim, and was very well designed a long time before Intel. But it was slow on the 80386.

Somewhat prior, step 1 in this story, folks had learned how to finesse the original IBM BIOS and to provide (except for the ROM BASIC) a near complete functionality (99.9% compatible) of the IBM with the Kaypro 286i. (The very first machine to get it almost entirely right.) The competition was at first simply about clock rates, as Intel slowly increased the speed of their 80286 devices upwards from around 6MHz towards 10, 12, and 16MHz. But there was a huge barrier -- the bus was synch'd to the cpu, so as the CPU rates could go up the bus rate also had to go up with it. But that "broke" the boards plugged into the system, as most of them couldn't go any faster than about 8.5MHz or so. (I know, I screwed around with that for a while.) So manufacturers needed to decouple the bus rate from the cpu rate. Step 2 in the story is about figuring how to do that. It took a boat-load of 7400 series chips and the next spate of computer motherboards were a veritable sea of socketed 7400 parts. It was something to behold. But it worked. That was about the time that Chips&Tech formed, to help solve this problem with ASICs. C&T make chipsets which provided all the of decoupling required without all those individual logic chips. This greatly simplified design for motherboard manufacturers and there was another spurt of growth. Step 3 was the period when C&T really grew and made buckets of money that Intel wanted back. That will get us to step 5. But also in step 3 was the moment when Intel 'discovered' that selling ICs to customers at K-mart was a LOT more profitable than selling them in thousands to some honery engineers. Intel had gotten manufacturers to include co-processor chip sockets for the 80286 and 80386. But it was only around the time when the 80386 was out that they really started to sell through at places like K-mart. And the profits were like a drug -- Intel couldn't get enough of that. Now step 4. Intel was loving the "sell ICs to stupid end-users for stupid money" mode and really came up with some serious scams with the 80486 family. They were still griped about C&T, too. But right now they were dialed into the idea of selling ICs, one at a time, to end-users who couldn't open their wallets fast enough. So they came up with the 80486SX and 80486DX. The 80486 yield wasn't perfect. Some of them didn't have decent FP. So they worked out how to turn that part off, if it didn't 'yield' and to sell bad chips by the bucket. They encouraged motherboard makers to add a 80487SX socket to their motherboards, offered to sell 80486SX chips "on the cheap" to further encourage them, and really tried to make it harder for motherboard makers to consider a full-up 80486DX board. At the beginning of this cycle, the 80486DX boards were all you could get. But by the end, they were nearly unobtainium because Intel was so good at this. The end user, of course, would eventually "figure out" that they were sold an 80486SX and that they really wanted the floating point added in and were willing to go buy a chip. The funny thing? The 80487SX was just a rebonded 80486DX chip. In fact, all of the 80486 line was the SAME darned die. 80486DX is a fully tested die. 80486SX is a bad-FP-test die toggled to inhibit its FP unit. 80487SX was just a rewired, good-tested 80486DX that lifted the 80486SX off the bus and just took over. Basically, Intel got people to buy dog poop to start and then sold them the real deal over the K-mart counter, one at a time, later. They bypassed those pesky engineers, too. Meanwhile, Intel was busy with step 5, solving another problem: C&T. With the 80486, they started themselves into the C&T chipset business. It wouldn't happen all at once, of course, but gradually during step 6, they pretty much took over the chipset market. Step 6 was about getting rid of all those thousands of mom and pop motherboard manufacturers. So many of them meant lots of retail competition. That meant low manufacturing costs. Which meant low chip prices to Intel. Intel considered that a bad thing. So they needed to greatly reduce the motherboard manufacturers. The idea came with the PCI bus -- sold as a "green" bus idea (it's reflection wave, instead of incident wave, and that actually is lower power.) It's real intent was to elevate the cost of equipment needed to make motherboards and add-in cards, though. An oscilloscope for a regular ISA bus would set you back a couple of grand. But for the PCI bus? You'd need to start at $100k each. And work up from there. It is a very expensive bus to design for. And that worked. It literally killed the mom-and-pop businesses. And reduced the manufacturer count dramatically, leaving only very well funded businesses in that market. This allowed chip prices to rise, etc. Another problem solved. (They still will tell you it was all about being green -- and that is the part-truth that makes the lie so much better.) Then there were more steps (graphics, etc.) But that gives a thumbnail. Intel uncovered each challenge to profit and solved it. They made some mistakes along the way (one of them cost $250 million in a single quarter buying back their own memory.) But they generally did smart things to muscle themselves into existing profitable 'claims' after others worked hard to prove out those claims and demonstrate value there. Intel would then march in and suck it up with a strategy in mind. 

But as an assembly-language programmer, all those nop-or-else delay rules were a royal PITA. Sure, the compiler could handle them easily, but compilers were pretty lousy in those days.

And if we're going to go with the perspective of the-little-engine-that-could, then I'd probably bring up the ARM processor as the product of a small company that managed to design a really powerful and pleasant RISC-like CPU.
ARM is a fantastic success story!! It never was able to perform like some of the best RISC cases (power, speed, etc.) But it performed 'well-enough.' And their strategies were excellent. (Hmm. Reminds me now of SPARC, which also competed for a while.) Today, there is little like them. If you want multiple source CPUs, it is either x86 or else ARM. Everything else is single-source. Or close to it. The compilers for the x86 and ARM are as good as they get, too. No place better to go. I love ARM and have lots of development tools here for them, as well. (In circuit JTAG stuff, for example.)

BTW ... going back to the PDP-11's JSR instruction. Isn't that basically the CISC version of having the return address put into a link-register?

The NEC V850 (still in production by Reneas) lets you use an register to hold the return address with it's JAL instruction.
Hmm. I admit I stepped by the V850. Probably the only processor I haven't touched! hehe. I'll have to go look.

JSR on the PDP-11, together with its very orthogonal addressing modes, provides more than just one or two completely separate concepts for subroutines, tasks, co-routines, and threading. I'll have to look at the V850 case to see if it can touch all of them. Might be. I wouldn't know.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: oziphantom on June 13, 2016, 04:03:33 am
Sorry I've been away from Console land for a while, could have saved you some pain.
The problem is you are looking at Console tools and not Computer tools. SNES and NES land is rather barren compared to C64 land.

64Tass is what you want.

It has N stack structs, so you can do
Code: [Select]
sScoreValues .struct
digit65 .byte ?
digit43 .byte ?
digit21 .byte ?
.ends

and then latter you have
Score .dstruct sScoreValues
HiScore .dstruct sScoreValues
or even
.block Scores
  Current .dstruct sScoreValues
  HiScore .dstruct sScoreValues
.endb

which you can access in code as

LDA Scores.Current.Digit21

You might notice the ?s above. This tells the assembler that I want some space but it doesn't need to actually put it into the output. If I put a 0 there then it would make sure the output covered that area and it was internalised to 0.

It has sections
Code: [Select]
*       = $02
        .dsection zp   ;declare zeropage section
        .cerror * > $30, "Too many zeropage variables"

and then later if I want to add a variable for some code I can do

.section zp
PlayerHasCape .byte ?
.send zp

awardPlayerCape
   inc zp.PlayerHasCape
   rts

This was I can declare the variables used by some code, even in a different file, and have it put into the Zero Page/Direct Page. without needing to worry about exactly where it is.

This becomes more useful when combined with .proc which declarers a procedure, and if it doesn't find any label references to it, then it will not be assembled. Using Sections means it data needs are also not assembled.

You can do Org and Logical ( which stacks ) so if you want to put code to 8000, that is loaded into 7F2000 to be run in ram you can do this
Code: [Select]
* = $8000
.logical =$2000
lda #1
jmp Ahead
Lda #3
Ahead

.here
and Ahead will be $2008

It lets you do
Code: [Select]
.as, .xs a and or x/y is 8 bit
.al .xl a and or x/y is 16bit
.autsiz track the rep/seps
.mansiz ignore and take your values
It has Databank and ProgramBank control
You can also set them be ? which disables the addressing modes, forcing abs at all time
you can also use ,b and ,d addressing modes to force
lda $8005,d will give you lda $05 regardless if the current bank is set to $80

There is a --shadow-check option that tells the compiler to warn if one label is forced over another

You can use the @b,@w or @l to force 8,16 or 24 bit addressing, so
lda @w$0000
force AD 00 00
you can also use the ~ form so lda $~0000

you can use .page .epage to error if code crosses a page and hence changes timings
you can use .align if you need it for speedcode so
Code: [Select]
.algin $ff <- forces next instruction to be the first byte in a page

It has psudeo opcodes
GEQ, GNE, GCC, GCS, GPL, GMI, GVC, GVS, GLT and GGE.
GRA for CPUs supporting BRA, which is expanded to BRL (if available) or JMP.
This will be a BXX instuction if it is range or the assembler will invert the XX and put a jump so
GEQ Label out of range becomes
BNE *+3
JMP Label

It has others like blt is bcc lsr lsl etc

The latest bleeding edge version has Optimisation support, in the assembler will detect and warn about unnecessary op-codes or where a branch can be used to save a byte etc
Full details can be found here http://tass64.sourceforge.net/ (http://tass64.sourceforge.net/#commandline-options)
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on June 13, 2016, 01:36:00 pm
64Tass is what you want.
Glad to see your response. I'll take a serious look at the tool. Sounds very interesting! (I do note the .dpage and .databack pseudoops, as well.) I will definitely look at it.

I'm trying to consider a syntax to support a universal binary-file semantic regarding patching ROM files directly. I don't think I can reasonably do that for ROM files which are segmented in complex ways and support a variety of internal data structures. But I think I can do that for ROM files which are essentially binary streams, with the only imposed aspect being the association between file addresses and assumed memory addresses (without a requirement that memory addresses are unique, but instead that named assembled segments can be directed to binary file areas regardless of their assembled address locations.) Since the source is there, it could be modified. However, since it also outputs several format types, including Motorola S-records, it may be enough to simply add an external tool there.

EDIT: 2:45PM PT: Just noticed a thread question on tass64 about the use of LDA #-5, for example. The author suggests that the assembler gives an error for this syntax. Is that true?
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: oziphantom on June 14, 2016, 12:49:16 pm
Yeah the #-5 is a pain, but it is due to it doing type checking. but you can make a custom function to handle it like
neg .function a
       (256-a)
.endf

then do lda #neg(5)

I've done something like the patch recently. Use the nonlinear format (see the command lines ops ) which gives you
XX XX YY YY AA BB CC ... XX XX YY ....
where XX is a 16bit Length, YY is a 16 bit Start Address and AA BB CC ... are the bytes
If you are in 65816 mode you get
XX XX YY YY YY AA BB CC
so you have a 24bit address, your code will need to be able to translate the Mapper format of choice, so you need an Address to ROM file offset converter and done.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on June 14, 2016, 01:05:37 pm
Yeah the #-5 is a pain, but it is due to it doing type checking. but you can make a custom function to handle it like
neg .function a
       (256-a)
.endf

then do lda #neg(5)
That will really be a bit of a typing pain if supplying comma separated negative values to a DB or something like that. Also, I'd have to already know that it will be negative. If I'm writing an expression, I may not know that right away, nor should I necessarily always have to.

Seems like something to fix. There is no good reason I can think of why the type checking can't also do 'promotions' from a signed value to its corresponding unsigned equivalent using the usual modulo definition found in C. (In C, there is always an equivalent unsigned value in the same size format for any signed value of the same size; but there is NOT always a conversion from an unsigned value to a signed value.)

I've done something like the patch recently. Use the nonlinear format (see the command lines ops ) which gives you
XX XX YY YY AA BB CC ... XX XX YY ....
where XX is a 16bit Length, YY is a 16 bit Start Address and AA BB CC ... are the bytes
If you are in 65816 mode you get
XX XX YY YY YY AA BB CC
so you have a 24bit address, your code will need to be able to translate the Mapper format of choice, so you need an Address to ROM file offset converter and done.
This I don't think I fully apprehend. I think you are talking about my ROM patching comment here and bringing in the different mapping formats for the binary file output. But I don't really understand the details you mention, probably because I'm still mostly ignorant about tass64. I'm sure it will clear up for me as I read more, when I get a moment of time.

Thanks, again.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: oziphantom on June 15, 2016, 01:11:27 am
Oddly it is only the # case, in that it says a immediate must be unsigned char. .byte -5,-4.-3,-2,-1,0,1,2,3,4,5,6 is perfectly fine. Yes I keep telling Soci ( the maker ) its daft.

Ok given this code
Code: [Select]
;patch the loader checksum
* = $86b3
.byte $17

;patch the loader
* = $9db3
JMP $0200

;disable the PLAY button detection
* = $9da5
nop
nop
You would normally get a file like
Code: [Select]
B3 86 17 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00
.snip.
00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00
20 02 00 00 00 00 00 00 00 00 00 00 00
etc

Doing a non linear file gets you a file with
Code: [Select]
01 00 B3 86 17 03 00 B3 9D 20 02 00 02 00 A5 9D EA EA
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on June 15, 2016, 01:34:47 am
Oddly it is only the # case, in that it says a immediate must be unsigned char. .byte -5,-4.-3,-2,-1,0,1,2,3,4,5,6 is perfectly fine. Yes I keep telling Soci ( the maker ) its daft.
Yeah. Well, it's daft. It's going to make me read code and fix it and, if necessary, start a whole new web page with a new product just to be annoying about it. And he/she won't like the way I fix it, either! I'll make sure it is very ugly, but workable. So they'd better do it before I do!

Ok given this code
Code: [Select]
;patch the loader checksum
* = $86b3
.byte $17

;patch the loader
* = $9db3
JMP $0200

;disable the PLAY button detection
* = $9da5
nop
nop
You would normally get a file like
Code: [Select]
B3 86 17 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00
.snip.
00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00
20 02 00 00 00 00 00 00 00 00 00 00 00
etc

Doing a non linear file gets you a file with
Code: [Select]
01 00 B3 86 17 03 00 B3 9D 20 02 00 02 00 A5 9D EA EA
Thanks. I had already understood that before you wrote about it. I just hadn't clued in on your use of the phrase "non linear" as meaning what I already understood as something else. My confusion and you've cleared it up. No problem.

I can use such files (or Mot S-records) with my own patcher program to update the ROM file, too. Or write a different one to convert one of several outputs of tass64 into a more standard rom update format used here on this site. Or modify tass64 to generate those same standard formats. Or... well, lots of possibilities, I guess. It's all good.
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: oziphantom on July 04, 2016, 02:44:08 am
Soci the author of 64Tass has noted my #256-5 hack and it has been officially fixed as of r1200 so
LDA #-5 works
and it will also fail on
LDA #+140
as signed is -128/+127
and using .char will handle the "byte" case
.char 1,-5,3
Title: Re: 65816: Direct Page vs Absolute Operand
Post by: jonk on July 04, 2016, 02:46:05 am
Soci the author of 64Tass has noted my #256-5 hack and it has been officially fixed as of r1200 so
LDA #-5 works
and it will also fail on
LDA #+140
as signed is -128/+127
and using .char will handle the "byte" case
.char 1,-5,3
Thanks for the note. I'll look at r1200.