News: 11 March 2016 - Forum Rules

Author Topic: 65816: Direct Page vs Absolute Operand  (Read 21048 times)

elmer

  • Full Member
  • ***
  • Posts: 122
    • View Profile
Re: 65816: Direct Page vs Absolute Operand
« Reply #20 on: May 19, 2016, 06:31:16 pm »
Code: [Select]
TimedLoop:
    lda foo, X
    sta bar
    nop
    inx
    bne TimedLoop

#if (current_pc & $FF00) != (TimedLoop & $FF00)
  #error "TimedLoop branch crossed page boundary.  Timing will be incorrect"
#endif

Here, you can't know the current PC until 'foo' and 'bar' are resolved, and the size of those lda/sta instructions are determined.  Which means the #if condition cannot be resolved at compile time.... and certainly not on first pass.

Any ideas how this problem can be addressed?

LTCG and multiple-passes to defer the problem until later could solve a lot of the problems ... but you'd still get the rare pathological cases that would break things and require an error message and an abort.

The question is ... what is the practical end-user complaint that you are trying to solve?

The "classic" method of 16-bit addressing if the label is undetermined (24-bit addressing on the 65816), and 8-bit addressing if the label is already known to comply, is the practical solution that will never generate incorrect code (just sometimes sub-optimal).

Then you leave it up to the programmer to override the "safe" default with "<", ">" or "|", or with an attribute on the label declaration.

That's one way that the problem can be addressed, both quickly and easily.

This isn't a linker issue ... unless you really want to attempt LTCG.

The linker is traditionally there for a different reasons, just as jonk said. Partially for speed (on old systems), but also for code-separation and for keeping a large multi-developer project "sane".

jonk

  • Sr. Member
  • ****
  • Posts: 273
    • View Profile
Re: 65816: Direct Page vs Absolute Operand
« Reply #21 on: May 19, 2016, 07:08:08 pm »
In the following, I'll write mostly "as if" I'm recommending the idea of allowing some, very modest really, link-time ability to make reasonable choices for certain opcodes. (These could be offered to the linker in an abstract way, I suppose, by the assembler writer who "knows" what the options are and who can provide a list of options to consider at link-time, rather than have to build in specific 65816 opcode knowledge into the linker. How this is achieved seems a "design question" more than a problem, to me.) The fact that I'm speaking in that tone here should, in no way, be taken to say that I am pushing for link-time selections. I'm open, either way.

The question is ... what is the practical end-user complaint that you are trying to solve?
Most of the problems quickly resolve out, without some kind of "pitching back and forth." There are always "edge cases." (No pun intended, but then I don't know who would know the pun unless they understood the use of "code edge.") But those can be finessed either by examining all of the logical alternatives and selecting those that don't "oscillate" (usually that isn't too hard to do) or by just making a choice and generating an error, if that's all you are left with. Some of this can (and probably should) be done by the assembler during assemble-time, as there is no need to sweep problems over to the linker when the assembler already knows enough to resolve them. The linker would only need to deal with those where the assembler doesn't have enough perspective.

The "classic" method of 16-bit addressing if the label is undetermined (24-bit addressing on the 65816), and 8-bit addressing if the label is already known to comply, is the practical solution that will never generate incorrect code (just sometimes sub-optimal).
Yeah. That would be easy, I suppose. The assembler could, if it can't resolve the situation given the current assembly-time information, pick the worst-case option. (Slowest, biggest, most generalized.)

But honestly, I don't think this is a hard problem for a paired assembler/linker and I'm not worried that it will become one. So some issues could be deferred out until link-time. It's not that terrible, at all.

Then you leave it up to the programmer to override the "safe" default with "<", ">" or "|", or with an attribute on the label declaration.

That's one way that the problem can be addressed, both quickly and easily.

This isn't a linker issue ... unless you really want to attempt LTCG.
There is that. You can always leave it up to the programmer and just generate errors as appropriate. Nothing new there and it's not hard on a programmer.

The linker is traditionally there for a different reasons, just as jonk said. Partially for speed (on old systems), but also for code-separation and for keeping a large multi-developer project "sane".
Yup. And I don't have any specific axe to grind here. If nothing fancy is done, I'm no worse off than before. If something is added that is nice, I'll take it.

There are times when I'd like the assembler to automatically recognize when a conditional branch can't reach its target and is able to replace it with a BRL and reverse-conditional branch around the BRL, for example. There are other times when I'd rather it wasn't that smart, too. Something perhaps I'd like to be able to turn ON or OFF? And if there is support in the linker when the assembler can't resolve things (a different question) then of course that adds more to consider when designing the list of options that an assembler presents to the linker (if you have the assembler do that and don't build in too much processor specific knowledge into the linker.)

Which brings up something else....

Should a macro facility have necessary and sufficient feature support, that it might be fully possible to write macros to achieve the above instruction replacement strategy? There is no reason, in principle, that it couldn't be achieved. Macro expansions could have conditional tests which are deferred until the second pass, for example. But I need to think about this. And, well, since I'm just kibitzing and not really doing any work....  perhaps the implementer would just prefer to shoot the messenger here and get on with it.  ;)

An equal right to an opinion isn't a right to an equal opinion. -- 1995, me
Saying religion is the source of morality is like saying a squirrel is the source of acorns.  -- 2002, me

Disch

  • Hero Member
  • *****
  • Posts: 2814
  • NES Junkie
    • View Profile
Re: 65816: Direct Page vs Absolute Operand
« Reply #22 on: May 19, 2016, 08:43:10 pm »
@jonk:  You are using a whole lot of "quotes" in your "post".   XD

Anyway...  Defaulting to 16-bit sizes is what I did in my original assembler.  In addition to that, to solve the #if problem, I had a requirement that evaluations passed to directives had to be immediately resolvable or else you get an error.  This also skirted around other problems that arose, like setting DirectPage to a label that was not defined yet.  But I'm worried that might be too restrictive.  I want this to "just work" without the programmer having to manually select the size of every instruction and/or worry about whether or not the assembler is choosing the optimal sizes.

Quote
There are times when I'd like the assembler to automatically recognize when a conditional branch can't reach its target and is able to replace it with a BRL and reverse-conditional branch around the BRL, for example.
[snip]
Should a macro facility have necessary and sufficient feature support, that it might be fully possible to write macros to achieve the above instruction replacement strategy?

I was going to say... I'd argue that shouldn't be an assembler feature and SHOULD be done with macros.

On a side note:  BRL is the dumbest and most useless instruction on the 68516.  It's basically just JMP, but takes an extra cycle.  I guess maybe it's useful for relocatable or self-modifying code?  Whatever.



Anyway I have an idea for how this could be accomplished, but it basically amounts to the linker doing a lot of the same work the assembler would have to do.  Like, to the point where they might as well be the same executable just with different commandline args.


Here's a question:

- Is it unreasonable to expect every source file to have an ORG before any binary output?

I would think this would be a safe assumption, but I can see someone creating library-like files that don't care where they're put.  But would those be assembled directly or would they be #included into another source file?

jonk

  • Sr. Member
  • ****
  • Posts: 273
    • View Profile
Re: 65816: Direct Page vs Absolute Operand
« Reply #23 on: May 20, 2016, 12:32:39 am »
Defaulting to 16-bit sizes is what I did in my original assembler.
;) Always ask yourself, "Am I doing this because it is the easy path? Or because it is the courageous path?"

I value the courageous choices in life!

In addition to that, to solve the #if problem, I had a requirement that evaluations passed to directives had to be immediately resolvable or else you get an error.
Hmm. Same question to ask yourself.

This also skirted around other problems that arose, like setting DirectPage to a label that was not defined yet. But I'm worried that might be too restrictive.
I think you are touching on one of the serious questions to be asking yourself. If an "ASSUME DP" directive (whatever you name it) is set to an external symbolic, you have no choice but to allow the linker to locate that symbol (or expression containing external symbols needing resolution) before you can resolve the quantity, exactly. And without the exact quantity, you cannot resolve the DP references in the instructions (like LDA) that may refer to a symbolic, which by definition is to be taken as relative to DP. (Or, if you automatically choose between two different LDA's, then to find out if you need to do so.) Of course, you didn't even address yourself to an external here, but to a "label that was not defined yet," which might be one that is in the same source file but found later (so you would need to wait until pass 2.) My example couldn't even be resolved in pass 2, but only at link time. Which is yet another question to ask.

I want this to "just work" without the programmer having to manually select the size of every instruction and/or worry about whether or not the assembler is choosing the optimal sizes.
This whole area is something that can easily fall into arguments over "matters of style." Better experienced programmers can argue this question into any corner you want and make it stick pretty well.

Should the assembler just do what you say and let the programmer make all the decisions, emitting errors to help guide the hand of the programmer?

Should the assembler do a fair job of "counting bytes" and "figuring out offsets" for the programmer, freeing the programmer from having to worry about such details?

But in regards to the original question I was asking, with regard to informing the assembler about the DP setting, things may get interesting.

Suppose you support structs. Suppose the assembly programmer has a subroutine they want to write that assumes that the DP points somewhere useful before being called. But it doesn't know (or care) exactly where. Instead, as it turns out, this subroutine does something interesting and fun with a palette structure. However, there are a dozen different palettes in use. This subroutine doesn't care which one is in use. It only cares that the DP register points to the palette structure you want it to examine before you call it. (This could be a palette structure or it could be an NPC structure or it could be a saved-game structure. It doesn't matter. Make up something you consider worthy of the example here.) Now, the assembler needs to be told the type of the structure that DP points at. But the assembler doesn't need to know the exact address contained in DP. Just that whatever the DP is pointing it happens to be something of this type. Now, DP-relative LDA instructions should be able to be generated by the assembler just fine without any need for fix-ups during link-time because the assembler knows all it needs to know in order to correctly generate the DP-relative LDA instruction.

So while the assembler may need to know what kind of thing the DP points at, it doesn't actually need to know the absolute value of the DP.  On the other hand, one might actually want to tell the assembler about the absolute address of the DP and not tell it about the type of the data items that proceed there, at all.

Should one be able to over-ride all this in the instruction operand itself? Should I be able to say to the assembler,

Code: [Select]
      LDA    ((struct X *) DP)->field1

And then have it figure out that field1 is at offset $12 relative to DP?

I don't know. You tell me?

I really don't like to waste my time sitting down with a piece of paper, working out DP-relative offsets. Worse, this in effect hard-codes these deltas. If I later decide to move the DP base somewhere else or if I decide to modify a structure there and add some more fields.... then I'm running around having either to modify a lot of instructions that I should never have had to bother with or else I have to go find my long list of EQU/= symbols and go hack that thing into shape so that the offsets are correctly stated, again. This is seriously bad. I really think the assembler needs to have some information about where DP is established and that the programmer should allow the assembler to figure out the offsets. The assembler is really good at bookkeeping details like that. The programmer isn't so good.

Let the assembler do what it is good at doing.

But once you open that door and walk through it, when do you stop?  Frankly, I see the need for a very good, high quality expression/operand analyzer.

I was going to say... I'd argue that shouldn't be an assembler feature and SHOULD be done with macros.
I anticipated that.

On a side note:  BRL is the dumbest and most useless instruction on the 68516.  It's basically just JMP, but takes an extra cycle.  I guess maybe it's useful for relocatable or self-modifying code?  Whatever.
Yeah. Mostly for PIC. You might want that if you are transferring code into RAM for execution. No, I don't know why. In the MSP430 from TI, you may need to do that if you are modifying your flash because the RAM still works fine when the flash is being written. So there, you'd need something like that. For the 65816 in the SNES? I don't know. Maybe I'll think about making something up that sounds really important. ;)

Anyway I have an idea for how this could be accomplished, but it basically amounts to the linker doing a lot of the same work the assembler would have to do.  Like, to the point where they might as well be the same executable just with different commandline args.
That would be bad. You don't want to bury knowledge in two different places. This is why I let slip the idea of having the assembler pass along a list of options for the linker to consider, done up in such a way that the linker doesn't actually need to know what it is doing. Just a thought for now.

Here's a question:

- Is it unreasonable to expect every source file to have an ORG before any binary output?
Of course it is. Relocatable code doesn't use ORG. In the Merlin32 system I modified a few weeks back, the ORG can occur either in the linker file OR in the source code OR in both. But if it is in the linker file and not in the assembly source code, then the source code is moved to the location indicated in the linker file. If the source includes an ORG (or more than one) then the linker file ORG merely sets a barrier so that all ORGs in the source file must be at or after that address. But it otherwise doesn't restrict the use.

I would think this would be a safe assumption, but I can see someone creating library-like files that don't care where they're put.  But would those be assembled directly or would they be #included into another source file?
I don't think it is safe. Here's why.

When I'm hacking ROM code, one of the first things I do is get rid of certain subroutines, replacing them with others I place in some 0xFF region I believe isn't used for anything. Doing so leaves "holes" in the code. I mark these holes for later use by other, shorter routines I might later write.

Suppose a 4Mb ExHiROM. Suppose I know that everything at the tail end, from $F40000 to $FFFFFF, is safe to use for new code. Suppose I tear out subroutines OLDX, OLDY, and OLDZ, located at $C40000 to $C400E7, $C41320 to $C41410, and $C72010 to $C72251, respectively. My new replacement routines for these three functions will be located somewhere in the $F40000 to $FFFFFF region, but I really don't care exactly where. It doesn't matter. But I do have to keep the address of OLDX at $C40000, OLDY at $C41320, and OLDZ at $C72010, because the rest of the ROM expects them to stay there and I don't intend wasting tons of time tracking down all of the calls to these functions. So I insert a small snippet (perhaps just a JML) of code at the beginning. This leaves me with three useful holes in the code that I may use for something else (moved data table, additional data table, other subroutines I write that are short, etc.) When I write my patches, I want to specify the start of the patch areas (in this case, there are four patch areas: OLDX, OLDY, OLDZ, and NEWCODE) to the linker. But the assembler shouldn't care, at all. So far as the assembler knows, I have four named code segments that are, each of them, fully relocatable. The assembler should not need to know anything about their location in memory. (Aside from the rule that the linker won't locate a named code segment so that it sprawls across a bank boundary.) Only the linker knows that. So my oldx_seg has a JML followed by some small subroutine or two; my oldy_seg has another JML plus some additional other small subroutines; my oldz_seg has yet another JML followed by still more personal subroutines; and newcode_seg has the replacement code for OLDX, OLDY, and OLDZ plus a bunch more library routines, tables, data, and other stuff I couldn't fit into the earlier, tiny holes.

Why should the assembler care about an ORG here?
An equal right to an opinion isn't a right to an equal opinion. -- 1995, me
Saying religion is the source of morality is like saying a squirrel is the source of acorns.  -- 2002, me

Disch

  • Hero Member
  • *****
  • Posts: 2814
  • NES Junkie
    • View Profile
Re: 65816: Direct Page vs Absolute Operand
« Reply #24 on: May 20, 2016, 01:56:00 am »
;) Always ask yourself, "Am I doing this because it is the easy path? Or because it is the courageous path?"

Well if you note, I said that's the way I did it in my OLD assembler ;P

Since this is a rewrite with a completely different approach, I'm hoping to improve upon all that.

Quote
So while the assembler may need to know what kind of thing the DP points at, it doesn't actually need to know the absolute value of the DP.  On the other hand, one might actually want to tell the assembler about the absolute address of the DP and not tell it about the type of the data items that proceed there, at all.

Should one be able to over-ride all this in the instruction operand itself?

This is a pretty easy scenario, IMO.

- Assembler keeps track of a DirectPage (and DataBank) value, which the coder can set any time with a #DP or similar directive
- EVERY operand to any instruction is ultimately going to evaluate to a number
- If the given number is accessible via the given DirectPage, use direct page mode.
-   Otherwise, if it's accessible with the given DataBank, use absolute mode
-   Otherwise, use long mode


Basic case:

Code: [Select]
lda #$0100
tcd
#dp $0100

lda $0105    ; A5 05
lda $0205    ; AD 05 02     (if assembler's DB is $00)
lda $0205    ; AF 05 02 00  (if assembler's DB is not $00)


Want DP Relative code?  Set Assembler's DP to zero, but don't tcd a new value:
Code: [Select]
; doesn't matter what #dp is set to at this point
#push_dp    ; save it
#dp 0       ; Give a fake direct page of 0

lda $05     ; A5 05   --  DP relative
#pop_dp     ; restore prevous dp


Quote
That would be bad. You don't want to bury knowledge in two different places. This is why I let slip the idea of having the assembler pass along a list of options for the linker to consider, done up in such a way that the linker doesn't actually need to know what it is doing. Just a thought for now.

Both the assembler and linker are going to have to do symbol resolution.  And really, symbol resolution is the bulk of the work... and is the only difficult part of writing an assembler.  Everything else is trivial if you know what the symbols are.

Linker has to do it (pretty much by definition of what the linker's job is) to resolve external symbols.... but you also want the assembler to do it for simple expressions so that object files are not unnecessarily huge and complicated.  Basically anything that CAN be resolved by the assembler should be, and anything that can't be should get passed to the linker.  I think you even said something like this earlier.

But this means both assembler and linker are doing the same thing.  And I don't want to put duplicate code in two different executables.  So I'm thinking this should just be one monolith program, where you can assemble to object files with one option --- link object files together with another option -- or do both with a different option (which is what I want the default behavior to be anyway)

Quote
Of course it is [a bad idea to require ORG]. Relocatable code doesn't use ORG.
[big example snip]
Why should the assembler care about an ORG here?

So I'm just going to say it.  I hate the concept of segments.  They're overly complicated... and I have yet to see a good use for them.

They're the absolute biggest complaint I have with ca65.  I've seen SEVERAL people turn away from assemblers that use them because it's too difficult a concept for them to grasp, and they don't provide functionality that isn't already accomplished with directives.

Worse, all of this could be done much more simply with directives like ORG.

Your example is completely trivial to do with ORG, and honestly this is how I would imagine most people would approach this problem:

Code: [Select]
#org $C40000
JML newX
#pad $C400E7, $FF

#org $C41320
JML newY
#pad $C41410, $FF

#org $C72010
JML newZ
#pad $C72251


#org $F40000
newX:
    ...
   
newY:
    ...
   
newZ:
    ...


The higher level concept of having code segments is fine.  But ALL of it can be accomplished in code through use of directives.  I don't want to build that functionality directly into the linker.

ORG is just so simple and elegant.  Everybody understands it.  Segments are the exact opposite.  I really dislike them.
« Last Edit: May 20, 2016, 04:00:19 am by Disch »

jonk

  • Sr. Member
  • ****
  • Posts: 273
    • View Profile
Re: 65816: Direct Page vs Absolute Operand
« Reply #25 on: May 20, 2016, 06:43:39 am »
I agree about wanting to keep things so people can start out easy and don't have to learn complex, new ideas before they can use a tool. I also don't like it when people have to struggle a lot just to climb over some very high wall. I may disagree a little about not supporting useful ideas that people can grow into, as skills grow. It would be nice to come up with a way to allow that path for growth, if you can conceive of a way to do that. But I think there is enough to worry about, too. So let's drill in on the rest and drop this. I'm good with that.

That said, the Merlin32 system I've been modifying a bit also doesn't carry any concept about code segments. Yet, it does support relocation. A user can do exactly as you'd have it -- just ORG. That works and a user doesn't need to learn anything at all about segments to write good code. Yet, if they want to, they can avoid using ORG and use REL (relocatable code), instead. All they have to do is remove the ORG in their source code (which overrides things, but doesn't produce an error if they keep it) and place the ORG in their linker file that lists the asm file with an ASM statement. So, any programmer can start out with complete ignorance about segments and then, when they feel able, add a "special file" that says "ORG" and "ASM" to position their REL asm code, later. It's no more than creating a very short, very easy to understand second file; digging into their asm source code and changing an ORG to REL, and then re-assembling the result, allowing the linker file to control the location instead of the assembly source file. They can add several ASM for any given ORG, so that they can link together two or more ASMs to start at any ORG they want. Or they can completely ignore all this, never use a linker file, and just salt their code with ORG all over the place. It doesn't care. This is just as easy for someone to just go code stuff without any knowledge of segments. But it allows relocation added as a later after-thought, when someone is ready for the idea, with barely more than a line or two of change.
« Last Edit: May 20, 2016, 07:40:41 am by jonk »
An equal right to an opinion isn't a right to an equal opinion. -- 1995, me
Saying religion is the source of morality is like saying a squirrel is the source of acorns.  -- 2002, me

Disch

  • Hero Member
  • *****
  • Posts: 2814
  • NES Junkie
    • View Profile
Re: 65816: Direct Page vs Absolute Operand
« Reply #26 on: May 20, 2016, 10:04:28 am »
Yet, if they want to, they can avoid using ORG and use REL (relocatable code), instead. All they have to do is remove the ORG in their source code (which overrides things, but doesn't produce an error if they keep it) and place the ORG in their linker file that lists the asm file with an ASM statement.

Here's my question - and why I'm so hesitant on doing this.

What value does this have?  What functionality does this provide?  How does this make things any easier for the developer?

All of these things can be done in 2 lines of code by way of ORG and/or usage of macros.  Introducing segments into the linker just creates a whole new syntax the user has to learn to write the config files, and adds a bunch of redundant functionality which then creates problems when a conflict arises.  And it complicates the linker.

Literally every example I can imagine where segments would be useful could be just as easily (or more easily) solved using macros and assembler directives.



EDIT:

Code: [Select]
; segments.asm

#macro segment_newcode
  #offset $xxyyzz
  #org $F40000, $FFFFFF, $FF
#endmacro

Code: [Select]
; newcode.asm

!segment_newcode

  ; ... code here
« Last Edit: May 20, 2016, 10:12:31 am by Disch »

jonk

  • Sr. Member
  • ****
  • Posts: 273
    • View Profile
Re: 65816: Direct Page vs Absolute Operand
« Reply #27 on: May 20, 2016, 01:27:30 pm »
Like I said, there are more important issues and I don't want to squander your motivational energy here. It's just not worth it if it causes you to hold short on even a single other feature.

I'm going to be really happy with structures. So, for example, drawing from the old CP/M and DOS systems:
Code: [Select]
PSP             STRUCT

pspInt20        dw        1 DUP(?)
pspNextPara     dw        1 DUP(?)
                db        1 DUP(?)
pspDispatcher   db        5 DUP(?)
pspTermVector   dd        1 DUP(?)
pspCtrlCVector  dd        1 DUP(?)
pspCritVector   dd        1 DUP(?)
                dw       11 DUP(?)
pspEnvironment  dw        1 DUP(?)
                dw       23 DUP(?)
pspFCB_1        db       16 DUP(?)
pspFCB_2        db       16 DUP(?)
                dd        1 DUP(?)
pspCmdTailCnt   db        1 DUP(?)
pspCmdTailTxt   db      127 DUP(?)

PSP             ENDS

                .
                .
                .

                #org    $C40000

                .
                .
                .

                #assume DP:STRUCT PSP               
MyFunction      LDA     pspNextPara
                JML     ExternalFunction
                .
                .
                .

I used to place that into a SEGMENT AT in order to achieve a "structure." But if you support structures, then I get what I want there and that's one less reason for segments.

So I really don't want to mess with your head over this. A linker is the right place to be slinging strings of bytes around, wholesale, not the assembler which hard-codes all this beforehand. I'd like to be able to define "segments" that I may at some later time choose to overlay on top in various ways, for example. I don't want to have to go modify my source code for that purpose. I shouldn't have to. On the other hand, if I'm careful about the design I can use your macro examples constructively and I don't want to spend too much time worrying over this.

Note that I do need a way to specify data storage that does NOT produce initialized data in any form, at all. It should ONLY advance the counter and it should NOT update the ROM file. I used the DUP(?) above to illustrate that point. In Merlin32, it's DS with a byte count operand, so that DW 1 DUP(?) in Merlin32 is written as DS 2. The ROM is NOT updated here. It's just "skipped over."

I'd like a feature where I can read the ROM contents, apply them in a calculation, and then re-apply the newly calculated bytes into the ROM as a patch. I don't know if you care about that. But I'd rather ask for that than to worry about code and data segments.
An equal right to an opinion isn't a right to an equal opinion. -- 1995, me
Saying religion is the source of morality is like saying a squirrel is the source of acorns.  -- 2002, me

Disch

  • Hero Member
  • *****
  • Posts: 2814
  • NES Junkie
    • View Profile
Re: 65816: Direct Page vs Absolute Operand
« Reply #28 on: May 20, 2016, 02:00:40 pm »
Like I said, there are more important issues and I don't want to squander your motivational energy here.

Well I might be being overly aggressive -- I get like that unintentionally sometimes.  Sorry if that's how I come across -- I don't mean to be confrontational or anything  :thumbsup:

Really, if you can provide a realistic example of something segments can be used for that makes them easier to use than the other tools available, I'd love to see it, and I'm willing to consider it.  I just literally have never seen such an example in my entire life and so I can't fathom what good they are.  Every example I've seen made segments look more complicated and less useful than the alternatives.

Quote
I don't want to have to go modify my source code for that purpose. I shouldn't have to.

You have to modify something.  The only difference between my approach and yours is the extension of the file being modified.

- You're modifying a makefile or linker config file.
- I'm modifying a *.asm file.

If it's really a problem, just change the *.asm extention to *.cfg and imagine the syntax for my config files is strikingly similar to the syntax for macros  ;P



EDIT:

AHA!!

Okay I just thought of something that segments are good for!  ORG forces you to start at the given address every time you use it, whereas segments don't necessarily do that.  Multiple different blocks of code can have the same segment and the linker will fit them all into the given area without overlapping.  Whereas with my macro approach if you have two ORGs, each will try to overwrite the other!


Okay -- that's worth supporting.  I'll work it in.

So bringing back to my original question... but modified....

- is it reasonable to expect every source file to have an ORG or SEGMENT directive before any output is generated?


Quote
Note that I do need a way to specify data storage that does NOT produce initialized data in any form, at all. It should ONLY advance the counter and it should NOT update the ROM file.

I did that in my old assembler with the '#var' directive.  In fact I think I already posted the snippit from the documentation on that here:
http://www.romhacking.net/forum/index.php/topic,21927.msg306540.html#msg306540


Quote
I'm going to be really happy with structures.

Yeah I'll have to think of reasonable syntax for this.  =)

Quote
I'd like a feature where I can read the ROM contents, apply them in a calculation, and then re-apply the newly calculated bytes into the ROM as a patch. I don't know if you care about that. But I'd rather ask for that than to worry about code and data segments.

Doesn't sound like a job for an assembler.  Sounds more like a job for a python script.
« Last Edit: May 20, 2016, 02:07:05 pm by Disch »

jonk

  • Sr. Member
  • ****
  • Posts: 273
    • View Profile
Re: 65816: Direct Page vs Absolute Operand
« Reply #29 on: May 20, 2016, 02:16:02 pm »
I'd like a feature where I can read the ROM contents, apply them in a calculation, and then re-apply the newly calculated bytes into the ROM as a patch. I don't know if you care about that. But I'd rather ask for that than to worry about code and data segments.
Doesn't sound like a job for an assembler.  Sounds more like a job for a python script.
Hehe. Okay. Just pushing. There are cases where I want to "update in-place" a pre-existing data structure, using a macro to do it. And, since you seemed to be so hot on the idea of making it really easy for people to just "use" the tool, I figured I'd mention it. Writing a Perl or Python script to do this, would put it almost certainly out of the reach of the more casual/inexperienced user. That would just be over the top for them to do. Worse, I can't even provide my source code as an example to teach others about it, because the learning curve and the required documentation writing would be WAY TOO MUCH for me to consider attempting, trying to teach someone all that. So it would both eliminate the use by casual users and it would eliminate my writing a web page on doing it. But for me, personally? I'm fine. So I'll just go away on that point. No problem.



May 20, 2016, 02:31:10 pm - (Auto Merged - Double Posts are not allowed before 7 days.)
I'm going to be really happy with structures. So, for example, drawing from the old CP/M and DOS systems:
Yeah I'll have to think of reasonable syntax for this.  =)
I can't wait! Note that structures involve not only the definition, but perhaps also being able to declare a reference to one in the #assume.

In case it isn't already clear, I think the #assume for DP should accept nothing (or equivalent) as valid. When I write subroutines which aren't supposed to make any assumptions, I do NOT want them accidentally picking up the #assume from some elsewhere code that set it before, from the assembler's perspective. I want to be able to specifically and clearly override any prior setting and to explicitly say "you know nothing about the DP, for purposes of the following statements, so I want you to flag warnings/errors when you think you are being asked to make assumptions here."

Personally, I also think I should be able to over-ride, on the asm statement itself (such as an LDA) how I want the DP interpreted. There are times where this makes sense to do.

I haven't heard, yet, a clear way to over-ride the operand type in cases where I don't want the assembler doing some kind of "hmm, what is the best way to code this? should I use a DP-relative, or absolute, absolute long, or...?" Are you still thinking about having the assembler make some intelligent decisions for the programmer? (I'd like that.) If so, is there a way to over-ride that behavior and be explicit if I want that?

I hadn't brought it up yet, but there is also the DBR ahead. I think the assembler needs to know where that is also set. I don't think the PBR is needed, though. So I kind of imagine the case where the assembler supports #assume for DBR and DP, and not for PBR. And... well, the X and Y, too. May as well do it. Especially the X and Y, in fact. If you need examples of why, I can lay a few out for you. But I suspect you already know of a few good cases, yourself.

(But as a clue regarding DBR, X, and Y, imagine the case where DBR:X or DBR:Y references a structure and where you want the 16-bit instruction operand available to indicate the offset into the structure here. And before you start saying, "What in the heck is DBR:X?" please recall that DBR can be considered part of either the instruction's operand or part of the X or Y value. The final addition within the processor is a 24-bit result summing of a 24-bit value and a 16-bit value (assuming index register size here.) If I tell the assembler that my DBR:X references this structure correctly, then the assembler should be perfectly able to calculate the instruction operand value from that knowledge.)



May 20, 2016, 02:35:36 pm - (Auto Merged - Double Posts are not allowed before 7 days.)
- is it reasonable to expect every source file to have an ORG or SEGMENT directive before any output is generated?
Well, I suppose you could construct, by default, a default segment with a default address. But I don't really care. Either way you go, I'm fine with it. You need information and I don't think it is unreasonable to ask that it be provided to you, somehow. Making assumptions is just another way of pretty much ensuring someone ignorantly screws things up and has no clue why and no way to figure things out because there are no errors, no warnings, ... just nothing.
« Last Edit: May 20, 2016, 02:51:01 pm by jonk »
An equal right to an opinion isn't a right to an equal opinion. -- 1995, me
Saying religion is the source of morality is like saying a squirrel is the source of acorns.  -- 2002, me

Disch

  • Hero Member
  • *****
  • Posts: 2814
  • NES Junkie
    • View Profile
Re: 65816: Direct Page vs Absolute Operand
« Reply #30 on: May 20, 2016, 02:50:55 pm »
In case it isn't already clear, I think the #assume for DP should accept nothing (or equivalent) as valid.

Yeah that makes sense.  So when DP is 'nothing' the assembler just won't use direct page mode at all?

Quote
Personally, I also think I should be able to over-ride, on the asm statement itself (such as an LDA) how I want the DP interpreted. There are times where this makes sense to do.
[snip]
I haven't heard, yet, a clear way to over-ride the operand type

Agreed.  You'll always be able to override the assembler's decision and choose your own addressing mode explicitly.  I just don't want to you HAVE to do that.

Code: [Select]
lda.b   $xx     ; always direct page, no matter what
lda.w   $xxxx   ; always absolte, no matter what
lda.l   $xxxxxx ; always long, no matter what
lda     $xx     ; assembler chooses best

Quote
I hadn't brought it up yet, but there is also the DBR ahead.
Quote

Yeah DB is the same idea as DP and will be handled the exact same way.

PBR you don't need, as that is determined by the PC/org.

And A/X/Y sizes are absolutely necessary as they impact the size of immediate mode instructions.

jonk

  • Sr. Member
  • ****
  • Posts: 273
    • View Profile
Re: 65816: Direct Page vs Absolute Operand
« Reply #31 on: May 20, 2016, 02:52:17 pm »
We've been cross-editing here, so go back and re-read some of my modifications. I've expanded things a bit.



May 20, 2016, 03:06:47 pm - (Auto Merged - Double Posts are not allowed before 7 days.)
Code: [Select]
lda.b   $xx     ; always direct page, no matter what
lda.w   $xxxx   ; always absolte, no matter what
lda.l   $xxxxxx ; always long, no matter what
lda     $xx     ; assembler chooses best
So, what happens here?
Code: [Select]
    lda.b   ExtSym1[5].Member2
    lda.w   ExtSym1[5].Member2
    lda.l   ExtSym1[5].Member2
    lda     ExtSym1[5].Member2

In this case, ExtSym1 happens to be a table of structures. You know the size of each, so you can easily compute from the [5] where the base of the 6th structure is at, and from that how much extra to add to get at .Member2. (Or, you might want to force the code writer to say [5*SIZEOF MYSTRUCT] instead. Up to you?)

In the first case, the .b forces a DP-relative encoding. I get that. So in this case, you'd have to go look at the #assume for the DP?

In the second case, the .w forces an absolute encoding. I get that, too. So in this case, you would need to know the DBR to be able to compute the appropriate 16-bit operand.

In the third case, the .l forces an absolute long encoding. In this case, no #assumes are required at all.

So, what about the last case? Do you go through all your options here? Fastest? Smallest? (Is there a case where the two are different? I'm not sure.)

Note that in all cases the symbol is external.  ;D
« Last Edit: May 20, 2016, 03:29:42 pm by jonk »
An equal right to an opinion isn't a right to an equal opinion. -- 1995, me
Saying religion is the source of morality is like saying a squirrel is the source of acorns.  -- 2002, me

Disch

  • Hero Member
  • *****
  • Posts: 2814
  • NES Junkie
    • View Profile
Re: 65816: Direct Page vs Absolute Operand
« Reply #32 on: May 20, 2016, 04:01:14 pm »
In this case, ExtSym1 happens to be a table of structures. You know the size of each, so you can easily compute from the [5] where the base of the 6th structure is at, and from that how much extra to add to get at .Member2. (Or, you might want to force the code writer to say [5*SIZEOF MYSTRUCT] instead. Up to you?)

Side note:  I wonder... is [] notation going to be useful for structs?  The given index would have to be a constant and couldn't be like the X index or anything.  So how often would you access an individual struct element outside of a loop?

Anyway to answer this specific question:

EVERYTHING gets resolved to a number.  After all symbols are defined and resolved, 'ExtSym1[5].Member2' will ultimately be reduced to a numerical value.  Only then will instruction size be determined.  (Unless it's overrided by .b/.w/.l suffixes)

Quote
In the second case, the .w forces an absolute encoding. I get that, too. So in this case, you would need to know the DBR to be able to compute the appropriate 16-bit operand.

DB would be ignored completely if the .w suffix is provided.  If I'm considering DB at all, there's no difference between the user giving the suffix and the assembler deciding it on its own.

I suppose I could check the given value to make sure it has the same bank as DB and give a warning if they don't match -- but I don't even think I should do that.  I see these suffixes like I see casts in C -- you're effectively telling the assembler "I know what I'm doing, just shut up and do it the way I tell you to".

Quote
In the first case, the .b forces a DP-relative encoding. I get that. So in this case, you'd have to go look at the #assume for the DP?

This is a more interesting question since (unlike DB), DP can affect lower bits.

Example:

Code: [Select]
#dp $0080

lda   $0080   ; A5 00
lda.b $0085   ; A5 05  ??? or A5 85  ???

Honestly I don't know what the "correct" solution is here.  I could make a case for either one.  What do you think?

Quote
So, what about the last case? Do you go through all your options here? Fastest? Smallest? (Is there a case where the two are different? I'm not sure.)

It ultimately boils down to "lda <some number>"

- Is <some number> on the currect direct page?  If yes, use Direct Page mode.
- Otherwise, is <some number> on the current data bank?  If yes, use absolute mode.
- Otherwise, use long mode.

jonk

  • Sr. Member
  • ****
  • Posts: 273
    • View Profile
Re: 65816: Direct Page vs Absolute Operand
« Reply #33 on: May 20, 2016, 04:23:53 pm »
In this case, ExtSym1 happens to be a table of structures. You know the size of each, so you can easily compute from the [5] where the base of the 6th structure is at, and from that how much extra to add to get at .Member2. (Or, you might want to force the code writer to say [5*SIZEOF MYSTRUCT] instead. Up to you?)
Side note:  I wonder... is [] notation going to be useful for structs?  The given index would have to be a constant and couldn't be like the X index or anything.  So how often would you access an individual struct element outside of a loop?
Very often, given what I've seen already. For example, there is a list of identical structures in a large table in DQ3 (more than one of these, too.) However, they are directly referenced in the code, pointing straight at #6 for example in order to describe the gold status display structure. Each one is the exact same structure. But they don't index into the table with any kind of search. They know, by index ID value, exactly which one they want. They are simply collected together for convenience, I suppose. As far as how they use them, they just use them like this:
Code: [Select]
#define GOLDDISPLAY TABLE[5]
      .
      .
      .
      LDA    GOLDDISPLAY.Member
So I'd like to be able to do that conveniently.

Anyway to answer this specific question:

EVERYTHING gets resolved to a number.  After all symbols are defined and resolved, 'ExtSym1[5].Member2' will ultimately be reduced to a numerical value.  Only then will instruction size be determined.  (Unless it's overrided by .b/.w/.l suffixes)
Yeah. I get that. If it gets resolved in the linker, so be it. For that, I'd suggest at least considering the idea of letting the assembler spell out the exact list of cases in order to hide that knowledge into the assembler and to avoid forcing the linker to know too much of the same stuff. The "fix up" record can include a list of options, where appropriate. This is kind of like how REST works with http. But that's another story.

I suppose I could check the given value to make sure it has the same bank as DB and give a warning if they don't match -- but I don't even think I should do that.  I see these suffixes like I see casts in C -- you're effectively telling the assembler "I know what I'm doing, just shut up and do it the way I tell you to".
Yes, agreed. (I was previously thinking about x86 segments, which overlap, and this was dumb of me. I need to go back to the WDC documentation and make sure there aren't any odd-ball corner cases, though. Worth a look-see.)

This is a more interesting question since (unlike DB), DP can affect lower bits.

Example:

Code: [Select]
#dp $0080

lda   $0080   ; A5 00
lda.b $0085   ; A5 05  ??? or A5 85  ???

Honestly I don't know what the "correct" solution is here.  I could make a case for either one.  What do you think?

It ultimately boils down to "lda <some number>"

- Is <some number> on the currect direct page?  If yes, use Direct Page mode.
- Otherwise, is <some number> on the current data bank?  If yes, use absolute mode.
- Otherwise, use long mode.
If the DP is "in view" with the assembler (isn't 'nothing') then the DP value should be used. I see your point about the question of the over-ride, though. If you are forcing the assembler to "be stupid," does that mean that the assembler should revert to 'nothing' for DP?

I think it would be better that you always use DP, if it is in view. Period. Regardless of the instruction over-ride, itself. In the case that DP is 'nothing' then I'd use the actual value given by the programmmer. So:

Code: [Select]
#dp $0080
lda   $0080   ; A5 00
lda.b $0085   ; A5 05
#dp nothing
lda.b $0085   ; A5 85
« Last Edit: May 20, 2016, 04:46:01 pm by jonk »
An equal right to an opinion isn't a right to an equal opinion. -- 1995, me
Saying religion is the source of morality is like saying a squirrel is the source of acorns.  -- 2002, me

Disch

  • Hero Member
  • *****
  • Posts: 2814
  • NES Junkie
    • View Profile
Re: 65816: Direct Page vs Absolute Operand
« Reply #34 on: May 20, 2016, 04:54:03 pm »
Very often, given what I've seen already. [snip] So I'd like to be able to do that conveniently.

Fair enough  =)

Quote
I don't see how that works. Without knowing what DBR (I'm using WDC notation here) is, you can't compute the 16-bit absolute address relative to it. And it is ALWAYS relative to the DBR. The DBR is pre-pended to the 16-bit absolute address.

DBR sets bits 16-23
Absolute mode records bits 0-15

If the user is forcing absolute mode with the .w suffix, then DBR literally doesn't matter because bits 16-23 are not going to be included in the assembled output anyway.

Ex:

Code: [Select]
#dbr $01

lda   $010000  ; AD 00 00
lda.w $010000  ; AD 00 00
lda.w $020000  ; AD 00 00  (bank is wrong, sure, but .w is forcing absolute)
lda.w $xx0000  ; AD 00 00  (no value for 'xx' can change that.  Bank byte doesn't matter here)


Quote
It's not just a check, so far as I can see. You NEED the DBR value to correctly compute the 16-bit absolute address relative to the DBR.

Nope.  You just mask out the high bits.  'AND $FFFF'

Quote
I think it would be better that you always use DP, if it is in view. Period.

I can go for that.  Maybe a 4th suffix can be added later if "extremely stupid" mode is really desired.

But realy this seems like a very unlikely edge case and probably isn't worth worrying about.

jonk

  • Sr. Member
  • ****
  • Posts: 273
    • View Profile
Re: 65816: Direct Page vs Absolute Operand
« Reply #35 on: May 20, 2016, 05:04:00 pm »
Yeah. I made a mistake about the DBR. Updated my response, too late I see. I agree with you on that topic. So put that to bed. I'm going to walk through EVERY SINGLE one of the addressing modes now and make sure I didn't miss something important.
An equal right to an opinion isn't a right to an equal opinion. -- 1995, me
Saying religion is the source of morality is like saying a squirrel is the source of acorns.  -- 2002, me

elmer

  • Full Member
  • ***
  • Posts: 122
    • View Profile
Re: 65816: Direct Page vs Absolute Operand
« Reply #36 on: May 21, 2016, 12:46:19 pm »
I really don't like to waste my time sitting down with a piece of paper, working out DP-relative offsets. Worse, this in effect hard-codes these deltas. If I later decide to move the DP base somewhere else or if I decide to modify a structure there and add some more fields.... then I'm running around having either to modify a lot of instructions that I should never have had to bother with or else I have to go find my long list of EQU/= symbols and go hack that thing into shape so that the offsets are correctly stated, again. This is seriously bad.

I'm going to be really happy with structures. So, for example, drawing from the old CP/M and DOS systems:
Code: [Select]
PSP             STRUCT

pspInt20        dw        1 DUP(?)
pspNextPara     dw        1 DUP(?)
...
pspCmdTailCnt   db        1 DUP(?)
pspCmdTailTxt   db      127 DUP(?)

PSP             ENDS

...
Note that I do need a way to specify data storage that does NOT produce initialized data in any form, at all. It should ONLY advance the counter and it should NOT update the ROM file. I used the DUP(?) above to illustrate that point. In Merlin32, it's DS with a byte count operand, so that DW 1 DUP(?) in Merlin32 is written as DS 2. The ROM is NOT updated here. It's just "skipped over."

Structure definition was traditionally handled with the RS directives (RSSET, RB, RW, RD, RS). That stops the programmer from having to count byte offsets.

Code: [Select]
               rsset 0
pspInt20       rw    1
pspNextPara    rw    1
...
pspCmdTailCnt  rb    1
pspCmdTailTxt  rb    127
PSP_SIZE       rb    0

Then, to define an array you can do ...

Code: [Select]
               rsset 0
PSP_0          rb    PSP_SIZE
PSP_1          rb    PSP_SIZE
PSP_2          rb    PSP_SIZE
PSP_3          rb    PSP_SIZE
PSP_4          rb    PSP_SIZE
PSP_5          rb    PSP_SIZE

At that point, your ExtSym1[5].Member2 example becomes "lda (PSP_5 + pspCmdTailCnt)".

Or if you want to set the DP to PSP_5, then you can do a simple "lda <pspCmdTailCnt", and it's clear and simple.


Quote
Code: [Select]
lda.b   $xx     ; always direct page, no matter what
lda.w   $xxxx   ; always absolte, no matter what
lda.l   $xxxxxx ; always long, no matter what
lda     $xx     ; assembler chooses best

Code: [Select]
    lda.b   ExtSym1[5].Member2
    lda.w   ExtSym1[5].Member2
    lda.l   ExtSym1[5].Member2
    lda     ExtSym1[5].Member2

I am having trouble seeing how what your proposing here is an "advance" upon existing practice.

The "lda.b/w/l" is using a suffix that's normally used in assembly language on every other processor to control the size of the load itself, and not the addressing mode.

It really doesn't seem like good design practice to override common usage and use it to indicate an addressing mode, especially when there is already an accepted syntax for doing what you need.

And if you want "C" like structure access notation ... the immediate question that I ask is, can I do a "lda ExtSyms1[IndexVar].Member2"?

If so, then you're actually generating new code, and you're actually writing a compiler.

If not, then the syntax is pretty pointless, and IMHO no advance on the RS directives in practical use.

If you do want to make 6502 programming easier and more high-level ... then I suggest that you investigate PLASMA or ATALAN (see David Wheeler's page for a discussion on advanced 6502 languages http://www.dwheeler.com/6502/).


There are times when I'd like the assembler to automatically recognize when a conditional branch can't reach its target and is able to replace it with a BRL and reverse-conditional branch around the BRL, for example.

Yes, this can be nice, and the GNU binutils suite supports it on a number of architectures.

I've never missed this in an practical usage of assembly language programming, because I normally want high-performance code, and so I want to know when something is out-of-range because I may choose to re-arrange the code a bit, or to just use a macro to do the long version.

But having the option of the assembler/linker doing it would be nice for more general run-of-the-mill code.


Disch

  • Hero Member
  • *****
  • Posts: 2814
  • NES Junkie
    • View Profile
Re: 65816: Direct Page vs Absolute Operand
« Reply #37 on: May 21, 2016, 01:25:10 pm »
I am having trouble seeing how what your proposing here is an "advance" upon existing practice.

Existing 65xx series assemblers typically do not automatically take advantage of direct page access and require the user to explicitly state the addressing mode they want.

I want to do that.  But I also want the user to be able to explicitly override the assembler's judgement.

Quote
The "lda.b/w/l" is using a suffix that's normally used in assembly language on every other processor to control the size of the load itself, and not the addressing mode.

It really doesn't seem like good design practice to override common usage and use it to indicate an addressing mode, especially when there is already an accepted syntax for doing what you need.

I wasn't aware the suffix notation conflicted with other architectures.  I recall seeing it on other 65xx assemblers (xkas maybe?  It's been a while)

What is the accepted syntax in this instance and what assemblers use it?  I saw the "lda <pspCmdTailCnt" in your post, but if that is intending to be a direct-page specification, that conflicts with the usage I'm familar with (which I've seen in most if not all 65xx assemblers) that treats < as a low-byte cast.

That is...
"<foo" and "foo & 0xFF" are equivalent

Quote
And if you want "C" like structure access notation ... the immediate question that I ask is, can I do a "lda ExtSyms1[IndexVar].Member2"?
[snip]
If not, then the syntax is pretty pointless, and IMHO no advance on the RS directives in practical use.

No, you would not be able to do that.  Which is why I originally raised the question as to how useful that syntax would actually be.  I'm still not entirely sold on it.

Quote
If you do want to make 6502 programming easier and more high-level [snip]

I don't want to change 65xx into an HLL.  The goal here is just to have an assembler that does intelligent symbol resolution without the user having to concern themselves with raw addresses.  Believe it or not, this is surprisingly lacking for 65xx assemblers.

jonk

  • Sr. Member
  • ****
  • Posts: 273
    • View Profile
Re: 65816: Direct Page vs Absolute Operand
« Reply #38 on: May 21, 2016, 01:47:37 pm »
Existing 65xx series assemblers typically do not automatically take advantage of direct page access and require the user to explicitly state the addressing mode they want.

I want to do that.  But I also want the user to be able to explicitly override the assembler's judgement.
Agreed.

I wasn't aware the suffix notation conflicted with other architectures.  I recall seeing it on other 65xx assemblers (xkas maybe?  It's been a while)

What is the accepted syntax in this instance and what assemblers use it?  I saw the "lda <pspCmdTailCnt" in your post, but if that is intending to be a direct-page specification, that conflicts with the usage I'm familar with (which I've seen in most if not all 65xx assemblers) that treats < as a low-byte cast.
There are several conflicting standards, so far as I've been able to see. I think this is largely because WDC didn't originally own the IP (it was originally MOS Tech, I believe) and, when they did, it was entirely as an OEM supplier. So folks trying to address the need for an Apple II assembler (later an Apple IIgs, etc) had to come up with their own ideas, consistent with what they could find in terms of pre-existing docs (which I'm sure varied a lot from developer to developer back then.) The Merlin assembler, targeting the Apple II series, actually has an "LDA:" instruction -- yes, a ':' exists as part of the opcode!

On the web, today, there is a SINGLE book available that discusses syntax that is free. If you haven't already bothered (and I have no reason to imagine you don't have it already), then you can find it here (full res and big), here, and here. The first one is from WDC, itself. The second and third are identical but come from different sites. So there are two versions of the book. The point here is that this book is the only complete manual I know of that is available to anyone at all without charge. That alone makes it important. So you might consider this fact when considering some starting point for ideas. Just a thought.

No, you would not be able to do that.  Which is why I originally raised the question as to how useful that syntax would actually be.  I'm still not entirely sold on it.
Neither am I. I'm only illustrating semantics I'd like, not syntax for it. So long as the semantics are accessible, I plan to leave to you any question of syntax to reach it.

I don't want to change 65xx into an HLL.  The goal here is just to have an assembler that does intelligent symbol resolution without the user having to concern themselves with raw addresses.  Believe it or not, this is surprisingly lacking for 65xx assemblers.
Cripes, I don't want an HLL, either. Just a good assembler/linker toolset.

Worse, almost any kind of HLL imposes semantic restrictions. And I don't want any of that, at all. I want full access to the entire semantic range available to an assembler programmer. Not some limited subset that is invariably available to HLL programmers. (For example, C imposes restrictions on parameter passing and return value methods. And further restricts functions to single entry points. Not interested.)

However, if and when you feel like taking a jaunt through an interesting assembler that does a great deal while at the same time allowing the basics, you could look at Randy Hyde's "The Art of Assembly" book and assembler tool. It's got some pretty fancy HLL features adapted to assembly, including support for thunking. (Which makes a procedure call require both an address AND an activation frame pointer, not just an address -- very useful for things like 'iterators'.) But I'm in no way suggesting you even think about that -- especially not in the case of the 65816 -- it's just for a rainy day.
An equal right to an opinion isn't a right to an equal opinion. -- 1995, me
Saying religion is the source of morality is like saying a squirrel is the source of acorns.  -- 2002, me

elmer

  • Full Member
  • ***
  • Posts: 122
    • View Profile
Re: 65816: Direct Page vs Absolute Operand
« Reply #39 on: May 21, 2016, 02:46:47 pm »
Existing 65xx series assemblers typically do not automatically take advantage of direct page access and require the user to explicitly state the addressing mode they want.

What I've been trying to point out, is that good assemblers that followed the manufacturer's recommendations do actually take advantage of this stuff.

The problem was solved a long time ago.

It may have resurfaced because the hacking community are using quickly-knocked-together tools that ignore the published specifications ... but that doesn't mean that you have to re-invent the wheel, or come up with some new syntax to solve problems that have already been solved.

I suggest that you read the 65816 specs here ... http://archive.6502.org/datasheets/wdc_w65c816s_aug_4_2008.pdf

See pages 37-40 for the assembly language standards.

Having the assembler automatically take advantage of direct page access is a part of the standard, and was definitely used in the SNASM assembler that I used at the time.

BTW ... despite the 2008 date, these are the same basic specs that I photocopied from the Apple IIGS Hardware Reference manual when I started SNES development in 1991.


Quote
What is the accepted syntax in this instance and what assemblers use it?  I saw the "lda <pspCmdTailCnt" in your post, but if that is intending to be a direct-page specification, that conflicts with the usage I'm familar with (which I've seen in most if not all 65xx assemblers) that treats < as a low-byte cast.

That is...
"<foo" and "foo & 0xFF" are equivalent

This is in the specifications. The datasheet given above shows the syntax. AFAIK, CA65 supports this syntax.

The only difference that I remember in practice, was that a lot of assemblers supported using square-brackets instead of braces for indirection, i.e. ...

Code: [Select]
  lda [$01],y

instead of

Code: [Select]
  lda ($01),y

This was seen as a good-idea by most progammers because it separated the syntax of expression-evaluation from indirection.

CA65 now supports the square-bracket syntax as an option ... otherwise it uses the traditional manufacturer syntax.

It does use a different syntax for structure definitions and references than the "RS" method that I mentioned previously, probably to simplify its use with the CC65 compiler.


Quote
The goal here is just to have an assembler that does intelligent symbol resolution without the user having to concern themselves with raw addresses.  Believe it or not, this is surprisingly lacking for 65xx assemblers.

To the best of my knowledge, it's only lacking from the assemblers that you're familiar with.

From what I can see, CA65 is the current standards-bearer in 65xx assemblers.

I understand that you really don't like its linker syntax ... and I agree that it is a bit brutal at first look, but it provides the flexibility to create pretty-much any output ROM layout.

Providing template linker files for the common SNES ROM layouts seems like it would basically solve that complaint.

If you find that it's missing features that you feel that you need, then may I suggest that it might be more profitable to the 65xx programming community as a whole to attempt to extend CA65 before throwing all the toys out of the pram and starting from the scratch.