I've moved from working on paper to actually writing code. Table switching did create a few headaches, but I'm using recursion as my stack now, so it's all good (well, except for post-processing of overlapping strings with bytes interpreted by different tables, which is still a mess, but hopefully no game was insane enough to do
that; same goes for pointers into the middle of multibyte tokens). I can locate, translate, and output text now, and it is fun

. However, there are a couple of things in the
table standard that I'd like to get a sanity check on.
Raw Hex InsertsGiven the following table:
00=<$01>
01==
02=this is a '<$01>' string
03=this is a '
04=' string
05=<
06=>
07=$
08=\
10=0
11=1
the hex sequence "00 01 02" will be dumped as the text "<$01>=this is a '<$01>' string" but must be inserted as "01 01 03 01 04" as per 2.2.1. This seems wrong, but the problem could be resolved by replacing the entries for 00 and 02 with e.g.
00=<$$01>
02=this is a '<$$01>' string
so perhaps text sequences containing <$[0-9A-Fa-f][0-9A-Fa-f]> should be forbidden. Also, it might be more appropriate to include the section on inserting raw hex in 2.2.4 instead of in 2.2.1. Also also, it might be worth mentioning that hex characters are equally valid in upper or lower case (e.g. "AB" == "ab" =="Ab" == "aB").
Control Codes in End Tokens Requiring Hex RepresentationGiven the following table:
00=text
01=more text
/FF=<end>\n\n//
the hex sequence "00 FF" will be dumped as the text "text<end>
//". When inserting, control codes are ignored for end tokens requiring hex representation, so any of "text<end>//", text<end>
//", "text<end>
//", etc. will be mapped to "00 FF", but "text<end>" will be mapped to "00" since "<end>" is ignored as per 2.2.2.
Assuming Atlas-compatible output, the hex sequence "00 FF 01 FF" would be dumped as "#W16($XXXX)
text<end>
//
#W16($XXXX)
more text<end>
//", which is probably not what was intended (maybe you could try to interpolate the pointer output into the end token's newlines, but that sounds like an extremely bad idea). Output commenting should probably be controlled at the utility level rather than the table level.
When inserting, should control codes be ignored for all tokens, or just end tokens requiring hex representation?
Uniqueness of End Token NamesNote: End Tokens, regardless of type, must be uniquely named.
The standard makes no definition of what constitutes a "name". Given that duplicate hex sequences are forbidden by 2.2.5, I assume name refers to the text sequence. Presumably an error should be generated when encountering a duplicate text sequence and at least one of the tokens involved is an end token... maybe? Is this dependent on the type of token? How about dumping vs. inserting? Given the following table:
/00=<end>,2
/01=<end>,2\n
$02=<end>,2
!03=<end>,2
04=<end>,2
05=<end>,2\n
/<end>,2
/<end>,2\n
what errors (if any) should be generated when dumping? when inserting?
While on the topic of uniqueness, it might be worth including a note in the standard that restricts the definition of a "unique" entry to the current logical table. Otherwise an error (duplicate hex sequence) should be generated by the following table:
@table1
00=foo
@table2
00=bar
Conversely, TableIDString in 2.6 should be considered unique across all tables (including across multiple table files) provided to the utility.
Linked EntriesAttempting to insert a linked entry which lacks its corresponding <$XX> in the insert script should generate an error under 4.2, right?
Under the "wishing for the moon" category:
It would be nice if we could define linked entries in such a way that we could obtain output with both pre- and post-fix, like "A(Color:<$A4><$34>)a". Theoretically, there's no reason a table file has to be restricted to dealing with in-game script. I can imagine, for instance, somebody writing a table like:
...
A9=LDA #,1
AA=TAX
AC=LDY $,2
...
and wanting "B1" to map to "LDA ($<$XX>),Y". You could do that with e.g. "B1=LDA ($,1,),Y", but since we lack a general escape character, you couldn't determine which commas were field delimiters and which were data

. It might also be nice to be able to specify how raw bytes were output in case you don't want the <$XX> format.
Table SwitchingNumberOfTableMatches in 2.6 refers to tokens (each of variable length) rather than bytes, yes? How many tokens do the bytes of a linked entry count as?
Given the following table:
@table1
00=1
!01=table2,2
@table2
00=A
01=B
0100=C
$02=<Color>,2
I think the process for translating "01 02 25 25 01 00 00" starting with table1 would be this:
table1 --> table2 --> <Color><$25><$25> --> C --> fallback to table1 --> 1
Various errata2.2.1: "sequenes"
2.2.1: "hex byte insertion takes precedent" should read "hex byte insertion takes precedence"
2.3: "Control Codes" is a somewhat ambiguous term, since it can refer to the only defined table entry format control code ("\n") or to game-specific control codes as referenced in 2.5
3.1: "in and automated fashion" should read "in an automated fashion"
3.4, Example 1: should "E060" read "E030"?
4.2: "paramter"