"Pretty printer" for BASIC #11

mattgodbolt · 2020-11-19T14:07:53Z

Expanding out the whole thing from the (sometimes) compressed mass of P."moo";:?A=1:MO.2 into separate lines. Would need to handle line numbering cleverly.

The text was updated successfully, but these errors were encountered:

mattgodbolt · 2020-12-02T23:26:41Z

See also the thread here: https://twitter.com/bbc_micro/status/1334247124142321668

Things to note:

<variable> <TOKEN> with no space might lead to a broken expansion. We might need to introduce a space. Concrete example (with <> around tokens for clarity) : <IF>A+Z<THEN><ENDPROC> which if expanded becomes interpreted as <IF>A=ZTHENENDPROC
There's a limit of 256 characters per line. if we manually tokenise we might be able to get past that instead of relying on BASIC's tokeniser. I'm not sure, sometimes I swear it's handling more than 256 chars but...
being very careful about things in REM lines etc.

ojwb · 2020-12-31T03:40:04Z

sometimes I swear it's handling more than 256 chars but...

It seems in practice you can have up to at least 259 characters before tokenisation:

https://bbcmic.ro/#%7B%22v%22%3A1%2C%22program%22%3A%22PRINTASNSINACSCOSPI%3APRINTASNSINACSCOSPI%3APRINTASNSINACSCOSPI%3APRINTASNSINACSCOSPI%3APRINTASNSINACSCOSPI%3APRINTASNSINACSCOSPI%3APRINTASNSINACSCOSPI%3APRINTASNSINACSCOSPI%3APRINTASNSINACSCOSPI%3APRINTASNSINACSCOSPI%3APRINTASNSINACSCOSPI%3APRINTASNSINACSCOSPI%3APRINTASNSINACSCOSPI%22%7D

If I add more characters to that input owlet seems to get unhappy, but looking at http://8bs.com/basic/basic4-8db2.htm I don't really see why it wouldn't handle a longer input provided the tokenised output fits...

But even if only the tokenised output needs to fit in the line length limit, the need to add spaces in some places when expanding tokens means that this would still sometimes be an issue.

ojwb · 2020-12-31T23:43:20Z

I've worked it out - in the BASIC ROM's tokeniser code there's a loop to copy the tail of the line down after substituting a token, indexed on Y. If the untokenised tail is more than 255 bytes then Y wraps. So the limit isn't the size of the input, but the size of the input after the first keyword.

I've opened mattgodbolt/jsbeeb#314 which intercepts PC reaching this loop and copies the tail in JS instead - with that we can tokenise any line which fits after tokenising.

8bitkick added the enhancement New feature or request label Nov 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"Pretty printer" for BASIC #11

"Pretty printer" for BASIC #11

mattgodbolt commented Nov 19, 2020

mattgodbolt commented Dec 2, 2020

ojwb commented Dec 31, 2020

ojwb commented Dec 31, 2020

"Pretty printer" for BASIC #11

"Pretty printer" for BASIC #11

Comments

mattgodbolt commented Nov 19, 2020

mattgodbolt commented Dec 2, 2020

ojwb commented Dec 31, 2020

ojwb commented Dec 31, 2020