BASIC Compilation Approaches!
Saturday, 21st September 2019
I put together a to-do list for Tiny BASIC, and for the past few weeks I've been ticking off various refactoring tasks that I wanted to get out of the way before I start on code generation. The main two are now done: proper handling of end-of-line characters, and individual tokens for keywords.
Until now, the tokeniser would recognise words and symbols, package them up as WORD or SYMBOL and pass them to the parser. The parser would then have to see what word or symbol they were. Both of my compiler design books recommend having the tokeniser do the work of identifying keywords, and pass the parser more finely-categorised tokens like LET, IF and so on. This is now done and, unlike the end-of-line handling, worked flawlessly and didn't hold me up for a couple of weeks with an elusive bug. I was initially suspicious that I hadn't actually recompiled the source or something.
Now there's one small task to get out of the way before moving on to code generation: toughening up the Makefile and possibly building the Makefiles for Windows and DOS. Then it'll be on to C code generation. I've also started looking into LLVM, and it might be practical to make the initial Tiny BASIC release a full compiler rather than just a C translator.