Wednesday, November 15, 2017

Disassembling the Disassembler

Writing the disassembler turned out to be even simpler than I expected. I had expected the work to be a bit on the time-consuming part as no matter which route I went with to write this I would need to deal with in 56 different instruction with many of them supporting several address modes. There are various approaches that can be taken for disassembling instructions.  For processor architectures such as the Sparc, there are very specific bit patterns that make up the instructions. A look over the instructions clearly shows that this is probably true of the 6502 but with 56 valid instructions and only 256 possible values a simple table approach seemed to be the way to go.

The table approach sets up all the information as a table. By having a function pointer or lambda function in the table, you could also set it up to be able to do the interpretation as well. This isn’t really that inefficient either as it is a simple table lookup which then calls a function that does the interpretation work. The bit approach would be a lot messier and with so few possible outcomes it is not overly cumbersome to create. A more complex processor would be a different story but for this project I will go with the table. Here is the format of the table:

OP Code
The number assigned to this operation. While not technically needed here, it is a good idea to have to make sure the table is complete and it will be needed if an assembler is desired in the future.
Op String
The mnemonic or 3 letter word used to describe the instruction.
Size
How many bytes (1 to 3) the instruction uses.
Address Mode
How memory is addressed.
Cycles
The base number of cycles for the instruction. Things such as crossing page boundaries or whether a branch is taken will add to this value.
Command
The code that handles the interpretation of this instruction.

Disassembling then becomes simply the matter of looking up the instruction then based on the address mode printing out the value or address that it is working with. There are 14 address modes that I came up with as follows:

enum class AddressMode {ABSOLUTE, ABSOLUTE_X, ABSOLUTE_Y, ACCUMULATOR, FUTURE_EXPANSION, IMMEDIATE, IMPLIED, INDIRECT, INDIRECT_X, INDIRECT_Y, RELATIVE, ZERO_PAGE, ZERO_PAGE_X, ZERO_PAGE_Y}

The meaning of the individual values in the enumeration are outlined in the following table. This will become important when the interpretor portion of our emulator starts getting implemented.
ABSOLUTE
Specifies the address that will be accessed directly.
ABSOLUTE_X
The address specified with an offset of the value in the X register.
ABSOLUTE_Y
The address specified with an offset of the value in the Y register.
ACCUMULATOR
The value in the Accumulator is used for the value.
FUTURE_EXPANSION
Unknown address mode as instruction not official. For the instructions that I end up having to implement, this will be changed as necessary.
IMMEDIATE
The value to be used is the next byte.
IMPLIED
The instruction tells you what register(s) it uses and those are what get used.
INDIRECT
Use the address located in the address this points to. So if this was JMP (1234) then the value at 1234 and 1235 would be the address to jump to.
INDIRECT_X
The next byte is a zero page address. The X register is added to this value. That byte and the one following it are then used to form the address to jump to.
INDIRECT_Y
The next byte is a zero page address. It is the low byte and the following zero page byte is the high byte to form the address. The value in the Y register is then added to this address.
RELATIVE
An offset to jump to (relative to the next instruction) if the branch is taken.
ZERO_PAGE
Use a zero page address (0 to 255 so only one byte is needed).
ZERO_PAGE_X
Zero page address with the value of the X register added to it.
ZERO_PAGE_Y
Zero page address with the value of the Y register added to it.

Calculating the addresses is easy but for people use to big endian architectures may be strange. For addresses the first byte is the low order byte followed by the high order byte. This means that the address is first + 256 * second. For branching (relative) the address is the start of the next instruction plus the value passed (-128 to 127).

Next week will be a look at my assembler decision with some hindsight about the process as I am nearly finished the assembler. 

Wednesday, November 8, 2017

Test Driven Disassembly

When I first started programming, the procedure was simple. You would write the program and then you would test the program. The testing was generally manual testing simply making sure that the program would do what you wanted. This is fine for when working on small or personal projects, but when projects get larger this is not a good way of doing things. Changing things could cause code to break but as it is not tested for it can go unnoticed for a long time and when discovered would require a lot of effort to find and fix.

The idea of automated testing helps solve this problem by making the testing process easy as one just needs to run the test after making changes to see if anything is broken. This does require that the tests exist which can be a problem as writing tests after the code has been completed makes writing the test a chore that can be skipped if one is behind schedule. It also has the problem of the test only testing what is already known to work.

Test driven development moves the testing to the top of the development loop. This has the advantage that the tests are written before the code so code always has test code. You then make sure that the tests fail and then write the code and get the code to pass the tests. You also have the advantage of thinking about how exactly you are going to test things and may uncover issues before you have even started writing the code. A comparison of the three methods is shown in the flowcharts below.



As with pretty much every approach to programming, dogmatism can take over and the advantages of test driven development can quickly be replaced by useless burdens. If you find yourself having to write tests for basic getters and setters then you have fallen into the dogmatism rabbit hole. I have been taking a middle ground with my work coming up with tests before writing code. As some things are simply too difficult to write automated tests for, especially non-deterministic programs such as with many games, manual testing is an option as long as you have a clear test plan. Automated testing is a preference as the tests are always ran so problems are detected earlier.

For my disassembler, the test is simply being able to disassemble known code into the proper instructions. My original plan for the assembly code was to write some test assembly language that would cover all the instructions with the various address modes for the instructions. The code wouldn’t have to perform anything useful, just cover the broad range of assembly instructions. This got me thinking about non-standard instructions.

There are future use operation codes (OP codes) that the 6502 has that when used will do things. As this functionality is unofficial, using such instructions is not wise since the presence is not guaranteed, but some programmers would use these instructions if it would save memory or cycles. As I do want my emulator to work with at least some real cartridges, which may use unofficial instructions, I need my disassembler to be able to detect these instructions and alert me to the use of the instructions so I can figure out what the instruction does and implement it in the future.

This means that all 256 possible OP codes need to be tested. As I want to be able to disassemble from any arbitrary point in memory, this simply meant that my test could be procedurally done. My test memory simply filled memory with the numbers from 0 to 255 so if I set the disassembly address in sequence, I would have all the instructions with the subsequent bytes being the address or value to be used. The fact that the instructions were of different lengths was not a big deal as we would be manually controlling the disassembly address. The list of instructions is something that I have so creating the test result list to compare to was very simple.

When running the test, if there is an error, it is still possible that my test list is incorrect, but this is obvious enough to determine. Once the disassembler is in a state that it can disassemble the complete list, it is probably working so as far as tests are concerned, this is a good way of testing. Once I wrote my disassembler, I did have to fix the list but also did find some issues so overall the test did work. Next week I will go into the disassembler which was surprisingly easy to write.

Wednesday, November 1, 2017

Halloween Scratch Postmortem

I have made huge progress on the emulator finishing the preliminary disassembly and starting work on an assembler so the next few months’ worth of blog posts will be catching up with where I am at with the project. The assembler was not part of my original plans but after getting the disassembler working it didn’t seem like it would be hard. Turns out to be a bit more complex than I expected but still worth doing. As soon as I get the assembler to a stable state (hopefully by next Wednesday’s post) I will post the code that I have written. Haven’t decided were the code will be hosted yet. Since I am so far ahead, I will spend a bit of time on my Christmas trilogy so may be posting three games over the next couple of months. But this week is the postmortem of my Halloween game.

While Halloween Scratch was not originally going to be the Halloween port for this year, while porting the vector assets of most of my Flash games into Create.js compatible JavaScript, it became apparent that some games benefit from developing them in Adobe Animate CC (hereafter referred to as Animate) while others only benefit from having the vector graphics converted into Create.js code. Animate is not the cheapest of software, so with my license running out this month I decided that I would not be renewing it for a while as I don’t really need it. Halloween Scratch was a game that was very animation oriented and was a simple game, finishing it in Animate while I still had the software made sense.


Halloween Scratch is a lottery ticket where instead of winning a prize, you win a monster.  There were other lottery ticket games at the time that were just click to reveal and I wanted to demonstrate to a potential client that you could have a much closer feel to scratching a real lottery ticket.

What Went Right

The animations worked pretty much flawlessly so very little work had to be done there other than with the alien as the transport effect was using color filters. Color filters in Animate are costly so they are cached which means you need to either force the image to be re-cached or come up with some other way of doing it. Simply creating images in the colored states (create a copy of the image then apply the color effect) was all that was needed.  If your game revolves more around animation effects, then using Animate is helpful. Most of my games are more code oriented so I am not sure it is worth it.

What Went Wrong

I had some strange issues with children in a container. I am not sure if this behavior stemmed from Animate, from Create.js, or from JavaScript itself. What was happening was that I used the target of the event listener to hide the dots that make up the scratch cover. There was a reset cover method that should have reset all the children to visible but even though it thought it was doing so, nothing was happening on the screen so already scratched off areas remained invisible. I am not sure why this was happening but was able to get the display list to properly reflect the visibility of a dot by accessing the dot through the dots array instead of the event target. It should not matter which reference of the object has its visibility changed yet in this case it does. I suspect this is one of the “this” pointer related issues that I always run across in JavaScript.

Mixed Blessings

I think the original game did an excellent job of representing the feel of a lotto ticket. Unfortunately, this was originally an ActionScript 1 game so the scratch code had to be pretty much re-written from scratch. I had hoped that using roll-over events would be detectible on tablets allowing for tablet users to scratch the ticket. This was not the case with the browser window being scrolled around instead. To solve this I added click support so by touching a tile it would reveal a block of the image. Not the best solution but it does allow browser users to play the game. Interesting side effect is that the tile can only be clicked if it has not been removed yet so computer users are pretty much forced to scratch the ticket.

Overall,  Animate is a good tool for porting animations and vector artwork to JavaScript but once that is done, the coding work is easier to do in other tools making it more of a utility than a tool. Animate is still a really good tool for creating animations so it would not surprise me if I end up renting the tool in the future but for my porting work, I am pretty much finished with the tool. Create.js is a workable solution for porting Flash games to HTML5 but ultimately you are working with JavaScript which is not the greatest of languages to work with.

Wednesday, October 25, 2017

Thanks for the Memories

When I program, I try to follow the following pattern: get it working, get it working correctly, then if necessary get it running fast. Premature optimization is one of the bigger problems that programmers face. Often optimized code is hard to read and is the ideal spot for bugs to lurk. Making premature optimization even a worse habit is far too often you end up spending time optimizing the wrong code (not the thing that is actually causing the program to run slow) or are optimizing code that you are going to be replacing later. This is why optimizing after you have finished something makes the most sense.

After writing my C++ memory system for my emulator project, I realized that I really didn’t like the code. Several professors that I have had would call this code smell. The thing is, I really didn’t know why I didn’t like the code, just that it didn’t feel right. The subconscious is really good at determining when something isn’t right but feeds this information to the conscious mind in the form of vague feelings. I have learned that it is best to try and listen to these feelings and work out what your subconscious is trying to tell you.

My initial thoughts on the problem were that the code was not going to be efficient. With an emulator this could be a big concern as poor performance on memory operations would reduce the overall performance of the emulator. This lead me to thinking about other ways I could handle the memory and realized that I was prematurely optimizing the problem. This, however, may be a case where that is a good thing. The memory subsystem will be used by everything in the emulator so making sure the interface is locked down is important.

The big issue with the 2600 memory management is that I need to track reads and writes. If I only had to track writing, then memory could be a fast global array with only writes needing to be handled through a function. This got me researching the various 2600 bank switching schemes to verify if any need to handle switching on a read. The most common bank switching scheme does the switch on a LDA instruction so that approach will not work. As tools for refactoring code have improved immensely making such drastic changes to the code later may not be that big of a deal, so I decided to leave things alone and port the existing code to Kotlin.

While re-writing the code in Kotlin, I realized that I may be over-complicating things. In C++, the cartridge loader class would be passed to the A2600 class (the machine) which would then call the cartridge loader install code which would tell the A2600 which memory manager to use. The A2600 - specifically the TIA emulator and the 6502 emulator - would access memory by calling the MMU class and if the code resulted in a bank switch then the MMU would call the cartridge loader to adjust the banks. By having the memory accesses go through the cartridge and having the MMU built into the cartridge (it could still be a separate class but don’t think that is necessary at this point) things are much easier as this picture shows. This change should make any future optimization easier alieving me of most my conserns.



While I am now starting my disassembler, or at least writing the test for the disassembler, next week will be a postmortem of my Halloween game which will be released this weekend. It is a port of a really old “game” that I did  and even though it is pretty low on the polling results it is a cute game and would be easier to port now than later (more on why next week).

Wednesday, October 18, 2017

The Kotlin Decision

Just like there are a number of JavaScript replacement languages, Kotlin is a Java replacement language which produces JRE bytecode which is compatible with the Java language. What got Kotlin on the roadmap was Google announcing full support for the language on the Android platform. The Android course I took in university was Java based which is okay. I am one of the few(?) programmers out there that don’t mind Java but do wish it was less verbose, compiled to JavaScript (or preferably asm.js), and could be compiled to native code. These are the things that Kotlin does, along with better type safety. This sounds to me like an ideal language and with it becoming increasingly popular among Android programmers it may take off meaning future work potential.

What I do when I want to learn a new language is skim through some of the books on the language get an idea about the basic syntax of the language and then find a small but real project to attempt to do in that language. Having a real project for a language really tells you a lot more about a language than books will and lets you know if a language is one that you can tolerate using or if it one of those languages that you will only use if you are being paid to use it, such as COBOL. I have in the past had the problem of having way too many projects going on at the same time. I still have a bad habit for this but am going to try and keep my personal projects down to two which at the moment are my re-write of Flash Game Development and my emulator/Coffee Quest 2600 project that I am developing on this blog. This means that if I want to learn Kotlan, I either have to wait until the book is finished or switch the language that I am using for developing my emulator in.

As there is only a hundred lines of code written for the project, now would be the ideal time to switch languages. It would also let me develop code for the web (JavaScript) while also working in the JRE and on Android devices. The problem is that part of the reason I decided to go with the emulator project was to get my C++ skills back up to a useful level. There is a third option, being to develop the emulator in both C++ and Kotlin and see how well the languages compare for such a task. As C++ is a system level language it should win hands-down but if native Kotlin is comparable in performance then that may speak to the future direction for my personal projects.

So, tonight I am setting up my Kotlin development environment and porting my existing code over to Kotlin. I will then start working on the disassembler portion of  the project in Kotlin. I have a really interesting way of doing the disassembly that will also be very helpful when I get around to writing the 6502 interpreter. Once I have finished the disassembler portion I will then port the code to C++ and then make an assessment as to if I want to continue developing in two languages or pick which language I wish to stick with.

So my decision is not to make a decision. This is probably a good decision as it will give me exposure to the Kotlin language so even if I ultimately drop it for C++ I will know whether it is an appropriate language for projects such as Coffee Quest. My Coffee Quest port, a future project, was going to be a port of the Java code into a language that could be compiled into JavaScript so it can run on the web as well as stand alone. I had been thinking of porting to C++ then using emscripten to generate asm.js code but if Kotlin works out then the Kotlin route may be the easier approach. Worst case I waste a bit of time prototyping my emulator in a different language but as this is a hobby project anyway that is not much of a loss.

Wednesday, October 11, 2017

Friday the 13th


I do not like the code that I wrote and am also considering switching this project to Kotlin as my test project for that language so am going to hold of discussing it this week and instead make an announcement. On Friday the 13th I will be updating my Spelchan.com site to have a less-ugly look. I will also be posting the first ported Blazing Games title, which will be 132 spikes.

Why that day? After all, doesn’t a horror movie franchise claim that this is an unlucky day? Well, porting games is drudge work and I consider it a horror so what better day than that. I am using Adobe Animate CC for the porting but am thinking that it really is not worth the price so will probably be switching to using direct Create.js code once I finish the games I am porting for the upcoming revision of my Flash Game Development book.

The first few games that I have ported went smoothly being easier to port than I feared yet nowhere near as easy to port as I had hoped. Animate CC converted the vector graphics and tween animations to create.js code for me, but none of the ActionScript was ported over requiring me to re-write the code myself in JavaScript. This is not overly hard as most of the Flash API has equivalent Create.js calls, but remembering to put this in front of all class scoped variables was a bit of a pain and often was the cause of any errors in the game. The speed of the games isn’t that great but I am waiting for the WebGL version of Create.js to be officially released before I start playing with that.

Some readers may have noticed that said I have finished porting several games, not just 132 spikes. My plans are to post one port a month for sure and additional posts of games when it is the appropriate occasion, (so yes, there will be a Halloween game). On Wednesdays where I have not made significant progress on my emulator, or at least don’t have new emulator topics to discuss, I will have a short progress report and apologize for my slow development time by posting another game that I ported.

My current porting plans are to finish the 10 games from my Flash book and then look at the poll results on BlazingGames.com to see what bigger series (right now One of those Weeks or Coffee Quest) is more desired then go with one of those larger projects doing smaller holiday games as appropriate. This plan is not set in stone so could change based on factors outside of my control.

Next week I will either be explaining my switch to Kotlin decision or reviewing why I think that my memory emulation code sucks. I do want to play around with emscripten but Kotlin does look like a real interesting language and may actually be a good language for tablet and web development, with work being done on a native compiler to boot. Tough decision ahead for me. See you next week.

Wednesday, October 4, 2017

Emulating Memory

The 2600 emulator is going to need memory. As it is based on the 6502 processor, we know that at most it has 16 bits worth of memory which is very little for modern processors. The obvious solution then is the following code:

unsigned char memory[65536];

Sadly, it is not going to be that easy. This is for several reasons. First, the 2600 used a cheaper version of the 6502 which only had 13 address lines not 16 so only 8192 bytes were addressable. Making this even worse is the fact that only 12 of these were for the cartridge, so cartridges were limited to 4096. Some of you are probably thinking, “Wait a second, Bill, weren’t there 8K, 12K, and 16K cartridges for the 2600?” Yes, there were. This leads to the real problem with the above approach.

Because of the memory restrictions of the cartridges, developers who needed more memory for increasingly complex games had to come up with ways around this. The solution that was used was several different bank-switching types of cartridges. The idea here is that a chip on the cartridge would act as a go-between giving the console different memory based on certain actions by the program. The memory management unit could detect things like reading or writing to a certain address and would use this to determine which bank of memory would be in place. This means that we are going to need to be able to know when memory is read or written to so we can handle such situations.

The next issue is the fact that cartridges were ROM. You cannot write to ROM.

As is common with common with computers, some memory is often mapped out to hardware devices so you communicate with this hardware by reading and especially writing to memory within a certain address range. This is how you set up the TIA chip for displaying things on the screen. There are a few other IO operations that also work this way (though I believe they are handled by different chips).

So emulating memory is not going to be that easy. My initial plan is to break up memory into a cartridge loader and a MMU class that can be overridden to support several types of bank-switching schemes. The cartridge loader would be responsible for determining which MMU to use and then giving the MMU the ROM data. There could be different cartridge loaders as well, and I plan on having two. The first would be a very basic file-based loader that would be used during development and would simply load the ROM file from the hard drive. Emscripten, the tool that I am using to compile C++ into asm.js, does let you load files but does so using a virtual file system. This is a bit of a pain so another cartridge loader which would be more web-friendly and be designed to so that I don’t need to deal with virtual file systems to change cartridges on a web page.


This project is being developed relatively live. This means that I am writing this as I work on the project. I am hoping I can get a few articles ahead of what I am posting but do want to keep what I post on this blog accurate to what I am going through as I develop this project so that readers can learn from both my successes and my failures. Tonight I am going to start coding and hopefully next post we will finally start seeing some of the code.