So, a little more on CocoNES’s 2A03 core. But first, a general idea of where I’m going with this. I’ll be running two threads–one is the event dispatch thread with the main event loop, and then the virtual machine itself will run on a secondary thread. The VM loop is really fairly simple; the first thing that happens in the loop is the -pulse message is sent to each timed component (ATM, the CPU and PPU). The CPU only receives the -pulse message every third cycle, as the CPU runs at 1/3 the speed of the PPU; this will give cycle-exact accuracy. The other part of giving cycle-exact accuracy was accounting for the fact that each CPU instruction actually takes multiple cycles to execute, and making sure the correct data, address, and control lines are set for each cycle. I spent a lot of time trying to figure out how I could do this, and came up with a pretty decent solution, I think.
The actual CPU itself has a timing control unit that keeps track of where the instruction is in its processing. I used the same idea in my emulated version; the CPU is kept in a static state between pulses, and each time a pulse is received, the next cycle is executed and the CPU is kept in state. I was actually able to easily handle the slight overlap in instruction (the next instruction is fetched as the current finishes executing); it was very easy when I realized that the CPU still only had one data latch and one address bus, so I could still only grab one piece of information at a time. So, while the last instruction is executing, I’m getting the opcode for the next instruction, then I reset the cycle counter to 1 instead of 0, signifying that the instruction is already stored in the instruction register and ready for execution.
In order to cut down on overhead, I came up with what should be an overhead-light “addressing” solution. Basically, there’s two components to each opcode, the addressing mode and the operation itself. So, what I ended up doing was making two arrays, one for the addressing mode and one for the operation. The opcode itself is just a simple 8-bit byte, so I can use the value of the byte as the index of the array; for example, if the opcode is $EA (which happens to be the No Operation code), I can look up addr_modes[0xEA] and instructions[0xEA].
But, you may ask, what ARE those arrays of? Well, I figured out early on that the Objective-C messaging overhead would probably be too much for this task. So I went searching for an alternative. I had originally thought to just write it in straight-up C with Core Foundation, but in the end I came up with a better solution. Objective-C is a strict superset of C–that means any C code works in an Objective-C environment, and that Objective-C is completely based off of standard C components. I was able to use this fact to realize that somewhere in there, there must be pointers to all of these methods. Cocoa specifies the IMP type (implementation pointer) that points to the implemented methods, that can then (with a couple of normally hidden arguments) be called as normal functions. So I was able to typedef a couple of custom types, and each array is an array of pointers to the functions. So, when the CPU receives a pulse, it runs addr_modes[opcode](), which starts the process of retrieving data, and then calls operations[opcode]() when it’s time to do the dirty work.
As soon as I have a chance to test the code more and clean it up a bit (as well as comment it some more) I’ll post a link to the source, since it can probably explain what’s going on far better than I can.