PC-Engine CPU Timing Test Results
by Chris Covell (chris_covell *at* yahoo.ca) - http://www.chrismcovell.com

Download Here: CPU_Test.zip

A Quick Explanation

This is a ROM to test CPU cycle accuracy in emulators & FPGA hardware targeted at the PC-Engine / TurboGrafx-16 game system.  It is a visual test, so really it's comparing the CPU timing against correct emulation of the VDC & VCE video chips.

In the test, each scanline "starts" with a change in the background colour, and then a single CPU instruction is looped over many times until the CRT beam wraps around and reaches the same spot one line below.  I've put all 2-cycle opcodes to be tested together on one screen; all 3-cycle opcodes on another, and so on.  Thus, correct timing for (almost) every instruction can be quickly checked as the start- and end-points should line up vertically, minus the odd 1-cycle correction here and there.

If the raster split carries down the screen at an angle, then the cycle timing is incorrectly emulated.  If the angle heads down and to the left, then somehow the CPU is being emulated too fast.  If the angle heads down and to the right, then CPU emulation is too slow.

These cycle-timed CPU opcodes can be run either from ROM or RAM.  When they are running in RAM, each instruction can be highlighted in the raster display if you press up or down.  This can help to single out which instructions are running too fast or slow.  If a single instruction's scanline clearly doesn't reach back on itself, then CPU emulation is too fast.  If the same coloured scanline wildly overlaps with itself from above, then CPU emulation is too slow.

An example: here I'm running the test ROM on a PC-Engine flash card.  This particular one (PCE 8M Flash from GamingEnterprisesInc.com) has no menu at all; unlike the Everdrive or SSDS3, the PCE jumps purely into my test ROM.  A menu like the one in the Everdrive sets up the VDC, VCE, and RAM to its own configuration, which has an effect on some of the results, as you can see later.
PC-Engine hardware
Mednafen emulator
I've highlighted in red where, horizontally, the raster split should start for every test.  My program has a very short VBlank interrupt service routine, and when it exits, it cycle-counts down the screen to the split position seen above. In comparison, an emulator like Mednafen has very accurate CPU emulation, but the position where the emulated PCE starts VBlank seems a bit delayed.

X-Cycle Tests

The last screen of my ROM tests the block transfer instructions of the PCE CPU a few times with differing data sizes.  The first set of tests take 1 scanline each to complete, but then they are set to take up 2, 4, and then 8 scanlines.

Finally, block transfers that write & read VRAM are set up.  Depending on when the PCE's VDC has "awakened" on system power-up, VRAM reads and writes might take longer or shorter times compared to my test ROM's screen captures.  This is due to how the CPU's and the VDC's VRAM accesses are interleaved in a configurable system of bus sharing.

However, I have set up the timing in my ROM so that the expected behaviour is the same almost every time the PCE is powered up cold.  Either this is just a fluke and I have been lucky, or this is the mysterious way in which the PCE operates, needing more in-depth investigation.

PC-Engine hardware
MiSTer FPGA system
PC-Engine hardware through SSDS3
Mednafen emulator
As can be seen above, even when running on the same PCE hardware, the presence of a setup menu in ROM emulation devices like the Everdrive or TerraOnion Super SD System 3 can throw off VRAM access timing.  The standard RAM block transfer instructions align correctly, but VRAM block transfers are actually sped up. While the Mendafen emulator (above) seems to run as though VRAM accesses always take the minimal time, the MiSTer FPGA PCE core (top) runs all block transfer instructions too fast.
All Timing Tests in RAM
We can see that Mednafen's CPU timing matches the PCE hardware, but its VSync interrupt triggers too late, or something like that.  The MiSTer, on the other hand, executes code in RAM at too high a speed in most of the tests, except for a few instructions that run at erratic speeds.  Oh, and the PCE display dimensions are wrong on the MiSTer, cutting off the right-hand side of the screen by a few BG tiles.
PC-Engine hardware
Mednafen emulator
MiSTer FPGA system
All Timing Tests in ROM
One of the bigger problems we can see here is that the PCE core in the MiSTer executes instructions from ROM at too slow a speed with the DDR3 setting.  This causes incompatibility and crashing in games and demos.  Running the ROM from SDRAM brings the timing more in line with the MiSTer's performance from RAM (above).
PC-Engine hardware
Mednafen emulator


To Chris' Homepage: http://www.chrismcovell.com


Okay, finally a "secret" feature for those who want to experiment further with the horizontal offset: Hold RUN and press button I or II to move the first raster split position right or left.  I've noticed that in the X-Cycle test, the far right side of the screen has fewer VDC write "slots" available if the raster split happens there, and that the 2nd loop of these block tests lags behind the 1st one.... Interesting.  I wonder why this happens.