Synthmark816
This test is based on Synthmark64 but focus more on real-life operations and less on raw memory performance. It also makes use of the 16-bit mode of the 65816 in order to highlight both pros and cons of using the larger accumulator and index registers. When run on 65816 it will show two columns, one where the CPU is in 8-bit mode and one where it is in 16-bit mode. Since switching between 8-bit and 16-bit mode takes time, the 8-bit operations will be quite a bit slower when the CPU is in 16-bit mode and need to be switched to 8-bit mode for each and every operation.
By default the 8-bit mode tests will keep the 65816 in 8-bit mode all the time, but there's an option to make the tests switch to 16-bit mode when possible, which may not always be such a good idea in terms of performance.
Changes compared to Synthmark64:
- New option for 65816: “use 16bit mode when possible”
- Measure how many CIA cycles an empty benchmark consumes in order to get a somewhat more realistic rating
- Replaced all benchmarks, the new ones are: add, or, shift, multiply and divide in 8-bit, 16-bit and 32-bit
- Multiply is implemented as:
Bits | 6502/65C02 | 65816 8-bit | 65816 16-bit |
---|---|---|---|
8-bit | table of squares, 2 KB table | table of squares, 2 KB table | lookup table, 128 KB table |
16-bit | table of squares, 2 KB table | table of squares, 2 KB table | table of squares, 1 MB table |
32-bit | table of squares, 2 KB table | table of squares, 2 KB table | table of squares, 1 MB table |
- Divide is implemented as:
Bits | 6502/65C02 | 65816 8-bit | 65816 16-bit |
---|---|---|---|
8-bit | reciprocal multiply, 2,5 KB + 2 KB table | reciprocal multiply, 2,5 KB + 2 KB table | lookup table, 128 KB table |
16-bit | radix-2, “one bit at a time” | radix-2, “one bit at a time” | radix-2, “one bit at a time” |
32-bit | radix-2, “one bit at a time” | radix-2, “one bit at a time” | radix-2, “one bit at a time” |
Note: 16-bit divide for 65816 16-bit could be implemented as a reciprocal multiply, making it faster. It might also be possible to use reciprocal multiply to implement larger divides by using larger tables, computing an approximation and then fix any quotient errors after computing the remainder. I haven't investigated if it would be faster than the simple radix-2 divides though.