So, as I wrote in the
Releases/PocketSNES based on Snes9x 1.43-dev thread, I have been trying a port of the Git version of Snes9x onto PocketSNES.
It does fix issues with some games, but only by virtue of
being more accurate; there's no single "fix" that I could backport for certain games, for example the audio troubles in Secret of Mana selected as the second ROM, or ActRaiser and Earthworm Jim. So more games run, and more games run well. Because it's much more accurate, it also runs
all games slower.
For example:
I also ran the emulator under a profiler while getting through the first level of Yoshi's Island with automatic frameskip (to get the pictures above). This is what I got:
6.7471 PocketSNES S9xAPUExecute()
6.2105 PocketSNES _ZL18S9xMainLoop_65C816v.8190
6.0511 PocketSNES S9xSuperFXExec()
4.6273 PocketSNES S9xCheckInterrupts()
4.5370 PocketSNES SNES::SPC_DSP::run(int)
4.4307 PocketSNES SNES::SPC_DSP::voice_V3c(SNES::SPC_DSP::voice_t*)
4.2236 PocketSNES _ZN4SNES3SMP4tickEv.local.3.constprop.491
4.1864 PocketSNES _ZL20DrawTile16_Normal1x1jjjj.28359
3.7773 PocketSNES S9xDoHEventProcessing()
2.6510 PocketSNES SNES::SPC_DSP::voice_V4(SNES::SPC_DSP::voice_t*)
2.1516 PocketSNES _ZL27DrawBackdrop16Add_Normal1x1jjj.28182
1.7851 PocketSNES _ZL24DrawBackdrop16_Normal1x1jjj.28177
1.6629 PocketSNES _Z15S9xDeinitUpdateii.part.1.32095
1.4610 PocketSNES SNES::SPC_DSP::voice_V8_V5_V2(SNES::SPC_DSP::voice_t*)
1.4132 PocketSNES _Z10S9xGetWordj9s9xwrap_t.constprop.463
1.3654 PocketSNES S9xSetPPU(unsigned char, unsigned short)
1.3122 PocketSNES _ZL6Op2CM0v.9323
1.3122 libuClibc-0.9.33.2.so /lib/libuClibc-0.9.33.2.so
1.2060 PocketSNES _ZL14DrawBackgroundihh.15565
1.1794 PocketSNES S9xGetPPU(unsigned short)
1.1157 PocketSNES S9xGetByte(unsigned int)
1.0413 PocketSNES _ZN4SNES7SPC_DSP3runEi.constprop.496
0.9988 PocketSNES S9xDoDMA(unsigned char)
0.9828 PocketSNES _ZL14addCyclesInDMAh.11571.1955
0.9616 PocketSNES _ZL20REGISTER_2118_linearh.11587
0.8872 PocketSNES _ZL6OpADM1v.9509
0.8660 PocketSNES _ZL20REGISTER_2119_linearh.11592
0.8181 PocketSNES _ZL12RenderScreenh.15597.1577
0.7597 PocketSNES _ZL6Op30E0v.9032
0.6747 PocketSNES _ZL8SetupOBJv.15595
0.6269 PocketSNES _ZL12fx_plot_4bitv.14013.1350
0.5738 PocketSNES _ZL6OpD0E0v.9014
0.5100 PocketSNES _ZN4SNES3SMP4tickEj.local.0.constprop.476
0.4781 PocketSNES _Z10S9xSetWordtj9s9xwrap_t15s9xwriteorder_t.constprop.364
0.3772 PocketSNES SNES::SPC_DSP::voice_V7_V4_V1(SNES::SPC_DSP::voice_t*)
And then I ran it with frameskip 0 to see if the rendering was a bottleneck:
9.7537 PocketSNES _ZL20DrawTile16_Normal1x1jjjj.28359
4.8813 PocketSNES S9xAPUExecute()
4.8365 PocketSNES _ZL27DrawBackdrop16Add_Normal1x1jjj.28182
4.6121 PocketSNES _ZL14DrawBackgroundihh.15565
4.5942 PocketSNES S9xSuperFXExec()
4.2084 PocketSNES _ZL18S9xMainLoop_65C816v.8190
3.5129 PocketSNES SNES::SPC_DSP::run(int)
3.4412 PocketSNES _Z15S9xDeinitUpdateii.part.1.32095
3.2527 PocketSNES S9xCheckInterrupts()
3.1271 PocketSNES SNES::SPC_DSP::voice_V3c(SNES::SPC_DSP::voice_t*)
2.9790 PocketSNES _ZN4SNES3SMP4tickEv.local.3.constprop.491
2.7592 PocketSNES _ZL24DrawBackdrop16_Normal1x1jjj.28177
2.7188 PocketSNES S9xDoHEventProcessing()
2.1849 PocketSNES _ZL12RenderScreenh.15597.1577
1.7408 libuClibc-0.9.33.2.so /lib/libuClibc-0.9.33.2.so
1.7183 PocketSNES SNES::SPC_DSP::voice_V4(SNES::SPC_DSP::voice_t*)
1.1844 PocketSNES S9xSetPPU(unsigned char, unsigned short)
1.1844 PocketSNES _Z10S9xGetWordj9s9xwrap_t.constprop.463
1.0588 PocketSNES _ZL6Op2CM0v.9323
0.9422 PocketSNES SNES::SPC_DSP::voice_V8_V5_V2(SNES::SPC_DSP::voice_t*)
0.8749 PocketSNES S9xGetPPU(unsigned short)
0.8300 PocketSNES S9xGetByte(unsigned int)
0.6864 PocketSNES _ZN4SNES7SPC_DSP3runEi.constprop.496
0.6326 PocketSNES _ZL20REGISTER_2119_linearh.11592
0.6281 PocketSNES _ZL27DrawClippedTile16_Normal1x1jjjjjj.28333
0.6102 PocketSNES S9xUpdateScreen()
0.6102 PocketSNES _ZL20REGISTER_2118_linearh.11587
0.5967 PocketSNES S9xDoDMA(unsigned char)
0.5922 PocketSNES _ZL14addCyclesInDMAh.11571.1955
0.5294 PocketSNES S9xSelectTileRenderers(int, unsigned char, unsigned char)
0.5025 PocketSNES _ZL8SetupOBJv.15595
0.4576 PocketSNES _ZL6OpADM1v.9509
0.4487 PocketSNES _ZL6Op30E0v.9032
0.4352 PocketSNES _ZL23DrawTile16Add_Normal1x1jjjj.28353
0.4217 PocketSNES _ZN4SNES3SMP4tickEj.local.0.constprop.476
0.3903 PocketSNES _ZL6OpD0E0v.9014
0.3544 PocketSNES SNES::SPC_DSP::voice_V7_V4_V1(SNES::SPC_DSP::voice_t*)
0.3544 PocketSNES _ZL12fx_plot_4bitv.14013.1350
I know that some things that were quick in Snes9x 1.43 became more accuracy-focused in Snes9x 1.5x, such as tile rendering and audio chip emulation and synthesis.
On auto frameskip:
Audio chip emulation = 25.98% of total GCW CPU
On frameskip 0:
Tile and screen rendering = 26.34% of total GCW CPU
Audio chip emulation = 18.27% of total GCW CPU
SNES S-CPU interrupt checks are twice slower than in 1.43; opcodes are twice or 3x slower than in 1.43; the S-SMP and S-DSP (together forming the audio chip, or APU) are anywhere from 6x to 15x slower than in 1.43, which could deal with all sound in up to 4% of total GCW CPU; and the tile rendering is 3x slower than in 1.43, mainly due to backdrops.
But here's the kicker: That's after I loosened the timings on S-CPU and SA-1 emulation and made other optimisations in a lot of places in the code. The vanilla Snes9x Git code ran twice slower than this, getting 35 FPS in a simple no-chip game like Super Mario World.I would like to know if other people are interested in trying to optimise Snes9x Git with me for the GCW Zero. You can see my work thus far at
https://github.com/Nebuleon/PocketSNES/commits/snes9x-git-experimental . Make yourself a fork if interested.
To be clear: I would like this to become PocketSNES in the future, but its current performance is unacceptable. I will stay with 1.43 until Snes9x Git's performance is acceptable, as defined by the community.