Looking at the code, it's obvious that it'll only work for x64, and it looks like it'd fail if it autostarts code at $c000. On machines with bankswitching (like C128 and Plus/4) how you'd go about it, without adding what I suspect is a fairly costly check in the core CPU emulation.
I improved the code a bit to wait for the LOADING stage and then do >= $e000 check only.
It would be even better if warp was enabled as soon as the user autostarts, so you don't have to sit around and wait for the C64's reset sequence. PC should be >= $e000 during the reset sequence as well, so the same checks should still apply.
I agree that other machines can cause trouble... - and have no fix for them currently Fortunately, the worst thing that might happen is warping toggled at the wrong point in AutostartWarp mode. Without Warp everything should work as before.
Yeah, worst case should only mean slow loading, and that's not too bad - you can still press CMD-W manually. The only scenario I can think of when it'll fail (as in not disable warp mode automatically) is if one of the bank switching machines loads autostart code above $e000, but I'm not sure if that's feasible - it's certainly not likely.
Just pushing the bytes in there, and adjusting the basic end address seems like a much simpler solution, yeah.
I did a first very rough experiment on this in the chris/v2.1.9-filedma branch.
It does dma load for x64 for prg files.
Machine specific code is a skeleton only as I need external help there.
Feel free to play with the code and keep your patches coming
Interesting, I just did a quick test, and it seems to work perfectly: it loads instantly, and TDE state is not touched.