Recent

Author Topic: Computed goto  (Read 25970 times)

skalogryz

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 2770
    • havefunsoft.com
Re: Computed goto
« Reply #30 on: November 02, 2016, 09:30:45 pm »
Because with the nested procedure it gives
...
Although when it confuses the registers, it is worse than the procedure
I guess GotoAddr is the way to go to  :D
(at least platform/cpu specific code is isolated within the procedure)

BeniBela

  • Hero Member
  • *****
  • Posts: 905
    • homepage
Re: Computed goto
« Reply #31 on: November 03, 2016, 11:41:31 am »
Because with the nested procedure it gives
...
Although when it confuses the registers, it is worse than the procedure
I guess GotoAddr is the way to go to  :D
(at least platform/cpu specific code is isolated within the procedure)

On the other hand, if it is run in valgrind, the gotoaddr variant takes 10 billion instructions and the macro takes 8 billion

creaothceann

  • Full Member
  • ***
  • Posts: 117
Re: Computed goto
« Reply #32 on: December 02, 2016, 07:43:26 pm »
Code: Pascal  [Select][+][-]
  1. {$define DISPATCH := p:= dispatch_table[code[pc]]; Inc(pc); asm JMP p;end;}
  2.  
  3. function interp_cgoto(code: PByte; initval: integer): integer;
  4. label
  5.   do_halt, do_inc, do_dec, do_mul2, do_div2, do_add7, do_neg;
  6. var
  7.   pc, val: integer;
  8.   dispatch_table: array[0..6] of
  9.   pointer = (@do_halt, @do_inc, @do_dec, @do_mul2, @do_div2, @do_add7, @do_neg);
  10.   p: pointer;
  11. begin
  12.   pc := 0;
  13.   val := initval;
  14.   DISPATCH;  
  15.   while True do begin
  16.     do_halt: Exit(val);
  17.     do_inc: Inc(val);
  18.     DISPATCH;
  19.     do_dec: Dec(val);
  20.     DISPATCH;
  21.     do_mul2: val := val * 2;
  22.     DISPATCH;
  23.     do_div2: val := val div 2;
  24.     DISPATCH;
  25.     do_add7: Inc(val, 7);
  26.     DISPATCH;
  27.     do_neg: val := -val;
  28.     DISPATCH;
  29.   end;
  30. end;

Do the labels and the ASM block have to be in the same procedure?

I'm playing around with creating a SNES emulator. There must be at least 256*2 labels because there are 256 opcodes and the CPU can be in 2 operating modes. I have a procedure SetMode that $INCLUDEs the label lists, the label addresses, and the actual labels with the code inbetween. The label addresses are stored in arrays and the currently active opcode cycle handler; this is necessary to easily resume execution... because unlike BeniBela I'm pausing the emulation after every system clock tick. (I fully expect it to be slower than this emulator.)


Code: Pascal  [Select][+][-]
  1. program Test;
  2.  
  3.  
  4. {$GOTO on}
  5. {$MODESWITCH AdvancedRecords}  // class-like records
  6.  
  7.  
  8. type bool = boolean;
  9. type u8   = byte;
  10. type u16  = word;
  11.  
  12. type T_Opcode                 = u8;
  13. type T_OpcodeHandlerAddress   = pointer;
  14. type T_OpcodeHandlerAddresses = array[T_Opcode] of T_OpcodeHandlerAddress;  // pointers to each opcode's first label
  15.  
  16.  
  17. type CPU = record  // static fields & methods => singleton (only one instance ever, declaration of the type is enough to create the instance)
  18.         class var D                          : u8;                        // data bus value
  19.         class var NativeMode                 : bool;
  20.         class var Opcode                     : T_Opcode;                  // currently executed opcode
  21.         class var OpcodeHandlerAddresses     : T_OpcodeHandlerAddresses;  // currently selected labels
  22.         class var OpcodeHandlers_not_updated : bool;
  23.         class var PC                         : u16;                       // program counter
  24.         class var ResumePoint                : pointer;
  25.         class procedure Latch_Opcode;       static;
  26.         class procedure PowerOn;            static;
  27.         class procedure Reset;              static;
  28.         class procedure SetMode;            static;
  29.         class procedure SetMode_Emulation;  static;
  30.         class procedure SetMode_Native;     static;
  31.         class procedure Step;               static;
  32.         end;
  33.  
  34.  
  35. class procedure CPU.Latch_Opcode;  {inline;}  // get new opcode, set opcode handler accordingly
  36. begin
  37. Opcode := D;
  38. ResumePoint := OpcodeHandlerAddresses[Opcode];
  39. Inc(PC);
  40. end;
  41.  
  42.  
  43. class procedure CPU.PowerOn;  {inline;}
  44. begin
  45. Reset;
  46. end;
  47.  
  48.  
  49. class procedure CPU.Reset;  {inline;}
  50. begin
  51. SetMode_Emulation;  // go back to "6502 emulation" mode
  52. D  := 0;
  53. PC := 0;            // TODO: use reset vector
  54. Latch_Opcode;
  55. end;
  56.  
  57.  
  58. class procedure CPU.SetMode;
  59. label {$INCLUDE I_65816_EmulationMode_labels};  // 256 labels
  60. label {$INCLUDE I_65816_NativeMode_labels};     // 256 labels
  61. const e : T_OpcodeHandlerAddresses = ({$INCLUDE I_65816_EmulationMode_addresses});  // 256 label addresses
  62. const n : T_OpcodeHandlerAddresses = ({$INCLUDE I_65816_NativeMode_addresses});     // 256 label addresses
  63. begin
  64. if OpcodeHandlers_not_updated then begin  // actually do set the requested mode, then quit
  65.         if NativeMode
  66.                 then OpcodeHandlerAddresses := n
  67.                 else OpcodeHandlerAddresses := e;
  68.         OpcodeHandlers_not_updated := False;
  69.         exit;
  70. end;
  71. WriteLn('This line should never be executed! Can we actually use something like exit here, or will the compiler then optimize away the labels?');
  72. {$INCLUDE I_65816_EmulationMode_code}  // 256 opcode handlers (with multiple opcode cycle labels)
  73. {$INCLUDE I_65816_NativeMode_code}     // 256 opcode handlers (with multiple opcode cycle labels)
  74. end;
  75.  
  76.  
  77. class procedure CPU.SetMode_Emulation;  begin  {inline;}  NativeMode := False;  OpcodeHandlers_not_updated := True;  SetMode;  end;
  78. class procedure CPU.SetMode_Native;     begin  {inline;}  NativeMode := True ;  OpcodeHandlers_not_updated := True;  SetMode;  end;
  79.  
  80.  
  81. class procedure CPU.Step;
  82. {$IFDEF CPU64}  var p : pointer;  {$ENDIF}
  83. begin
  84. {$ASMMODE Intel}
  85. {$IFDEF CPU32}                      ASM  JMP ResumePoint  END;  {$ENDIF}  // jump to previously stored label address
  86. {$IFDEF CPU64}  p := _ResumePoint;  ASM  JMP p            END;  {$ENDIF}
  87. end;
  88.  
  89.  
  90. ////////////////////////////////////////////////////////////////////////////////
  91.  
  92.  
  93. const LoopCount = 1000 * 1000 * 1000;
  94.  
  95. var i : integer;
  96.  
  97.  
  98. begin
  99. CPU.PowerOn;
  100. for i := 1 to LoopCount do begin
  101.         CPU.Step;
  102.         if (i mod (1024 * 1024) = 0) then Write('.');
  103. end;
  104. WriteLn;
  105. WriteLn;
  106. WriteLn('done');
  107. ReadLn;
  108. end.

The above code results in an exception class "External: SIGILL" in 32-bit and "SIGSEGV" in 64-bit mode. (OS: 64-bit Windows 10)

Do I have to "resort" to function pointers instead of label pointers? Would there be differences in execution speed, assuming the procedures don't use parameters and only access the record?
« Last Edit: December 02, 2016, 07:45:08 pm by creaothceann »

derek.john.evans

  • Guest
Re: Computed goto
« Reply #33 on: December 02, 2016, 11:14:31 pm »
I'm playing around with creating a SNES emulator. There must be at least 256*2 labels because there are 256 opcodes and the CPU can be in 2 operating modes....

Have you written any other processor emulators? If not, I wouldnt try to code a Ricoh 5A22 emulator.

If you have coded some others, some researching brings up these projects::

https://www.libretro.com/index.php/develop/
http://www.snes9x.com/
http://lifehacker.com/the-best-snes-emulator-for-windows-1745316428
https://en.wikipedia.org/wiki/Super_Nintendo_Entertainment_System#Emulation

They have implemented asm code for the CPU emulator. Even if you had a working processor, then there is the custom 3D chips.

I wouldn't say this is a task to take lightly.

creaothceann

  • Full Member
  • ***
  • Posts: 117
Re: Computed goto
« Reply #34 on: December 03, 2016, 12:25:37 am »
Have you written any other processor emulators?
No, but I've written some tools for SNES and PSX. I've also studied the architecture of the 6502 for some time now, which is where I'll have to begin.


some researching brings up these projects::

https://www.libretro.com/index.php/develop/
http://www.snes9x.com/
http://lifehacker.com/the-best-snes-emulator-for-windows-1745316428
https://en.wikipedia.org/wiki/Super_Nintendo_Entertainment_System#Emulation

They have implemented asm code for the CPU emulator. Even if you had a working processor, then there is the custom 3D chips.
Yeah, as a regular member of the bsnes forum I've heard of them :)

I'm not too concerned about the custom chips; the SA1 for example would probably halve any framerate I'd get. So I'm going to implement them after the main system, if at all.

As for the other projects, afaik only ZSNES uses assembler code extensively, which is why it's going to be restricted to x86. SNES9x seems to mostly use C++. bsnes uses several assembler implementations for its "libco" for its cooperative multithreading (which would be the greatest difference between bsnes and my emulator, because I use a state machine instead), and the newest C++ standard features for the rest.

derek.john.evans

  • Guest
Re: Computed goto
« Reply #35 on: December 03, 2016, 01:10:54 am »
O, cool. So the question is "label jumps" vs "function jumps" vs "case/of" and how that effects inline functions....

SNES9x looks like they went with a function jump table, and globals vars.

https://github.com/snes9xgit/snes9x/blob/master/cpuops.cpp
https://github.com/snes9xgit/snes9x/blob/master/cpuexec.cpp

I'd go with the same so at least you have a working reference point. It looks like a lot of work, even just to translate it.

The CPU emulator looks mostly C, so have you thought about trying to compile it to a dll/obj?

I guess it depends on how much work you want todo yourself.

creaothceann

  • Full Member
  • ***
  • Posts: 117
Re: Computed goto
« Reply #36 on: December 03, 2016, 01:22:55 am »
t looks like a lot of work, even just to translate it.

The CPU emulator looks mostly C, so have you thought about trying to compile it to a dll/obj?

I guess it depends on how much work you want todo yourself.
I want to write the emulator myself, of course. Otherwise, what's the point? ::)

And I can't copy it 1:1 anyway because SNES9x steps at the opcode level (~3.58MHz at most), whereas I want to step like bsnes at the system clock level (~21.477MHz).

derek.john.evans

  • Guest
Re: Computed goto
« Reply #37 on: December 03, 2016, 01:43:24 am »
I want to write the emulator myself, of course. Otherwise, what's the point? ::)
And I can't copy it 1:1 anyway because SNES9x steps at the opcode level (~3.58MHz at most), whereas I want to step like bsnes at the system clock level (~21.477MHz).

Yep, I get that. So SNES9x throttles? Wouldn't that mean, you just leave out the cycle counting and syncing?

Either way, this file is pretty tight emulator code:
https://github.com/snes9xgit/snes9x/blob/master/cpumacro.h

Lots of inlining and macros. My point/view is, the dispatch method looks insignificant.

Your questions are about speed. You want your emulator to run ~6 times faster than C code. My guess is, if you removed the throttling from SNES9x, it would do that.

So, I wouldn't focus on a complex buggy dispatch method. Yes, you want to code the emulator yourself, which is admirable, but, from what I can see, the SNES9x op inline/macro code is pretty slick. You wont write better, and using a non standard dispatch will only cause you issues, and effect what you obviously want todo.

ie: Enjoy writing your own emulator for a processor that you are fond off.

« Last Edit: December 03, 2016, 01:56:42 am by Geepster »

creaothceann

  • Full Member
  • ***
  • Posts: 117
Re: Computed goto
« Reply #38 on: December 03, 2016, 02:11:38 am »
Yep, I get that. So SNES9x throttles? Wouldn't that mean, you just leave out the cycle counting and syncing?
SNES9x just doesn't emulate all the system clock cycles in an instruction, but still counts them to get the overall instruction timing right. [1]
I can skip the cycle counting there (because I write a label/function for every cycle anyway), and I sync the CPU with the rest of the system at every system clock cycle.

Either way, this file is pretty tight emulator code:
https://github.com/snes9xgit/snes9x/blob/master/cpumacro.h

Lots of inlining and macros.
Sure, it makes the code look smaller, but behind the scenes it still gets executed.

using a non standard dispatch will only cause you issues, and effect what you obviously want todo.
That's why I'm asking if there is a way to make it work reliably with labels :) If not then I'm going with function pointers.


[1] Of course some instructions take more cycles to execute than others, but the SNES is also unique in that the MMU can stall the CPU depending on which area of the memory map is accessed.

creaothceann

  • Full Member
  • ***
  • Posts: 117
Re: Computed goto
« Reply #39 on: December 03, 2016, 01:30:54 pm »
It works well with function pointers. For the record, here's the code together with some high-resolution timing functions:

Code: Pascal  [Select][+][-]
  1. program Test;
  2. {$MODESWITCH AdvancedRecords}  // class-like records
  3. {$IFDEF WINDOWS}  uses Windows;  {$ENDIF}
  4.  
  5.  
  6. type bool   = boolean;
  7. type float8 = double;
  8. type int    = integer;
  9. type u8     = byte;
  10. type u16    = word;
  11.  
  12.  
  13. type T_Opcode         = u8;
  14. type T_OpcodeHandler  = procedure;
  15. type T_OpcodeHandlers = array[T_Opcode] of T_OpcodeHandler;  // function pointers to each opcode's first procedure
  16.  
  17.  
  18. type CPU = record  // static fields & methods => singleton (only one instance ever, declaration of the type is enough to create the instance)
  19.         class var D              : u8;                           // data bus value
  20.         class var next           : T_OpcodeHandler;
  21.         class var Opcode         : T_Opcode;                     // currently executed opcode
  22.         class var OpcodeHandlers : T_OpcodeHandlers;             // currently selected set of procedure pointers
  23.         class var PC             : u16;                          // program counter
  24.         class procedure Latch_Opcode;                   static;
  25.         class procedure PowerOn;                        static;
  26.         class procedure Reset;                          static;
  27.         class procedure Set_Mode(const native : bool);  static;
  28.         class procedure Step;                           static;
  29.         {$INCLUDE I_65816_EmulationMode_list}                    // 256 opcode handlers (with multiple opcode cycle procedures)
  30.         {$INCLUDE I_65816_NativeMode_list}                       // 256 opcode handlers (with multiple opcode cycle procedures)
  31.         end;
  32.  
  33.  
  34. class procedure CPU.Latch_Opcode;  inline;  // get new opcode, set opcode handler accordingly
  35. begin
  36. Opcode := D;
  37. next   := OpcodeHandlers[Opcode];
  38. Inc(PC);
  39. end;
  40.  
  41.  
  42. class procedure CPU.PowerOn;  inline;
  43. begin
  44. Reset;
  45. end;
  46.  
  47.  
  48. class procedure CPU.Reset;  inline;
  49. begin
  50. Set_Mode(False);  // go back to "6502 emulation" mode
  51. D  := 0;
  52. PC := 0;          // TODO: use reset vector
  53. Latch_Opcode;
  54. end;
  55.  
  56.  
  57. {$INCLUDE I_65816_EmulationMode_code}  // 256 opcode handlers (with multiple opcode cycle procedures)
  58. {$INCLUDE I_65816_NativeMode_code}     // 256 opcode handlers (with multiple opcode cycle procedures)
  59.  
  60.  
  61. class procedure CPU.Set_Mode(const native : bool);  inline;
  62. const Addresses : array[bool] of T_OpcodeHandlers = (
  63.         ({$INCLUDE I_65816_EmulationMode_addresses}),   // 256 procedure addresses
  64.         ({$INCLUDE I_65816_NativeMode_addresses}   ));  // 256 procedure addresses
  65. begin
  66. OpcodeHandlers := Addresses[native];
  67. end;
  68.  
  69.  
  70. class procedure CPU.Step;  inline;
  71. begin
  72. next;
  73. end;
  74.  
  75.  
  76. // high-resolution timer  ------------------------------------------------------
  77.  
  78.  
  79. var _HRT_TicksPerSecond : int64;
  80.  
  81.  
  82. function HRT_CurrentValue : int64;  inline;
  83. begin
  84. {$IFDEF WINDOWS} Result := 0;  QueryPerformanceCounter(Result);  {$ELSE}  {$FATAL code required for non-Windows platforms}  {$ENDIF}
  85. end;
  86.  
  87.  
  88. procedure HRT_Init;  inline;
  89. begin
  90. _HRT_TicksPerSecond := 0;
  91. {$IFDEF WINDOWS} QueryPerformanceFrequency(_HRT_TicksPerSecond);  {$ELSE}  {$FATAL code required for non-Windows platforms}  {$ENDIF}
  92. end;
  93.  
  94.  
  95. function HRT_TicksPerSecond : int64;  inline;
  96. begin
  97. Result := _HRT_TicksPerSecond;
  98. end;
  99.  
  100.  
  101. ////////////////////////////////////////////////////////////////////////////////
  102.  
  103.  
  104. const LoopCount = 1000 * 1000 * 1000;
  105. const SNES_MHz  = 1890 / 88;
  106.  
  107.  
  108. var i         : int;
  109. var MHz       : float8;
  110. var t         : float8;
  111. var TimeStart : int64;
  112. var TimeEnd   : int64;
  113.  
  114.  
  115. begin
  116. HRT_Init;
  117. CPU.PowerOn;
  118. TimeStart := HRT_CurrentValue;  for i := 1 to LoopCount do  CPU.Step;
  119. TimeEnd   := HRT_CurrentValue;
  120. t         := abs(TimeEnd - TimeStart) / HRT_TicksPerSecond;
  121. MHz       := LoopCount / t / 1000000;
  122. WriteLn(t:0:4, ' seconds, ', MHz :0:4, ' MHz, x', MHz / SNES_MHz :0:1);
  123. ReadLn;
  124. end.
  125.  

On my machine (i7 4790K @ 4.4GHz, DRAM @ 1200MHz although for now it's probably all in the L1 cache) I get the following values:

Code: [Select]
target OS              |  Win32  |  Win32  |  Win64    |  Win64
target CPU family      |  i386   |  i386   |  x86_64   |  x86_64
target processor       |  P4     |  P4     |  default  |  default
optimization level     |  3      |  3      |  3        |  3
small vs. fast         |  fast   |  small  |  fast     |  small
debugging              |  no     |  no     |  no       |  no
-----------------------+---------+---------+-----------+------------------------
emulated speed (MHz)   |  875    |  731    |  878      |  731
SNES speed multiplier  |  40.9   |  34.1   |  40.9     |  34.1

This means that only with the framework, no actual emulation (including fetching new opcodes and setting the opcode handlers) yet, to emulate 1 MHz of the SNES we have to spend 4400 / 878 = 5.01 MHz of the host CPU. With more emulation added in the emulated speed will drop dramatically; once it goes below ~21MHz we can't run the games at realtime speed.

But we'll see. :)
« Last Edit: December 03, 2016, 01:41:20 pm by creaothceann »

BeniBela

  • Hero Member
  • *****
  • Posts: 905
    • homepage
Re: Computed goto
« Reply #40 on: December 04, 2016, 02:21:19 pm »
I wrote an reply yesterday and it disappeared. Did I forget to click "post"?

>next;

Does that become a JMP? it seems if it became a CALL it would overflow the stack.


However, I have other problems.

I want it to be thread safe. That rules out global variables.

And unfortunately I put my data in interfaces. Now every function has an implicit exception block and that becomes quite slow. And the simplest way to get rid of that is to put everything in a single function via byte code.


creaothceann

  • Full Member
  • ***
  • Posts: 117
Re: Computed goto
« Reply #41 on: December 04, 2016, 03:03:26 pm »
>next;

Does that become a JMP? it seems if it became a CALL it would overflow the stack.
That is a function pointer (signature "procedure;"), so it becomes a CALL. But this is intended since I return to the main program after every clock cycle.

It seems that if you jump around between the labels all the time, you have to put the GUI / program control (message loop) in there too, unless you have only short code chunks to execute and can guarantee that there isn't an endless loop.


However, I have other problems.

I want it to be thread safe. That rules out global variables.

And unfortunately I put my data in interfaces. Now every function has an implicit exception block and that becomes quite slow. And the simplest way to get rid of that is to put everything in a single function via byte code.
Can't you safeguard global resources with critical sections and the like? However: thread synchronization tends to be slow, which is why e.g. bsnes uses cooperative multithreading (it emulates a part of the machine without synchronizing at every step until it's unavoidable, then switches to the other involved part.)

Thaddy

  • Hero Member
  • *****
  • Posts: 14200
  • Probably until I exterminate Putin.
Re: Computed goto
« Reply #42 on: December 04, 2016, 03:25:30 pm »
I have a pascal version of fake6502.c that also uses function pointers in a const array of procedure. This is the exact C equivalent. C also doesn'tuse jums but calls given this code snippet inline:
Code: Pascal  [Select][+][-]
  1. // imp etc are implemented procedural types, the operants too, the cycle code is a const array of dword
  2. type
  3.   AddrMode = procedure;
  4. const addrtable:array[0..255] of AddrMode = (
  5. (*        |  0  |  1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  A  |  B  |  C  |  D  |  E  |  F  |     *)
  6. (* 0 *)     imp, indx,  imp, indx,   zp,   zp,   zp,   zp,  imp,  imm,  acc,  imm, abso, abso, abso, abso, (* 0 *)
  7. (* 1 *)     rel, indy,  imp, indy,  zpx,  zpx,  zpx,  zpx,  imp, absy,  imp, absy, absx, absx, absx, absx, (* 1 *)
  8. (* 2 *)    abso, indx,  imp, indx,   zp,   zp,   zp,   zp,  imp,  imm,  acc,  imm, abso, abso, abso, abso, (* 2 *)
  9. (* 3 *)     rel, indy,  imp, indy,  zpx,  zpx,  zpx,  zpx,  imp, absy,  imp, absy, absx, absx, absx, absx, (* 3 *)
  10. (* 4 *)     imp, indx,  imp, indx,   zp,   zp,   zp,   zp,  imp,  imm,  acc,  imm, abso, abso, abso, abso, (* 4 *)
  11. (* 5 *)     rel, indy,  imp, indy,  zpx,  zpx,  zpx,  zpx,  imp, absy,  imp, absy, absx, absx, absx, absx, (* 5 *)
  12. (* 6 *)     imp, indx,  imp, indx,   zp,   zp,   zp,   zp,  imp,  imm,  acc,  imm,  ind, abso, abso, abso, (* 6 *)
  13. (* 7 *)     rel, indy,  imp, indy,  zpx,  zpx,  zpx,  zpx,  imp, absy,  imp, absy, absx, absx, absx, absx, (* 7 *)
  14. (* 8 *)     imm, indx,  imm, indx,   zp,   zp,   zp,   zp,  imp,  imm,  imp,  imm, abso, abso, abso, abso, (* 8 *)
  15. (* 9 *)     rel, indy,  imp, indy,  zpx,  zpx,  zpy,  zpy,  imp, absy,  imp, absy, absx, absx, absy, absy, (* 9 *)
  16. (* A *)     imm, indx,  imm, indx,   zp,   zp,   zp,   zp,  imp,  imm,  imp,  imm, abso, abso, abso, abso, (* A *)
  17. (* B *)     rel, indy,  imp, indy,  zpx,  zpx,  zpy,  zpy,  imp, absy,  imp, absy, absx, absx, absy, absy, (* B *)
  18. (* C *)     imm, indx,  imm, indx,   zp,   zp,   zp,   zp,  imp,  imm,  imp,  imm, abso, abso, abso, abso, (* C *)
  19. (* D *)     rel, indy,  imp, indy,  zpx,  zpx,  zpx,  zpx,  imp, absy,  imp, absy, absx, absx, absx, absx, (* D *)
  20. (* E *)     imm, indx,  imm, indx,   zp,   zp,   zp,   zp,  imp,  imm,  imp,  imm, abso, abso, abso, abso, (* E *)
  21. (* F *)     rel, indy,  imp, indy,  zpx,  zpx,  zpx,  zpx,  imp, absy,  imp, absy, absx, absx, absx, absx  (* F *)
  22. );
  23. begin
  24.   readln;
  25.   addrtable[$F1];
  26.   addrtable[0];
  27. end.
  28.  

The emulation has good performance, even on Raspbian arm.
« Last Edit: December 04, 2016, 03:27:33 pm by Thaddy »
Specialize a type, not a var.

creaothceann

  • Full Member
  • ***
  • Posts: 117
Re: Computed goto
« Reply #43 on: December 04, 2016, 04:15:15 pm »
I still don't know why my code above didn't work, but I think now that using labels wouldn't be any faster than using function pointers...

Originally I had a function "Step" that would be entered from the main program. Then it would use assembler code to load the next label's address, and jump to it ("JMP"). The code at the label would then call "exit" ("RET").

With the function pointer "next" I can avoid the "Step" function (it is inlined and just calls "next" anyway), so the only cost is calling the function pointer and returning from it. Since the "Step" function from before couldn't be inlined because of the assembler code, the current version might even be faster, especially if the compiler doesn't generate stack frames for the procedures I'm using now (haven't checked yet).


Thaddy:
You don't have to use the @ operator to get the address of the functions?

Thaddy

  • Hero Member
  • *****
  • Posts: 14200
  • Probably until I exterminate Putin.
Re: Computed goto
« Reply #44 on: December 04, 2016, 04:35:00 pm »
Thaddy:
You don't have to use the @ operator to get the address of the functions?

No, my version of fake6502 compiles in Delphi and with FPC in Delphi mode. Hence @ is superfluous.

Btw that has also step, etc and should also be usable as processor for the Ricoh processor.
It simply reads the bin from a bytestream and feeds it into the 6502. The processor itself is completely hidden in the implementation section.

And note that the compiler will indeed omit the stackframe for parameterless procedures (except maybe below -O2) because it can see that the local stack is never used..
« Last Edit: December 04, 2016, 04:40:15 pm by Thaddy »
Specialize a type, not a var.

 

TinyPortal © 2005-2018