Optimization “tricks”

Looking for cycle exactness

Some explanation at first for background…

As mentioned on many other occasions, the length of a vector is determined by two factors:
a) scale
b) strengths
The scale is a “duration”, and the strength is the “speed”, both multiplied is the length.

Since you usually want everything to be done as fast as possible – One way to do really fast vector drawing is to reduce the scale factor as much as possible, and use large vector strengths instead (to compensate the low scale).

If you do that to the extreme you might come up with something like:

scale = 3
strength = 120

The thing with that is – that using BIOS functions, vectorlists with above settings look sort of bad.

Background to the background…
The gist of the problem is that the vectrex can not do two things at the same time.
It would be great if it could switch on the light and start moving the vector beam at the same time – but it can not.

Movement is switched using the ~RAMP flag, and light is switched using the ~BLANK flag. You can not switch both of them at the same time.
You either switch the light on too early (bright spot at the beginning), or to late (empty space at the beginning of the vector). (Same goes for the end of the vector)

BIOS routines use timer 1 (VIA) to switch movement on and off (~RAMP) and the SHIFTREG (VIA) to switch the light (~BLANK).

The greater the strength of the vector (the faster it moves), the more noticable that small difference is.
If you additionally only draw the vector for a very short time, the difference is even more noticable (percentaged about 20%-25% for above values!)

Meaning the vector lists with above values displayed via BIOS routines look “broken”.

(This is an outlined number “6”)

 

One solution I found is based on two “facts”:
– one can set the ~BLANK without using the SHIFTREG, if VIA is configured correctly (btw, ~RAMP can also be set without using the timer)
– “somewhere” within the vectrex (or the monitor), there is a delay of one or both signals. The delay seems to be different depending on the way ~RAMP/~BLANK are set (experiments show…)

I have written an assembler routine, that uses the experimentally gotten “delays” and is able to nearly exactly time both signals. There are no bright “dots” and no “empty spaces” (experimentally approved by different vectrex systems).

Back from the “backgrounds”.
Since I wanted to program Vectrex in “C” I wanted to port that new routine to “C” to be not dependable on external “asm” libraries.

Following challenge – one time critical assembler block is:

 ldb #0xee 
 CLR *VIA_t1_cnt_hi   ; [] enable timer 1 
 stb *VIA_cntl        ; [4] ZERO disabled, and BLANK disabled

The third line “stb…” must come exactly 4 cycles after enabling the timer.

The corresponding C-Lines look like:

 VIA_t1_cnt_hi = 0; // ; enable timer 1
 VIA_cntl = (int)0xee; // ZERO disabled, and BLANK disabled

and the compiler generates (-O2):

 clr -12283 ; 
 ldb #-18 ;
 stb -12276 ;

The timer and the CNTL poke are now 6 cycles apart. Which again results in a (very small) gap at the beginning of a vector.

WHAT THIS IS ALL ABOUT
All I want to say – if you want to program very optimized “C”, don’t give up – and try things out.

Peer* actually came up with a way to produce the cycle exact asm code within “C”:

 register int c = (int)0xee;
 asm volatile ("" :: "r" (c));
 VIA_t1_cnt_hi = 0;     // ; enable timer 1
 VIA_cntl = (int)0xee;  // ZERO disabled, and BLANK disabled

The resulting code is:

ldb #-18
clr -12283 ; 
stb -12276 ;

(if direct page addressing is used, than this is exactly the same code as hand written assembler!)

 

Resulting outlined “6”:

 

 

* Prof. Dr. Peer Johannsen of Pforzheim University, Germany.