Circular buffer NEON
Posted 02 February 2012 - 10:01 AM
This is a good idea to use a circular buffer when you need temp memory zone.
There is no restriction to modify a ARM register used as a pointer into a NEON instruction.
Look at this instruction
vld1.32 q0, [r4]!
When the Cortex decode this instruction what happened ?
First the instruction is put into the NEON queue, but the pushed instruction do not contained any reference to R4 it contained the value of R4.
If R4 = 0x0000ab00, the pushed instuction is
vld1.32 q0, [0x0000ab00]
the ARM can then add 8 to r4 immediatly (due to post increment)
So even if the load is pushed into the queue and will be executed few cycle later, you can modify R4 as soon as you want.
If your memory zone is 0x0000a000 to 0x0000afff
You can use
vld1.32 q0, [r4]! and r4, r4, r5 @ with r5 = 0xfffff000
This post has been edited by webshaker: 02 February 2012 - 10:02 AM