Login

Important information

This site uses cookies to store information on your computer. By continuing to use our site, you consent to our cookies.

ARM websites use two types of cookie: (1) those that enable the site to function and perform as required; and (2) analytical cookies which anonymously track visitors only while using the site. If you are not happy with this use of these cookies please review our Privacy Policy to learn how they can be disabled. By disabling cookies some features of the site will not work.

ARM Community: VPf vector example - ARM Community

Jump to content

Page 1 of 1
  • You cannot start a new topic
  • You cannot reply to this topic

VPf vector example Rate Topic: -----

#1 User is offline   webshaker 

  • Regular Contributor
  • PipPipPip
  • Group: Members
  • Posts: 220
  • Joined: 07-October 10

Posted 05 January 2011 - 10:12 AM

Hi.

the cortex documentation speak about register bank for vector usage.
That's great, but I do not really understand what is a vector instruction (using those bank)

Does anybody can give me an example using the register bank and vpf vector instruction ?

Thank's
When you have eliminated the impossible, whatever remains, however improbable, must be the truth
0

#2 User is offline   scott 

  • Regular Contributor
  • PipPipPip
  • Group: Members.
  • Posts: 205
  • Joined: 05-October 06

Posted 05 January 2011 - 04:59 PM

In archtecture v7-A processors that have the feature (for example Cortex-A5, -A8, -A9, -A15), you want to use the Advanced SIMD (also known as NEON) instructions, e.g. VLD1.16/VADD.I16/VST1.16. They use a register bank where the registers are named D0-D31 (overlapped with Q0-Q15) that is separate from the integer registers R0-R14.

You can find some examples in various thread in this forum, for example: http://forums.arm.co...th-arm-or-neon/.

It's a bit confusing because the VFP instructions (which are older than NEON) use the same D0-D31* register bank and could originally do short vector operations. But the short vectors were somewhat difficult to use and the feature was not used much, if at all. In fact, the most recent implementations of the the VFP instructions no longer perform short vector operations in hardware.

[*] in ARM11 processors with VFP there are only D0-D15.
0

#3 User is offline   webshaker 

  • Regular Contributor
  • PipPipPip
  • Group: Members
  • Posts: 220
  • Joined: 07-October 10

Posted 05 January 2011 - 05:39 PM

Hum.

In fact, it was not my question.

My question was about vector float instruction
I finally found that

Quote

VFPASSERT VECTOR
fadds s10<3>, s0, s2<3>


If I've understand this code will execute
fadds s10, s0, s2
fadds s11, s0, s3
fadds s12, s0, s4
When you have eliminated the impossible, whatever remains, however improbable, must be the truth
0

#4 User is offline   scott 

  • Regular Contributor
  • PipPipPip
  • Group: Members.
  • Posts: 205
  • Joined: 05-October 06

Posted 06 January 2011 - 03:14 PM

That is the older, deprecated "VFP vector" or "short vector" mode. It's not supported (in hardware) on the Cortex-A nor Cortex-R processors (as far as I know).

Details are in the ARM ARM for v5 http://infocenter.ar...100i/index.html or v7-A & -R http://infocenter.ar...406b/index.html

This post has been edited by scott: 06 January 2011 - 03:24 PM

0

#5 User is offline   webshaker 

  • Regular Contributor
  • PipPipPip
  • Group: Members
  • Posts: 220
  • Joined: 07-October 10

Posted 07 January 2011 - 08:27 AM

View Postscott, on 06 January 2011 - 03:14 PM, said:

That is the older, deprecated "VFP vector" or "short vector" mode. It's not supported (in hardware) on the Cortex-A nor Cortex-R processors (as far as I know).



I'm agree with you.
But this is the only usage I found for the bank register.

NEON do not use the bank registrer.
Most of NEON instruction can use all NEON registrer without any restriction.


When you have eliminated the impossible, whatever remains, however improbable, must be the truth
0

#6 User is offline   JesseT 

  • Member
  • Pip
  • Group: Members
  • Posts: 1
  • Joined: 19-January 11

Posted 19 January 2011 - 03:26 AM

View Postwebshaker, on 07 January 2011 - 08:27 AM, said:

I'm agree with you.
But this is the only usage I found for the bank register.

NEON do not use the bank registrer.
Most of NEON instruction can use all NEON registrer without any restriction.




With the VFP architecture, the VFP registers were divided into 4 banks. So you had:

Bank #0: S0-S7 and D0-D3
Bank #1: S8-S15 and D4-D7
Bank #2: S16-S23 and D8-D11
Bank #3: S24-S31 and D12-D15

When you set the VFP vector arity to greater than 1, Banks #1 through #3 were used for vector operations, while Bank #0 was reserved for scalar operations. That way, even if you had set the vector arity to say 4 and were performing operations on 32-bit floating point 4-vectors, you still could use registers S0-S7 for scalar 32-bit floating point operations without having to switch the VFP unit's vector arity back to 1.

Of course, now that VFP is deprecated on ARMv7 based architectures such as Cortex, NEON is the way to go. VFP instructions with the arity set above 1 on ARMv7/NEON processors will perform much slower, so you should avoid using them on those platforms and use the NEON pipeline instead.

Edit: There was another interesting property of register addressing when used in vector operations that I had forgotten to mention. The subsequent registers comprising a vector would wrap around on the register bank boundaries. So if you issued the following instruction when the vector arity was set to 4:

fadds s16, s14, s20

The first vector operand starting at register s14 would wrap around so that it would be {s14, s15, s8, s9}. It would be the equivalent of:

s16 = s14 + s20
s17 = s15 + s21
s18 = s8 + s22
s19 = s9 + s23

You could exploit this trick to perform shuffling of vector components without additional instructions.

This post has been edited by JesseT: 19 January 2011 - 03:45 AM

0

Share this topic:


Page 1 of 1
  • You cannot start a new topic
  • You cannot reply to this topic