Quick Links
ARM NEON equivalent of Intel SSE
#1
Posted 15 May 2012 - 02:35 AM
I'm trying to convert some Intel SSE code to ARM NEON and I can't find the equivalent NEON instructions of the SSE single-precision instructions _mm_mul_ss, _m_add_ss. Please help.
thanks
#2
Posted 15 May 2012 - 07:09 AM
* multiply out everything and merge two result registers to get what you want
* do the scalar multiply on the ARM core using normal FPU instructions, and then merge that in. You could use single-lane load store rather than register mangling in this case, which _may_ be faster.
#3
Posted 15 May 2012 - 08:08 AM
isogen74, on 15 May 2012 - 07:09 AM, said:
* multiply out everything and merge two result registers to get what you want
* do the scalar multiply on the ARM core using normal FPU instructions, and then merge that in. You could use single-lane load store rather than register mangling in this case, which _may_ be faster.
Thanks for the quick reply. At least that explains why I couldn't find those instructions. I think I'll use the first version to avoid doing to and from the FPU.















