So far I have the following:
(1) Using the VFP floating point comparison:
[/font][/size] [size="2"] vcmp.f64 d0, d6 vmrs APSR_nzcv, fpscr vcmpeq.f64 d1, d7 vmrseq APSR_nzcv, fpscr[/size] [size="2"]
If the 64bit "floats" are equivalent to NaN, this version will not work.
(2) Using the NEON narrowing and the VFP comparison (this time only once and in a NaN-safe manner):
[/font][/size] vceq.i32 q15, q0, q3 vmovn.i32 d31, q15 vshl.s16 d31, d31, #8 vcmp.f64 d31, d29 vmrs APSR_nzcv, fpscr
The D29 register is previously preloaded with the right 16bit pattern:
vmov.i16 d29, #65280 ; 0xff00
[font="Arial,"]
[size="2"][font="Arial,"]My question is: is there any better than this? Am I overseeing some obvious way to do it?
This post has been edited by Mircea: 30 January 2012 - 06:54 PM
















