diff options
author | David Mosberger-Tang <davidm@hpl.hp.com> | 2005-04-21 11:07:59 -0700 |
---|---|---|
committer | Tony Luck <tony.luck@intel.com> | 2005-04-21 11:07:59 -0700 |
commit | 821376bf15e692941f9235f13a14987009fd0b10 (patch) | |
tree | 2179380ee3eb38fb393719e6ce32b15e934c4a44 /include/asm-ia64/bitops.h | |
parent | d8470b7c13e11c18cf14a7e3180f0b00e715e4f0 (diff) |
[IA64] fix fls()
The ia64-version of fls() never worked as intended (the bitnumbering
was off by 1 and fls(0) was undefined). This patch fixes the problem
by using a popcnt-based fls(), which on McKinley-derived cores is
slightly faster than both ia64_fls() and generic_fls(). The resulting
code, however, is bigger (7-8 bundles instead of about 3 bundles).
Also switch ia64_popcnt() to __builtin_popcountl() for GCC v3.4 or
newer since the compiler can predicate that and schedule it better.
Thanks to Simon Derr and Matt Mackall for tracking down this bug.
Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Diffstat (limited to 'include/asm-ia64/bitops.h')
-rw-r--r-- | include/asm-ia64/bitops.h | 21 |
1 files changed, 17 insertions, 4 deletions
diff --git a/include/asm-ia64/bitops.h b/include/asm-ia64/bitops.h index 925d54cee47..7232528e2d0 100644 --- a/include/asm-ia64/bitops.h +++ b/include/asm-ia64/bitops.h @@ -314,8 +314,8 @@ __ffs (unsigned long x) #ifdef __KERNEL__ /* - * find_last_zero_bit - find the last zero bit in a 64 bit quantity - * @x: The value to search + * Return bit number of last (most-significant) bit set. Undefined + * for x==0. Bits are numbered from 0..63 (e.g., ia64_fls(9) == 3). */ static inline unsigned long ia64_fls (unsigned long x) @@ -327,10 +327,23 @@ ia64_fls (unsigned long x) return exp - 0xffff; } +/* + * Find the last (most significant) bit set. Returns 0 for x==0 and + * bits are numbered from 1..32 (e.g., fls(9) == 4). + */ static inline int -fls (int x) +fls (int t) { - return ia64_fls((unsigned int) x); + unsigned long x = t & 0xffffffffu; + + if (!x) + return 0; + x |= x >> 1; + x |= x >> 2; + x |= x >> 4; + x |= x >> 8; + x |= x >> 16; + return ia64_popcnt(x); } /* |