Age | Commit message (Collapse) | Author |
|
Add support for the hardware version of the Hamming weight function,
popcnt, present in CPUs which advertize it under CPUID, Function
0x0000_0001_ECX[23]. On CPUs which don't support it, we fallback to the
default lib/hweight.c sw versions.
A synthetic benchmark comparing popcnt with __sw_hweight64 showed almost
a 3x speedup on a F10h machine.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <20100318112015.GC11152@aftab>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
|
Rename the extisting runtime hweight() implementations to
__arch_hweight(), rename the compile-time versions to __const_hweight()
and then have hweight() pick between them.
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20100318111929.GB11152@aftab>
Acked-by: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <1265028224.24455.154.camel@laptop>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
|
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
sunxvr500: Ignore secondary output PCI devices.
sparc64: Implement perf_arch_fetch_caller_regs
sparc64: Update defconfig.
sparc64: Fix array size reported by vmemmap_populate()
sparc: Fix regset register window handling.
drivers/serial/sunsu.c: Correct use after free
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf: Always build the powerpc perf_arch_fetch_caller_regs version
perf: Always build the stub perf_arch_fetch_caller_regs version
perf, probe-finder: Build fix on Debian
perf/scripts: Tuple was set from long in both branches in python_process_event()
perf: Fix 'perf sched record' deadlock
perf, x86: Fix callgraphs of 32-bit processes on 64-bit kernels
perf, x86: Fix AMD hotplug & constraint initialization
x86: Move notify_cpu_starting() callback to a later stage
x86,kgdb: Always initialize the hw breakpoint attribute
perf: Use hot regs with software sched switch/migrate events
perf: Correctly align perf event tracing buffer
|
|
We provide regs->tstate, regs->tpc, regs->tnpc and
regs->u_regs[UREG_FP].
regs->tstate is necessary for:
user_mode() (via perf_exclude_event())
perf_misc_flags() (via perf_prepare_sample())
regs->tpc is necessary for:
perf_instruction_pointer() (via perf_prepare_sample())
and regs->u_regs[UREG_FP] is necessary for:
perf_callchain() (via perf_prepare_sample())
The regs->tnpc value is provided just to be tidy.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
vmemmap_populate() attempts to report the used index and total size of
vmemmap_table, but it wrongly shifts the total size so that it is
always shown as 0.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now that software events use perf_arch_fetch_caller_regs() too, we
need the powerpc version to be always built.
Fixes the following build error:
(.text+0x3210): undefined reference to `perf_arch_fetch_caller_regs'
(.text+0x3324): undefined reference to `perf_arch_fetch_caller_regs'
(.text+0x33bc): undefined reference to `perf_arch_fetch_caller_regs'
(.text+0x33ec): undefined reference to `perf_arch_fetch_caller_regs'
(.text+0xd4a0): undefined reference to `perf_arch_fetch_caller_regs'
arch/powerpc/kernel/built-in.o:(.text+0xd528): more undefined references to `perf_arch_fetch_caller_regs' follow
make[1]: *** [.tmp_vmlinux1] Error 1
make: *** [sub-make] Error 2
Reported-by: Michael Ellerman <michael@ellerman.id.au>
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
|
|
* master.kernel.org:/home/rmk/linux-2.6-arm:
ARM: 5965/1: Fix soft lockup in at91 udc driver
ARM: 6006/1: ARM: Use the correct NOP size in memmove for Thumb-2 kernel builds
ARM: 6005/1: arm: kprobes: fix register corruption with jprobes
ARM: 6003/1: removing compilation warning from pl061.h
ARM: 6001/1: removing compilation warning comming from clkdev.h
ARM: 6000/1: removing compilation warning comming from <asm/irq.h>
ARM: 5999/1: Including device.h and resource.h header files in linux/amba/bus.h
ARM: 5997/1: ARM: Correct the VFPv3 detection
ARM: 5996/1: ARM: Change the mandatory barriers implementation (4/4)
ARM: 5995/1: ARM: Add L2x0 outer_sync() support (3/4)
ARM: 5994/1: ARM: Add outer_cache_fns.sync function pointer (2/4)
ARM: 5993/1: ARM: Move the outer_cache definitions into a separate file (1/4)
|
|
* 'merge' of git://git.secretlab.ca/git/linux-2.6:
powerpc/5200: in lpbfifo, flag DMA irqs as enabled after requesting them
powerpc/fsl: add device tree binding for QE firmware
of/flattree: Fix unhandled OF_DT_NOP tag when unflattening the device tree
|
|
When profiling a 32-bit process on a 64-bit kernel, callgraph tracing
stopped after the first function, because it has seen a garbage memory
address (tried to interpret the frame pointer, and return address as a
64-bit pointer).
Fix this by using a struct stack_frame with 32-bit pointers when the
TIF_IA32 flag is set.
Note that TIF_IA32 flag must be used, and not is_compat_task(), because
the latter is only set when the 32-bit process is executing a syscall,
which may not always be the case (when tracing page fault events for
example).
Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
LKML-Reference: <1268820436-13145-1-git-send-email-edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Commit 3f6da39 ("perf: Rework and fix the arch CPU-hotplug hooks") moved
the amd northbridge allocation from CPUS_ONLINE to CPUS_PREPARE_UP
however amd_nb_id() doesn't work yet on prepare so it would simply bail
basically reverting to a state where we do not properly track node wide
constraints - causing weird perf results.
Fix up the AMD NorthBridge initialization code by allocating from
CPU_UP_PREPARE and installing it from CPU_STARTING once we have the
proper nb_id. It also properly deals with the allocation failing.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
[ robustify using amd_has_nb() ]
Signed-off-by: Stephane Eranian <eranian@google.com>
LKML-Reference: <1269353485.5109.48.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Because we need to have cpu identification things done by the time we run
CPU_STARTING notifiers.
( This init ordering will be relied on by the next fix. )
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1269353485.5109.48.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/urgent
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6
* 'sh/for-2.6.34' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
sh: Fix up the SH-3 build for recent TLB changes.
sh: export return_address() symbol.
sh: Enable the mmu in start_secondary()
sh: Fix FDPIC binary loader
arch/sh/kernel: Use set_cpus_allowed_ptr
sh: Update ecovec_defconfig
USB gadget r8a66597-udc.c: duplicated include
sh: update the TLB replacement counter for entry wiring.
|
|
While the MMUCR.URB and ITLB/UTLB differentiation works fine for all SH-4
and later TLBs, these features are absent on SH-3. This splits out
local_flush_tlb_all() in to SH-4 and PTEAEX copies while restoring the
old SH-3 one, subsequently fixing up the build.
This will probably want some further reordering and tidying in the
future, but that's out of scope at present.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
|
|
This is needed with some of the tracing code built as modules, so provide
the export.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
|
|
Word copying is used only for aligned addresses.
Here is space for improving to use any better copying technique.
Look at memcpy implementation.
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
If early printk console is not enabled then all messages
are written to log buffer.
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
I forget to change register name in comments.
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
TLB size was hardcoded in asm code. This patch brings ability
to change TLB size only in one place. (mmu.h).
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
I forget to remove pci Kconfig option.
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
On the base on GCOV analytics is helpful to add likely/unlikely
macros.
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
Cachegrind analysis need this fix to be able to log asm functions.
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
To be able to do trace TLB operations.
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
Sync labels.
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
RESR and REAR uses the same regs in whole file.
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
This change synchronize register usage in code.
ESR = R4
EAR = R3
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
Any sync branch must follow mts instructions not mfs.
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
Disable debug option in asm code.
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
When the system has no lmb bram, main memory should be start from
zero because of microblaze vectors.
DTS fragment could look like:
DDR2_SDRAM: memory@0 {
device_type = "memory";
reg = < 0x0 0x10000000 >;
} ;
Then you have to setup CONFIG_KERNEL_BASE_ADDR=0 which caused
that kernel physical start address will be zero. On reset vector place
will be jump to 0x100 and on 0x100 starts kernel text.
You have to solve how to load the kernel before cpu starts.
Tested with XMD.
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
Last sync.
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
Move to generic location.
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
noMMU and MMU use them.
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
Here is small regression on dhrystone tests and I think
that on all benchmarking tests. It is due to better checking
mechanism in put_user macro
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
Use unified version.
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
Previous patches fixed only MMU version and this is the first
patch for noMMU kernel
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
Generic implementation for noMMU and MMU version
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
copy_from_user macro also use copy_tofrom_user function
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
noMMU and MMU kernel will use copy copy_tofrom_user
asm implementation.
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
Add macro description and resort.
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
Use FIXUP macros and resort them.
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
It is used __FIXUP_SECTION and __EX_TABLE_SECTION macros.
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
This is the first patch which does uaccess unification.
I choosed to do several patches to be able to use bisect
in future if any fault happens.
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
The same noMMU and MMU functions should be placed in
one file.
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
Just sort to be able remove whole block.
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
I would like to use asm-generic uaccess.h where are segment
macros defined. This is just first step.
Signed-off-by: Michal Simek <monstr@monstr.eu>
|
|
We don't need to do it.
Signed-off-by: Michal Simek <monstr@monstr.eu>
|