Age | Commit message (Collapse) | Author |
|
Commit d57af9b (taskstats: use real microsecond granularity for CPU times)
renamed msecs_to_cputime to usecs_to_cputime, but failed to update all
numbers on the way. This causes nonsensical cpu idle/iowait values to be
displayed in /proc/stat (the only user of usecs_to_cputime so far).
This also renames __cputime_msec_factor to __cputime_usec_factor, adapting
its value and using it directly in cputime_to_usecs instead of doing two
multiplications.
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Acked-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
read_n_cells() cannot be marked as .devinit.text since it is referenced
from two functions that are not in that section: of_get_lmb_size() and
hot_add_drconf_scn_to_nid().
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
mark_reserved_regions_for_nid() is only called from do_init_bootmem(),
which is in .init.text, so it must be in the same section to avoid a
section mismatch warning.
Reported-by: Subrata Modak <subrata@linux.vnet.ibm.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
PPC64 uses long long for u64 in the kernel, but powerpc's asm/types.h
prevents 64-bit userland from seeing this definition, instead defaulting
to u64 == long in userspace. Some user programs (e.g. kvmtool) may actually
want LL64, so this patch adds a check for __SANE_USERSPACE_TYPES__ so that,
if defined, int-ll64.h is included instead.
Signed-off-by: Matt Evans <matt@ozlabs.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Implement a POWER7 optimised copy_to_user/copy_from_user using VMX.
For large aligned copies this new loop is over 10% faster, and for
large unaligned copies it is over 200% faster.
If we take a fault we fall back to the old version, this keeps
things relatively simple and easy to verify.
On POWER7 unaligned stores rarely slow down - they only flush when
a store crosses a 4KB page boundary. Furthermore this flush is
handled completely in hardware and should be 20-30 cycles.
Unaligned loads on the other hand flush much more often - whenever
crossing a 128 byte cache line, or a 32 byte sector if either sector
is an L1 miss.
Considering this information we really want to get the loads aligned
and not worry about the alignment of the stores. Microbenchmarks
confirm that this approach is much faster than the current unaligned
copy loop that uses shifts and rotates to ensure both loads and
stores are aligned.
We also want to try and do the stores in cacheline aligned, cacheline
sized chunks. If the store queue is unable to merge an entire
cacheline of stores then the L2 cache will have to do a
read/modify/write. Even worse, we will serialise this with the stores
in the next iteration of the copy loop since both iterations hit
the same cacheline.
Based on this, the new loop does the following things:
1 - 127 bytes
Get the source 8 byte aligned and use 8 byte loads and stores. Pretty
boring and similar to how the current loop works.
128 - 4095 bytes
Get the source 8 byte aligned and use 8 byte loads and stores,
1 cacheline at a time. We aren't doing the stores in cacheline
aligned chunks so we will potentially serialise once per cacheline.
Even so it is much better than the loop we have today.
4096 - bytes
If both source and destination have the same alignment get them both
16 byte aligned, then get the destination cacheline aligned. Do
cacheline sized loads and stores using VMX.
If source and destination do not have the same alignment, we get the
destination cacheline aligned, and use permute to do aligned loads.
In both cases the VMX loop should be optimal - we always do aligned
loads and stores and are always doing stores in cacheline aligned,
cacheline sized chunks.
To be able to use VMX we must be careful about interrupts and
sleeping. We don't use the VMX loop when in an interrupt (which should
be rare anyway) and we wrap the VMX loop in disable/enable_pagefault
and fall back to the existing copy_tofrom_user loop if we do need to
sleep.
The VMX breakpoint of 4096 bytes was chosen using this microbenchmark:
http://ozlabs.org/~anton/junkcode/copy_to_user.c
Since we are using VMX and there is a cost to saving and restoring
the user VMX state there are two broad cases we need to benchmark:
- Best case - userspace never uses VMX
- Worst case - userspace always uses VMX
In reality a userspace process will sit somewhere between these two
extremes. Since we need to test both aligned and unaligned copies we
end up with 4 combinations. The point at which the VMX loop begins to
win is:
0% VMX
aligned 2048 bytes
unaligned 2048 bytes
100% VMX
aligned 16384 bytes
unaligned 8192 bytes
Considering this is a microbenchmark, the data is hot in cache and
the VMX loop has better store queue merging properties we set the
breakpoint to 4096 bytes, a little below the unaligned breakpoints.
Some future optimisations we can look at:
- Looking at the perf data, a significant part of the cost when a
task is always using VMX is the extra exception we take to restore
the VMX state. As such we should do something similar to the x86
optimisation that restores FPU state for heavy users. ie:
/*
* If the task has used fpu the last 5 timeslices, just do a full
* restore of the math state immediately to avoid the trap; the
* chances of needing FPU soon are obviously high now
*/
preload_fpu = tsk_used_math(next_p) && next_p->fpu_counter > 5;
and
/*
* fpu_counter contains the number of consecutive context switches
* that the FPU is used. If this is over a threshold, the lazy fpu
* saving becomes unlazy to save the trap. This is an unsigned char
* so that after 256 times the counter wraps and the behavior turns
* lazy again; this to deal with bursty apps that only use FPU for
* a short time
*/
- We could create a paca bit to mirror the VMX enabled MSR bit and check
that first, avoiding multiple calls to calling enable_kernel_altivec.
That should help with iovec based system calls like readv.
- We could have two VMX breakpoints, one for when we know the user VMX
state is loaded into the registers and one when it isn't. This could
be a second bit in the paca so we can calculate the break points quickly.
- One suggestion from Ben was to save and restore the VSX registers
we use inline instead of using enable_kernel_altivec.
[BenH: Fixed a problem with preempt and fixed build without CONFIG_ALTIVEC]
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Last merge window the memory maps for U300 were simplified so
we can now safely delete memory.h.
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
The clock for the smp_twd block is not equal to the CPU
frequency, actually it is divided by two, so fix this,
and set the initial frequency to half of 1GHz which is
the most common case.
Reported-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
The DB8500 ED (Early Drop) and V1 are only available inside of
ST-Ericsson or partners, we have actively replaced and scrapped
these prototypes. All Nova products on the open market (such as
the Snowball board) are based on V2 and later ASIC variants.
So let us focus on supporting the silicon that will be used and
delete this to get a clear overview.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
A few new addresses for newly supported peripherals and SRAM base
offsets.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
Use platform_device_register_simple() rather than a static
struct, so we create and register the PMU device on-the-fly.
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
This adds a few CPU identification functions for the U5500 variants.
Contains portions of code written by Rabin Vincent.
Cc: Rabin Vincent <rabin.vincent@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
Extend the ux500 ID table to cover the DB8520 variant.
Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
The overlapping iotable mapping entries for the ux500 Cortex
A9 SCU, CPU control and TWD are no longer accepted by the
kernel. Remove the overlaps so the machine boots again.
Cc: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com>
Cc: Rabin Vincent <rabin.vincent@stericsson.com>
Reported-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
sched_clock() is yet another blocker on the road to the single
image. This patch implements an idea by Russell King:
http://www.spinics.net/lists/linux-omap/msg49561.html
Instead of asking the platform to implement both sched_clock()
itself and the rollover callback, simply register a read()
function, and let the ARM code care about sched_clock() itself,
the conversion to ns and the rollover. sched_clock() uses
this read() function as an indirection to the platform code.
If the platform doesn't provide a read(), the code falls back
to the jiffy counter (just like the default sched_clock).
This allow some simplifications and possibly some footprint gain
when multiple platforms are compiled in. Among the drawbacks,
the removal of the *_fixed_sched_clock optimization which could
negatively impact some platforms (sa1100, tegra, versatile
and omap).
Tested on 11MPCore, OMAP4 and Tegra.
Cc: Imre Kaloz <kaloz@openwrt.org>
Cc: Eric Miao <eric.y.miao@gmail.com>
Cc: Colin Cross <ccross@android.com>
Cc: Erik Gilling <konkers@android.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Sascha Hauer <kernel@pengutronix.de>
Cc: Alessandro Rubini <rubini@unipv.it>
Cc: STEricsson <STEricsson_nomadik_linux@list.st.com>
Cc: Lennert Buytenhek <kernel@wantstofly.org>
Cc: Ben Dooks <ben-linux@fluff.org>
Tested-by: Jamie Iles <jamie@jamieiles.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Krzysztof Halasa <khc@pm.waw.pl>
Acked-by: Kukjin Kim <kgene.kim@samsung.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Allow the platform to be restarted by triggering the watchdog to expire
with the shortest possible expiry. This should reset the CPU core and
all on-chip peripherals.
v2: - use writel_relaxed().
Signed-off-by: Jamie Iles <jamie@jamieiles.com>
|
|
Now that we have lost our machine specific ioremap() we just have one
mapping that covers all peripherals. Move this to common.c to simplify
things a little.
Signed-off-by: Jamie Iles <jamie@jamieiles.com>
|
|
All irq_desc's are now dynamically allocated so we don't need to
statically reserve them.
v2: - select SPARSE_IRQ and set .nr_irqs to NR_IRQS_LEGACY to skip
ISA and IRQ 0.
Signed-off-by: Jamie Iles <jamie@jamieiles.com>
|
|
mach/memory.h is no longer required for simple platforms so remove it
for picoxcell.
Signed-off-by: Jamie Iles <jamie@jamieiles.com>
|
|
LAPIC related statistics are grouped inside the per-cpu
structure irq_stat, so there is no need for icr_read_retry_count
to be a standalone per-cpu variable.
This patch moves icr_read_retry_count to where it belongs.
Suggested-y: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Cc: Jörn Engel <joern@logfs.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
|
|
Added additional debug output that we always seem to add
during power ons to validate firmware operation.
Signed-off-by: Michael Demeter <michael.demeter@intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Link: http://lkml.kernel.org/r/20111215223116.10166.50803.stgit@bob.linux.org.uk
[ fixed line breaks, formatting and commit title. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/mce
|
|
This sets all up the other bits that need to be INTEL_MID
specific rather than Moorestown specific.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Link: http://lkml.kernel.org/r/20111217174318.7207.91543.stgit@bob.linux.org.uk
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
gcc noticed (when using -Wempty-body) that our use of
lock_cmos() and unlock_cmos() in
arch/x86/include/asm/mach_traps.h is potentially problematic :
arch/x86/include/asm/mach_traps.h:32:15: warning: suggest braces around empty body in an ¡else¢ statement [-Wempty-body]
arch/x86/include/asm/mach_traps.h:40:16: warning: suggest braces around empty body in an ¡else¢ statement [-Wempty-body]
Let's just use the standard 'do {} while (0)' solution. That
shuts up gcc and also prevents future problems if the macros
should end up being used in a similar situation elsewhere.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1112180103130.21784@swampdragon.chaosbits.net
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
If one builds the kernel with -Wempty-body one gets this
warning:
mm/memory.c:3432:46: warning: suggest braces around empty body in an ¡if¢ statement [-Wempty-body]
due to the fact that 'flush_tlb_fix_spurious_fault' is a macro
that can sometimes be defined to nothing.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: linux-mm@kvack.org
Cc: Michel Lespinasse <walken@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1112180128070.21784@swampdragon.chaosbits.net
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
The APB timer requires SFI, SCU and MID support
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Link: http://lkml.kernel.org/r/20111217215719.3743.93550.stgit@bob.linux.org.uk
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Add support for the tegra30 based cardhu development board. Cardhu is a tablet
formfactor reference design for tegra30. The patch provides a device tree for
the board, updates Makefile.boot to build the dtb, includes the platform in
Kconfig and updates board-dt.c.
Signed-off-by: Peter De Schrijver <pdeschrijver@nvidia.com>
Acked-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Colin Cross <ccross@android.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
Add support for tegra30 SoC. This includes a device tree compatible type for
this SoC ("nvidia,tegra30") and adds L2 cache initialization for this new SoC.
The clock framework is still missing, which prevents most drivers from working.
The basic IRQs are the same, so remove the dependency on
CONFIG_ARCH_TEGRA_2x_SOC.
Signed-off-by: Peter De Schrijver <pdeschrijver@nvidia.com>
Acked-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Colin Cross <ccross@android.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
Define the pinmuxing and pindrive tables for tegra30. The pinmux table defines
the available functions for each pinmux group. The pindrive table defines the
default pullup or pulldowns for each group.
Derived from code by Scott Williams (scwilliams@nvidia.com)
Signed-off-by: Peter De Schrijver <pdeschrijver@nvidia.com>
Acked-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Colin Cross <ccross@android.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
Add new fields to struct tegra_pingroup_desc to support new hardware features
introduced in the tegra30 SoC. The pinmux driver won't use those fields yet,
but the tegra30 pinmux tables will already provide the necessary data.
Signed-off-by: Peter De Schrijver <pdeschrijver@nvidia.com>
Acked-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Colin Cross <ccross@android.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
This patch modifies the pinmux code to be useable for multiple tegra variants.
Some tegra20 specific constants will be replaced by variables which will be
initialized to the appropriate value at runtime.
Signed-off-by: Peter De Schrijver <pdeschrijver@nvidia.com>
Acked-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Colin Cross <ccross@android.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
Rename pinmux-t2.h and pinmux-t2-tables.c to the new tegra naming. This file
will be reworked somewhat in the next patch to support multiple tegra SoC
types.
Signed-off-by: Peter De Schrijver <pdeschrijver@nvidia.com>
Acked-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Colin Cross <ccross@android.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
Generalize L2 cache initialization and discover L2 cache associativity at
runtime.
Signed-off-by: Peter De Schrijver <pdeschrijver@nvidia.com>
Acked-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Colin Cross <ccross@android.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
Use PMC reset rather then CAR system reset as recommended by the hardware
team.
Signed-off-by: Peter De Schrijver <pdeschrijver@nvidia.com>
Acked-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Colin Cross <ccross@android.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
Tegra20 based boards will be handled by the current board-dt.c file. Tegra30
based boards will be handled by a new board-dt-tegra30.c file. Hence rename
the existing board-dt.c to board-dt-tegra20.c to reflect its use.
Signed-off-by: Peter De Schrijver <pdeschrijver@nvidia.com>
Acked-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Colin Cross <ccross@android.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
This patch splits the early init code in a common and a tegra20 specific part.
Signed-off-by: Peter De Schrijver <pdeschrijver@nvidia.com>
Acked-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Colin Cross <ccross@android.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
don't export clk_measure_input_freq as its functionality is also available
using clk_get_rate().
Signed-off-by: Peter De Schrijver <pdeschrijver@nvidia.com>
Acked-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Colin Cross <ccross@android.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
Rework the tegra20 clock code to support multiple tegra variants :
* remove tegra2_periph_reset_assert/tegra2_periph_reset_deassert. This
functionality should be in clock.c.
* remove tegra_sdmmc_tap_delay and export tegra2_sdmmc_tap_delay
directly. This feature is handled inside the sdmmc block from tegra30
onwards. So there is no need for support in the clock code beyond
tegra20. There are no in tree users of this function.
Signed-off-by: Peter De Schrijver <pdeschrijver@nvidia.com>
Acked-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Colin Cross <ccross@android.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
* add a dependency to ARCH_TEGRA_2x_SOC in Kconfig to all tegra20 based boards
and TEGRA_PCI
* make powergating dependent on ARCH_TEGRA_2x_SOC
* remove dependency on ARCH_TEGRA_2x_SOC for clock.c
Signed-off-by: Peter De Schrijver <pdeschrijver@nvidia.com>
Acked-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Colin Cross <ccross@android.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
The timer and rtc-timer clocks aren't gated by default, so there is no reason
to crash the system if the dummy enable call failed.
Signed-off-by: Peter De Schrijver <pdeschrijver@nvidia.com>
Acked-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
Conflicts:
arch/arm/mach-tegra/board-dt.c
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
This patch adds the initial device tree for tegra30
Signed-off-by: Peter De Schrijver <pdeschrijver@nvidia.com>
Acked-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Colin Cross <ccross@android.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
|
|
Use snd_soc_register_card() instead of creating a "soc-audio" platform device.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
|
|
Use snd_soc_register_card() instead of creating a "soc-audio" platform device.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
|
|
Use snd_soc_register_card() instead of creating a "soc-audio" platform device.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
|
|
Use snd_soc_register_card() instead of creating a "soc-audio" platform device.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
|
|
Use snd_soc_register_card() instead of creating a "soc-audio" platform device.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
|
|
All McBSP instances on OMAP4 has 128 word long FIFO
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
|
|
git://git.pwsan.com/linux-2.6 into ehci
|