Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC driver update from Kevin Hilman:
"This contains the ARM SoC related driver updates for v3.12. The only
thing this cycle are core PM updates and CPUidle support for ARM's TC2
big.LITTLE development platform"
* tag 'drivers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
cpuidle: big.LITTLE: vexpress-TC2 CPU idle driver
ARM: vexpress: tc2: disable GIC CPU IF in tc2_pm_suspend
drivers: irq-chip: irq-gic: introduce gic_cpu_if_down()
|
|
The coupled cpuidle waiting loop clears pending pokes before
entering the safe state. If a poke arrives just before the
pokes are cleared, but after the while loop condition checks,
the poke will be lost and the cpu will stay in the safe state
until another interrupt arrives. This may cause the cpu that
sent the poke to spin in the ready loop with interrupts off
until another cpu receives an interrupt, and if no other cpus
have interrupts routed to them it can spin forever.
Change the return value of cpuidle_coupled_clear_pokes to
return if a poke was cleared, and move the need_resched()
checks into the callers. In the waiting loop, if
a poke was cleared restart the loop to repeat the while
condition checks.
Reported-by: Neil Zhang <zhangwm@marvell.com>
Signed-off-by: Colin Cross <ccross@android.com>
Cc: 3.6+ <stable@vger.kernel.org> # 3.6+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Joseph Lo <josephl@nvidia.com> reported a lockup on Tegra20 caused
by a race condition in coupled cpuidle. When two or more cpus
enter idle at the same time, the first cpus to arrive may go to the
ready loop without processing pending pokes from the last cpu to
arrive.
This patch adds a check for pending pokes once all cpus have been
synchronized in the ready loop and resets the coupled state and
retries if any cpus failed to handle their pending poke.
Retrying on all cpus may trigger the same issue again, so this patch
also adds a check to ensure that each cpu has received at least one
poke between when it enters the waiting loop and when it moves on to
the ready loop.
Reported-and-tested-by: Joseph Lo <josephl@nvidia.com>
Tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Colin Cross <ccross@android.com>
Cc: 3.6+ <stable@vger.kernel.org> # 3.6+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Calling cpuidle_enter_state is expected to return with interrupts
enabled, but interrupts must be disabled before starting the
ready loop synchronization stage. Call local_irq_disable after
each call to cpuidle_enter_state for the safe state.
Tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Colin Cross <ccross@android.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
From Lorenzo Pieralisi:
This patch series contains:
- GIC driver update to add a method to disable the GIC CPU IF
- TC2 MCPM update to add GIC CPU disabling to suspend method
- TC2 CPU idle big.LITTLE driver
* cpuidle/biglittle:
cpuidle: big.LITTLE: vexpress-TC2 CPU idle driver
ARM: vexpress: tc2: disable GIC CPU IF in tc2_pm_suspend
drivers: irq-chip: irq-gic: introduce gic_cpu_if_down()
ARM: vexpress/TC2: implement PM suspend method
ARM: vexpress/TC2: basic PM support
ARM: vexpress: Add SCC to V2P-CA15_A7's device tree
ARM: vexpress/TC2: add Serial Power Controller (SPC) support
ARM: vexpress/dcscb: fix cache disabling sequences
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
The big.LITTLE architecture is composed of two clusters of cpus. One cluster
contains less powerful but more energy efficient processors and the other
cluster groups the powerful but energy-intensive cpus.
The TC2 testchip implements two clusters of CPUs (A7 and A15 clusters in
a big.LITTLE configuration) connected through a CCI interconnect that manages
coherency of their respective L2 caches and intercluster distributed
virtual memory messages (DVM).
TC2 testchip integrates a power controller that manages cores resets, wake-up
IRQs and cluster low-power states. Power states are managed at cluster
level, which means that voltage is removed from a cluster iff all cores
in a cluster are in a wfi state. Single cores can enter a reset state
which is identical to wfi in terms of power consumption but simplifies the
way cluster states are entered.
This patch provides a multiple driver CPU idle implementation for TC2
which paves the way for a generic big.LITTLE idle driver for all
upcoming big.LITTLE based systems on chip.
The driver relies on the MCPM infrastructure to coordinate and manage
core power states; in particular MCPM allows to suspend specific cores
and hides the CPUs coordination required to shut-down clusters of CPUs.
Power down sequences for the respective clusters are implemented in the
MCPM TC2 backend, with all code needed to clean caches and exit coherency.
The multiple driver CPU idle infrastructure allows to define different
C-states for big and little cores, determined at boot by checking the
part id of the possible CPUs and initializing the respective logical
masks in the big and little drivers.
Current big.little systems are composed of A7 and A15 clusters, as
implemented in TC2, but in the future that may change and the driver
will have evolve to retrieve what is a 'big' cpu and what is a 'little'
cpu in order to build the correct topology.
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Amit Kucheria <amit.kucheria@linaro.org>
Cc: Olof Johansson <olof@lixom.net>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
Field predicted_us value can never exceed expected_us value, but it has
a potentially larger type. As there is no need for additional 32 bits of
zeroes on 32 bit plaforms, change the type of predicted_us to match the
type of expected_us.
Field correction_factor is used to store a value that cannot exceed the
product of RESOLUTION and DECAY (default 1024*8 = 8192). The constants
cannot in practice be incremented to such values, that they'd overflow
unsigned int even on 32 bit systems, so the type is changed to avoid
unnecessary 64 bit arithmetic on 32 bit systems.
One multiplication of (now) 32 bit values needs an added cast to avoid
truncation of the result and has been added.
In order to avoid another multiplication from 32 bit domain to 64 bit
domain, the new correction_factor calculation has been changed from
new = old * (DECAY-1) / DECAY
to
new = old - old / DECAY,
which with infinite precision would yeild exactly the same result, but
now changes the direction of rounding. The impact is not significant as
the maximum accumulated difference cannot exceed the value of DECAY,
which is relatively small compared to product of RESOLUTION and DECAY
(8 / 8192).
Signed-off-by: Tuukka Tikkanen <tuukka.tikkanen@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The menu governor has a number of tunable constants that may be changed
in the source. If certain combination of values are chosen, an overflow
is possible when the correction_factor is being recalculated.
This patch adds a warning regarding this possibility and describes the
change needed for fixing the issue. The change should not be permanently
enabled, as it will hurt performance when it is not needed.
Signed-off-by: Tuukka Tikkanen <tuukka.tikkanen@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The menu governor uses a static function get_typical_interval() to
try to detect a repeating pattern of wakeups. The previous interval
durations are stored as an array of unsigned ints, but the arithmetic
in the function is performed exclusively as 64 bit values, even when
the value stored in a variable is known not to exceed unsigned int,
which may be smaller and more efficient on some platforms.
This patch changes the types of varibles used to store some
intermediates, the maximum and and the cutoff threshold to unsigned
ints. Average and standard deviation are still treated as 64 bit values,
even when the values are known to be within the domain of unsigned int,
to avoid casts to ensure correct integer promotion for arithmetic
operations.
Signed-off-by: Tuukka Tikkanen <tuukka.tikkanen@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Struct menu_device member intervals is declared as u32, but the value
stored is (unsigned) int. The type is changed to match the value being
stored.
Signed-off-by: Tuukka Tikkanen <tuukka.tikkanen@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The function get_typical_interval() initializes a number of variables
that are immediately after declarations assigned constant values.
In addition, there are multiple assignments on a single line, which
is explicitly forbidden by Documentation/CodingStyle.
This patch removes redundant initial values for the variables and
breaks up the multiple assignment line.
Signed-off-by: Tuukka Tikkanen <tuukka.tikkanen@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
get_typical_interval() uses int_sqrt() in calculation of standard
deviation. The formal parameter of int_sqrt() is unsigned long, which
may on some platforms be smaller than the 64 bit unsigned integer used
as the actual parameter. The overflow can occur frequently when actual
idle period lengths are in hundreds of milliseconds.
This patch adds a check for such overflow and rejects the candidate
average when an overflow would occur.
Signed-off-by: Tuukka Tikkanen <tuukka.tikkanen@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
This patch rearranges a if-return-elsif-goto-fi-return sequence into
if-return-fi-if-return-fi-goto sequence. The functionality remains the
same. Also, a lengthy comment that did not describe the functionality
in the order it occurs is split into half and top half is moved closer
to actual implementation it describes.
Signed-off-by: Tuukka Tikkanen <tuukka.tikkanen@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
This patch prevents cpuidle menu governor from using repeating interval
prediction result if the idle period predicted is longer than the one
allowed by shortest running timer.
Signed-off-by: Tuukka Tikkanen <tuukka.tikkanen@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Remove unneeded error handling on the result of a call to
platform_get_resource when the value is passed to
devm_ioremap_resource().
A simplified version of the semantic patch that makes this change is
as follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
expression pdev,res,n,e,e1;
expression ret != 0;
identifier l;
@@
- res = platform_get_resource(pdev, IORESOURCE_MEM, n);
... when != res
- if (res == NULL) { ... \(goto l;\|return ret;\) }
... when != res
+ res = platform_get_resource(pdev, IORESOURCE_MEM, n);
e = devm_ioremap_resource(e1, res);
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
git://git.linaro.org/people/dlezcano/linux into pm-cpuidle
Pull ARM cpuidle updates from Daniel Lezcano.
* 'cpuidle/arm-next' of git://git.linaro.org/people/dlezcano/linux:
cpuidle: kirkwood: Make kirkwood_cpuidle_remove function static
cpuidle: calxeda: Add missing __iomem annotation
SH: cpuidle: Add missing parameter for cpuidle_register()
|
|
|
|
This local symbol is used only in this file.
Fix the following sparse warnings:
drivers/cpuidle/cpuidle-kirkwood.c:73:5: warning: symbol 'kirkwood_cpuidle_remove' was not declared. Should it be static ?
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
|
|
Added missing __iomem annotation in order to fix the following
sparse warnings:
drivers/cpuidle/cpuidle-calxeda.c:44:24: warning: incorrect type in argument 1 (different address spaces)
drivers/cpuidle/cpuidle-calxeda.c:44:24: expected void [noderef] <asn:2>*<noident>
drivers/cpuidle/cpuidle-calxeda.c:44:24: got void *extern [addressable] [toplevel] scu_base_addr
drivers/cpuidle/cpuidle-calxeda.c:56:24: warning: incorrect type in argument 1 (different address spaces)
drivers/cpuidle/cpuidle-calxeda.c:56:24: expected void [noderef] <asn:2>*<noident>
drivers/cpuidle/cpuidle-calxeda.c:56:24: got void *extern [addressable] [toplevel] scu_base_addr
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
|
|
Revert commit 69a37bea (cpuidle: Quickly notice prediction failure for
repeat mode), because it has been identified as the source of a
significant performance regression in v3.8 and later as explained by
Jeremy Eder:
We believe we've identified a particular commit to the cpuidle code
that seems to be impacting performance of variety of workloads.
The simplest way to reproduce is using netperf TCP_RR test, so
we're using that, on a pair of Sandy Bridge based servers. We also
have data from a large database setup where performance is also
measurably/positively impacted, though that test data isn't easily
share-able.
Included below are test results from 3 test kernels:
kernel reverts
-----------------------------------------------------------
1) vanilla upstream (no reverts)
2) perfteam2 reverts e11538d1f03914eb92af5a1a378375c05ae8520c
3) test reverts 69a37beabf1f0a6705c08e879bdd5d82ff6486c4
e11538d1f03914eb92af5a1a378375c05ae8520c
In summary, netperf TCP_RR numbers improve by approximately 4%
after reverting 69a37beabf1f0a6705c08e879bdd5d82ff6486c4. When
69a37beabf1f0a6705c08e879bdd5d82ff6486c4 is included, C0 residency
never seems to get above 40%. Taking that patch out gets C0 near
100% quite often, and performance increases.
The below data are histograms representing the %c0 residency @
1-second sample rates (using turbostat), while under netperf test.
- If you look at the first 4 histograms, you can see %c0 residency
almost entirely in the 30,40% bin.
- The last pair, which reverts 69a37beabf1f0a6705c08e879bdd5d82ff6486c4,
shows %c0 in the 80,90,100% bins.
Below each kernel name are netperf TCP_RR trans/s numbers for the
particular kernel that can be disclosed publicly, comparing the 3
test kernels. We ran a 4th test with the vanilla kernel where
we've also set /dev/cpu_dma_latency=0 to show overall impact
boosting single-threaded TCP_RR performance over 11% above
baseline.
3.10-rc2 vanilla RX + c0 lock (/dev/cpu_dma_latency=0):
TCP_RR trans/s 54323.78
-----------------------------------------------------------
3.10-rc2 vanilla RX (no reverts)
TCP_RR trans/s 48192.47
Receiver %c0
0.0000 - 10.0000 [ 1]: *
10.0000 - 20.0000 [ 0]:
20.0000 - 30.0000 [ 0]:
30.0000 - 40.0000 [ 59]:
***********************************************************
40.0000 - 50.0000 [ 1]: *
50.0000 - 60.0000 [ 0]:
60.0000 - 70.0000 [ 0]:
70.0000 - 80.0000 [ 0]:
80.0000 - 90.0000 [ 0]:
90.0000 - 100.0000 [ 0]:
Sender %c0
0.0000 - 10.0000 [ 1]: *
10.0000 - 20.0000 [ 0]:
20.0000 - 30.0000 [ 0]:
30.0000 - 40.0000 [ 11]: ***********
40.0000 - 50.0000 [ 49]:
*************************************************
50.0000 - 60.0000 [ 0]:
60.0000 - 70.0000 [ 0]:
70.0000 - 80.0000 [ 0]:
80.0000 - 90.0000 [ 0]:
90.0000 - 100.0000 [ 0]:
-----------------------------------------------------------
3.10-rc2 perfteam2 RX (reverts commit
e11538d1f03914eb92af5a1a378375c05ae8520c)
TCP_RR trans/s 49698.69
Receiver %c0
0.0000 - 10.0000 [ 1]: *
10.0000 - 20.0000 [ 1]: *
20.0000 - 30.0000 [ 0]:
30.0000 - 40.0000 [ 59]:
***********************************************************
40.0000 - 50.0000 [ 0]:
50.0000 - 60.0000 [ 0]:
60.0000 - 70.0000 [ 0]:
70.0000 - 80.0000 [ 0]:
80.0000 - 90.0000 [ 0]:
90.0000 - 100.0000 [ 0]:
Sender %c0
0.0000 - 10.0000 [ 1]: *
10.0000 - 20.0000 [ 0]:
20.0000 - 30.0000 [ 0]:
30.0000 - 40.0000 [ 2]: **
40.0000 - 50.0000 [ 58]:
**********************************************************
50.0000 - 60.0000 [ 0]:
60.0000 - 70.0000 [ 0]:
70.0000 - 80.0000 [ 0]:
80.0000 - 90.0000 [ 0]:
90.0000 - 100.0000 [ 0]:
-----------------------------------------------------------
3.10-rc2 test RX (reverts 69a37beabf1f0a6705c08e879bdd5d82ff6486c4
and e11538d1f03914eb92af5a1a378375c05ae8520c)
TCP_RR trans/s 47766.95
Receiver %c0
0.0000 - 10.0000 [ 1]: *
10.0000 - 20.0000 [ 1]: *
20.0000 - 30.0000 [ 0]:
30.0000 - 40.0000 [ 27]: ***************************
40.0000 - 50.0000 [ 2]: **
50.0000 - 60.0000 [ 0]:
60.0000 - 70.0000 [ 2]: **
70.0000 - 80.0000 [ 0]:
80.0000 - 90.0000 [ 0]:
90.0000 - 100.0000 [ 28]: ****************************
Sender:
0.0000 - 10.0000 [ 1]: *
10.0000 - 20.0000 [ 0]:
20.0000 - 30.0000 [ 0]:
30.0000 - 40.0000 [ 11]: ***********
40.0000 - 50.0000 [ 0]:
50.0000 - 60.0000 [ 1]: *
60.0000 - 70.0000 [ 0]:
70.0000 - 80.0000 [ 3]: ***
80.0000 - 90.0000 [ 7]: *******
90.0000 - 100.0000 [ 38]: **************************************
These results demonstrate gaining back the tendency of the CPU to
stay in more responsive, performant C-states (and thus yield
measurably better performance), by reverting commit
69a37beabf1f0a6705c08e879bdd5d82ff6486c4.
Requested-by: Jeremy Eder <jeder@redhat.com>
Tested-by: Len Brown <len.brown@intel.com>
Cc: 3.8+ <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Revert commit e11538d1 (cpuidle: Quickly notice prediction failure in
general case), since it depends on commit 69a37be (cpuidle: Quickly
notice prediction failure for repeat mode) that has been identified
as the source of a significant performance regression in v3.8 and
later.
Requested-by: Jeremy Eder <jeder@redhat.com>
Tested-by: Len Brown <len.brown@intel.com>
Cc: 3.8+ <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
* cpuidle-arm:
ARM: ux500: cpuidle: Move ux500 cpuidle driver to drivers/cpuidle
ARM: ux500: cpuidle: Remove pointless include
ARM: ux500: cpuidle: Instantiate the driver from platform device
ARM: davinci: cpuidle: Fix target residency
cpuidle: Add Kconfig.arm and move calxeda, kirkwood and zynq
|
|
There is no more dependency with arch/arm headers, so we can safely move the
driver to the drivers/cpuidle directory.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
|
|
Add Kconfig.arm for ARM cpuidle drivers and moves calxeda, kirkwood
and zynq to Kconfig.arm. Like in the cpufreq menu, "CPU Idle" menu
is added to drivers/cpuidle/Kconfig.
Signed-off-by: Sahara <keun-o.park@windriver.com>
|
|
Make __cpuidle_register_device() check whether or not the device has
been registered already and return -EBUSY immediately if that's the
case.
[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Add __cpuidle_device_init() for initializing the cpuidle_device
structure.
[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
To reduce code duplication related to the unregistration of cpuidle
devices, introduce __cpuidle_unregister_device() and move all of the
unregistration code to that function.
[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The cpuidle sysfs code is designed to have a single instance of per
CPU cpuidle directory. It is not possible to remove the sysfs entry
and create it again. This is not a problem with the current code but
future changes will add CPU hotplug support to enable/disable the
device, so it will need to remove the sysfs entry like other
subsystems do. That won't be possible without this change, because
the kobj is a static object which can't be reused for
kobj_init_and_add().
Add cpuidle_device_kobj to be allocated dynamically when
adding/removing a sysfs entry which is consistent with the other
cpuidle's sysfs entries.
An added benefit is that the sysfs code is now more self-contained
and the includes needed for sysfs can be moved from cpuidle.h
directly into sysfs.c so as to reduce the total number of headers
dragged along with cpuidle.h.
[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Fix white space in the cpuidle code to follow the rules described in
CodingStyle.
No changes in behavior should result from this.
[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
We previously changed the ordering of the cpuidle framework
initialization so that the governors are registered before the
drivers which can register their devices right from the start.
Now, we can safely remove the __cpuidle_register_device() call hack
in cpuidle_enable_device() and check if the driver has been
registered before enabling it. Then, cpuidle_register_device() can
consistently check the cpuidle_enable_device() return value when
enabling the device.
[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
cpufreq governors are defined as modules in the code, but the Kconfig
options do not allow them to be built as modules. This is not really
a problem, but the cpuidle init ordering is: the cpuidle init
functions (framework and driver) and then the governors. That leads
to some weirdness in the cpuidle framework.
Namely, cpuidle_register_device() calls cpuidle_enable_device() which
fails at the first attempt, because governors have not been registered
yet. When a governor is registered, the framework calls
cpuidle_enable_device() again which runs __cpuidle_register_device()
only then. Of course, for that to work, the cpuidle_enable_device()
return value has to be ignored by cpuidle_register_device().
Instead of having this cyclic call graph and relying on a positive
side effects of the hackish back and forth cpuidle_enable_device()
calls it is better to fix the cpuidle init ordering.
To that end, replace the module init code with postcore_initcall()
so we have:
* cpuidle framework : core_initcall
* cpuidle governors : postcore_initcall
* cpuidle drivers : device_initcall
and remove the corresponding module exit code as it is dead anyway
(governors can't be built as modules).
[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI updates from Rafael Wysocki:
"This time the total number of ACPI commits is slightly greater than
the number of cpufreq commits, but Viresh Kumar (who works on cpufreq)
remains the most active patch submitter.
To me, the most significant change is the addition of offline/online
device operations to the driver core (with the Greg's blessing) and
the related modifications of the ACPI core hotplug code. Next are the
freezer updates from Colin Cross that should make the freezing of
tasks a bit less heavy weight.
We also have a couple of regression fixes, a number of fixes for
issues that have not been identified as regressions, two new drivers
and a bunch of cleanups all over.
Highlights:
- Hotplug changes to support graceful hot-removal failures.
It sometimes is necessary to fail device hot-removal operations
gracefully if they cannot be carried out completely. For example,
if memory from a memory module being hot-removed has been allocated
for the kernel's own use and cannot be moved elsewhere, it's
desirable to fail the hot-removal operation in a graceful way
rather than to crash the kernel, but currenty a success or a kernel
crash are the only possible outcomes of an attempted memory
hot-removal. Needless to say, that is not a very attractive
alternative and it had to be addressed.
However, in order to make it work for memory, I first had to make
it work for CPUs and for this purpose I needed to modify the ACPI
processor driver. It's been split into two parts, a resident one
handling the low-level initialization/cleanup and a modular one
playing the actual driver's role (but it binds to the CPU system
device objects rather than to the ACPI device objects representing
processors). That's been sort of like a live brain surgery on a
patient who's riding a bike.
So this is a little scary, but since we found and fixed a couple of
regressions it caused to happen during the early linux-next testing
(a month ago), nobody has complained.
As a bonus we remove some duplicated ACPI hotplug code, because the
ACPI-based CPU hotplug is now going to use the common ACPI hotplug
code.
- Lighter weight freezing of tasks.
These changes from Colin Cross and Mandeep Singh Baines are
targeted at making the freezing of tasks a bit less heavy weight
operation. They reduce the number of tasks woken up every time
during the freezing, by using the observation that the freezer
simply doesn't need to wake up some of them and wait for them all
to call refrigerator(). The time needed for the freezer to decide
to report a failure is reduced too.
Also reintroduced is the check causing a lockdep warining to
trigger when try_to_freeze() is called with locks held (which is
generally unsafe and shouldn't happen).
- cpufreq updates
First off, a commit from Srivatsa S Bhat fixes a resume regression
introduced during the 3.10 cycle causing some cpufreq sysfs
attributes to return wrong values to user space after resume. The
fix is kind of fresh, but also it's pretty obvious once Srivatsa
has identified the root cause.
Second, we have a new freqdomain_cpus sysfs attribute for the
acpi-cpufreq driver to provide information previously available via
related_cpus. From Lan Tianyu.
Finally, we fix a number of issues, mostly related to the
CPUFREQ_POSTCHANGE notifier and cpufreq Kconfig options and clean
up some code. The majority of changes from Viresh Kumar with bits
from Jacob Shin, Heiko Stübner, Xiaoguang Chen, Ezequiel Garcia,
Arnd Bergmann, and Tang Yuantian.
- ACPICA update
A usual bunch of updates from the ACPICA upstream.
During the 3.4 cycle we introduced support for ACPI 5 extended
sleep registers, but they are only supposed to be used if the
HW-reduced mode bit is set in the FADT flags and the code attempted
to use them without checking that bit. That caused suspend/resume
regressions to happen on some systems. Fix from Lv Zheng causes
those registers to be used only if the HW-reduced mode bit is set.
Apart from this some other ACPICA bugs are fixed and code cleanups
are made by Bob Moore, Tomasz Nowicki, Lv Zheng, Chao Guan, and
Zhang Rui.
- cpuidle updates
New driver for Xilinx Zynq processors is added by Michal Simek.
Multidriver support simplification, addition of some missing
kerneldoc comments and Kconfig-related fixes come from Daniel
Lezcano.
- ACPI power management updates
Changes to make suspend/resume work correctly in Xen guests from
Konrad Rzeszutek Wilk, sparse warning fix from Fengguang Wu and
cleanups and fixes of the ACPI device power state selection
routine.
- ACPI documentation updates
Some previously missing pieces of ACPI documentation are added by
Lv Zheng and Aaron Lu (hopefully, that will help people to
uderstand how the ACPI subsystem works) and one outdated doc is
updated by Hanjun Guo.
- Assorted ACPI updates
We finally nailed down the IA-64 issue that was the reason for
reverting commit 9f29ab11ddbf ("ACPI / scan: do not match drivers
against objects having scan handlers"), so we can fix it and move
the ACPI scan handler check added to the ACPI video driver back to
the core.
A mechanism for adding CMOS RTC address space handlers is
introduced by Lan Tianyu to allow some EC-related breakage to be
fixed on some systems.
A spec-compliant implementation of acpi_os_get_timer() is added by
Mika Westerberg.
The evaluation of _STA is added to do_acpi_find_child() to avoid
situations in which a pointer to a disabled device object is
returned instead of an enabled one with the same _ADR value. From
Jeff Wu.
Intel BayTrail PCH (Platform Controller Hub) support is added to
the ACPI driver for Intel Low-Power Subsystems (LPSS) and that
driver is modified to work around a couple of known BIOS issues.
Changes from Mika Westerberg and Heikki Krogerus.
The EC driver is fixed by Vasiliy Kulikov to use get_user() and
put_user() instead of dereferencing user space pointers blindly.
Code cleanups are made by Bjorn Helgaas, Nicholas Mazzuca and Toshi
Kani.
- Assorted power management updates
The "runtime idle" helper routine is changed to take the return
values of the callbacks executed by it into account and to call
rpm_suspend() if they return 0, which allows us to reduce the
overall code bloat a bit (by dropping some code that's not
necessary any more after that modification).
The runtime PM documentation is updated by Alan Stern (to reflect
the "runtime idle" behavior change).
New trace points for PM QoS are added by Sahara
(<keun-o.park@windriver.com>).
PM QoS documentation is updated by Lan Tianyu.
Code cleanups are made and minor issues are addressed by Bernie
Thompson, Bjorn Helgaas, Julius Werner, and Shuah Khan.
- devfreq updates
New driver for the Exynos5-bus device from Abhilash Kesavan.
Minor cleanups, fixes and MAINTAINERS update from MyungJoo Ham,
Abhilash Kesavan, Paul Bolle, Rajagopal Venkat, and Wei Yongjun.
- OMAP power management updates
Adaptive Voltage Scaling (AVS) SmartReflex voltage control driver
updates from Andrii Tseglytskyi and Nishanth Menon."
* tag 'pm+acpi-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (162 commits)
cpufreq: Fix cpufreq regression after suspend/resume
ACPI / PM: Fix possible NULL pointer deref in acpi_pm_device_sleep_state()
PM / Sleep: Warn about system time after resume with pm_trace
cpufreq: don't leave stale policy pointer in cdbs->cur_policy
acpi-cpufreq: Add new sysfs attribute freqdomain_cpus
cpufreq: make sure frequency transitions are serialized
ACPI: implement acpi_os_get_timer() according the spec
ACPI / EC: Add HP Folio 13 to ec_dmi_table in order to skip DSDT scan
ACPI: Add CMOS RTC Operation Region handler support
ACPI / processor: Drop unused variable from processor_perflib.c
cpufreq: tegra: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: s3c64xx: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: omap: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: imx6q: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: exynos: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: dbx500: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: davinci: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: arm-big-little: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: powernow-k8: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: pcc: call CPUFREQ_POSTCHANGE notfier in error cases
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC specific changes from Arnd Bergmann:
"These changes are all to SoC-specific code, a total of 33 branches on
17 platforms were pulled into this. Like last time, Renesas sh-mobile
is now the platform with the most changes, followed by OMAP and
EXYNOS.
Two new platforms, TI Keystone and Rockchips RK3xxx are added in this
branch, both containing almost no platform specific code at all, since
they are using generic subsystem interfaces for clocks, pinctrl,
interrupts etc. The device drivers are getting merged through the
respective subsystem maintainer trees.
One more SoC (u300) is now multiplatform capable and several others
(shmobile, exynos, msm, integrator, kirkwood, clps711x) are moving
towards that goal with this series but need more work.
Also noteworthy is the work on PCI here, which is traditionally part
of the SoC specific code. With the changes done by Thomas Petazzoni,
we can now more easily have PCI host controller drivers as loadable
modules and keep them separate from the platform code in
drivers/pci/host. This has already led to the discovery that three
platforms (exynos, spear and imx) are actually using an identical PCIe
host controller and will be able to share a driver once support for
spear and imx is added."
* tag 'soc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (480 commits)
ARM: integrator: let pciv3 use mem/premem from device tree
ARM: integrator: set local side PCI addresses right
ARM: dts: Add pcie controller node for exynos5440-ssdk5440
ARM: dts: Add pcie controller node for Samsung EXYNOS5440 SoC
ARM: EXYNOS: Enable PCIe support for Exynos5440
pci: Add PCIe driver for Samsung Exynos
ARM: OMAP5: voltagedomain data: remove temporary OMAP4 voltage data
ARM: keystone: Move CPU bringup code to dedicated asm file
ARM: multiplatform: always pick one CPU type
ARM: imx: select syscon for IMX6SL
ARM: keystone: select ARM_ERRATA_798181 only for SMP
ARM: imx: Synertronixx scb9328 needs to select SOC_IMX1
ARM: OMAP2+: AM43x: resolve SMP related build error
dmaengine: edma: enable build for AM33XX
ARM: edma: Add EDMA crossbar event mux support
ARM: edma: Add DT and runtime PM support to the private EDMA API
dmaengine: edma: Add TI EDMA device tree binding
arm: add basic support for Rockchip RK3066a boards
arm: add debug uarts for rockchip rk29xx and rk3xxx series
arm: Add basic clocks for Rockchip rk3066a SoCs
...
|
|
Like other ARM specific drivers, this one requires ARM_CPU_SUSPEND,
as shown by this linker error:
drivers/built-in.o: In function `calxeda_pwrdown_idle':
drivers/cpuidle/cpuidle-calxeda.c:84: undefined reference to `cpu_suspend'
drivers/cpuidle/cpuidle-calxeda.c:86: undefined reference to `cpu_resume'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: linux-pm@vger.kernel.org
|
|
Before commit d6f346f (cpuidle: improve governor Kconfig options),
the CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED option didn't depend on
CONFIG_CPU_IDLE but now it has been moved under the CPU_IDLE
menuconfig.
That raises the following warnings:
warning: (ARCH_OMAP4 && ARCH_TEGRA_2x_SOC) selects ARCH_NEEDS_CPU_IDLE_COUPLED
which has unmet direct dependencies (CPU_IDLE)
warning: (ARCH_OMAP4 && ARCH_TEGRA_2x_SOC) selects ARCH_NEEDS_CPU_IDLE_COUPLED
which has unmet direct dependencies (CPU_IDLE)
because the tegra2 and omap4 Kconfig files select this option
without checking if CPU_IDLE is set.
Fix that by moving ARCH_NEEDS_CPU_IDLE_COUPLED outside of CPU_IDLE.
[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Add kerneldoc (and other) comments to the cpuidle driver's framework
code.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Commit bf4d1b5 (cpuidle: support multiple drivers) introduced support
for using multiple cpuidle drivers at the same time. It added a
couple of new APIs to register the driver per CPU, but that led to
some unnecessary code complexity related to the kernel config options
deciding whether or not the multiple driver support is enabled. The
code has to work as it did before when the multiple driver support is
not enabled and the multiple driver support has to be compatible with
the previously existing API.
Remove the new API, not used by any driver in the tree yet (but
needed for the HMP cpuidle drivers that will be submitted soon), and
add a new cpumask pointer to the cpuidle driver structure that will
point to the mask of CPUs handled by the given driver. That will
allow the cpuidle_[un]register_driver() API to be used for the
multiple driver support along with the cpuidle_[un]register()
functions added recently.
[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Add cpuidle support for Xilinx Zynq.
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Each governor is suitable for different kernel configurations: the menu
governor suits better for a tickless system, while the ladder governor fits
better for a periodic timer tick system.
The Kconfig does not allow to [un]select a governor, thus both are compiled in
the kernel but the init order makes the menu governor to be the last one to be
registered, so becoming the default. The only way to switch back to the ladder
governor is to enable the sysfs governor switch in the kernel command line.
Because it seems nobody complained about this, the menu governor is used by
default most of the time on the system, having both governors is not really
necessary on a tickless system but there isn't a config option to disable one
or another governor.
Create a submenu for cpuidle and add a label for each governor, so we can see
the option in the menu config and enable/disable it.
The governors will be enabled depending on the CONFIG_NO_HZ option:
- If CONFIG_NO_HZ is set, then the menu governor is selected and the ladder
governor is optional, defaulting to 'yes'
- If CONFIG_NO_HZ is not set, then the ladder governor is selected and the
menu governor is optional, defaulting to 'yes'
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Move the private set_auxcr/get_auxcr functions from
drivers/cpuidle/cpuidle-calxeda.c so they can be used across platforms.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Tony Lindgren <tony@atomide.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
|
|
Currently cpuidle drivers are spread across different archs.
As a result, there are several different paths for cpuidle patch
submissions: cpuidle core changes go through linux-pm, ARM driver
changes go to the arm-soc or SoC-specific trees, sh changes go
through the sh arch tree, pseries changes go through the PowerPC tree
and finally intel changes go through the Len's tree while ACPI idle
changes go through linux-pm.
That makes it difficult to consolidate code and to propagate
modifications from the cpuidle core to the different drivers.
Hopefully, a movement has started to put the majority of cpuidle
drivers under drivers/cpuidle like cpuidle-calxeda.c and
cpuidle-kirkwood.c.
Add a maintainer entry for cpuidle to MAINTAINERS to clarify the
situation and to indicate to new cpuidle driver authors that those
drivers should not go into arch-specific directories.
The upstreaming process is unchanged: Rafael takes patches for
merging into his tree, but with an Acked-by: tag from the driver's
maintainer, so indicate in the drivers' headers who maintains them.
The arrangement will be the same as for cpufreq.
[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Andrew Lunn <andrew@lunn.ch> #for kirkwood
Acked-by: Jason Cooper <jason@lakedaemon.net> #for kirkwood
Acked-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Fix comment format for the kernel doc script.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Remove the duplicated code and use the cpuidle common code for initialization.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Remove the duplicated code and use the cpuidle common code for initialization.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The usual scheme to initialize a cpuidle driver on a SMP is:
cpuidle_register_driver(drv);
for_each_possible_cpu(cpu) {
device = &per_cpu(cpuidle_dev, cpu);
cpuidle_register_device(device);
}
This code is duplicated in each cpuidle driver.
On UP systems, it is done this way:
cpuidle_register_driver(drv);
device = &per_cpu(cpuidle_dev, cpu);
cpuidle_register_device(device);
On UP, the macro 'for_each_cpu' does one iteration:
#define for_each_cpu(cpu, mask) \
for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask)
Hence, the initialization loop is the same for UP than SMP.
Beside, we saw different bugs / mis-initialization / return code unchecked in
the different drivers, the code is duplicated including bugs. After fixing all
these ones, it appears the initialization pattern is the same for everyone.
Please note, some drivers are doing dev->state_count = drv->state_count. This is
not necessary because it is done by the cpuidle_enable_device function in the
cpuidle framework. This is true, until you have the same states for all your
devices. Otherwise, the 'low level' API should be used instead with the specific
initialization for the driver.
Let's add a wrapper function doing this initialization with a cpumask parameter
for the coupled idle states and use it for all the drivers.
That will save a lot of LOC, consolidate the code, and the modifications in the
future could be done in a single place. Another benefit is the consolidation of
the cpuidle_device variable which is now in the cpuidle framework and no longer
spread accross the different arch specific drivers.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The en_core_tk_irqen flag is set in all the cpuidle driver which
means it is not necessary to specify this flag.
Remove the flag and the code related to it.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Kevin Hilman <khilman@linaro.org> # for mach-omap2/*
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The commit 89878baa73f0f1c679355006bd8632e5d78f96c2 introduced
the CPUIDLE_FLAG_TIMER_STOP flag where we specify a specific idle
state stops the local timer.
Now use this flag to check at init time if one state will need
the broadcast timer and, in this case, setup the broadcast timer
framework. That prevents multiple code duplication in the drivers.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Convert all uses of devm_request_and_ioremap() to the newly introduced
devm_ioremap_resource() which provides more consistent error handling.
devm_ioremap_resource() provides its own error messages so all explicit
error messages can be removed from the failure code paths.
Signed-off-by: Silviu-Mihai Popescu <silviupopescu1990@gmail.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
When the CPU_IDLE and the ARCH_KIRKWOOD options are set it is
pointless to define a new option CPU_IDLE_KIRKWOOD because it
is redundant.
The Makefile drivers directory contains a condition to compile
the cpuidle drivers:
obj-$(CONFIG_CPU_IDLE) += cpuidle/
Hence, if CPU_IDLE is not set we won't enter this directory.
This patch removes the useless Kconfig option and replaces the
condition in the Makefile by CONFIG_ARCH_KIRKWOOD.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
When a cpu enters a deep idle state, the local timers are stopped and
the time framework falls back to the timer device used as a broadcast
timer.
The different cpuidle drivers are calling clockevents_notify ENTER/EXIT
when the idle state stops the local timer.
Add a new flag CPUIDLE_FLAG_TIMER_STOP which can be set by the cpuidle
drivers. If the flag is set, the cpuidle core code takes care of the
notification on behalf of the driver to avoid pointless code duplication.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|