diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2014-04-01 12:48:54 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2014-04-01 12:48:54 -0700 |
commit | 4dedde7c7a18f55180574f934dbc1be84ca0400b (patch) | |
tree | d7cc511e8ba8ffceadf3f45b9a63395c4e4183c5 /drivers/cpuidle | |
parent | 683b6c6f82a60fabf47012581c2cfbf1b037ab95 (diff) | |
parent | 0ecfe310f4517d7505599be738158087c165be7c (diff) |
Merge tag 'pm+acpi-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management updates from Rafael Wysocki:
"The majority of this material spent some time in linux-next, some of
it even several weeks. There are a few relatively fresh commits in
it, but they are mostly fixes and simple cleanups.
ACPI took the lead this time, both in terms of the number of commits
and the number of modified lines of code, cpufreq follows and there
are a few changes in the PM core and in cpuidle too.
A new feature that already got some LWN.net's attention is the device
PM QoS extension allowing latency tolerance requirements to be
propagated from leaf devices to their ancestors with hardware
interfaces for specifying latency tolerance. That should help systems
with hardware-driven power management to avoid going too far with it
in cases when there are latency tolerance constraints.
There also are some significant changes in the ACPI core related to
the way in which hotplug notifications are handled. They affect PCI
hotplug (ACPIPHP) and the ACPI dock station code too. The bottom line
is that all those notification now go through the root notify handler
and are propagated to the interested subsystems by means of callbacks
instead of having to install a notify handler for each device object
that we can potentially get hotplug notifications for.
In addition to that ACPICA will now advertise "Windows 2013"
compatibility for _OSI, because some systems out there don't work
correctly if that is not done (some of them don't even boot).
On the system suspend side of things, all of the device suspend and
resume callbacks, except for ->prepare() and ->complete(), are now
going to be executed asynchronously as that turns out to speed up
system suspend and resume on some platforms quite significantly and we
have a few more optimizations in that area.
Apart from that, there are some new device IDs and fixes and cleanups
all over. In particular, the system suspend and resume handling by
cpufreq should be improved and the cpuidle menu governor should be a
bit more robust now.
Specifics:
- Device PM QoS support for latency tolerance constraints on systems
with hardware interfaces allowing such constraints to be specified.
That is necessary to prevent hardware-driven power management from
becoming overly aggressive on some systems and to prevent power
management features leading to excessive latencies from being used
in some cases.
- Consolidation of the handling of ACPI hotplug notifications for
device objects. This causes all device hotplug notifications to go
through the root notify handler (that was executed for all of them
anyway before) that propagates them to individual subsystems, if
necessary, by executing callbacks provided by those subsystems
(those callbacks are associated with struct acpi_device objects
during device enumeration). As a result, the code in question
becomes both smaller in size and more straightforward and all of
those changes should not affect users.
- ACPICA update, including fixes related to the handling of _PRT in
cases when it is broken and the addition of "Windows 2013" to the
list of supported "features" for _OSI (which is necessary to
support systems that work incorrectly or don't even boot without
it). Changes from Bob Moore and Lv Zheng.
- Consolidation of ACPI _OST handling from Jiang Liu.
- ACPI battery and AC fixes allowing unusual system configurations to
be handled by that code from Alexander Mezin.
- New device IDs for the ACPI LPSS driver from Chiau Ee Chew.
- ACPI fan and thermal optimizations related to system suspend and
resume from Aaron Lu.
- Cleanups related to ACPI video from Jean Delvare.
- Assorted ACPI fixes and cleanups from Al Stone, Hanjun Guo, Lan
Tianyu, Paul Bolle, Tomasz Nowicki.
- Intel RAPL (Running Average Power Limits) driver cleanups from
Jacob Pan.
- intel_pstate fixes and cleanups from Dirk Brandewie.
- cpufreq fixes related to system suspend/resume handling from Viresh
Kumar.
- cpufreq core fixes and cleanups from Viresh Kumar, Stratos
Karafotis, Saravana Kannan, Rashika Kheria, Joe Perches.
- cpufreq drivers updates from Viresh Kumar, Zhuoyu Zhang, Rob
Herring.
- cpuidle fixes related to the menu governor from Tuukka Tikkanen.
- cpuidle fix related to coupled CPUs handling from Paul Burton.
- Asynchronous execution of all device suspend and resume callbacks,
except for ->prepare and ->complete, during system suspend and
resume from Chuansheng Liu.
- Delayed resuming of runtime-suspended devices during system suspend
for the PCI bus type and ACPI PM domain.
- New set of PM helper routines to allow device runtime PM callbacks
to be used during system suspend and resume more easily from Ulf
Hansson.
- Assorted fixes and cleanups in the PM core from Geert Uytterhoeven,
Prabhakar Lad, Philipp Zabel, Rashika Kheria, Sebastian Capella.
- devfreq fix from Saravana Kannan"
* tag 'pm+acpi-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (162 commits)
PM / devfreq: Rewrite devfreq_update_status() to fix multiple bugs
PM / sleep: Correct whitespace errors in <linux/pm.h>
intel_pstate: Set core to min P state during core offline
cpufreq: Add stop CPU callback to cpufreq_driver interface
cpufreq: Remove unnecessary braces
cpufreq: Fix checkpatch errors and warnings
cpufreq: powerpc: add cpufreq transition latency for FSL e500mc SoCs
MAINTAINERS: Reorder maintainer addresses for PM and ACPI
PM / Runtime: Update runtime_idle() documentation for return value meaning
video / output: Drop display output class support
fujitsu-laptop: Drop unneeded include
acer-wmi: Stop selecting VIDEO_OUTPUT_CONTROL
ACPI / gpu / drm: Stop selecting VIDEO_OUTPUT_CONTROL
ACPI / video: fix ACPI_VIDEO dependencies
cpufreq: remove unused notifier: CPUFREQ_{SUSPENDCHANGE|RESUMECHANGE}
cpufreq: Do not allow ->setpolicy drivers to provide ->target
cpufreq: arm_big_little: set 'physical_cluster' for each CPU
cpufreq: arm_big_little: make vexpress driver depend on bL core driver
ACPI / button: Add ACPI Button event via netlink routine
ACPI: Remove duplicate definitions of PREFIX
...
Diffstat (limited to 'drivers/cpuidle')
-rw-r--r-- | drivers/cpuidle/cpuidle.c | 3 | ||||
-rw-r--r-- | drivers/cpuidle/driver.c | 2 | ||||
-rw-r--r-- | drivers/cpuidle/governors/menu.c | 75 |
3 files changed, 47 insertions, 33 deletions
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index 09d05ab262b..cb20fd915be 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -85,7 +85,8 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv, time_end = ktime_get(); - local_irq_enable(); + if (!cpuidle_state_is_coupled(dev, drv, entered_state)) + local_irq_enable(); diff = ktime_to_us(ktime_sub(time_end, time_start)); if (diff > INT_MAX) diff --git a/drivers/cpuidle/driver.c b/drivers/cpuidle/driver.c index 06dbe7c8619..136d6a283e0 100644 --- a/drivers/cpuidle/driver.c +++ b/drivers/cpuidle/driver.c @@ -209,7 +209,7 @@ static void poll_idle_init(struct cpuidle_driver *drv) state->exit_latency = 0; state->target_residency = 0; state->power_usage = -1; - state->flags = 0; + state->flags = CPUIDLE_FLAG_TIME_VALID; state->enter = poll_idle; state->disabled = false; } diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c index cf7f2f0e4ef..71b52329335 100644 --- a/drivers/cpuidle/governors/menu.c +++ b/drivers/cpuidle/governors/menu.c @@ -122,9 +122,8 @@ struct menu_device { int last_state_idx; int needs_update; - unsigned int expected_us; + unsigned int next_timer_us; unsigned int predicted_us; - unsigned int exit_us; unsigned int bucket; unsigned int correction_factor[BUCKETS]; unsigned int intervals[INTERVALS]; @@ -257,7 +256,7 @@ again: stddev = int_sqrt(stddev); if (((avg > stddev * 6) && (divisor * 4 >= INTERVALS * 3)) || stddev <= 20) { - if (data->expected_us > avg) + if (data->next_timer_us > avg) data->predicted_us = avg; return; } @@ -289,7 +288,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev) struct menu_device *data = &__get_cpu_var(menu_devices); int latency_req = pm_qos_request(PM_QOS_CPU_DMA_LATENCY); int i; - int multiplier; + unsigned int interactivity_req; struct timespec t; if (data->needs_update) { @@ -298,7 +297,6 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev) } data->last_state_idx = 0; - data->exit_us = 0; /* Special case when user has set very strict latency requirement */ if (unlikely(latency_req == 0)) @@ -306,13 +304,11 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev) /* determine the expected residency time, round up */ t = ktime_to_timespec(tick_nohz_get_sleep_length()); - data->expected_us = + data->next_timer_us = t.tv_sec * USEC_PER_SEC + t.tv_nsec / NSEC_PER_USEC; - data->bucket = which_bucket(data->expected_us); - - multiplier = performance_multiplier(); + data->bucket = which_bucket(data->next_timer_us); /* * if the correction factor is 0 (eg first time init or cpu hotplug @@ -326,17 +322,26 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev) * operands are 32 bits. * Make sure to round up for half microseconds. */ - data->predicted_us = div_round64((uint64_t)data->expected_us * + data->predicted_us = div_round64((uint64_t)data->next_timer_us * data->correction_factor[data->bucket], RESOLUTION * DECAY); get_typical_interval(data); /* + * Performance multiplier defines a minimum predicted idle + * duration / latency ratio. Adjust the latency limit if + * necessary. + */ + interactivity_req = data->predicted_us / performance_multiplier(); + if (latency_req > interactivity_req) + latency_req = interactivity_req; + + /* * We want to default to C1 (hlt), not to busy polling * unless the timer is happening really really soon. */ - if (data->expected_us > 5 && + if (data->next_timer_us > 5 && !drv->states[CPUIDLE_DRIVER_STATE_START].disabled && dev->states_usage[CPUIDLE_DRIVER_STATE_START].disable == 0) data->last_state_idx = CPUIDLE_DRIVER_STATE_START; @@ -355,11 +360,8 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev) continue; if (s->exit_latency > latency_req) continue; - if (s->exit_latency * multiplier > data->predicted_us) - continue; data->last_state_idx = i; - data->exit_us = s->exit_latency; } return data->last_state_idx; @@ -390,36 +392,47 @@ static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device *dev) { struct menu_device *data = &__get_cpu_var(menu_devices); int last_idx = data->last_state_idx; - unsigned int last_idle_us = cpuidle_get_last_residency(dev); struct cpuidle_state *target = &drv->states[last_idx]; unsigned int measured_us; unsigned int new_factor; /* - * Ugh, this idle state doesn't support residency measurements, so we - * are basically lost in the dark. As a compromise, assume we slept - * for the whole expected time. + * Try to figure out how much time passed between entry to low + * power state and occurrence of the wakeup event. + * + * If the entered idle state didn't support residency measurements, + * we are basically lost in the dark how much time passed. + * As a compromise, assume we slept for the whole expected time. + * + * Any measured amount of time will include the exit latency. + * Since we are interested in when the wakeup begun, not when it + * was completed, we must substract the exit latency. However, if + * the measured amount of time is less than the exit latency, + * assume the state was never reached and the exit latency is 0. */ - if (unlikely(!(target->flags & CPUIDLE_FLAG_TIME_VALID))) - last_idle_us = data->expected_us; + if (unlikely(!(target->flags & CPUIDLE_FLAG_TIME_VALID))) { + /* Use timer value as is */ + measured_us = data->next_timer_us; + } else { + /* Use measured value */ + measured_us = cpuidle_get_last_residency(dev); - measured_us = last_idle_us; - - /* - * We correct for the exit latency; we are assuming here that the - * exit latency happens after the event that we're interested in. - */ - if (measured_us > data->exit_us) - measured_us -= data->exit_us; + /* Deduct exit latency */ + if (measured_us > target->exit_latency) + measured_us -= target->exit_latency; + /* Make sure our coefficients do not exceed unity */ + if (measured_us > data->next_timer_us) + measured_us = data->next_timer_us; + } /* Update our correction ratio */ new_factor = data->correction_factor[data->bucket]; new_factor -= new_factor / DECAY; - if (data->expected_us > 0 && measured_us < MAX_INTERESTING) - new_factor += RESOLUTION * measured_us / data->expected_us; + if (data->next_timer_us > 0 && measured_us < MAX_INTERESTING) + new_factor += RESOLUTION * measured_us / data->next_timer_us; else /* * we were idle so long that we count it as a perfect @@ -439,7 +452,7 @@ static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device *dev) data->correction_factor[data->bucket] = new_factor; /* update the repeating-pattern data */ - data->intervals[data->interval_ptr++] = last_idle_us; + data->intervals[data->interval_ptr++] = measured_us; if (data->interval_ptr >= INTERVALS) data->interval_ptr = 0; } |