Age | Commit message (Collapse) | Author |
|
[akpm@linux-foundation.org: don't overwrite kstrtoull()'s errno]
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This patch also fixes one function declaration over 80 characters.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Fix checkpatch warnings about EXPORT_SYMBOL and return()
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
- EXPORT_SYMBOL
- typo: unexpectidly->unexpectedly
- function prototype over 80 characters
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
no level printk converted to pr_warn (if err)
no level printk converted to pr_info (disabling non-boot cpus)
Other printk converted to respective level.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
sys_sgetmask and sys_ssetmask are obsolete system calls no longer
supported in libc.
This patch replaces architecture related __ARCH_WANT_SYS_SGETMAX by expert
mode configuration.That option is enabled by default for those
architectures.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Greg Ungerer <gerg@uclinux.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
If cpusets are not in use then we still check a global variable on every
page allocation. Use jump labels to avoid the overhead.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Jan Kara <jack@suse.cz>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
for_each_process_thread() is sub-optimal. All threads share the same
->mm, we can swicth to the next process once we found a thread with
->mm != NULL and ->mm != mm.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Peter Chiang <pchiang@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
"Search through everything else" in mm_update_next_owner() can hit a
kthread which adopted this "mm" via use_mm(), it should not be used as
mm->owner. Add the PF_KTHREAD check.
While at it, change this code to use for_each_process_thread() instead
of deprecated do_each_thread/while_each_thread.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Peter Chiang <pchiang@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
CONFIG_MM_OWNER makes no sense. It is not user-selectable, it is only
selected by CONFIG_MEMCG automatically. So we can kill this option in
init/Kconfig and do s/CONFIG_MM_OWNER/CONFIG_MEMCG/ globally.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Currently to allocate a page that should be charged to kmemcg (e.g.
threadinfo), we pass __GFP_KMEMCG flag to the page allocator. The page
allocated is then to be freed by free_memcg_kmem_pages. Apart from
looking asymmetrical, this also requires intrusion to the general
allocation path. So let's introduce separate functions that will
alloc/free pages charged to kmemcg.
The new functions are called alloc_kmem_pages and free_kmem_pages. They
should be used when the caller actually would like to use kmalloc, but
has to fall back to the page allocator for the allocation is large.
They only differ from alloc_pages and free_pages in that besides
allocating or freeing pages they also charge them to the kmem resource
counter of the current memory cgroup.
[sfr@canb.auug.org.au: export kmalloc_order() to modules]
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Acked-by: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Glauber Costa <glommer@gmail.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit 786235eeba0e ("kthread: make kthread_create() killable") meant
for allowing kthread_create() to abort as soon as killed by the
OOM-killer. But returning -ENOMEM is wrong if killed by SIGKILL from
userspace. Change kthread_create() to return -EINTR upon SIGKILL.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Oleg Nesterov <oleg@redhat.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: <stable@vger.kernel.org> [3.13+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull core irq updates from Thomas Gleixner:
"The irq department delivers:
- Another tree wide update to get rid of the horrible create_irq
interface along with its even more horrible variants. That also
gets rid of the last leftovers of the initial sparse irq hackery.
arch/driver specific changes have been either acked or ignored.
- A fix for the spurious interrupt detection logic with threaded
interrupts.
- A new ARM SoC interrupt controller
- The usual pile of fixes and improvements all over the place"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
Documentation: brcmstb-l2: Add Broadcom STB Level-2 interrupt controller binding
irqchip: brcmstb-l2: Add Broadcom Set Top Box Level-2 interrupt controller
genirq: Improve documentation to match current implementation
ARM: iop13xx: fix msi support with sparse IRQ
genirq: Provide !SMP stub for irq_set_affinity_notifier()
irqchip: armada-370-xp: Move the devicetree binding documentation
irqchip: gic: Use mask field in GICC_IAR
genirq: Remove dynamic_irq mess
ia64: Use irq_init_desc
genirq: Replace dynamic_irq_init/cleanup
genirq: Remove irq_reserve_irq[s]
genirq: Replace reserve_irqs in core code
s390: Avoid call to irq_reserve_irqs()
s390: Remove pointless arch_show_interrupts()
s390: pci: Check return value of alloc_irq_desc() proper
sh: intc: Remove pointless irq_reserve_irqs() invocation
x86, irq: Remove pointless irq_reserve_irqs() call
genirq: Make create/destroy_irq() ia64 private
tile: Use SPARSE_IRQ
tile: pci: Use irq_alloc/free_hwirq()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull timer core updates from Thomas Gleixner:
"This time you get nothing really exciting:
- A huge update to the sh* clocksource drivers
- Support for two more ARM SoCs
- Removal of the deprecated setup_sched_clock() API
- The usual pile of fixlets all over the place"
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
clocksource: Add Freescale FlexTimer Module (FTM) timer support
ARM: dts: vf610: Add Freescale FlexTimer Module timer node.
clocksource: ftm: Add FlexTimer Module (FTM) Timer devicetree Documentation
clocksource: sh_tmu: Remove unnecessary OOM messages
clocksource: sh_mtu2: Remove unnecessary OOM messages
clocksource: sh_cmt: Remove unnecessary OOM messages
clocksource: em_sti: Remove unnecessary OOM messages
clocksource: dw_apb_timer_of: Do not trace read_sched_clock
clocksource: Fix clocksource_mmio_readX_down
clocksource: Fix type confusion for clocksource_mmio_readX_Y
clocksource: sh_tmu: Fix channel IRQ retrieval in legacy case
clocksource: qcom: Implement read_current_timer for udelay
ntp: Make is_error_status() use its argument
ntp: Convert simple_strtol to kstrtol
timer_stats/doc: Fix /proc/timer_stats documentation
sched_clock: Remove deprecated setup_sched_clock() API
ARM: sun6i: a31: Add support for the High Speed Timers
clocksource: sun5i: Add support for reset controller
clocksource: efm32: use $vendor,$device scheme for compatible string
KConfig: Vexpress: build the ARM_GLOBAL_TIMER with vexpress platform
...
|
|
Somehow this unused variable warning sneaked past my warnings check
(probably due to it depending on a new config).
kernel/trace/trace_benchmark.c: In function 'trace_do_benchmark':
kernel/trace/trace_benchmark.c:38:6: warning: unused variable 'seedsq' [-Wunused-variable]
u64 seedsq;
^
Link: http://lkml.kernel.org/r/20140604160921.4f4e69c4@canb.auug.org.au
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm into next
Pull ACPI and power management updates from Rafael Wysocki:
"ACPICA is the leader this time (63 commits), followed by cpufreq (28
commits), devfreq (15 commits), system suspend/hibernation (12
commits), ACPI video and ACPI device enumeration (10 commits each).
We have no major new features this time, but there are a few
significant changes of how things work. The most visible one will
probably be that we are now going to create platform devices rather
than PNP devices by default for ACPI device objects with _HID. That
was long overdue and will be really necessary to be able to use the
same drivers for the same hardware blocks on ACPI and DT-based systems
going forward. We're not expecting fallout from this one (as usual),
but it's something to watch nevertheless.
The second change having a chance to be visible is that ACPI video
will now default to using native backlight rather than the ACPI
backlight interface which should generally help systems with broken
Win8 BIOSes. We're hoping that all problems with the native backlight
handling that we had previously have been addressed and we are in a
good enough shape to flip the default, but this change should be easy
enough to revert if need be.
In addition to that, the system suspend core has a new mechanism to
allow runtime-suspended devices to stay suspended throughout system
suspend/resume transitions if some extra conditions are met
(generally, they are related to coordination within device hierarchy).
However, enabling this feature requires cooperation from the bus type
layer and for now it has only been implemented for the ACPI PM domain
(used by ACPI-enumerated platform devices mostly today).
Also, the acpidump utility that was previously shipped as a separate
tool will now be provided by the upstream ACPICA along with the rest
of ACPICA code, which will allow it to be more up to date and better
supported, and we have one new cpuidle driver (ARM clps711x).
The rest is improvements related to certain specific use cases,
cleanups and fixes all over the place.
Specifics:
- ACPICA update to upstream version 20140424. That includes a number
of fixes and improvements related to things like GPE handling,
table loading, headers, memory mapping and unmapping, DSDT/SSDT
overriding, and the Unload() operator. The acpidump utility from
upstream ACPICA is included too. From Bob Moore, Lv Zheng, David
Box, David Binderman, and Colin Ian King.
- Fixes and cleanups related to ACPI video and backlight interfaces
from Hans de Goede. That includes blacklist entries for some new
machines and using native backlight by default.
- ACPI device enumeration changes to create platform devices rather
than PNP devices for ACPI device objects with _HID by default. PNP
devices will still be created for the ACPI device object with
device IDs corresponding to real PNP devices, so that change should
not break things left and right, and we're expecting to see more
and more ACPI-enumerated platform devices in the future. From
Zhang Rui and Rafael J Wysocki.
- Updates for the ACPI LPSS (Low-Power Subsystem) driver allowing it
to handle system suspend/resume on Asus T100 correctly. From
Heikki Krogerus and Rafael J Wysocki.
- PM core update introducing a mechanism to allow runtime-suspended
devices to stay suspended over system suspend/resume transitions if
certain additional conditions related to coordination within device
hierarchy are met. Related PM documentation update and ACPI PM
domain support for the new feature. From Rafael J Wysocki.
- Fixes and improvements related to the "freeze" sleep state. They
affect several places including cpuidle, PM core, ACPI core, and
the ACPI battery driver. From Rafael J Wysocki and Zhang Rui.
- Miscellaneous fixes and updates of the ACPI core from Aaron Lu,
Bjørn Mork, Hanjun Guo, Lan Tianyu, and Rafael J Wysocki.
- Fixes and cleanups for the ACPI processor and ACPI PAD (Processor
Aggregator Device) drivers from Baoquan He, Manuel Schölling, Tony
Camuso, and Toshi Kani.
- System suspend/resume optimization in the ACPI battery driver from
Lan Tianyu.
- OPP (Operating Performance Points) subsystem updates from Chander
Kashyap, Mark Brown, and Nishanth Menon.
- cpufreq core fixes, updates and cleanups from Srivatsa S Bhat,
Stratos Karafotis, and Viresh Kumar.
- Updates, fixes and cleanups for the Tegra, powernow-k8, imx6q,
s5pv210, nforce2, and powernv cpufreq drivers from Brian Norris,
Jingoo Han, Paul Bolle, Philipp Zabel, Stratos Karafotis, and
Viresh Kumar.
- intel_pstate driver fixes and cleanups from Dirk Brandewie, Doug
Smythies, and Stratos Karafotis.
- Enabling the big.LITTLE cpufreq driver on arm64 from Mark Brown.
- Fix for the cpuidle menu governor from Chander Kashyap.
- New ARM clps711x cpuidle driver from Alexander Shiyan.
- Hibernate core fixes and cleanups from Chen Gang, Dan Carpenter,
Fabian Frederick, Pali Rohár, and Sebastian Capella.
- Intel RAPL (Running Average Power Limit) driver updates from Jacob
Pan.
- PNP subsystem updates from Bjorn Helgaas and Fabian Frederick.
- devfreq core updates from Chanwoo Choi and Paul Bolle.
- devfreq updates for exynos4 and exynos5 from Chanwoo Choi and
Bartlomiej Zolnierkiewicz.
- turbostat tool fix from Jean Delvare.
- cpupower tool updates from Prarit Bhargava, Ramkumar Ramachandra
and Thomas Renninger.
- New ACPI ec_access.c tool for poking at the EC in a safe way from
Thomas Renninger"
* tag 'pm+acpi-3.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (187 commits)
ACPICA: Namespace: Remove _PRP method support.
intel_pstate: Improve initial busy calculation
intel_pstate: add sample time scaling
intel_pstate: Correct rounding in busy calculation
intel_pstate: Remove C0 tracking
PM / hibernate: fixed typo in comment
ACPI: Fix x86 regression related to early mapping size limitation
ACPICA: Tables: Add mechanism to control early table checksum verification.
ACPI / scan: use platform bus type by default for _HID enumeration
ACPI / scan: always register ACPI LPSS scan handler
ACPI / scan: always register memory hotplug scan handler
ACPI / scan: always register container scan handler
ACPI / scan: Change the meaning of missing .attach() in scan handlers
ACPI / scan: introduce platform_id device PNP type flag
ACPI / scan: drop unsupported serial IDs from PNP ACPI scan handler ID list
ACPI / scan: drop IDs that do not comply with the ACPI PNP ID rule
ACPI / PNP: use device ID list for PNPACPI device enumeration
ACPI / scan: .match() callback for ACPI scan handlers
ACPI / battery: wakeup the system only when necessary
power_supply: allow power supply devices registered w/o wakeup source
...
|
|
Conflicts:
include/net/inetpeer.h
net/ipv6/output_core.c
Changes in net were fixing bugs in code removed in net-next.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If allocation of the max_buffer fails on boot up, the error path will
free both per_cpu data structures from the buffers. With the new redesign
of the code, those structures are freed if allocations failed. That is,
the helper function that allocates the buffers will free the per cpu data
on failure. No need to do it again. In fact, the second free will cause
a bug as the code can not handle a double free.
Link: http://lkml.kernel.org/p/20140603042803.27308.30956.stgit@yunodevel
Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
* pnp:
MAINTAINERS: Remove Bjorn Helgaas as PNP maintainer
PNP / resources: remove positive test on unsigned values
* powercap:
powercap / RAPL: add new CPU IDs
powercap / RAPL: further relax energy counter checks
* pm-runtime:
PM / runtime: Update documentation to reflect the current code flow
* pm-opp:
PM / OPP: discard duplicate OPPs
PM / OPP: Make OPP invisible to users in Kconfig
PM / OPP: fix incorrect OPP count handling in of_init_opp_table
|
|
* acpi-pm:
ACPI / PM: Export rest of the subsys PM callbacks
ACPI / PM: Avoid resuming devices in ACPI PM domain during system suspend
ACPI / PM: Hold ACPI scan lock over the "freeze" sleep state
ACPI / PM: Export acpi_target_system_state() to modules
|
|
* pm-sleep:
PM / hibernate: fixed typo in comment
PM / sleep: unregister wakeup source when disabling device wakeup
PM / sleep: Introduce command line argument for sleep state enumeration
PM / sleep: Use valid_state() for platform-dependent sleep states only
PM / sleep: Add state field to pm_states[] entries
PM / sleep: Update device PM documentation to cover direct_complete
PM / sleep: Mechanism to avoid resuming runtime-suspended devices unnecessarily
PM / hibernate: Fix memory corruption in resumedelay_setup()
PM / hibernate: convert simple_strtoul to kstrtoul
PM / hibernate: Documentation: Fix script for unswapping
PM / hibernate: no kernel_power_off when pm_power_off NULL
PM / hibernate: use unsigned local variables in swsusp_show_speed()
|
|
* pm-cpuidle:
PM / suspend: Always use deepest C-state in the "freeze" sleep state
cpuidle / menu: move repeated correction factor check to init
cpuidle / menu: Return (-1) if there are no suitable states
cpuidle: Combine cpuidle_enabled() with cpuidle_select()
ARM: clps711x: Add cpuidle driver
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull scheduler updates from Ingo Molnar:
"The main scheduling related changes in this cycle were:
- various sched/numa updates, for better performance
- tree wide cleanup of open coded nice levels
- nohz fix related to rq->nr_running use
- cpuidle changes and continued consolidation to improve the
kernel/sched/idle.c high level idle scheduling logic. As part of
this effort I pulled cpuidle driver changes from Rafael as well.
- standardized idle polling amongst architectures
- continued work on preparing better power/energy aware scheduling
- sched/rt updates
- misc fixlets and cleanups"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (49 commits)
sched/numa: Decay ->wakee_flips instead of zeroing
sched/numa: Update migrate_improves/degrades_locality()
sched/numa: Allow task switch if load imbalance improves
sched/rt: Fix 'struct sched_dl_entity' and dl_task_time() comments, to match the current upstream code
sched: Consolidate open coded implementations of nice level frobbing into nice_to_rlimit() and rlimit_to_nice()
sched: Initialize rq->age_stamp on processor start
sched, nohz: Change rq->nr_running to always use wrappers
sched: Fix the rq->next_balance logic in rebalance_domains() and idle_balance()
sched: Use clamp() and clamp_val() to make sys_nice() more readable
sched: Do not zero sg->cpumask and sg->sgp->power in build_sched_groups()
sched/numa: Fix initialization of sched_domain_topology for NUMA
sched: Call select_idle_sibling() when not affine_sd
sched: Simplify return logic in sched_read_attr()
sched: Simplify return logic in sched_copy_attr()
sched: Fix exec_start/task_hot on migrated tasks
arm64: Remove TIF_POLLING_NRFLAG
metag: Remove TIF_POLLING_NRFLAG
sched/idle: Make cpuidle_idle_call() void
sched/idle: Reflow cpuidle_idle_call()
sched/idle: Delay clearing the polling bit
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull perf updates from Ingo Molnar:
"The tooling changes maintained by Jiri Olsa until Arnaldo is on
vacation:
User visible changes:
- Add -F option for specifying output fields (Namhyung Kim)
- Propagate exit status of a command line workload for record command
(Namhyung Kim)
- Use tid for finding thread (Namhyung Kim)
- Clarify the output of perf sched map plus small sched command
fixes (Dongsheng Yang)
- Wire up perf_regs and unwind support for ARM64 (Jean Pihet)
- Factor hists statistics counts processing which in turn also fixes
several bugs in TUI report command (Namhyung Kim)
- Add --percentage option to control absolute/relative percentage
output (Namhyung Kim)
- Add --list-cmds to 'kmem', 'mem', 'lock' and 'sched', for use by
completion scripts (Ramkumar Ramachandra)
Development/infrastructure changes and fixes:
- Android related fixes for pager and map dso resolving (Michael
Lentine)
- Add libdw DWARF post unwind support for ARM (Jean Pihet)
- Consolidate types.h for ARM and ARM64 (Jean Pihet)
- Fix possible null pointer dereference in session.c (Masanari Iida)
- Cleanup, remove unused variables in map_switch_event() (Dongsheng
Yang)
- Remove nr_state_machine_bugs in perf latency (Dongsheng Yang)
- Remove usage of trace_sched_wakeup(.success) (Peter Zijlstra)
- Cleanups for perf.h header (Jiri Olsa)
- Consolidate types.h and export.h within tools (Borislav Petkov)
- Move u64_swap union to its single user's header, evsel.h (Borislav
Petkov)
- Fix for s390 to properly parse tracepoints plus test code
(Alexander Yarygin)
- Handle EINTR error for readn/writen (Namhyung Kim)
- Add a test case for hists filtering (Namhyung Kim)
- Share map_groups among threads of the same group (Arnaldo Carvalho
de Melo, Jiri Olsa)
- Making some code (cpu node map and report parse callchain callback)
global to be usable by upcomming changes (Don Zickus)
- Fix pmu object compilation error (Jiri Olsa)
Kernel side changes:
- intrusive uprobes fixes from Oleg Nesterov. Since the interface is
admin-only, and the bug only affects user-space ("any probed
jmp/call can kill the application"), we queued these fixes via the
development tree, as a special exception.
- more fuzzer motivated race fixes and related refactoring and
robustization.
- allow PMU drivers to be built as modules. (No actual module yet,
because the x86 Intel uncore module wasn't ready in time for this)"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits)
perf tools: Add automatic remapping of Android libraries
perf tools: Add cat as fallback pager
perf tests: Add a testcase for histogram output sorting
perf tests: Factor out print_hists_*()
perf tools: Introduce reset_output_field()
perf tools: Get rid of obsolete hist_entry__sort_list
perf hists: Reset width of output fields with header length
perf tools: Skip elided sort entries
perf top: Add --fields option to specify output fields
perf report/tui: Fix a bug when --fields/sort is given
perf tools: Add ->sort() member to struct sort_entry
perf report: Add -F option to specify output fields
perf tools: Call perf_hpp__init() before setting up GUI browsers
perf tools: Consolidate management of default sort orders
perf tools: Allow hpp fields to be sort keys
perf ui: Get rid of callback from __hpp__fmt()
perf tools: Consolidate output field handling to hpp format routines
perf tools: Use hpp formats to sort final output
perf tools: Support event grouping in hpp ->sort()
perf tools: Use hpp formats to sort hist entries
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull core locking updates from Ingo Molnar:
"The main changes in this cycle were:
- reduced/streamlined smp_mb__*() interface that allows more usecases
and makes the existing ones less buggy, especially in rarer
architectures
- add rwsem implementation comments
- bump up lockdep limits"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits)
rwsem: Add comments to explain the meaning of the rwsem's count field
lockdep: Increase static allocations
arch: Mass conversion of smp_mb__*()
arch,doc: Convert smp_mb__*()
arch,xtensa: Convert smp_mb__*()
arch,x86: Convert smp_mb__*()
arch,tile: Convert smp_mb__*()
arch,sparc: Convert smp_mb__*()
arch,sh: Convert smp_mb__*()
arch,score: Convert smp_mb__*()
arch,s390: Convert smp_mb__*()
arch,powerpc: Convert smp_mb__*()
arch,parisc: Convert smp_mb__*()
arch,openrisc: Convert smp_mb__*()
arch,mn10300: Convert smp_mb__*()
arch,mips: Convert smp_mb__*()
arch,metag: Convert smp_mb__*()
arch,m68k: Convert smp_mb__*()
arch,m32r: Convert smp_mb__*()
arch,ia64: Convert smp_mb__*()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull RCU changes from Ingo Molnar:
"The main RCU changes in this cycle were:
- RCU torture-test changes.
- variable-name renaming cleanup.
- update RCU documentation.
- miscellaneous fixes.
- patch to suppress RCU stall warnings while sysrq requests are being
processed"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (68 commits)
rcu: Provide API to suppress stall warnings while sysrc runs
rcu: Variable name changed in tree_plugin.h and used in tree.c
torture: Remove unused definition
torture: Remove __init from torture_init_begin/end
torture: Check for multiple concurrent torture tests
locktorture: Remove reference to nonexistent Kconfig parameter
rcutorture: Run rcu_torture_writer at normal priority
rcutorture: Note diffs from git commits
rcutorture: Add missing destroy_timer_on_stack()
rcutorture: Explicitly test synchronous grace-period primitives
rcutorture: Add tests for get_state_synchronize_rcu()
rcutorture: Test RCU-sched primitives in TREE_PREEMPT_RCU kernels
torture: Use elapsed time to detect hangs
rcutorture: Check for rcu_torture_fqs creation errors
torture: Better summary diagnostics for build failures
torture: Notice if an all-zero cpumask is passed inside a critical section
rcutorture: Make rcu_torture_reader() use cond_resched()
sched,rcu: Make cond_resched() report RCU quiescent states
percpu: Fix raw_cpu_inc_return()
rcutorture: Export RCU grace-period kthread wait state to rcutorture
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty into next
Pull tty/serial driver updates from Greg KH:
"Here is the big tty / serial driver pull request for 3.16-rc1.
A variety of different serial driver fixes and updates and additions,
nothing huge, and no real major core tty changes at all.
All have been in linux-next for a while"
* tag 'tty-3.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (84 commits)
Revert "serial: imx: remove the DMA wait queue"
serial: kgdb_nmi: Improve console integration with KDB I/O
serial: kgdb_nmi: Switch from tasklets to real timers
serial: kgdb_nmi: Use container_of() to locate private data
serial: cpm_uart: No LF conversion in put_poll_char()
serial: sirf: Fix compilation failure
console: Remove superfluous readonly check
console: Use explicit pointer type for vc_uni_pagedir* fields
vgacon: Fix & cleanup refcounting
ARM: tty: Move HVC DCC assembly to arch/arm
tty/hvc/hvc_console: Fix wakeup of HVC thread on hvc_kick()
drivers/tty/n_hdlc.c: replace kmalloc/memset by kzalloc
vt: emulate 8- and 24-bit colour codes.
printk/of_serial: fix serial console cessation part way through boot.
serial: 8250_dma: check the result of TX buffer mapping
serial: uart: add hw flow control support configuration
tty/serial: at91: add interrupts for modem control lines
tty/serial: at91: use mctrl_gpio helpers
tty/serial: Add GPIOLIB helpers for controlling modem lines
ARM: at91: gpio: implement get_direction
...
|
|
There is still one residue of sysfs remaining: the sb_magic
SYSFS_MAGIC. However this should be kernfs user specific,
so this patch moves it out. Kerrnfs user should specify their
magic number while mouting.
Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core into next
Pull driver core / kernfs changes from Greg KH:
"Here is the "big" pull request for 3.16-rc1.
Not a lot of changes here, some kernfs work, a revert of a very old
driver core change that ended up cauing some memory leaks on driver
probe error paths, and other minor things.
As was pointed out earlier today, one commit here, 26fc9cd200ec
("kernfs: move the last knowledge of sysfs out from kernfs") is also
needed in your 3.15-final branch as well. If you could cherry-pick it
there, it would be most appreciated by Andy Lutomirski to prevent a
regression there.
All of these have been in linux-next for a while"
* tag 'driver-core-3.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
crypto/nx/nx-842: dev_set_drvdata can no longer fail
kernfs: move the last knowledge of sysfs out from kernfs
sysfs: fix attribute_group bin file path on removal
sysfs.h: don't return a void-valued expression in sysfs_remove_file
init.h: Update initcall_sync variants to fix build errors
driver core: Inline dev_set/get_drvdata
driver core: dev_get_drvdata: Don't check for NULL dev
driver core: dev_set_drvdata returns void
driver core: dev_set_drvdata can no longer fail
driver core: Move driver_data back to struct device
lib/devres.c: fix checkpatch warnings
lib/devres.c: use dev in devm_request_and_ioremap
kobject: Make support for uevent_helper optional.
kernfs: make kernfs_notify() trigger inotify events too
kernfs: implement kernfs_root->supers list
|
|
While I played with my own feature(ex, something on the way to reclaim),
the kernel would easily oops. I guessed that the reason had to do with
stack overflow and wanted to prove it.
I discovered the stack tracer which proved to be very useful for me but
the kernel would oops before my user program gather the information via
"watch cat /sys/kernel/debug/tracing/stack_trace" so I couldn't get any
message from that. What I needed was to have the stack tracer emit the
kernel stack usage before it does the oops so I could find what was
hogging the stack.
This patch shows the callstack of max stack usage right before an oops so
we can find a culprit.
So, the result is as follows.
[ 1116.522206] init: lightdm main process (1246) terminated with status 1
[ 1119.922916] init: failsafe-x main process (1272) terminated with status 1
[ 3887.728131] kworker/u24:1 (6637) used greatest stack depth: 256 bytes left
[ 6397.629227] cc1 (9554) used greatest stack depth: 128 bytes left
[ 7174.467392] Depth Size Location (47 entries)
[ 7174.467392] ----- ---- --------
[ 7174.467785] 0) 7248 256 get_page_from_freelist+0xa7/0x920
[ 7174.468506] 1) 6992 352 __alloc_pages_nodemask+0x1cd/0xb20
[ 7174.469224] 2) 6640 8 alloc_pages_current+0x10f/0x1f0
[ 7174.469413] 3) 6632 168 new_slab+0x2c5/0x370
[ 7174.469413] 4) 6464 8 __slab_alloc+0x3a9/0x501
[ 7174.469413] 5) 6456 80 __kmalloc+0x1cb/0x200
[ 7174.469413] 6) 6376 376 vring_add_indirect+0x36/0x200
[ 7174.469413] 7) 6000 144 virtqueue_add_sgs+0x2e2/0x320
[ 7174.469413] 8) 5856 288 __virtblk_add_req+0xda/0x1b0
[ 7174.469413] 9) 5568 96 virtio_queue_rq+0xd3/0x1d0
[ 7174.469413] 10) 5472 128 __blk_mq_run_hw_queue+0x1ef/0x440
[ 7174.469413] 11) 5344 16 blk_mq_run_hw_queue+0x35/0x40
[ 7174.469413] 12) 5328 96 blk_mq_insert_requests+0xdb/0x160
[ 7174.469413] 13) 5232 112 blk_mq_flush_plug_list+0x12b/0x140
[ 7174.469413] 14) 5120 112 blk_flush_plug_list+0xc7/0x220
[ 7174.469413] 15) 5008 64 io_schedule_timeout+0x88/0x100
[ 7174.469413] 16) 4944 128 mempool_alloc+0x145/0x170
[ 7174.469413] 17) 4816 96 bio_alloc_bioset+0x10b/0x1d0
[ 7174.469413] 18) 4720 48 get_swap_bio+0x30/0x90
[ 7174.469413] 19) 4672 160 __swap_writepage+0x150/0x230
[ 7174.469413] 20) 4512 32 swap_writepage+0x42/0x90
[ 7174.469413] 21) 4480 320 shrink_page_list+0x676/0xa80
[ 7174.469413] 22) 4160 208 shrink_inactive_list+0x262/0x4e0
[ 7174.469413] 23) 3952 304 shrink_lruvec+0x3e1/0x6a0
[ 7174.469413] 24) 3648 80 shrink_zone+0x3f/0x110
[ 7174.469413] 25) 3568 128 do_try_to_free_pages+0x156/0x4c0
[ 7174.469413] 26) 3440 208 try_to_free_pages+0xf7/0x1e0
[ 7174.469413] 27) 3232 352 __alloc_pages_nodemask+0x783/0xb20
[ 7174.469413] 28) 2880 8 alloc_pages_current+0x10f/0x1f0
[ 7174.469413] 29) 2872 200 __page_cache_alloc+0x13f/0x160
[ 7174.469413] 30) 2672 80 find_or_create_page+0x4c/0xb0
[ 7174.469413] 31) 2592 80 ext4_mb_load_buddy+0x1e9/0x370
[ 7174.469413] 32) 2512 176 ext4_mb_regular_allocator+0x1b7/0x460
[ 7174.469413] 33) 2336 128 ext4_mb_new_blocks+0x458/0x5f0
[ 7174.469413] 34) 2208 256 ext4_ext_map_blocks+0x70b/0x1010
[ 7174.469413] 35) 1952 160 ext4_map_blocks+0x325/0x530
[ 7174.469413] 36) 1792 384 ext4_writepages+0x6d1/0xce0
[ 7174.469413] 37) 1408 16 do_writepages+0x23/0x40
[ 7174.469413] 38) 1392 96 __writeback_single_inode+0x45/0x2e0
[ 7174.469413] 39) 1296 176 writeback_sb_inodes+0x2ad/0x500
[ 7174.469413] 40) 1120 80 __writeback_inodes_wb+0x9e/0xd0
[ 7174.469413] 41) 1040 160 wb_writeback+0x29b/0x350
[ 7174.469413] 42) 880 208 bdi_writeback_workfn+0x11c/0x480
[ 7174.469413] 43) 672 144 process_one_work+0x1d2/0x570
[ 7174.469413] 44) 528 112 worker_thread+0x116/0x370
[ 7174.469413] 45) 416 240 kthread+0xf3/0x110
[ 7174.469413] 46) 176 176 ret_from_fork+0x7c/0xb0
[ 7174.469413] ------------[ cut here ]------------
[ 7174.469413] kernel BUG at kernel/trace/trace_stack.c:174!
[ 7174.469413] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
[ 7174.469413] Dumping ftrace buffer:
[ 7174.469413] (ftrace buffer empty)
[ 7174.469413] Modules linked in:
[ 7174.469413] CPU: 0 PID: 440 Comm: kworker/u24:0 Not tainted 3.14.0+ #212
[ 7174.469413] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 7174.469413] Workqueue: writeback bdi_writeback_workfn (flush-253:0)
[ 7174.469413] task: ffff880034170000 ti: ffff880029518000 task.ti: ffff880029518000
[ 7174.469413] RIP: 0010:[<ffffffff8112336e>] [<ffffffff8112336e>] stack_trace_call+0x2de/0x340
[ 7174.469413] RSP: 0000:ffff880029518290 EFLAGS: 00010046
[ 7174.469413] RAX: 0000000000000030 RBX: 000000000000002f RCX: 0000000000000000
[ 7174.469413] RDX: 0000000000000000 RSI: 000000000000002f RDI: ffffffff810b7159
[ 7174.469413] RBP: ffff8800295182f0 R08: ffffffffffffffff R09: 0000000000000000
[ 7174.469413] R10: 0000000000000001 R11: 0000000000000001 R12: ffffffff82768dfc
[ 7174.469413] R13: 000000000000f2e8 R14: ffff8800295182b8 R15: 00000000000000f8
[ 7174.469413] FS: 0000000000000000(0000) GS:ffff880037c00000(0000) knlGS:0000000000000000
[ 7174.469413] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 7174.469413] CR2: 00002acd0b994000 CR3: 0000000001c0b000 CR4: 00000000000006f0
[ 7174.469413] Stack:
[ 7174.469413] 0000000000000000 ffffffff8114fdb7 0000000000000087 0000000000001c50
[ 7174.469413] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 7174.469413] 0000000000000002 ffff880034170000 ffff880034171028 0000000000000000
[ 7174.469413] Call Trace:
[ 7174.469413] [<ffffffff8114fdb7>] ? get_page_from_freelist+0xa7/0x920
[ 7174.469413] [<ffffffff816eee3f>] ftrace_call+0x5/0x2f
[ 7174.469413] [<ffffffff81165065>] ? next_zones_zonelist+0x5/0x70
[ 7174.469413] [<ffffffff810a23fa>] ? __bfs+0x11a/0x270
[ 7174.469413] [<ffffffff81165065>] ? next_zones_zonelist+0x5/0x70
[ 7174.469413] [<ffffffff8114fdb7>] ? get_page_from_freelist+0xa7/0x920
[ 7174.469413] [<ffffffff8119092f>] ? alloc_pages_current+0x10f/0x1f0
[ 7174.469413] [<ffffffff811507fd>] __alloc_pages_nodemask+0x1cd/0xb20
[ 7174.469413] [<ffffffff810a4de6>] ? check_irq_usage+0x96/0xe0
[ 7174.469413] [<ffffffff816eee3f>] ? ftrace_call+0x5/0x2f
[ 7174.469413] [<ffffffff8119092f>] alloc_pages_current+0x10f/0x1f0
[ 7174.469413] [<ffffffff81199cd5>] ? new_slab+0x2c5/0x370
[ 7174.469413] [<ffffffff81199cd5>] new_slab+0x2c5/0x370
[ 7174.469413] [<ffffffff816eee3f>] ? ftrace_call+0x5/0x2f
[ 7174.469413] [<ffffffff816db002>] __slab_alloc+0x3a9/0x501
[ 7174.469413] [<ffffffff8119af8b>] ? __kmalloc+0x1cb/0x200
[ 7174.469413] [<ffffffff8141dc46>] ? vring_add_indirect+0x36/0x200
[ 7174.469413] [<ffffffff8141dc46>] ? vring_add_indirect+0x36/0x200
[ 7174.469413] [<ffffffff8141dc46>] ? vring_add_indirect+0x36/0x200
[ 7174.469413] [<ffffffff8119af8b>] __kmalloc+0x1cb/0x200
[ 7174.469413] [<ffffffff8141de10>] ? vring_add_indirect+0x200/0x200
[ 7174.469413] [<ffffffff8141dc46>] vring_add_indirect+0x36/0x200
[ 7174.469413] [<ffffffff8141e402>] virtqueue_add_sgs+0x2e2/0x320
[ 7174.469413] [<ffffffff8148e35a>] __virtblk_add_req+0xda/0x1b0
[ 7174.469413] [<ffffffff8148e503>] virtio_queue_rq+0xd3/0x1d0
[ 7174.469413] [<ffffffff8134aa0f>] __blk_mq_run_hw_queue+0x1ef/0x440
[ 7174.469413] [<ffffffff8134b0d5>] blk_mq_run_hw_queue+0x35/0x40
[ 7174.469413] [<ffffffff8134b7bb>] blk_mq_insert_requests+0xdb/0x160
[ 7174.469413] [<ffffffff8134be5b>] blk_mq_flush_plug_list+0x12b/0x140
[ 7174.469413] [<ffffffff81342237>] blk_flush_plug_list+0xc7/0x220
[ 7174.469413] [<ffffffff816e60ef>] ? _raw_spin_unlock_irqrestore+0x3f/0x70
[ 7174.469413] [<ffffffff816e16e8>] io_schedule_timeout+0x88/0x100
[ 7174.469413] [<ffffffff816e1665>] ? io_schedule_timeout+0x5/0x100
[ 7174.469413] [<ffffffff81149415>] mempool_alloc+0x145/0x170
[ 7174.469413] [<ffffffff8109baf0>] ? __init_waitqueue_head+0x60/0x60
[ 7174.469413] [<ffffffff811e246b>] bio_alloc_bioset+0x10b/0x1d0
[ 7174.469413] [<ffffffff81184230>] ? end_swap_bio_read+0xc0/0xc0
[ 7174.469413] [<ffffffff81184230>] ? end_swap_bio_read+0xc0/0xc0
[ 7174.469413] [<ffffffff81184110>] get_swap_bio+0x30/0x90
[ 7174.469413] [<ffffffff81184230>] ? end_swap_bio_read+0xc0/0xc0
[ 7174.469413] [<ffffffff81184660>] __swap_writepage+0x150/0x230
[ 7174.469413] [<ffffffff810ab405>] ? do_raw_spin_unlock+0x5/0xa0
[ 7174.469413] [<ffffffff81184230>] ? end_swap_bio_read+0xc0/0xc0
[ 7174.469413] [<ffffffff81184515>] ? __swap_writepage+0x5/0x230
[ 7174.469413] [<ffffffff81184782>] swap_writepage+0x42/0x90
[ 7174.469413] [<ffffffff8115ae96>] shrink_page_list+0x676/0xa80
[ 7174.469413] [<ffffffff816eee3f>] ? ftrace_call+0x5/0x2f
[ 7174.469413] [<ffffffff8115b872>] shrink_inactive_list+0x262/0x4e0
[ 7174.469413] [<ffffffff8115c1c1>] shrink_lruvec+0x3e1/0x6a0
[ 7174.469413] [<ffffffff8115c4bf>] shrink_zone+0x3f/0x110
[ 7174.469413] [<ffffffff816eee3f>] ? ftrace_call+0x5/0x2f
[ 7174.469413] [<ffffffff8115c9e6>] do_try_to_free_pages+0x156/0x4c0
[ 7174.469413] [<ffffffff8115cf47>] try_to_free_pages+0xf7/0x1e0
[ 7174.469413] [<ffffffff81150db3>] __alloc_pages_nodemask+0x783/0xb20
[ 7174.469413] [<ffffffff8119092f>] alloc_pages_current+0x10f/0x1f0
[ 7174.469413] [<ffffffff81145c0f>] ? __page_cache_alloc+0x13f/0x160
[ 7174.469413] [<ffffffff81145c0f>] __page_cache_alloc+0x13f/0x160
[ 7174.469413] [<ffffffff81146c6c>] find_or_create_page+0x4c/0xb0
[ 7174.469413] [<ffffffff811463e5>] ? find_get_page+0x5/0x130
[ 7174.469413] [<ffffffff812837b9>] ext4_mb_load_buddy+0x1e9/0x370
[ 7174.469413] [<ffffffff81284c07>] ext4_mb_regular_allocator+0x1b7/0x460
[ 7174.469413] [<ffffffff81281070>] ? ext4_mb_use_preallocated+0x40/0x360
[ 7174.469413] [<ffffffff816eee3f>] ? ftrace_call+0x5/0x2f
[ 7174.469413] [<ffffffff81287eb8>] ext4_mb_new_blocks+0x458/0x5f0
[ 7174.469413] [<ffffffff8127d83b>] ext4_ext_map_blocks+0x70b/0x1010
[ 7174.469413] [<ffffffff8124e6d5>] ext4_map_blocks+0x325/0x530
[ 7174.469413] [<ffffffff81253871>] ext4_writepages+0x6d1/0xce0
[ 7174.469413] [<ffffffff812531a0>] ? ext4_journalled_write_end+0x330/0x330
[ 7174.469413] [<ffffffff811539b3>] do_writepages+0x23/0x40
[ 7174.469413] [<ffffffff811d2365>] __writeback_single_inode+0x45/0x2e0
[ 7174.469413] [<ffffffff811d36ed>] writeback_sb_inodes+0x2ad/0x500
[ 7174.469413] [<ffffffff811d39de>] __writeback_inodes_wb+0x9e/0xd0
[ 7174.469413] [<ffffffff811d40bb>] wb_writeback+0x29b/0x350
[ 7174.469413] [<ffffffff81057c3d>] ? __local_bh_enable_ip+0x6d/0xd0
[ 7174.469413] [<ffffffff811d6e9c>] bdi_writeback_workfn+0x11c/0x480
[ 7174.469413] [<ffffffff81070610>] ? process_one_work+0x170/0x570
[ 7174.469413] [<ffffffff81070672>] process_one_work+0x1d2/0x570
[ 7174.469413] [<ffffffff81070610>] ? process_one_work+0x170/0x570
[ 7174.469413] [<ffffffff81071bb6>] worker_thread+0x116/0x370
[ 7174.469413] [<ffffffff81071aa0>] ? manage_workers.isra.19+0x2e0/0x2e0
[ 7174.469413] [<ffffffff81078e53>] kthread+0xf3/0x110
[ 7174.469413] [<ffffffff81078d60>] ? flush_kthread_worker+0x150/0x150
[ 7174.469413] [<ffffffff816ef0ec>] ret_from_fork+0x7c/0xb0
[ 7174.469413] [<ffffffff81078d60>] ? flush_kthread_worker+0x150/0x150
[ 7174.469413] Code: c0 49 bc fc 8d 76 82 ff ff ff ff e8 44 5a 5b 00 31 f6 8b 05 95 2b b3 00 48 39 c6 7d 0e 4c 8b 04 f5 20 5f c5 81 49 83 f8 ff 75 11 <0f> 0b 48 63 05 71 5a 64 01 48 29 c3 e9 d0 fd ff ff 48 8d 5e 01
[ 7174.469413] RIP [<ffffffff8112336e>] stack_trace_call+0x2de/0x340
[ 7174.469413] RSP <ffff880029518290>
[ 7174.469413] ---[ end trace c97d325b36b718f3 ]---
Link: http://lkml.kernel.org/p/1401683592-1651-1-git-send-email-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci into next
Pull PCI changes from Bjorn Helgaas:
"Enumeration
- Notify driver before and after device reset (Keith Busch)
- Use reset notification in NVMe (Keith Busch)
NUMA
- Warn if we have to guess host bridge node information (Myron Stowe)
- Work around AMD Fam15h BIOSes that fail to provide _PXM (Suravee
Suthikulpanit)
- Clean up and mark early_root_info_init() as deprecated (Suravee
Suthikulpanit)
Driver binding
- Add "driver_override" for force specific binding (Alex Williamson)
- Fail "new_id" addition for devices we already know about (Bandan
Das)
Resource management
- Support BAR sizes up to 8GB (Nikhil Rao, Alan Cox)
- Don't move IORESOURCE_PCI_FIXED resources (Bjorn Helgaas)
- Mark SBx00 HPET BAR as IORESOURCE_PCI_FIXED (Bjorn Helgaas)
- Fail safely if we can't handle BARs larger than 4GB (Bjorn Helgaas)
- Reject BAR above 4GB if dma_addr_t is too small (Bjorn Helgaas)
- Don't convert BAR address to resource if dma_addr_t is too small
(Bjorn Helgaas)
- Don't set BAR to zero if dma_addr_t is too small (Bjorn Helgaas)
- Don't print anything while decoding is disabled (Bjorn Helgaas)
- Don't add disabled subtractive decode bus resources (Bjorn Helgaas)
- Add resource allocation comments (Bjorn Helgaas)
- Restrict 64-bit prefetchable bridge windows to 64-bit resources
(Yinghai Lu)
- Assign i82875p_edac PCI resources before adding device (Yinghai Lu)
PCI device hotplug
- Remove unnecessary "dev->bus" test (Bjorn Helgaas)
- Use PCI_EXP_SLTCAP_PSN define (Bjorn Helgaas)
- Fix rphahp endianess issues (Laurent Dufour)
- Acknowledge spurious "cmd completed" event (Rajat Jain)
- Allow hotplug service drivers to operate in polling mode (Rajat Jain)
- Fix cpqphp possible NULL dereference (Rickard Strandqvist)
MSI
- Replace pci_enable_msi_block() by pci_enable_msi_exact()
(Alexander Gordeev)
- Replace pci_enable_msix() by pci_enable_msix_exact() (Alexander Gordeev)
- Simplify populate_msi_sysfs() (Jan Beulich)
Virtualization
- Add Intel Patsburg (X79) root port ACS quirk (Alex Williamson)
- Mark RTL8110SC INTx masking as broken (Alex Williamson)
Generic host bridge driver
- Add generic PCI host controller driver (Will Deacon)
Freescale i.MX6
- Use new clock names (Lucas Stach)
- Drop old IRQ mapping (Lucas Stach)
- Remove optional (and unused) IRQs (Lucas Stach)
- Add support for MSI (Lucas Stach)
- Fix imx6_add_pcie_port() section mismatch warning (Sachin Kamat)
Renesas R-Car
- Add gen2 device tree support (Ben Dooks)
- Use new OF interrupt mapping when possible (Lucas Stach)
- Add PCIe driver (Phil Edworthy)
- Add PCIe MSI support (Phil Edworthy)
- Add PCIe device tree bindings (Phil Edworthy)
Samsung Exynos
- Remove unnecessary OOM messages (Jingoo Han)
- Fix add_pcie_port() section mismatch warning (Sachin Kamat)
Synopsys DesignWare
- Make MSI ISR shared IRQ aware (Lucas Stach)
Miscellaneous
- Check for broken config space aliasing (Alex Williamson)
- Update email address (Ben Hutchings)
- Fix Broadcom CNB20LE unintended sign extension (Bjorn Helgaas)
- Fix incorrect vgaarb conditional in WARN_ON() (Bjorn Helgaas)
- Remove unnecessary __ref annotations (Bjorn Helgaas)
- Add arch/x86/kernel/quirks.c to MAINTAINERS PCI file patterns
(Bjorn Helgaas)
- Fix use of uninitialized MPS value (Bjorn Helgaas)
- Tidy x86/gart messages (Bjorn Helgaas)
- Fix return value from pci_user_{read,write}_config_*() (Gavin Shan)
- Turn pcibios_penalize_isa_irq() into a weak function (Hanjun Guo)
- Remove unused serial device IDs (Jean Delvare)
- Use designated initialization in PCI_VDEVICE (Mark Rustad)
- Fix powerpc NULL dereference in pci_root_buses traversal (Mike Qiu)
- Configure MPS on ARM (Murali Karicheri)
- Remove unnecessary includes of <linux/init.h> (Paul Gortmaker)
- Move Open Firmware devspec attribute to PCI common code (Sebastian Ott)
- Use pdev->dev.groups for attribute creation on s390 (Sebastian Ott)
- Remove pcibios_add_platform_entries() (Sebastian Ott)
- Add new ID for Intel GPU "spurious interrupt" quirk (Thomas Jarosch)
- Rename pci_is_bridge() to pci_has_subordinate() (Yijing Wang)
- Add and use new pci_is_bridge() interface (Yijing Wang)
- Make pci_bus_add_device() void (Yijing Wang)
DMA API
- Clarify physical/bus address distinction in docs (Bjorn Helgaas)
- Fix typos in docs (Emilio López)
- Update dma_pool_create ()and dma_pool_alloc() descriptions (Gioh Kim)
- Change dma_declare_coherent_memory() CPU address to phys_addr_t
(Bjorn Helgaas)
- Pass GAPSPCI_DMA_BASE CPU & bus address to dma_declare_coherent_memory()
(Bjorn Helgaas)"
* tag 'pci-v3.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (92 commits)
MAINTAINERS: Add generic PCI host controller driver
PCI: generic: Add generic PCI host controller driver
PCI: imx6: Add support for MSI
PCI: designware: Make MSI ISR shared IRQ aware
PCI: imx6: Remove optional (and unused) IRQs
PCI: imx6: Drop old IRQ mapping
PCI: imx6: Use new clock names
i82875p_edac: Assign PCI resources before adding device
ARM/PCI: Call pcie_bus_configure_settings() to set MPS
PCI: imx6: Fix imx6_add_pcie_port() section mismatch warning
PCI: Make pci_bus_add_device() void
PCI: exynos: Fix add_pcie_port() section mismatch warning
PCI: Introduce new device binding path using pci_dev.driver_override
PCI: rcar: Add gen2 device tree support
PCI: cpqphp: Fix possible null pointer dereference
PCI: rcar: Add R-Car PCIe device tree bindings
PCI: rcar: Add MSI support for PCIe
PCI: rcar: Add Renesas R-Car PCIe driver
PCI: Fix return value from pci_user_{read,write}_config_*()
PCI: exynos: Remove unnecessary OOM messages
...
|
|
This patch finally allows us to get rid of the BPF_S_* enum.
Currently, the code performs unnecessary encode and decode
workarounds in seccomp and filter migration itself when a filter
is being attached in order to overcome BPF_S_* encoding which
is not used anymore by the new interpreter resp. JIT compilers.
Keeping it around would mean that also in future we would need
to extend and maintain this enum and related encoders/decoders.
We can get rid of all that and save us these operations during
filter attaching. Naturally, also JIT compilers need to be updated
by this.
Before JIT conversion is being done, each compiler checks if A
is being loaded at startup to obtain information if it needs to
emit instructions to clear A first. Since BPF extensions are a
subset of BPF_LD | BPF_{W,H,B} | BPF_ABS variants, case statements
for extensions can be removed at that point. To ease and minimalize
code changes in the classic JITs, we have introduced bpf_anc_helper().
Tested with test_bpf on x86_64 (JIT, int), s390x (JIT, int),
arm (JIT, int), i368 (int), ppc64 (JIT, int); for sparc we
unfortunately didn't have access, but changes are analogous to
the rest.
Joint work with Alexei Starovoitov.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Mircea Gherzan <mgherzan@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Acked-by: Chema Gonzalez <chemag@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"Various fixlets, mostly related to the (root-only) SCHED_DEADLINE
policy, but also a hotplug bug fix and a fix for a NR_CPUS related
overallocation bug causing a suspend/resume regression"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Fix hotplug vs. set_cpus_allowed_ptr()
sched/cpupri: Replace NR_CPUS arrays
sched/deadline: Replace NR_CPUS arrays
sched/deadline: Restrict user params max value to 2^63 ns
sched/deadline: Change sched_getparam() behaviour vs SCHED_DEADLINE
sched: Disallow sched_attr::sched_policy < 0
sched: Make sched_setattr() correctly return -EFBIG
|
|
Fix a trivial comment typo (s/mam/map) in kernel/power/swap.c.
Signed-off-by: Niv Yehezkel <executerx@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core futex/rtmutex fixes from Thomas Gleixner:
"Three fixlets for long standing issues in the futex/rtmutex code
unearthed by Dave Jones syscall fuzzer:
- Add missing early deadlock detection checks in the futex code
- Prevent user space from attaching a futex to kernel threads
- Make the deadlock detector of rtmutex work again
Looks large, but is more comments than code change"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rtmutex: Fix deadlock detector for real
futex: Prevent attaching to kernel threads
futex: Add another early deadlock detection check
|
|
With the conversion of the saved_cmdlines output to use seq_read, there
is now a race between accessing the values of the saved_cmdlines and
the writing to them. The trace_cmdline_lock needs to be taken at
the start and stop of the seq calls.
A new __trace_find_cmdline() call is created to allow for the look up
to happen without taking the lock.
Fixes: 42584c81c5ad tracing: Have saved_cmdlines use the seq_read infrastructure
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
In order to prevent the saved cmdline cache from being filled when
tracing is not active, the comms are only recorded after a trace event
is recorded.
The problem is, a comm can fail to be recorded if the trace_cmdline_lock
is held. That lock is taken via a trylock to allow it to happen from
any context (including NMI). If the lock fails to be taken, the comm
is skipped. No big deal, as we will try again later.
But! Because of the code that was added to only record after an event,
we may not try again later as the recording is made as a oneshot per
event per CPU.
Only disable the recording of the comm if the comm is actually recorded.
Fixes: 7ffbd48d5cab "tracing: Cache comms only after an event occurred"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
Current tracing_saved_cmdlines_read() implementation is naive; It allocates
a large buffer, constructs output data to that buffer for each read
operation, and then copies a portion of the buffer to the user space
buffer. This has several issues such as slow memory allocation, high
CPU usage, and even corruption of the output data.
The seq_read infrastructure is made to handle this type of work.
By converting it to use seq_read() the code becomes smaller, simplified,
as well as correct.
Link: http://lkml.kernel.org/p/20140220084431.3839.51793.stgit@yunodevel
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
In order to help benchmark the time tracepoints take, a new config
option is added called CONFIG_TRACEPOINT_BENCHMARK. When this option
is set a tracepoint is created called "benchmark:benchmark_event".
When the tracepoint is enabled, it kicks off a kernel thread that
goes into an infinite loop (calling cond_sched() to let other tasks
run), and calls the tracepoint. Each iteration will record the time
it took to write to the tracepoint and the next iteration that
data will be passed to the tracepoint itself. That is, the tracepoint
will report the time it took to do the previous tracepoint.
The string written to the tracepoint is a static string of 128 bytes
to keep the time the same. The initial string is simply a write of
"START". The second string records the cold cache time of the first
write which is not added to the rest of the calculations.
As it is a tight loop, it benchmarks as hot cache. That's fine because
we care most about hot paths that are probably in cache already.
An example of the output:
START
first=3672 [COLD CACHED]
last=632 first=3672 max=632 min=632 avg=316 std=446 std^2=199712
last=278 first=3672 max=632 min=278 avg=303 std=316 std^2=100337
last=277 first=3672 max=632 min=277 avg=296 std=258 std^2=67064
last=273 first=3672 max=632 min=273 avg=292 std=224 std^2=50411
last=273 first=3672 max=632 min=273 avg=288 std=200 std^2=40389
last=281 first=3672 max=632 min=273 avg=287 std=183 std^2=33666
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
trace_printk() is used to debug fast paths within the kernel. Places
that gets called in any context (interrupt or NMI) or thousands of
times a second. Something you do not want to do with a printk().
In order to make it completely lockless as it needs a temporary buffer
to handle some of the string formatting, a page is created per cpu for
every context (four per cpu; normal, softirq, irq, NMI).
Since trace_printk() should only be used for debugging purposes,
there's no reason to waste memory on these buffers on a production
system. That means, trace_printk() should never be used unless a
developer is debugging their kernel. There's macro magic to allocate
the buffers if trace_printk() is used anywhere in the kernel.
To help enforce that trace_printk() isn't used outside of development,
when it is used, a nasty banner is displayed on bootup (or when a module
is loaded that uses trace_printk() and the kernel core does not).
Here's the banner:
**********************************************************
** NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE **
** **
** trace_printk() being used. Allocating extra memory. **
** **
** This means that this is a DEBUG kernel and it is **
** unsafe for produciton use. **
** **
** If you see this message and you are not debugging **
** the kernel, report this immediately to your vendor! **
** **
** NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE **
**********************************************************
That should hopefully keep developers from trying to sneak in a
trace_printk() or two.
Link: http://lkml.kernel.org/p/20140528131440.2283213c@gandalf.local.home
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
Commit 5f5c9ae56c38942623f69c3e6dc6ec78e4da2076
"serial_core: Unregister console in uart_remove_one_port()"
fixed a crash where a serial port was removed but
not deregistered as a console.
There is a side effect of that commit for platforms having serial consoles
and of_serial configured (CONFIG_SERIAL_OF_PLATFORM). The serial console
is disabled midway through the boot process.
This cessation of the serial console affects PowerPC computers
such as the MVME5100 and SAM440EP.
The sequence is:
bootconsole [udbg0] enabled
....
serial8250/16550 driver initialises and registers its UARTS,
one of these is the serial console.
console [ttyS0] enabled
....
of_serial probes "platform" devices, registering them as it goes.
One of these is the serial console.
console [ttyS0] disabled.
The disabling of the serial console is due to:
a. unregister_console in printk not clearing the
CONS_ENABLED bit in the console flags,
even though it has announced that the console is disabled; and
b. of_platform_serial_probe in of_serial not setting the port type
before it registers with serial8250_register_8250_port.
This patch ensures that the serial console is re-enabled when of_serial
registers a serial port that corresponds to the designated console.
Signed-off-by: Stephen Chivers <schivers@csc.com>
Tested-by: Stephen Chivers <schivers@csc.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [unregister_console]
Cc: stable <stable@vger.kernel.org> # 3.15
===
The above failure was identified in Linux-3.15-rc2.
Tested using MVME5100 and SAM440EP PowerPC computers with
kernels built from Linux-3.15-rc5 and tty-next.
The continued operation of the serial console is vital for computers
such as the MVME5100 as that Single Board Computer does not
have any grapical/display hardware.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The current deadlock detection logic does not work reliably due to the
following early exit path:
/*
* Drop out, when the task has no waiters. Note,
* top_waiter can be NULL, when we are in the deboosting
* mode!
*/
if (top_waiter && (!task_has_pi_waiters(task) ||
top_waiter != task_top_pi_waiter(task)))
goto out_unlock_pi;
So this not only exits when the task has no waiters, it also exits
unconditionally when the current waiter is not the top priority waiter
of the task.
So in a nested locking scenario, it might abort the lock chain walk
and therefor miss a potential deadlock.
Simple fix: Continue the chain walk, when deadlock detection is
enabled.
We also avoid the whole enqueue, if we detect the deadlock right away
(A-A). It's an optimization, but also prevents that another waiter who
comes in after the detection and before the task has undone the damage
observes the situation and detects the deadlock and returns
-EDEADLOCK, which is wrong as the other task is not in a deadlock
situation.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20140522031949.725272460@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull two powerpc fixes from Ben Herrenschmidt:
"Here's a pair of powerpc fixes for 3.15 which are also going to
stable.
One's a fix for building with newer binutils (the problem currently
only affects the BookE kernels but the affected macro might come back
into use on BookS platforms at any time). Unfortunately, the binutils
maintainer did a backward incompatible change to a construct that we
use so we have to add Makefile check.
The other one is a fix for CPUs getting stuck in kexec when running
single threaded. Since we routinely use kexec on power (including in
our newer bootloaders), I deemed that important enough"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc, kexec: Fix "Processor X is stuck" issue during kexec from ST mode
powerpc: Fix 64 bit builds with binutils 2.24
|
|
This commit did an incorrect printk->pr_info conversion. If we were
converting to pr_info() we should lose the log_level parameter. The problem is
that this is called (indirectly) by show_regs_print_info(), which is called
with various log_levels (from _INFO clear to _EMERG). So we leave it as
a printk() call so the desired log_level is applied.
Not a full revert, as the other half of the patch is correct.
Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
If we try to perform a kexec when the machine is in ST (Single-Threaded) mode
(ppc64_cpu --smt=off), the kexec operation doesn't succeed properly, and we
get the following messages during boot:
[ 0.089866] POWER8 performance monitor hardware support registered
[ 0.089985] power8-pmu: PMAO restore workaround active.
[ 5.095419] Processor 1 is stuck.
[ 10.097933] Processor 2 is stuck.
[ 15.100480] Processor 3 is stuck.
[ 20.102982] Processor 4 is stuck.
[ 25.105489] Processor 5 is stuck.
[ 30.108005] Processor 6 is stuck.
[ 35.110518] Processor 7 is stuck.
[ 40.113369] Processor 9 is stuck.
[ 45.115879] Processor 10 is stuck.
[ 50.118389] Processor 11 is stuck.
[ 55.120904] Processor 12 is stuck.
[ 60.123425] Processor 13 is stuck.
[ 65.125970] Processor 14 is stuck.
[ 70.128495] Processor 15 is stuck.
[ 75.131316] Processor 17 is stuck.
Note that only the sibling threads are stuck, while the primary threads (0, 8,
16 etc) boot just fine. Looking closer at the previous step of kexec, we observe
that kexec tries to wakeup (bring online) the sibling threads of all the cores,
before performing kexec:
[ 9464.131231] Starting new kernel
[ 9464.148507] kexec: Waking offline cpu 1.
[ 9464.148552] kexec: Waking offline cpu 2.
[ 9464.148600] kexec: Waking offline cpu 3.
[ 9464.148636] kexec: Waking offline cpu 4.
[ 9464.148671] kexec: Waking offline cpu 5.
[ 9464.148708] kexec: Waking offline cpu 6.
[ 9464.148743] kexec: Waking offline cpu 7.
[ 9464.148779] kexec: Waking offline cpu 9.
[ 9464.148815] kexec: Waking offline cpu 10.
[ 9464.148851] kexec: Waking offline cpu 11.
[ 9464.148887] kexec: Waking offline cpu 12.
[ 9464.148922] kexec: Waking offline cpu 13.
[ 9464.148958] kexec: Waking offline cpu 14.
[ 9464.148994] kexec: Waking offline cpu 15.
[ 9464.149030] kexec: Waking offline cpu 17.
Instrumenting this piece of code revealed that the cpu_up() operation actually
fails with -EBUSY. Thus, only the primary threads of all the cores are online
during kexec, and hence this is a sure-shot receipe for disaster, as explained
in commit e8e5c2155b (powerpc/kexec: Fix orphaned offline CPUs across kexec),
as well as in the comment above wake_offline_cpus().
It turns out that cpu_up() was returning -EBUSY because the variable
'cpu_hotplug_disabled' was set to 1; and this disabling of CPU hotplug was done
by migrate_to_reboot_cpu() inside kernel_kexec().
Now, migrate_to_reboot_cpu() was originally written with the assumption that
any further code will not need to perform CPU hotplug, since we are anyway in
the reboot path. However, kexec is clearly not such a case, since we depend on
onlining CPUs, atleast on powerpc.
So re-enable cpu-hotplug after returning from migrate_to_reboot_cpu() in the
kexec path, to fix this regression in kexec on powerpc.
Also, wrap the cpu_up() in powerpc kexec code within a WARN_ON(), so that we
can catch such issues more easily in the future.
Fixes: c97102ba963 (kexec: migrate to reboot cpu)
Cc: stable@vger.kernel.org
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
There is still one residue of sysfs remaining: the sb_magic
SYSFS_MAGIC. However this should be kernfs user specific,
so this patch moves it out. Kerrnfs user should specify their
magic number while mouting.
Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Jiri Kosina <trivial@kernel.org>
Link: http://lkml.kernel.org/r/1401178092-1228-3-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|