summaryrefslogtreecommitdiffstats
path: root/include
AgeCommit message (Collapse)Author
2012-03-27sched: Fix select_fallback_rq() vs cpu_active/cpu_onlinePeter Zijlstra
Commit 5fbd036b55 ("sched: Cleanup cpu_active madness"), which was supposed to finally sort the cpu_active mess, instead uncovered more. Since CPU_STARTING is ran before setting the cpu online, there's a (small) window where the cpu has active,!online. If during this time there's a wakeup of a task that used to reside on that cpu select_task_rq() will use select_fallback_rq() to compute an alternative cpu to run on since we find !online. select_fallback_rq() however will compute the new cpu against cpu_active, this means that it can return the same cpu it started out with, the !online one, since that cpu is in fact marked active. This results in us trying to scheduling a task on an offline cpu and triggering a WARN in the IPI code. The solution proposed by Chuansheng Liu of setting cpu_active in set_cpu_online() is buggy, firstly not all archs actually use set_cpu_online(), secondly, not all archs call set_cpu_online() with IRQs disabled, this means we would introduce either the same race or the race from fd8a7de17 ("x86: cpu-hotplug: Prevent softirq wakeup on wrong CPU") -- albeit much narrower. [ By setting online first and active later we have a window of online,!active, fresh and bound kthreads have task_cpu() of 0 and since cpu0 isn't in tsk_cpus_allowed() we end up in select_fallback_rq() which excludes !active, resulting in a reset of ->cpus_allowed and the thread running all over the place. ] The solution is to re-work select_fallback_rq() to require active _and_ online. This makes the active,!online case work as expected, OTOH archs running CPU_STARTING after setting online are now vulnerable to the issue from fd8a7de17 -- these are alpha and blackfin. Reported-by: Chuansheng Liu <chuansheng.liu@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Frysinger <vapier@gentoo.org> Cc: linux-alpha@vger.kernel.org Link: http://lkml.kernel.org/n/tip-hubqk1i10o4dpvlm06gq7v6j@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-03-13sched/x86: Fix overflow in cyc2ns_offsetSalman Qazi
When a machine boots up, the TSC generally gets reset. However, when kexec is used to boot into a kernel, the TSC value would be carried over from the previous kernel. The computation of cycns_offset in set_cyc2ns_scale is prone to an overflow, if the machine has been up more than 208 days prior to the kexec. The overflow happens when we multiply *scale, even though there is enough room to store the final answer. We fix this issue by decomposing tsc_now into the quotient and remainder of division by CYC2NS_SCALE_FACTOR and then performing the multiplication separately on the two components. Refactor code to share the calculation with the previous fix in __cycles_2_ns(). Signed-off-by: Salman Qazi <sqazi@google.com> Acked-by: John Stultz <john.stultz@linaro.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Turner <pjt@google.com> Cc: john stultz <johnstul@us.ibm.com> Link: http://lkml.kernel.org/r/20120310004027.19291.88460.stgit@dungbeetle.mtv.corp.google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-13Merge tag 'v3.3-rc7' into sched/coreIngo Molnar
Merge reason: merge back final fixes, prepare for the merge window. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-12printk/sched: Introduce special printk_sched() for those awkward momentsPeter Zijlstra
There's a few awkward printk()s inside of scheduler guts that people prefer to keep but really are rather deadlock prone. Fudge around it by storing the text in a per-cpu buffer and poll it using the existing printk_tick() handler. This will drop output when its more frequent than once a tick, however only the affinity thing could possible go that fast and for that just one should suffice to notify the admin he's done something silly.. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-wua3lmkt3dg8nfts66o6brne@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking from David Miller: 1) IPV4 routing metrics can become stale when routes are changed by the administrator, fix from Steffen Klassert. 2) atl1c does "val |= XXX;" where XXX is a bit number not a bit mask, fix by using set_bit. From Dan Carpenter. 3) Memory accounting bug in carl9170 driver results in wedged TX queue. Fix from Nicolas Cavallari. 4) iwlwifi accidently uses "sizeof(ptr)" instead of "sizeof(*ptr)", fix from Johannes Berg. 5) Openvswitch doesn't honor dp_ifindex when doing vport lookups, fix from Ben Pfaff. 6) ehea conversion to 64-bit stats lost multicast and rx_errors accounting, fix from Eric Dumazet. 7) Bridge state transition logging in br_stp_disable_port() is busted, it's emitted at the wrong time and the message is in the wrong tense, fix from Paulius Zaleckas. 8) mlx4 device erroneously invokes the queue resize firmware operation twice, fix from Jack Morgenstein. 9) Fix deadlock in usbnet, need to drop lock when invoking usb_unlink_urb() otherwise we recurse into taking it again. Fix from Sebastian Siewior. 10) hyperv network driver uses the wrong driver name string, fix from Haiyang Zhang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: net/hyperv: Use the built-in macro KBUILD_MODNAME for this driver net/usbnet: avoid recursive locking in usbnet_stop() route: Remove redirect_genid inetpeer: Invalidate the inetpeer tree along with the routing cache mlx4_core: fix bug in modify_cq wrapper for resize flow. atl1c: set ATL1C_WORK_EVENT_RESET bit correctly bridge: fix state reporting when port is disabled bridge: br_log_state() s/entering/entered/ ehea: restore multicast and rx_errors fields openvswitch: Fix checksum update for actions on UDP packets. openvswitch: Honor dp_ifindex, when specified, for vport lookup by name. iwlwifi: fix wowlan suspend mwifiex: reset encryption mode flag before association carl9170: fix frame delivery if sta is in powersave mode carl9170: Fix memory accounting when sta is in power-save mode.
2012-03-08Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux-2.6Linus Torvalds
Pull minor devicetree bug fixes and documentation updates from Grant Likely: "Fixes up a duplicate #include, adds an empty implementation of of_find_compatible_node() and make git ignore .dtb files. And fix up bus name on OF described PHYs. Nothing exciting here." * tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux-2.6: doc: dt: Fix broken reference in gpio-leds documentation of/mdio: fix fixed link bus name of/fdt.c: asm/setup.h included twice of: add picochip vendor prefix dt: add empty of_find_compatible_node function ARM: devicetree: Add .dtb files to arch/arm/boot/.gitignore
2012-03-08route: Remove redirect_genidSteffen Klassert
As we invalidate the inetpeer tree along with the routing cache now, we don't need a genid to reset the redirect handling when the routing cache is flushed. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-08inetpeer: Invalidate the inetpeer tree along with the routing cacheSteffen Klassert
We initialize the routing metrics with the values cached on the inetpeer in rt_init_metrics(). So if we have the metrics cached on the inetpeer, we ignore the user configured fib_metrics. To fix this issue, we replace the old tree with a fresh initialized inet_peer_base. The old tree is removed later with a delayed work queue. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-07Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-armLinus Torvalds
Pull ARM updates from Russell King. * 'fixes' of git://git.linaro.org/people/rmk/linux-arm: ARM: 7358/1: perf: add PMU hotplug notifier ARM: 7357/1: perf: fix overflow handling for xscale2 PMUs ARM: 7356/1: perf: check that we have an event in the PMU IRQ handlers ARM: 7355/1: perf: clear overflow flag when disabling counter on ARMv7 PMU ARM: 7354/1: perf: limit sample_period to half max_period in non-sampling mode ARM: ecard: ensure fake vma vm_flags is setup ARM: 7346/1: errata: fix PL310 erratum #753970 workaround selection ARM: 7345/1: errata: update workaround for A9 erratum #743622 ARM: 7348/1: arm/spear600: fix one-shot timer ARM: 7339/1: amba/serial.h: Include types.h for resolving dependency of type bool
2012-03-05Merge branch 'akpm' (Andrew's patch bomb)Linus Torvalds
Merge the emailed seties of 19 patches from Andrew Morton * akpm: rapidio/tsi721: fix queue wrapping bug in inbound doorbell handler memcg: fix mapcount check in move charge code for anonymous page mm: thp: fix BUG on mm->nr_ptes alpha: fix 32/64-bit bug in futex support memcg: fix GPF when cgroup removal races with last exit debugobjects: Fix selftest for static warnings floppy/scsi: fix setting of BIO flags memcg: fix deadlock by inverting lrucare nesting drivers/rtc/rtc-r9701.c: fix crash in r9701_remove() c2port: class_create() returns an ERR_PTR pps: class_create() returns an ERR_PTR, not NULL hung_task: fix the broken rcu_lock_break() logic vfork: kill PF_STARTING coredump_wait: don't call complete_vfork_done() vfork: make it killable vfork: introduce complete_vfork_done() aio: wake up waiters when freeing unused kiocbs kprobes: return proper error code from register_kprobe() kmsg_dump: don't run on non-error paths by default
2012-03-05memcg: fix GPF when cgroup removal races with last exitHugh Dickins
When moving tasks from old memcg (with move_charge_at_immigrate on new memcg), followed by removal of old memcg, hit General Protection Fault in mem_cgroup_lru_del_list() (called from release_pages called from free_pages_and_swap_cache from tlb_flush_mmu from tlb_finish_mmu from exit_mmap from mmput from exit_mm from do_exit). Somewhat reproducible, takes a few hours: the old struct mem_cgroup has been freed and poisoned by SLAB_DEBUG, but mem_cgroup_lru_del_list() is still trying to update its stats, and take page off lru before freeing. A task, or a charge, or a page on lru: each secures a memcg against removal. In this case, the last task has been moved out of the old memcg, and it is exiting: anonymous pages are uncharged one by one from the memcg, as they are zapped from its pagetables, so the charge gets down to 0; but the pages themselves are queued in an mmu_gather for freeing. Most of those pages will be on lru (and force_empty is careful to lru_add_drain_all, to add pages from pagevec to lru first), but not necessarily all: perhaps some have been isolated for page reclaim, perhaps some isolated for other reasons. So, force_empty may find no task, no charge and no page on lru, and let the removal proceed. There would still be no problem if these pages were immediately freed; but typically (and the put_page_testzero protocol demands it) they have to be added back to lru before they are found freeable, then removed from lru and freed. We don't see the issue when adding, because the mem_cgroup_iter() loops keep their own reference to the memcg being scanned; but when it comes to mem_cgroup_lru_del_list(). I believe this was not an issue in v3.2: there, PageCgroupAcctLRU and PageCgroupUsed flags were used (like a trick with mirrors) to deflect view of pc->mem_cgroup to the stable root_mem_cgroup when neither set. 38c5d72f3ebe ("memcg: simplify LRU handling by new rule") mercifully removed those convolutions, but left this General Protection Fault. But it's surprisingly easy to restore the old behaviour: just check PageCgroupUsed in mem_cgroup_lru_add_list() (which decides on which lruvec to add), and reset pc to root_mem_cgroup if page is uncharged. A risky change? just going back to how it worked before; testing, and an audit of uses of pc->mem_cgroup, show no problem. And there's a nice bonus: with mem_cgroup_lru_add_list() itself making sure that an uncharged page goes to root lru, mem_cgroup_reset_owner() no longer has any purpose, and we can safely revert 4e5f01c2b9b9 ("memcg: clear pc->mem_cgroup if necessary"). Calling update_page_reclaim_stat() after add_page_to_lru_list() in swap.c is not strictly necessary: the lru_lock there, with RCU before memcg structures are freed, makes mem_cgroup_get_reclaim_stat_from_page safe without that; but it seems cleaner to rely on one dependency less. Signed-off-by: Hugh Dickins <hughd@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-05vfork: kill PF_STARTINGOleg Nesterov
Previously it was (ab)used by utrace. Then it was wrongly used by the scheduler code. Currently it is not used, kill it before it finds the new erroneous user. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-05coredump_wait: don't call complete_vfork_done()Oleg Nesterov
Now that CLONE_VFORK is killable, coredump_wait() no longer needs complete_vfork_done(). zap_threads() should find and kill all tasks with the same ->mm, this includes our parent if ->vfork_done is set. mm_release() becomes the only caller, unexport complete_vfork_done(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-05vfork: make it killableOleg Nesterov
Make vfork() killable. Change do_fork(CLONE_VFORK) to do wait_for_completion_killable(). If it fails we do not return to the user-mode and never touch the memory shared with our child. However, in this case we should clear child->vfork_done before return, we use task_lock() in do_fork()->wait_for_vfork_done() and complete_vfork_done() to serialize with each other. Note: now that we use task_lock() we don't really need completion, we could turn task->vfork_done into "task_struct *wake_up_me" but this needs some complications. NOTE: this and the next patches do not affect in-kernel users of CLONE_VFORK, kernel threads run with all signals ignored including SIGKILL/SIGSTOP. However this is obviously the user-visible change. Not only a fatal signal can kill the vforking parent, a sub-thread can do execve or exit_group() and kill the thread sleeping in vfork(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-05vfork: introduce complete_vfork_done()Oleg Nesterov
No functional changes. Move the clear-and-complete-vfork_done code into the new trivial helper, complete_vfork_done(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-05kmsg_dump: don't run on non-error paths by defaultMatthew Garrett
Since commit 04c6862c055f ("kmsg_dump: add kmsg_dump() calls to the reboot, halt, poweroff and emergency_restart paths"), kmsg_dump() gets run on normal paths including poweroff and reboot. This is less than ideal given pstore implementations that can only represent single backtraces, since a reboot may overwrite a stored oops before it's been picked up by userspace. In addition, some pstore backends may have low performance and provide a significant delay in reboot as a result. This patch adds a printk.always_kmsg_dump kernel parameter (which can also be changed from userspace). Without it, the code will only be run on failure paths rather than on normal paths. The option can be enabled in environments where there's a desire to attempt to audit whether or not a reboot was cleanly requested or not. Signed-off-by: Matthew Garrett <mjg@redhat.com> Acked-by: Seiji Aguchi <seiji.aguchi@hds.com> Cc: Seiji Aguchi <seiji.aguchi@hds.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Marco Stornelli <marco.stornelli@gmail.com> Cc: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Don Zickus <dzickus@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) TCP SACK processing can calculate an incorrect reordering value in some cases, fix from Neal Cardwell. 2) tcp_mark_head_lost() can split SKBs in situations where it should not, violating send queue invariants expected by other pieces of code and thus resulting (eventually) in corrupted retransmit state counters. Also from Neal Cardwell. 3) qla3xxx erroneously calls spin_lock_irqrestore() with constant hw_flags of zero. Fix from Santosh Nayak. 4) Fix NULL deref in rt2x00, from Gabor Juhos. 5) pch_gbe passes address of wrong typed object to pch_gbe_validate_option thus corrupting part of the value. From Dan Carpenter. 6) We must check the return value of nlmsg_parse() before trying to use the results. From Eric Dumazet. 7) Bridging code fails to check return value of ipv6_dev_get_saddr() thus potentially leaving uninitialized garbage in the outgoing ipv6 header. From Ulrich Weber. 8) Due to rounding and a reversed operation on jiffies, bridge message ages can go backwards instead of forwards, thus breaking STP. Fixes from Joakim Tjernlund. 9) r8169 modifies Config* registers without properly holding the Config9346 lock, resulting in corrupted IP fragments on some chips. Fix from Francois Romieu. 10) NET_PACKET_ENGINE default wan't set properly during the network driver mega-move. Fix from Stephen Hemminger. 11) vmxnet3 uses TCP header size where it actually should use the UDP header size, fix from Shreyas Bhatewara. 12) Netfilter bridge module autoload is busted in the compat case, fix from Florian Westphal. 13) Wireless Key removal was not setting multicast bits correctly thus accidently killing the unicast key 0 and thus all traffic stops. Fix from Johannes Berg. 14) Fix endless retries of A-MPDU transmissions in brcm80211 driver. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (22 commits) qla3xxx: ethernet: Fix bogus interrupt state flag. bridge: check return value of ipv6_dev_get_saddr() rtnetlink: fix rtnl_calcit() and rtnl_dump_ifinfo() bridge: message age needs to increase, not decrease. bridge: Adjust min age inc for HZ > 256 tcp: don't fragment SACKed skbs in tcp_mark_head_lost() r8169: corrupted IP fragments fix for large mtu. packetengines: fix config default vmxnet3: Fix transport header size enic: fix an endian bug in enic_probe() pch_gbe: memory corruption calling pch_gbe_validate_option() tg3: Fix tg3_get_stats64 for 5700 / 5701 devs tcp: fix false reordering signal in tcp_shifted_skb tcp: fix comment for tp->highest_sack netfilter: bridge: fix module autoload in compat case brcm80211: smac: only print block-ack timeout message at trace level brcm80211: smac: fix endless retry of A-MPDU transmissions mac80211: Fix a warning on changing to monitor mode from STA mac80211: zero initialize count field in ieee80211_tx_rate iwlwifi: fix key removal ...
2012-03-05Merge branch 'for-3.3-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull per-cpu patches from Tejun Heo: "This pull request contains four patches. One replaces manual clearing with bitmap_clear(), two fix generic definition of __this_cpu ops so that they don't choose unnecessarily strict arch version. One makes _this_cpu definition use raw_local_irq_*() so that it doesn't end up wrecking irq on/off state tracking when used from inside lockdep. Of the four patches, the raw_local_irq_*() update is the most important, so please feel free to cherry pick only that one patch and ignore the rest if you want to - commit e920d5971d 'percpu: use raw_local_irq_* in _this_cpu op'." * 'for-3.3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: fix __this_cpu_{sub,inc,dec}_return() definition percpu: use raw_local_irq_* in _this_cpu op percpu: fix generic definition of __this_cpu_add_and_return() percpu: use bitmap_clear
2012-03-04vfs: move dentry_cmp from <linux/dcache.h> to fs/dcache.cLinus Torvalds
It's only used inside fs/dcache.c, and we're going to play games with it for the word-at-a-time patches. This time we really don't even want to export it, because it really is an internal function to fs/dcache.c, and has been since it was introduced. Having it in that extremely hot header file (it's included in pretty much everything, thanks to <linux/fs.h>) is a disaster for testing different versions, and is utterly pointless. We really should have some kind of header file diet thing, where we figure out which parts of header files are really better off private and only result in more expensive compiles. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-04percpu: fix __this_cpu_{sub,inc,dec}_return() definitionKonstantin Khlebnikov
This patch adds missed "__" prefixes, otherwise these functions works as irq/preemption safe. Reported-by: Torsten Kaiser <just.for.lkml@googlemail.com> Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2012-03-03Merge tag 'parisc-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6 PARISC fixes from James Bottomley: "This is a set of build fixes to get the cross compiled architecture testbeds building again" * tag 'parisc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6: [PARISC] don't unconditionally override CROSS_COMPILE for 64 bit. [PARISC] include <linux/prefetch.h> in drivers/parisc/iommu-helpers.h [PARISC] fix compile break caused by iomap: make IOPORT/PCI mapping functions conditional
2012-03-02vfs: clarify and clean up dentry_cmp()Linus Torvalds
It did some odd things for unclear reasons. As this is one of the functions that gets changed when doing word-at-a-time compares, this is yet another of the "don't change any semantics, but clean things up so that subsequent patches don't get obscured by the cleanups". Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-02vfs: uninline full_name_hash()Linus Torvalds
.. and also use it in lookup_one_len() rather than open-coding it. There aren't any performance-critical users, so inlining it is silly. But it wouldn't matter if it wasn't for the fact that the word-at-a-time dentry name patches want to conditionally replace the function, and uninlining it sets the stage for that. So again, this is a preparatory patch that doesn't change any semantics, and only prepares for a much cleaner and testable word-at-a-time dentry name accessor patch. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-02vfs: trivial __d_lookup_rcu() cleanupsLinus Torvalds
These don't change any semantics, but they clean up the code a bit and mark some arguments appropriately 'const'. They came up as I was doing the word-at-a-time dcache name accessor code, and cleaning this up now allows me to send out a smaller relevant interesting patch for the experimental stuff. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-02regset: Return -EFAULT, not -EIO, on host-side memory faultH. Peter Anvin
There is only one error code to return for a bad user-space buffer pointer passed to a system call in the same address space as the system call is executed, and that is EFAULT. Furthermore, the low-level access routines, which catch most of the faults, return EFAULT already. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Roland McGrath <roland@hack.frob.com> Cc: <stable@vger.kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-02regset: Prevent null pointer reference on readonly regsetsH. Peter Anvin
The regset common infrastructure assumed that regsets would always have .get and .set methods, but not necessarily .active methods. Unfortunately people have since written regsets without .set methods. Rather than putting in stub functions everywhere, handle regsets with null .get or .set methods explicitly. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Roland McGrath <roland@hack.frob.com> Cc: <stable@vger.kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-02sched: Clean up parameter passing of proc_sched_autogroup_set_nice()Hiroshi Shimamoto
Pass nice as a value to proc_sched_autogroup_set_nice(). No side effect is expected, and the variable err will be overwritten with the return value. Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/4F45FBB7.5090607@ct.jp.nec.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-01Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
DRM fixes from Dave Airlie: intel: fixes for output regression on 965GM, an oops and a machine hang radeon: uninitialised var (that gcc didn't warn about for some reason) + a couple of correctness fixes. exynos: fixes for various things, drop some chunks of unused code. * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/radeon/kms/vm: fix possible bug in radeon_vm_bo_rmv() drm/radeon: fix uninitialized variable drm/radeon/kms: fix radeon_dp_get_modes for LVDS bridges (v2) drm/i915: Remove use of the autoreported ringbuffer HEAD position drm/i915: Prevent a machine hang by checking crtc->active before loading lut drm/i915: fix operator precedence when enabling RC6p drm/i915: fix a sprite watermark computation to avoid divide by zero if xpos<0 drm/i915: fix mode set on load pipe. (v2) drm/exynos: exynos_drm.h header file fixes drm/exynos: added panel physical size. drm/exynos: added postclose to release resource. drm/exynos: removed exynos_drm_fbdev_recreate function. drm/exynos: fixed page flip issue. drm/exynos: added possible_clones setup function. drm/exynos: removed pageflip_event_list init code when closed. drm/exynos: changed priority of mixer layers. drm/exynos: Fix typo in exynos_mixer.c
2012-03-01sched/rt: Do not submit new work when PI-blockedThomas Gleixner
When we are PI-blocked then we want to get things done ASAP. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-vw8et3445km5b8mpihf4trae@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-01sched/wait: Add __wake_up_all_locked() APIThomas Gleixner
For code which protects the waitqueue itself with another lock it makes no sense to acquire the waitqueue lock for wakeup all. Provide __wake_up_all_locked(). This is an optimization on the vanilla kernel (to be used by the PCI code) and an important semantic distinction on -rt. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-ux6m4b8jonb9inx8xafh77ds@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-01sched/rt: Document scheduler related skip-resched-check sitesThomas Gleixner
Create a distinction between scheduler related preempt_enable_no_resched() calls and the nearly one hundred other places in the kernel that do not want to reschedule, for one reason or another. This distinction matters for -rt, where the scheduler and the non-scheduler preempt models (and checks) are different. For upstream it's purely documentational. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/n/tip-gs88fvx2mdv5psnzxnv575ke@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-01sched/rt: Add schedule_preempt_disabled()Thomas Gleixner
Add helper to get rid of the ever repeating: preempt_enable_no_resched(); schedule(); preempt_disable(); patterns. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-wxx7btox7coby6ifv5vzhzgp@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-01Merge branch 'linus' into sched/coreIngo Molnar
Merge reason: we'll queue up dependent patches. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-02-29Merge branch 'exynos-drm-fixes' of ↵Dave Airlie
git://git.infradead.org/users/kmpark/linux-2.6-samsung into HEAD * 'exynos-drm-fixes' of git://git.infradead.org/users/kmpark/linux-2.6-samsung: drm/exynos: exynos_drm.h header file fixes drm/exynos: added panel physical size. drm/exynos: added postclose to release resource. drm/exynos: removed exynos_drm_fbdev_recreate function. drm/exynos: fixed page flip issue. drm/exynos: added possible_clones setup function. drm/exynos: removed pageflip_event_list init code when closed. drm/exynos: changed priority of mixer layers. drm/exynos: Fix typo in exynos_mixer.c
2012-02-28tcp: fix comment for tp->highest_sackNeal Cardwell
There was an off-by-one error in the comments describing the highest_sack field in struct tcp_sock. The comments previously claimed that it was the "start sequence of the highest skb with SACKed bit". This commit fixes the comments to note that it is the "start sequence of the skb just *after* the highest skb with SACKed bit". Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-27Merge branch 'fixes-for-grant' of git://sources.calxeda.com/kernel/linux ↵Grant Likely
into devicetree/merge
2012-02-27Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/events: Revert trace_sched_stat_sleeptime()
2012-02-27[PARISC] fix compile break caused by iomap: make IOPORT/PCI mapping ↵James Bottomley
functions conditional The problem in commit fea80311a939a746533a6d7e7c3183729d6a3faf Author: Randy Dunlap <rdunlap@xenotime.net> Date: Sun Jul 24 11:39:14 2011 -0700 iomap: make IOPORT/PCI mapping functions conditional is that if your architecture supplies pci_iomap/pci_iounmap, it expects always to supply them. Adding empty body defitions in the !CONFIG_PCI case, which is what this patch does, breaks the parisc compile because the functions become doubly defined. It took us a while to spot this, because we don't actually build !CONFIG_PCI very often (only if someone is brave enough to test the snake/asp machines). Since the note in the commit log says this is to fix a CONFIG_GENERIC_IOMAP issue (which it does because CONFIG_GENERIC_IOMAP supplies pci_iounmap only if CONFIG_PCI is set), there should actually have been a condition upon this. This should make sure no other architecture's !CONFIG_PCI compile breaks in the same way as parisc. The fix had to be updated to take account of the GENERIC_PCI_IOMAP separation. Reported-by: Rolf Eike Beer <eike@sf-mail.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
1) ICMP sockets leave err uninitialized but we try to return it for the unsupported MSG_OOB case, reported by Dave Jones. 2) Add new Zaurus device ID entries, from Dave Jones. 3) Pointer calculation in hso driver memset is wrong, from Dan Carpenter. 4) ks8851_probe() checks unsigned value as negative, fix also from Dan Carpenter. 5) Fix crashes in atl1c driver due to TX queue handling, from Eric Dumazet. I anticipate some TX side locking fixes coming in the near future for this driver as well. 6) The inline directive fix in Bluetooth which was breaking the build only with very new versions of GCC, from Johan Hedberg. 7) Fix crashes in the ATP CLIP code due to ARP cleanups this merge window, reported by Meelis Roos and fixed by Eric Dumazet. 8) JME driver doesn't flush RX FIFO correctly, from Guo-Fu Tseng. 9) Some ip6_route_output() callers test the return value for NULL, but this never happens as the convention is to return a dst entry with dst->error set. Fixes from RonQing Li. 10) Logitech Harmony 900 should be handled by zaurus driver not cdc_ether, update white lists and black lists accordingly. From Scott Talbert. 11) Receiving from certain kinds of devices there won't be a MAC header, so there is no MAC header to fixup in the IPSEC code, and if we try to do it we'll crash. Fix from Eric Dumazet. 12) Port type array indexing off-by-one in mlx4 driver, fix from Yevgeny Petrilin. 13) Fix regression in link-down handling in davinci_emac which causes all RX descriptors to be freed up and therefore RX to wedge completely, from Christian Riesch. 14) It took two attempts, but ctnetlink soft lockups seem to be cured now, from Pablo Neira Ayuso. 15) Endianness bug fix in ENIC driver, from Santosh Nayak. 16) The long ago conversion of the PPP fragmentation code over to abstracted SKB list handling wasn't perfect, once we get an out of sequence SKB we don't flush the rest of them like we should. From Ben McKeegan. 17) Fix regression of ->ip_summed initialization in sfc driver. From Ben Hutchings. 18) Bluetooth timeout mistakenly using msecs instead of jiffies, from Andrzej Kaczmarek. 19) Using _sync variant of work cancellation results in deadlocks, use the non _sync variants instead. From Andre Guedes. 20) Bluetooth rfcomm code had reference counting problems leading to crashes, fix from Octavian Purdila. 21) The conversion of netem over to classful qdisc handling added two bugs to netem_dequeue(), fixes from Eric Dumazet. 22) Missing pci_iounmap() in ATM Solos driver. Fix from Julia Lawall. 23) b44_pci_exit() should not have __exit tag since it's invoked from non-__exit code. From Nikola Pajkovsky. 24) The conversion of the neighbour hash tables over to RCU added a race, fixed here by adding the necessary reread of tbl->nht, fix from Michel Machado. 25) When we added VF (virtual function) attributes for network device dumps, this potentially bloats up the size of the dump of one network device such that the dump size is too large for the buffer allocated by properly written netlink applications. In particular, if you add 255 VFs to a network device, parts of GLIBC stop working. To fix this, we add an attribute that is used to turn on these extended portions of the network device dump. Sophisticaed applications like 'ip' that want to see this stuff will be changed to set the attribute, whereas things like GLIBC that don't care about VFs simply will not, and therefore won't be busted by the mere presence of VFs on a network device. Thanks to the tireless work of Greg Rose on this fix. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (53 commits) sfc: Fix assignment of ip_summed for pre-allocated skbs ppp: fix 'ppp_mp_reconstruct bad seq' errors enic: Fix endianness bug. gre: fix spelling in comments netfilter: ctnetlink: fix soft lockup when netlink adds new entries (v2) Revert "netfilter: ctnetlink: fix soft lockup when netlink adds new entries" davinci_emac: Do not free all rx dma descriptors during init mlx4_core: Fixing array indexes when setting port types phy: IC+101G and PHY_HAS_INTERRUPT flag netdev/phy/icplus: Correct broken phy_init code ipsec: be careful of non existing mac headers Move Logitech Harmony 900 from cdc_ether to zaurus hso: memsetting wrong data in hso_get_count() netfilter: ip6_route_output() never returns NULL. ethernet/broadcom: ip6_route_output() never returns NULL. ipv6: ip6_route_output() never returns NULL. jme: Fix FIFO flush issue atm: clip: remove clip_tbl ipv4: ping: Fix recvmsg MSG_OOB error handling. rtnetlink: Fix problem with buffer allocation ...
2012-02-26Fix autofs compile without CONFIG_COMPATLinus Torvalds
The autofs compat handling fix caused a compile failure when CONFIG_COMPAT isn't defined. Instead of adding random #ifdef'fery in autofs, let's just make the compat helpers earlier to use: without CONFIG_COMPAT, is_compat_task() just hardcodes to zero. We could probably do something similar for a number of other cases where we have #ifdef's in code, but this is the low-hanging fruit. Reported-and-tested-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-24Merge branch 'master' of git://1984.lsi.us.es/netDavid S. Miller
2012-02-24epoll: introduce POLLFREE to flush ->signalfd_wqh before kfree()Oleg Nesterov
This patch is intentionally incomplete to simplify the review. It ignores ep_unregister_pollwait() which plays with the same wqh. See the next change. epoll assumes that the EPOLL_CTL_ADD'ed file controls everything f_op->poll() needs. In particular it assumes that the wait queue can't go away until eventpoll_release(). This is not true in case of signalfd, the task which does EPOLL_CTL_ADD uses its ->sighand which is not connected to the file. This patch adds the special event, POLLFREE, currently only for epoll. It expects that init_poll_funcptr()'ed hook should do the necessary cleanup. Perhaps it should be defined as EPOLLFREE in eventpoll. __cleanup_sighand() is changed to do wake_up_poll(POLLFREE) if ->signalfd_wqh is not empty, we add the new signalfd_cleanup() helper. ep_poll_callback(POLLFREE) simply does list_del_init(task_list). This make this poll entry inconsistent, but we don't care. If you share epoll fd which contains our sigfd with another process you should blame yourself. signalfd is "really special". I simply do not know how we can define the "right" semantics if it used with epoll. The main problem is, epoll calls signalfd_poll() once to establish the connection with the wait queue, after that signalfd_poll(NULL) returns the different/inconsistent results depending on who does EPOLL_CTL_MOD/signalfd_read/etc. IOW: apart from sigmask, signalfd has nothing to do with the file, it works with the current thread. In short: this patch is the hack which tries to fix the symptoms. It also assumes that nobody can take tasklist_lock under epoll locks, this seems to be true. Note: - we do not have wake_up_all_poll() but wake_up_poll() is fine, poll/epoll doesn't use WQ_FLAG_EXCLUSIVE. - signalfd_cleanup() uses POLLHUP along with POLLFREE, we need a couple of simple changes in eventpoll.c to make sure it can't be "lost". Reported-by: Maxime Bizon <mbizon@freebox.fr> Cc: <stable@kernel.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-24netfilter: ctnetlink: fix soft lockup when netlink adds new entries (v2)Jozsef Kadlecsik
Marcell Zambo and Janos Farago noticed and reported that when new conntrack entries are added via netlink and the conntrack table gets full, soft lockup happens. This is because the nf_conntrack_lock is held while nf_conntrack_alloc is called, which is in turn wants to lock nf_conntrack_lock while evicting entries from the full table. The patch fixes the soft lockup with limiting the holding of the nf_conntrack_lock to the minimum, where it's absolutely required. It required to extend (and thus change) nf_conntrack_hash_insert so that it makes sure conntrack and ctnetlink do not add the same entry twice to the conntrack table. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-02-23ARM: 7339/1: amba/serial.h: Include types.h for resolving dependency of type ↵viresh kumar
bool serial.h uses bool, but its definition is missing, as it doesn't include types.h. Fix this by including types.h Signed-off-by: Viresh Kumar <viresh.kumar@st.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-02-23ipsec: be careful of non existing mac headersEric Dumazet
Niccolo Belli reported ipsec crashes in case we handle a frame without mac header (atm in his case) Before copying mac header, better make sure it is present. Bugzilla reference: https://bugzilla.kernel.org/show_bug.cgi?id=42809 Reported-by: Niccolò Belli <darkbasic@linuxsystems.it> Tested-by: Niccolò Belli <darkbasic@linuxsystems.it> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-23Merge branch 'nf' of git://1984.lsi.us.es/netDavid S. Miller
2012-02-22Merge tag 'usb-3.3-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb USB bugfixes for 3.3-rc4 A number of new device ids, and a cleanup/fix for some of the option device ids that shouldn't have been added in the first place. There's also a few USB 3 fixes for problems that people have reported, and a usb-storage bugfix to round it out. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> * tag 'usb-3.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: USB: Added Kamstrup VID/PIDs to cp210x serial driver. USB: Serial: ti_usb_3410_5052: Add Abbot Diabetes Care cable id usb-storage: fix freezing of the scanning thread xhci: Fix encoding for HS bulk/control NAK rate. USB: Set hub depth after USB3 hub reset USB: Fix handoff when BIOS disables host PCI device. USB: option: cleanup zte 3g-dongle's pid in option.c USB: Don't fail USB3 probe on missing legacy PCI IRQ. xhci: Fix oops caused by more USB2 ports than USB3 ports. USB: Remove duplicate USB 3.0 hub feature #defines.
2012-02-22Merge tag 'nfs-for-3.3-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Bugfixes for the NFS client. Fix a nasty Oops in the NFSv4 getacl code, another source of infinite loops in the NFSv4 state recovery code, and a regression in NFSv4.1 session initialisation. Also deal with an NFSv4.1 memory leak. * tag 'nfs-for-3.3-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4: fix server_scope memory leak NFSv4.1: Fix a NFSv4.1 session initialisation regression NFSv4: Ensure we throw out bad delegation stateids on NFS4ERR_BAD_STATEID NFSv4: Fix an Oops in the NFSv4 getacl code
2012-02-22sched: Make initial SCHED_RR timeslace DEF_TIMESLICEHiroshi Shimamoto
Current the initial SCHED_RR timeslice of init_task is HZ, which means 1s, and is not same as the default SCHED_RR timeslice DEF_TIMESLICE. Change that initial timeslice to the DEF_TIMESLICE. Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> [ s/DEF_TIMESLICE/RR_TIMESLICE/g ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/4F3C9995.3010800@ct.jp.nec.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-02-22sched/events: Revert trace_sched_stat_sleeptime()Peter Zijlstra
Commit 1ac9bc69 ("sched/tracing: Add a new tracepoint for sleeptime") added a new sched:sched_stat_sleeptime tracepoint. It's broken: the first sample we get on a task might be bad because of a stale sleep_start value that wasn't reset at the last task switch because the tracepoint was not active. It also breaks the existing schedstat samples due to the side effects of: - se->statistics.sleep_start = 0; ... - se->statistics.block_start = 0; Nor do I see means to fix it without adding overhead to the scheduler fast path, which I'm not willing to for the sake of redundant instrumentation. Most importantly, sleep time information can already be constructed by tracing context switches and wakeups, and taking the timestamp difference between the schedule-out, the wakeup and the schedule-in. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrew Vagin <avagin@openvz.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-pc4c9qhl8q6vg3bs4j6k0rbd@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>