summaryrefslogtreecommitdiffstats
path: root/include
AgeCommit message (Collapse)Author
2007-09-20signalfd simplificationDavide Libenzi
This simplifies signalfd code, by avoiding it to remain attached to the sighand during its lifetime. In this way, the signalfd remain attached to the sighand only during poll(2) (and select and epoll) and read(2). This also allows to remove all the custom "tsk == current" checks in kernel/signal.c, since dequeue_signal() will only be called by "current". I think this is also what Ben was suggesting time ago. The external effect of this, is that a thread can extract only its own private signals and the group ones. I think this is an acceptable behaviour, in that those are the signals the thread would be able to fetch w/out signalfd. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-09-19sched: add /proc/sys/kernel/sched_compat_yieldIngo Molnar
add /proc/sys/kernel/sched_compat_yield to make sys_sched_yield() more agressive, by moving the yielding task to the last position in the rbtree. with sched_compat_yield=0: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2539 mingo 20 0 1576 252 204 R 50 0.0 0:02.03 loop_yield 2541 mingo 20 0 1576 244 196 R 50 0.0 0:02.05 loop with sched_compat_yield=1: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2584 mingo 20 0 1576 248 196 R 99 0.0 0:52.45 loop 2582 mingo 20 0 1576 256 204 R 0 0.0 0:00.00 loop_yield Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
2007-09-19Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linusLinus Torvalds
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: [MIPS] cpu-bugs64.c: GCC 3.3 constraint workaround [MIPS] DEC: Initialise ioasic_ssr_lock
2007-09-19Merge branch 'merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: [POWERPC] Fix timekeeping on PowerPC 601 [POWERPC] Don't expose clock vDSO functions when CPU has no timebase [POWERPC] spusched: Fix null pointer dereference in find_victim
2007-09-19[MIPS] cpu-bugs64.c: GCC 3.3 constraint workaroundMaciej W. Rozycki
Add a workaround to address warnings generated on the "n" constraint by GCC 3.3 and below. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-09-19Fix NUMA Memory Policy Reference CountingLee Schermerhorn
This patch proposes fixes to the reference counting of memory policy in the page allocation paths and in show_numa_map(). Extracted from my "Memory Policy Cleanups and Enhancements" series as stand-alone. Shared policy lookup [shmem] has always added a reference to the policy, but this was never unrefed after page allocation or after formatting the numa map data. Default system policy should not require additional ref counting, nor should the current task's task policy. However, show_numa_map() calls get_vma_policy() to examine what may be [likely is] another task's policy. The latter case needs protection against freeing of the policy. This patch adds a reference count to a mempolicy returned by get_vma_policy() when the policy is a vma policy or another task's mempolicy. Again, shared policy is already reference counted on lookup. A matching "unref" [__mpol_free()] is performed in alloc_page_vma() for shared and vma policies, and in show_numa_map() for shared and another task's mempolicy. We can call __mpol_free() directly, saving an admittedly inexpensive inline NULL test, because we know we have a non-NULL policy. Handling policy ref counts for hugepages is a bit trickier. huge_zonelist() returns a zone list that might come from a shared or vma 'BIND policy. In this case, we should hold the reference until after the huge page allocation in dequeue_hugepage(). The patch modifies huge_zonelist() to return a pointer to the mempolicy if it needs to be unref'd after allocation. Kernel Build [16cpu, 32GB, ia64] - average of 10 runs: w/o patch w/ refcount patch Avg Std Devn Avg Std Devn Real: 100.59 0.38 100.63 0.43 User: 1209.60 0.37 1209.91 0.31 System: 81.52 0.42 81.64 0.34 Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Andi Kleen <ak@suse.de> Cc: Christoph Lameter <clameter@sgi.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-09-19Fix user namespace exiting OOPsPavel Emelyanov
It turned out, that the user namespace is released during the do_exit() in exit_task_namespaces(), but the struct user_struct is released only during the put_task_struct(), i.e. MUCH later. On debug kernels with poisoned slabs this will cause the oops in uid_hash_remove() because the head of the chain, which resides inside the struct user_namespace, will be already freed and poisoned. Since the uid hash itself is required only when someone can search it, i.e. when the namespace is alive, we can safely unhash all the user_struct-s from it during the namespace exiting. The subsequent free_uid() will complete the user_struct destruction. For example simple program #include <sched.h> char stack[2 * 1024 * 1024]; int f(void *foo) { return 0; } int main(void) { clone(f, stack + 1 * 1024 * 1024, 0x10000000, 0); return 0; } run on kernel with CONFIG_USER_NS turned on will oops the kernel immediately. This was spotted during OpenVZ kernel testing. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Acked-by: "Serge E. Hallyn" <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-09-19Convert uid hash to hlistPavel Emelyanov
Surprisingly, but (spotted by Alexey Dobriyan) the uid hash still uses list_heads, thus occupying twice as much place as it could. Convert it to hlist_heads. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-09-19[POWERPC] Fix timekeeping on PowerPC 601Benjamin Herrenschmidt
Recent changes to the timekeeping code broke support for the PowerPC 601 processor which doesn't have the usual timebase facility but a slightly different thing called (yuck) the RTC. This fixes it, boot tested on an old 601 based PowerMac 7200. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-09-16Merge branch 'master' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6 * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC64]: Warn user if cpu is ignored. [SPARC64]: Fix lockdep, particularly on SMP. [SPARC64]: Update defconfig.
2007-09-16Merge branch 'master' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [VLAN]: Fix net_device leak. [PPP] generic: Fix receive path data clobbering & non-linear handling [PPP] generic: Call skb_cow_head before scribbling over skb [NET] skbuff: Add skb_cow_head [BRIDGE]: Kill clone argument to br_flood_* [PPP] pppoe: Fill in header directly in __pppoe_xmit [PPP] pppoe: Fix data clobbering in __pppoe_xmit and return value [PPP] pppoe: Fix skb_unshare_check call position [SCTP]: Convert bind_addr_list locking to RCU [SCTP]: Add RCU synchronization around sctp_localaddr_list [PKT_SCHED]: sch_cbq.c: Shut up uninitialized variable warning [PKTGEN]: srcmac fix [IPV6]: Fix source address selection. [IPV4]: Just increment OutDatagrams once per a datagram. [IPV6]: Just increment OutDatagrams once per a datagram. [IPV6]: Fix unbalanced socket reference with MSG_CONFIRM. [NET_SCHED] protect action config/dump from irqs [NET]: Fix two issues wrt. SO_BINDTODEVICE.
2007-09-16Fix non-ISA link error in drivers/scsi/advansys.cMatthew Wilcox
When CONFIG_ISA is disabled, the isa_driver support will not be compiled in. Define stubs so that we don't get link-time errors. Signed-off-by: Matthew Wilcox <matthew@wil.cx> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-09-16[NET] skbuff: Add skb_cow_headHerbert Xu
This patch adds an optimised version of skb_cow that avoids the copy if the header can be modified even if the rest of the payload is cloned. This can be used in encapsulating paths where we only need to modify the header. As it is, this can be used in PPPOE and bridging. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-16[SCTP]: Convert bind_addr_list locking to RCUVlad Yasevich
Since the sctp_sockaddr_entry is now RCU enabled as part of the patch to synchronize sctp_localaddr_list, it makes sense to change all handling of these entries to RCU. This includes the sctp_bind_addrs structure and it's list of bound addresses. This list is currently protected by an external rw_lock and that looks like an overkill. There are only 2 writers to the list: bind()/bindx() calls, and BH processing of ASCONF-ACK chunks. These are already seriealized via the socket lock, so they will not step on each other. These are also relatively rare, so we should be good with RCU. The readers are varied and they are easily converted to RCU. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Sridhar Samdurala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-16[SCTP]: Add RCU synchronization around sctp_localaddr_listVlad Yasevich
sctp_localaddr_list is modified dynamically via NETDEV_UP and NETDEV_DOWN events, but there is not synchronization between writer (even handler) and readers. As a result, the readers can access an entry that has been freed and crash the sytem. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Sridhar Samdurala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-16[SPARC64]: Fix lockdep, particularly on SMP.David S. Miller
As noted by Al Viro, when we try to call prom_set_trap_table() in the SMP trampoline code we try to take the PROM call spinlock which doesn't work because the current thread pointer isn't valid yet and lockdep depends upon that being correct. Furthermore, we cannot set the current thread pointer register because it can't be properly dereferenced until we return from prom_set_trap_table(). Kernel TLB misses only work after that call. So do the PROM call to set the trap table directly instead of going through the OBP library C code, and thus avoid the lock altogether. These calls are guarenteed to be serialized fully. Since there are now no calls to the prom_set_trap_table{_sun4v}() library functions, they can be deleted. Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-14Merge git://git.linux-xtensa.org/kernel/xtensa-feedLinus Torvalds
* git://git.linux-xtensa.org/kernel/xtensa-feed: [patch 1/2] Xtensa: enable arbitary tty speed setting ioctls [patch 2/2] xtensa console.c: remove duplicate #include [XTENSA] Add support for cache-aliasing [XTENSA] Add kernel module support [XTENSA] Add support for executable/non-executable feature in the mmu [XTENSA] Use the generic version of get_order [XTENSA] Initialize semaphore_wake_lock [XTENSA] Add typecast macro for constants [XTENSA] Fix timer instabilities. [XTENSA] Fix fadvise64_64 [XTENSA] Remove extraneous include statement [XTENSA] Move string-io functions to io.c from pci.c [XTENSA] Move pre-initialized structures to init_task.c [XTENSA] Add freestanding option to CFLAGS [XTENSA] Add getpgrp system-call to unistd.h [XTENSA] add missing system calls [XTENSA] fix wrong usage of __init and __initdata in traps.c
2007-09-14Merge branch 'for-linus' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/cooloney/blackfin-2.6 * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/cooloney/blackfin-2.6: Blackfin arch: fix some bugs in lib/string.h functions found by our string testing modules Blackfin arch: fix the aliased write macros Blackfin arch: Update/Fix PM support add new pm_ops valid
2007-09-14Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linusLinus Torvalds
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: [MIPS] 20Kc: Disable use of WAIT instruction. [MIPS] Workaround for 4Kc machine check exception [MIPS] Malta: Fix off by one bug in interrupt handler. [MIPS] No ide_default_io_base() if PCI IDE was not found [MIPS] Add #include <linux/profile.h> to arch/mips/kernel/time.c [MIPS] N32 needs to use compat_sys_futimesat [MIPS] rtlx: Fix build error. [MIPS] rtlx: fix int vs. long bug.
2007-09-14[MIPS] No ide_default_io_base() if PCI IDE was not foundAtsushi Nemoto
Revert b5438582090406e2ccb4169d9b2df7c9939ae42b and add no_pci_devices() check to avoid crash due to early calling of pci_get_class(). Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-09-14V4L/DVB (6220a): fix build error for et61x251 driverLinus Torvalds
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2007-09-12Merge branch 'for-linus' of git://git.o-hand.com/linux-rpurdie-ledsLinus Torvalds
* 'for-linus' of git://git.o-hand.com/linux-rpurdie-leds: leds: Add missing include for leds.h
2007-09-12Define termios_1 functions for powerpc, s390, avr32 and frvPaul Mackerras
Commit f629307c857c030d5a3dd777fee37c8bb395e171 introduced uses of kernel_termios_to_user_termios_1 and user_termios_to_kernel_termios_1 on all architectures. However, powerpc, s390, avr32 and frv don't currently define those functions since their termios struct didn't need to be changed when the arbitrary baud rate stuff was added, and thus the kernel won't currently build on those architectures. This adds definitions of kernel_termios_to_user_termios_1 and user_termios_to_kernel_termios_1 to include/asm-generic/termios.h which are identical to kernel_termios_to_user_termios and user_termios_to_kernel_termios respectively. The definitions are the same because the "old" termios and "new" termios are in fact the same on these architectures (which are the same ones that use asm-generic/termios.h). Signed-off-by: Paul Mackerras <paulus@samba.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alan Cox <alan@redhat.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-09-12Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: usbtouchscreen - correctly set 'phys' Input: i8042 - add HP Pavilion DV4270ca to the MUX blacklist Input: i8042 - fix modpost warning Input: add more Braille keycodes
2007-09-12Blackfin arch: fix some bugs in lib/string.h functions found by our string ↵Mike Frysinger
testing modules - use ints for the return value rather than char since we actually return an int and we dont want it improperly being sign extended during the reload http://blackfin.uclinux.org/gf/project/uclinux-dist/tracker/?action=TrackerItemEdit&tracker_item_id=3525 - if src is shorter than the requested number of copy bytes, we need to null pad the rest http://blackfin.uclinux.org/gf/project/uclinux-dist/tracker/?action=TrackerItemEdit&tracker_item_id=3524 - mark these as __volatile__ and add memory to the clobber list so gcc does not optimize buffers around on us we may be using - rewrite asm code to be readable/maintainable Signed-off-by: Mike Frysinger <michael.frysinger@analog.com> Signed-off-by: Bryan Wu <bryan.wu@analog.com>
2007-09-11m68k(nommu): add missing syscallsGeert Uytterhoeven
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-09-11Fix select on /proc files without ->pollAlexey Dobriyan
Taneli Vähäkangas <vahakang@cs.helsinki.fi> reported that commit 786d7e1612f0b0adb6046f19b906609e4fe8b1ba aka "Fix rmmod/read/write races in /proc entries" broke SBCL + SLIME combo. The old code in do_select() used DEFAULT_POLLMASK, if couldn't find ->poll handler. The new code makes ->poll always there and returns 0 by default, which is not correct. Return DEFAULT_POLLMASK instead. Steps to reproduce: install emacs, SBCL, SLIME emacs M-x slime in *inferior-lisp* buffer [watch it doing "Connecting to Swank on port X.."] Please, apply before 2.6.23. P.S.: why SBCL can't just read(2) /proc/cpuinfo is a mystery. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: T Taneli Vahakangas <vahakang@cs.helsinki.fi> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-09-11PTR_ALIGNMatthew Wilcox
The AdvanSys driver wants to align some pointers, and the ALIGN macro doesn't work for pointers. Rather than try to make it work, add a new PTR_ALIGN macro which is typesafe. Signed-off-by: Matthew Wilcox <matthew@wil.cx> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-09-11BCM1480 serial build fixThiemo Seufer
Restores serial functionality for the BCM1480. Signed-off-by: Thiemo Seufer <ths@networkno.de> Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-09-11Merge master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6: pdc202xx_new: PLL detection fix via82cxxx: add Arima W730-K8 and other rebadgings to short cables list pmac: build fix pata_ali/alim15x3: override 80-wire cable detection for Toshiba S1800-814 hpt366: UltraDMA filter for SATA cards (take 2) ide: add ide_dev_is_sata() helper (take 2) hpt366: fix PCI clock detection for HPT374 (take 4) pdc202xx_new: fix PCI refcounting ide: fix PCI refcounting mpc8xx: Only build mpc8xx on arch/ppc
2007-09-11leds: Add missing include for leds.hYoichi Yuasa
This patch has added #include <linux/spinlock.h> to include/linux/leds.h for rwlock_t. Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp> Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
2007-09-11ide: add ide_dev_is_sata() helper (take 2)Sergei Shtylyov
Make the SATA drive detection code from eighty_ninty_three() into inline ide_dev_is_sata() helper fixing it along the way to be more strict while checking word 80 for the reserved values... Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-09-11Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: PCI: irq and pci_ids patch for Intel Tolapai PCI: unhide SMBus on Compaq Deskpro EP 401963-001 motherboard PCI: Remove __devinit from pcibios_get_irq_routing_table PCI: remove devinit from pci_read_bridge_bases PCI AER: fix warnings when PCIEAER=n
2007-09-11Merge branch 'master' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [INET_DIAG]: Fix oops in netlink_rcv_skb [IPv6]: Fix NULL pointer dereference in ip6_flush_pending_frames [NETFILTER]: Fix/improve deadlock condition on module removal netfilter [NETFILTER]: nf_conntrack_ipv4: fix "Frag of proto ..." messages [NET] DOC: Update networking/multiqueue.txt with correct information. [IPV6]: Freeing alive inet6 address [DECNET]: Fix interface address listing regression. [IPV4] devinet: show all addresses assigned to interface [NET]: Do not dereference iov if length is zero [TG3]: Workaround MSI bug on 5714/5780. [Bluetooth] Fix parameter list for event filter command [Bluetooth] Update security filter for Bluetooth 2.1 [Bluetooth] Add compat handling for timestamp structure [Bluetooth] Add missing stat.byte_rx counter modification
2007-09-11Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: [SCSI] libiscsi: sync up iscsi and scsi eh's access to the connection [SCSI] libiscsi: fix null ptr regression when aborting a command with data to transfer [SCSI] qla2xxx: Update version number to 8.02.00-k3. [SCSI] qla2xxx: Correct mailbox register dump for FWI2 capable ISPs. [SCSI] qla2xxx: Correct 8GB iIDMA support. [SCSI] qla2xxx: Correct management-server login-state synchronization issue. [SCSI] qla2xxx: Don't modify parity bits during ISP25XX restart. [SCSI] qla2xxx: Allocate enough space for the full PCI descriptor. [SCSI] zfcp: fix the data buffer accessor patch [SCSI] zfcp: allocate gid_pn_data objects from gid_pn_cache [SCSI] zfcp: fix memory leak
2007-09-11PCI: irq and pci_ids patch for Intel TolapaiJason Gaston
This patch adds the Intel Tolapai LPC and SMBus Controller DID's. Signed-off-by: Jason Gaston <jason.d.gaston@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-09-11PCI AER: fix warnings when PCIEAER=nRandy Dunlap
Fix warnings when CONFIG_PCIEAER=n: drivers/pci/pcie/portdrv_pci.c:105: warning: statement with no effect drivers/pci/pcie/portdrv_pci.c:226: warning: statement with no effect drivers/scsi/arcmsr/arcmsr_hba.c:352: warning: statement with no effect Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-09-11[NETFILTER]: Fix/improve deadlock condition on module removal netfilterNeil Horman
So I've had a deadlock reported to me. I've found that the sequence of events goes like this: 1) process A (modprobe) runs to remove ip_tables.ko 2) process B (iptables-restore) runs and calls setsockopt on a netfilter socket, increasing the ip_tables socket_ops use count 3) process A acquires a file lock on the file ip_tables.ko, calls remove_module in the kernel, which in turn executes the ip_tables module cleanup routine, which calls nf_unregister_sockopt 4) nf_unregister_sockopt, seeing that the use count is non-zero, puts the calling process into uninterruptible sleep, expecting the process using the socket option code to wake it up when it exits the kernel 4) the user of the socket option code (process B) in do_ipt_get_ctl, calls ipt_find_table_lock, which in this case calls request_module to load ip_tables_nat.ko 5) request_module forks a copy of modprobe (process C) to load the module and blocks until modprobe exits. 6) Process C. forked by request_module process the dependencies of ip_tables_nat.ko, of which ip_tables.ko is one. 7) Process C attempts to lock the request module and all its dependencies, it blocks when it attempts to lock ip_tables.ko (which was previously locked in step 3) Theres not really any great permanent solution to this that I can see, but I've developed a two part solution that corrects the problem Part 1) Modifies the nf_sockopt registration code so that, instead of using a use counter internal to the nf_sockopt_ops structure, we instead use a pointer to the registering modules owner to do module reference counting when nf_sockopt calls a modules set/get routine. This prevents the deadlock by preventing set 4 from happening. Part 2) Enhances the modprobe utilty so that by default it preforms non-blocking remove operations (the same way rmmod does), and add an option to explicity request blocking operation. So if you select blocking operation in modprobe you can still cause the above deadlock, but only if you explicity try (and since root can do any old stupid thing it would like.... :) ). Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-10Merge branch 'upstream-linus' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: libata clear horkage on ata_dev_init() [libata, IDE] add new VIA bridge to VIA PATA drivers pata_it821x: fix lost interrupt with atapi devices Fix broken pata_via cable detection
2007-09-10[libata, IDE] add new VIA bridge to VIA PATA driversJoseph Chan
Signed-off-by: Joseph Chan <josephchan@via.com.tw> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-09-10UML: Fix ELF_CORE_COPY_REGS build botchJeff Dike
The earlier crash dump fix on x86_64 depended on patches in -mm which are intended for post-2.6.23. Without those, it broke the build when it went into 2.6.23-rc5. This changes the field references in ELF_CORE_COPY_REGS back to those still used in mainline. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-09-10Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linusLinus Torvalds
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: [MIPS] Ocelot: remove remaining bits [MIPS] TLB: Fix instruction bitmasks [MIPS] R10000: Fix wrong test in dma-default.c [MIPS] Provide empty irq_enable_hazard definition for legacy and R1 cores. [MIPS] Sibyte: Remove broken dependency on EXPERIMENTAL from SIBYTE_SB1xxx_SOC. [MIPS] Kconfig: whitespace cleanup. [MIPS] PCI: Set need_domain_info if controller domain index is non-zero. [MIPS] BCM1480: Fix computation of interrupt mask address register. [MIPS] i8259: Add disable method. [MIPS] tty: add the new ioctls and definitions.
2007-09-10Merge branch 'for-linus' of git://www.linux-m32r.org/git/takata/linux-2.6_devLinus Torvalds
* 'for-linus' of git://www.linux-m32r.org/git/takata/linux-2.6_dev: m32r: Rename STI/CLI macros m32r: build fix of entry.S m32r: Separate syscall table from entry.S m32r: Cosmetic updates of arch/m32r/kernel/entry.S m32r: Exit ei_handler directly for no IRQ case or IPI operations m32r: Simplify ei_handler code m32r: Define symbols to unify platform-dependent ICU checks m32r: Move dot.gdbinit files m32r: Rearrange platform-dependent codes m32r: Add defconfig file for the usrv platform. m32r: Update defconfig files for 2.6.23-rc1 m32r: Move defconfig files to arch/m32r/configs/
2007-09-10[MIPS] Ocelot: remove remaining bitsYoichi Yuasa
Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-09-10[MIPS] Provide empty irq_enable_hazard definition for legacy and R1 cores.Ralf Baechle
Following a strict interpretation the empty definition of irq_enable_hazard has always been a bug - but an intentional one because it didn't bite. This has now changed, for uniprocessor kernels mm/slab.c:do_drain() [...] on_each_cpu(do_drain, cachep, 1, 1); check_irq_on(); [...] may be compiled into a mtc0 c0_status; mfc0 c0_status sequence resulting in a back-to-back hazard. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-09-10[MIPS] tty: add the new ioctls and definitions.Alan Cox
Same as all the others, just put in the constants for the existing kernel code and termios2 structure Signed-off-by: Alan Cox <alan@redhat.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-09-11[POWERPC] cell/PS3: Fix a bug that causes the PS3 to hang on the SPU Class 0 ↵Masato Noguchi
interrupt. The Cell BE Architecture spec states that the SPU MFC Class 0 interrupt is edge-triggered. The current spu interrupt handler assumes this behavior and does not clear the interrupt status. The PS3 hypervisor visualizes all SPU interrupts as level, and on return from the interrupt handler the hypervisor will deliver a new virtual interrupt for any unmasked interrupts which for which the status has not been cleared. This fix clears the interrupt status in the interrupt handler. Signed-off-by: Masato Noguchi <Masato.Noguchi@jp.sony.com> Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com> Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Acked-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-09-06m32r: Rename STI/CLI macrosHirokazu Takata
The names of STI and CLI macros were derived from i386 arch historically, but their name are incomprehensible. So, for easy to understand, rename these macros to ENABLE_INTERRUPTS and DISABLE_INTERRUPTS, respectively. Signed-off-by: Hirokazu Takata <takata@linux-m32r.org>
2007-09-04Input: add more Braille keycodesSamuel Thibault
Some braille keyboards have 10 dots, so extend the Input braille keys definitions. Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2007-09-04Merge branch 'for_linus' of git://git.linux-nfs.org/pub/linux/nfs-2.6Linus Torvalds