summaryrefslogtreecommitdiffstats
path: root/include
AgeCommit message (Collapse)Author
2006-10-03[PATCH] ide: Fix crash on repeated resetAlan Cox
Michal Miroslaw reported a problem (bugzilla #7023) where a user initiated reset while the IDE layer was already resetting the channel caused a crash, and provided a rough fix. This is a slightly cleaner version of the fix which tracks the reset state and blocks further reset requests while a reset is in progress. Note this is not a security issue - random end users can't access the ioctl in question anyway. Signed-off-by: Alan Cox <alan@redhat.com> Cc: Michal Miroslaw <mirq-linux@rere.qmqm.pl> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03[PATCH] ide: remove dma_base2 field from ide_hwif_tSergei Shtylyov
Remove dma_base2 field from ide_hwif_t as it's used only in 2 drivers and without great need. Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: John Keller <jpk@sgi.com> Signed-off-by: Jeremy Higdon <jeremy@sgi.com> Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03[PATCH] drivers/ide/: cleanupsAdrian Bunk
- setup-pci.c: remove the unused ide_pci_unregister_driver() - ide-dma.c: remove the unused EXPORT_SYMBOL_GPL(ide_in_drive_list) Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03[PATCH] Make number of IDE interfaces configurableMatt Mackall
Make IDE_HWIFS configurable if EMBEDDED This lets us lop as much as 16k off an x86 build. It's a little ugly, but it's dead simple. Note the fix for HWIFS < 2. Sizing interfaces dynamically unfortunately turns out to be pretty major surgery. add/remove: 0/1 grow/shrink: 0/11 up/down: 0/-16182 (-16182) function old new delta ide_hwifs 16920 1692 -15228 init_irq 1113 750 -363 ideprobe_init 283 138 -145 ide_pci_setup_ports 1329 1193 -136 save_match 85 - -85 ide_register_hw_with_fixup 367 287 -80 ide_setup 1364 1308 -56 is_chipset_set 40 4 -36 create_proc_ide_interfaces 225 205 -20 init_ide_data 84 67 -17 ide_probe_for_cmd640x 1198 1183 -15 ide_unregister 1452 1451 -1 Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl> Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03[PATCH] IDE: claim extra DMA ports regardless of channelSergei Shtylylov
- Claim extra DMA I/O ports regardless of what IDE channels are present/enabled. - Remove extra ports handling from ide_mapped_mmio_dma() since it's not applicable to the custom-mapping IDE drivers. Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03[PATCH] sched: cleanup sched_group cpu_power setupSiddha, Suresh B
Up to now sched group's cpu_power for each sched domain is initialized independently. This made the setup code ugly as the new sched domains are getting added. Make the sched group cpu_power setup code generic, by using domain child field and new domain flag in sched_domain. For most of the sched domains(except NUMA), sched group's cpu_power is now computed generically using the domain properties of itself and of the child domain. sched groups in NUMA domains are setup little differently and hence they don't use this generic mechanism. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03[PATCH] sched: introduce child field in sched_domainSiddha, Suresh B
Introduce the child field in sched_domain struct and use it in sched_balance_self(). We will also use this field in cleaning up the sched group cpu_power setup(done in a different patch) code. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03[PATCH] restore parport_pc probing on powermacOlaf Hering
The last change for partport_pc did fix the common case for all PowerMacs, but it broke the case for PCI multiport IO cards. In fact, the config option CONFIG_PARPORT_PC_SUPERIO=y lead to a hard crash when cups probed the parport driver. It enables the winbond and smsc probing. Remove the PARPORT_BASE check again, parport_pc_find_nonpci_ports() will take care of it. All powerpc configs should have CONFIG_PARPORT_PC_SUPERIO=n, the code did not find anything on the chrp boards we tested it on. Tested on a G4/466 with a PCI card: 0001:10:13.0 Serial controller: Timedia Technology Co Ltd PCI2S550 (Dual 16550 UART) (rev 01) (prog-if 02 [16550]) Subsystem: Timedia Technology Co Ltd Unknown device 5079 Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Interrupt: pin A routed to IRQ 53 Region 0: I/O ports at f2000800 [size=32] Region 2: I/O ports at f2000870 [size=8] Region 3: I/O ports at f2000860 [size=8] Signed-off-by: Olaf Hering <olaf@aepfle.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Adam Belay <ambx1@neo.rr.com> Cc: Dmitry Torokhov <dtor@mail.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03[PATCH] kernel-doc for kernel/dma.cRandy Dunlap
Add kernel-doc function headers in kernel/dma.c and use it in DocBook. Clean up kernel-doc in mca_dma.h (the colon (':') represents a section header). Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03[PATCH] Create kallsyms_lookup_size_offset()Franck Bui-Huu
Some uses of kallsyms_lookup() do not need to find out the name of a symbol and its module's name it belongs. This is specially true in arch specific code, which needs to unwind the stack to show the back trace during oops (mips is an example). In this specific case, we just need to retreive the function's size and the offset of the active intruction inside it. Adds a new entry "kallsyms_lookup_size_offset()" This new entry does exactly the same as kallsyms_lookup() but does not require any buffers to store any names. It returns 0 if it fails otherwise 1. Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03[PATCH] VFS: Make filldir_t and struct kstat deal in 64-bit inode numbersDavid Howells
These patches make the kernel pass 64-bit inode numbers internally when communicating to userspace, even on a 32-bit system. They are required because some filesystems have intrinsic 64-bit inode numbers: NFS3+ and XFS for example. The 64-bit inode numbers are then propagated to userspace automatically where the arch supports it. Problems have been seen with userspace (eg: ld.so) using the 64-bit inode number returned by stat64() or getdents64() to differentiate files, and failing because the 64-bit inode number space was compressed to 32-bits, and so overlaps occur. This patch: Make filldir_t take a 64-bit inode number and struct kstat carry a 64-bit inode number so that 64-bit inode numbers can be passed back to userspace. The stat functions then returns the full 64-bit inode number where available and where possible. If it is not possible to represent the inode number supplied by the filesystem in the field provided by userspace, then error EOVERFLOW will be issued. Similarly, the getdents/readdir functions now pass the full 64-bit inode number to userspace where possible, returning EOVERFLOW instead when a directory entry is encountered that can't be properly represented. Note that this means that some inodes will not be stat'able on a 32-bit system with old libraries where they were before - but it does mean that there will be no ambiguity over what a 32-bit inode number refers to. Note similarly that directory scans may be cut short with an error on a 32-bit system with old libraries where the scan would work before for the same reasons. It is judged unlikely that this situation will occur because modern glibc uses 64-bit capable versions of stat and getdents class functions exclusively, and that older systems are unlikely to encounter unrepresentable inode numbers anyway. [akpm: alpha build fix] Signed-off-by: David Howells <dhowells@redhat.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03[PATCH] pid.h cleanupAndrew Morton
Make the pid.h macros look less revolting in an 80-col window. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02Merge master.kernel.org:/pub/scm/linux/kernel/git/wim/linux-2.6-watchdogLinus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog: [WATCHDOG] improve machzwd detection [WATCHDOG] use ENOTTY instead of ENOIOCTLCMD in ioctl() [WATCHDOG] s3c24XX nowayout [WATCHDOG] pnx4008: add cpu_relax() [WATCHDOG] pnx4008_wdt.c - spinlock fixes. [WATCHDOG] pnx4008_wdt.c - remove patch [WATCHDOG] pnx4008_wdt.c - nowayout patch [WATCHDOG] pnx4008: add watchdog support [WATCHDOG] i8xx_tco remove pci_find_device. [WATCHDOG] alim remove pci_find_device
2006-10-02Add prototype for sigset_from_compat()Linus Torvalds
Duh. I screwed up editing David Howells patch in commit 3f2e05e90e0846c42626e3d272454f26be34a1bc, and the actual declaration for the sigset_from_compat() function went missing. My bad. Olaf Hering saved the day and noticed that I'm a moron. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[WATCHDOG] pnx4008: add watchdog supportVitaly Wool
Add watchdog support for Philips PNX4008 ARM board inlined. Signed-off-by: Vitaly Wool <vitalywool@gmail.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2006-10-02Merge git://git.infradead.org/mtd-2.6Linus Torvalds
* git://git.infradead.org/mtd-2.6: [MTD] Cleanup of 'ioremap balanced with iounmap for drivers/mtd subsystem' [MTD] fix nftl_write warning [MTD] fix printk warning [MTD ONENAND] Check OneNAND lock scheme & all block unlock command support [MTD ONENAND] Remove unused MTD_ONENAND_SYNC_READ configuration [MTD ONENAND] Fix OneNAND probe [MTD NAND] Provide prototype for newly-exported nand_wait_ready() [MTD] Remove #ifndef __KERNEL__ hack in <mtd/mtd-abi.h> [MTD NAND] Allow override of page read and write functions. [MTD NAND] Allocate chip->buffers separately to allow it to be overridden [MTD NAND] Split nand_scan() into two parts; allow board driver to intervene [MTD NAND] Export nand_wait_ready() for use by board drivers
2006-10-02Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (35 commits) Input: wistron - add support for Acer TravelMate 2424NWXCi Input: wistron - fix setting up special buttons Input: add KEY_BLUETOOTH and KEY_WLAN definitions Input: add new BUS_VIRTUAL bus type Input: add driver for stowaway serial keyboards Input: make input_register_handler() return error codes Input: remove cruft that was needed for transition to sysfs Input: fix input module refcounting Input: constify input core Input: libps2 - rearrange exports Input: atkbd - support Microsoft Natural Elite Pro keyboards Input: i8042 - disable MUX mode on Toshiba Equium A110 Input: i8042 - get rid of polling timer Input: send key up events at disconnect Input: constify psmouse driver Input: i8042 - add Amoi to the MUX blacklist Input: logips2pp - add sugnature 56 (Cordless MouseMan Wheel), cleanup Input: add driver for Touchwin serial touchscreens Input: add driver for Touchright serial touchscreens Input: add driver for Penmount serial touchscreens ...
2006-10-02Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linusLinus Torvalds
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: [MIPS] Remove unused galileo-boars header files [MIPS] Rename SERIAL_PORT_DEFNS for EV64120 [MIPS] Add UART IRQ number for EV64120 [MIPS] Remove excite_flash.c [MIPS] Update i8259 resources. [MIPS] Make unwind_stack() can dig into interrupted context [MIPS] Stacktrace build-fix and improvement [MIPS] QEMU: Add support for little endian mips [MIPS] Remove __flush_icache_page [MIPS] lockdep: update defconfigs [MIPS] lockdep: Add STACKTRACE_SUPPORT and enable LOCKDEP_SUPPORT [MIPS] lockdep: fix TRACE_IRQFLAGS_SUPPORT
2006-10-02[PATCH] BLOCK: Revert patch to hack around undeclared sigset_t in linux/compat.hDavid Howells
Revert Andrew Morton's patch to temporarily hack around the lack of a declaration of sigset_t in linux/compat.h to make the block-disablement patches build on IA64. This got accidentally pushed to Linus and should be fixed in a different manner. Also make linux/compat.h #include asm/signal.h to gain a definition of sigset_t so that it can externally declare sigset_from_compat(). This has been compile-tested for i386, x86_64, ia64, mips, mips64, frv, ppc and ppc64 and run-tested on frv. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] replace cad_pid by a struct pidCedric Le Goater
There are a few places in the kernel where the init task is signaled. The ctrl+alt+del sequence is one them. It kills a task, usually init, using a cached pid (cad_pid). This patch replaces the pid_t by a struct pid to avoid pid wrap around problem. The struct pid is initialized at boot time in init() and can be modified through systctl with /proc/sys/kernel/cad_pid [ I haven't found any distro using it ? ] It also introduces a small helper routine kill_cad_pid() which is used where it seemed ok to use cad_pid instead of pid 1. [akpm@osdl.org: cleanups, build fix] Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] introduce get_task_pid() to fix unsafe get_pid()Oleg Nesterov
proc_pid_make_inode: ei->pid = get_pid(task_pid(task)); I think this is not safe. get_pid() can be preempted after checking "pid != NULL". Then the task exits, does detach_pid(), and RCU frees the pid. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] AVR32: Implement kernel_execveHaavard Skinnemoen
Move execve() into arch/avr32/kernel/sys_avr32.c, rename it to kernel_execve() and return the syscall return value directly without setting errno. This also gets rid of the __KERNEL_SYSCALLS__ stuff from unistd.h and expands #ifdef __KERNEL__ to cover everything in unistd.h except the __NR_foo definitions. Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] remove remaining errno and __KERNEL_SYSCALLS__ referencesArnd Bergmann
The last in-kernel user of errno is gone, so we should remove the definition and everything referring to it. This also removes the now-unused lib/execve.c file that was introduced earlier. Also remove every trace of __KERNEL_SYSCALLS__ that still remained in the kernel. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Andi Kleen <ak@muc.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Hirokazu Takata <takata.hirokazu@renesas.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Richard Curnow <rc@rc0.org.uk> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp> Cc: Chris Zankel <chris@zankel.net> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] rename the provided execve functions to kernel_execveArnd Bergmann
Some architectures provide an execve function that does not set errno, but instead returns the result code directly. Rename these to kernel_execve to get the right semantics there. Moreover, there is no reasone for any of these architectures to still provide __KERNEL_SYSCALLS__ or _syscallN macros, so remove these right away. [akpm@osdl.org: build fix] [bunk@stusta.de: build fix] Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Andi Kleen <ak@muc.de> Acked-by: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Hirokazu Takata <takata.hirokazu@renesas.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Richard Curnow <rc@rc0.org.uk> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp> Cc: Chris Zankel <chris@zankel.net> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] IPC namespace - utilsKirill Korotaev
This patch adds basic IPC namespace functionality to IPC utils: - init_ipc_ns - copy/clone/unshare/free IPC ns - /proc preparations Signed-off-by: Pavel Emelianov <xemul@openvz.org> Signed-off-by: Kirill Korotaev <dev@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] IPC namespace coreKirill Korotaev
This patch set allows to unshare IPCs and have a private set of IPC objects (sem, shm, msg) inside namespace. Basically, it is another building block of containers functionality. This patch implements core IPC namespace changes: - ipc_namespace structure - new config option CONFIG_IPC_NS - adds CLONE_NEWIPC flag - unshare support [clg@fr.ibm.com: small fix for unshare of ipc namespace] [akpm@osdl.org: build fix] Signed-off-by: Pavel Emelianov <xemul@openvz.org> Signed-off-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] namespaces: utsname: implement CLONE_NEWUTS flagSerge E. Hallyn
Implement a CLONE_NEWUTS flag, and use it at clone and sys_unshare. [clg@fr.ibm.com: IPC unshare fix] [bunk@stusta.de: cleanup] Signed-off-by: Serge Hallyn <serue@us.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Andrey Savochkin <saw@sw.ru> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] namespaces: utsname: remove system_utsnameSerge E. Hallyn
The system_utsname isn't needed now that kernel/sysctl.c is fixed. Nuke it. Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Andrey Savochkin <saw@sw.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] namespaces: utsname: implement utsname namespacesSerge E. Hallyn
This patch defines the uts namespace and some manipulators. Adds the uts namespace to task_struct, and initializes a system-wide init namespace. It leaves a #define for system_utsname so sysctl will compile. This define will be removed in a separate patch. [akpm@osdl.org: build fix, cleanup] Signed-off-by: Serge Hallyn <serue@us.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Andrey Savochkin <saw@sw.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] namespaces: utsname: use init_utsname when appropriateSerge E. Hallyn
In some places, particularly drivers and __init code, the init utsns is the appropriate one to use. This patch replaces those with a the init_utsname helper. Changes: Removed several uses of init_utsname(). Hope I picked all the right ones in net/ipv4/ipconfig.c. These are now changed to utsname() (the per-process namespace utsname) in the previous patch (2/7) [akpm@osdl.org: CIFS fix] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Andrey Savochkin <saw@sw.ru> Cc: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] namespaces: utsname: switch to using uts namespacesSerge E. Hallyn
Replace references to system_utsname to the per-process uts namespace where appropriate. This includes things like uname. Changes: Per Eric Biederman's comments, use the per-process uts namespace for ELF_PLATFORM, sunrpc, and parts of net/ipv4/ipconfig.c [jdike@addtoit.com: UML fix] [clg@fr.ibm.com: cleanup] [akpm@osdl.org: build fix] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Andrey Savochkin <saw@sw.ru> Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] namespaces: utsname: introduce temporary helpersSerge E. Hallyn
Define utsname() and init_utsname() which return &system_utsname. Users of system_utsname will be changed to use these helpers, after which system_utsname will disappear. Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Andrey Savochkin <saw@sw.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] namespaces: incorporate fs namespace into nsproxySerge E. Hallyn
This moves the mount namespace into the nsproxy. The mount namespace count now refers to the number of nsproxies point to it, rather than the number of tasks. As a result, the unshare_namespace() function in kernel/fork.c no longer checks whether it is being shared. Signed-off-by: Serge Hallyn <serue@us.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Andrey Savochkin <saw@sw.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] namespaces: add nsproxySerge E. Hallyn
This patch adds a nsproxy structure to the task struct. Later patches will move the fs namespace pointer into this structure, and introduce a new utsname namespace into the nsproxy. The vserver and openvz functionality, then, would be implemented in large part by virtualizing/isolating more and more resources into namespaces, each contained in the nsproxy. [akpm@osdl.org: build fix] Signed-off-by: Serge Hallyn <serue@us.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Andrey Savochkin <saw@sw.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] nfsd: lockdep annotationPeter Zijlstra
while doing a kernel make modules_install install over an NFS mount. ============================================= [ INFO: possible recursive locking detected ] --------------------------------------------- nfsd/9550 is trying to acquire lock: (&inode->i_mutex){--..}, at: [<c034c845>] mutex_lock+0x1c/0x1f but task is already holding lock: (&inode->i_mutex){--..}, at: [<c034c845>] mutex_lock+0x1c/0x1f other info that might help us debug this: 2 locks held by nfsd/9550: #0: (hash_sem){..--}, at: [<cc895223>] exp_readlock+0xd/0xf [nfsd] #1: (&inode->i_mutex){--..}, at: [<c034c845>] mutex_lock+0x1c/0x1f stack backtrace: [<c0103508>] show_trace_log_lvl+0x58/0x152 [<c0103b8b>] show_trace+0xd/0x10 [<c0103c2f>] dump_stack+0x19/0x1b [<c012aa57>] __lock_acquire+0x77a/0x9a3 [<c012af4a>] lock_acquire+0x60/0x80 [<c034c6c2>] __mutex_lock_slowpath+0xa7/0x20e [<c034c845>] mutex_lock+0x1c/0x1f [<c0162edc>] vfs_unlink+0x34/0x8a [<cc891d98>] nfsd_unlink+0x18f/0x1e2 [nfsd] [<cc89884f>] nfsd3_proc_remove+0x95/0xa2 [nfsd] [<cc88f0d4>] nfsd_dispatch+0xc0/0x178 [nfsd] [<c033e84d>] svc_process+0x3a5/0x5ed [<cc88f5ba>] nfsd+0x1a7/0x305 [nfsd] [<c0101005>] kernel_thread_helper+0x5/0xb DWARF2 unwinder stuck at kernel_thread_helper+0x5/0xb Leftover inexact backtrace: [<c0103b8b>] show_trace+0xd/0x10 [<c0103c2f>] dump_stack+0x19/0x1b [<c012aa57>] __lock_acquire+0x77a/0x9a3 [<c012af4a>] lock_acquire+0x60/0x80 [<c034c6c2>] __mutex_lock_slowpath+0xa7/0x20e [<c034c845>] mutex_lock+0x1c/0x1f [<c0162edc>] vfs_unlink+0x34/0x8a [<cc891d98>] nfsd_unlink+0x18f/0x1e2 [nfsd] [<cc89884f>] nfsd3_proc_remove+0x95/0xa2 [nfsd] [<cc88f0d4>] nfsd_dispatch+0xc0/0x178 [nfsd] [<c033e84d>] svc_process+0x3a5/0x5ed [<cc88f5ba>] nfsd+0x1a7/0x305 [nfsd] [<c0101005>] kernel_thread_helper+0x5/0xb ============================================= [ INFO: possible recursive locking detected ] --------------------------------------------- nfsd/9580 is trying to acquire lock: (&inode->i_mutex){--..}, at: [<c034cc1d>] mutex_lock+0x1c/0x1f but task is already holding lock: (&inode->i_mutex){--..}, at: [<c034cc1d>] mutex_lock+0x1c/0x1f other info that might help us debug this: 2 locks held by nfsd/9580: #0: (hash_sem){..--}, at: [<cc89522b>] exp_readlock+0xd/0xf [nfsd] #1: (&inode->i_mutex){--..}, at: [<c034cc1d>] mutex_lock+0x1c/0x1f stack backtrace: [<c0103508>] show_trace_log_lvl+0x58/0x152 [<c0103b8b>] show_trace+0xd/0x10 [<c0103c2f>] dump_stack+0x19/0x1b [<c012aa63>] __lock_acquire+0x77a/0x9a3 [<c012af56>] lock_acquire+0x60/0x80 [<c034ca9a>] __mutex_lock_slowpath+0xa7/0x20e [<c034cc1d>] mutex_lock+0x1c/0x1f [<cc892ad1>] nfsd_setattr+0x2c8/0x499 [nfsd] [<cc893ede>] nfsd_create_v3+0x31b/0x4ac [nfsd] [<cc8984a1>] nfsd3_proc_create+0x128/0x138 [nfsd] [<cc88f0d4>] nfsd_dispatch+0xc0/0x178 [nfsd] [<c033ec1d>] svc_process+0x3a5/0x5ed [<cc88f5ba>] nfsd+0x1a7/0x305 [nfsd] [<c0101005>] kernel_thread_helper+0x5/0xb DWARF2 unwinder stuck at kernel_thread_helper+0x5/0xb Leftover inexact backtrace: [<c0103b8b>] show_trace+0xd/0x10 [<c0103c2f>] dump_stack+0x19/0x1b [<c012aa63>] __lock_acquire+0x77a/0x9a3 [<c012af56>] lock_acquire+0x60/0x80 [<c034ca9a>] __mutex_lock_slowpath+0xa7/0x20e [<c034cc1d>] mutex_lock+0x1c/0x1f [<cc892ad1>] nfsd_setattr+0x2c8/0x499 [nfsd] [<cc893ede>] nfsd_create_v3+0x31b/0x4ac [nfsd] [<cc8984a1>] nfsd3_proc_create+0x128/0x138 [nfsd] [<cc88f0d4>] nfsd_dispatch+0xc0/0x178 [nfsd] [<c033ec1d>] svc_process+0x3a5/0x5ed [<cc88f5ba>] nfsd+0x1a7/0x305 [nfsd] [<c0101005>] kernel_thread_helper+0x5/0xb Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Neil Brown <neilb@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] knfsd: make rpc threads pools numa awareGreg Banks
Actually implement multiple pools. On NUMA machines, allocate a svc_pool per NUMA node; on SMP a svc_pool per CPU; otherwise a single global pool. Enqueue sockets on the svc_pool corresponding to the CPU on which the socket bh is run (i.e. the NIC interrupt CPU). Threads have their cpu mask set to limit them to the CPUs in the svc_pool that owns them. This is the patch that allows an Altix to scale NFS traffic linearly beyond 4 CPUs and 4 NICs. Incorporates changes and feedback from Neil Brown, Trond Myklebust, and Christoph Hellwig. Signed-off-by: Greg Banks <gnb@melbourne.sgi.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] knfsd: add svc_set_num_threadsGreg Banks
Currently knfsd keeps its own list of all nfsd threads in nfssvc.c; add a new way of managing the list of all threads in a svc_serv. Add svc_create_pooled() to allow creation of a svc_serv whose threads are managed by the sunrpc code. Add svc_set_num_threads() to manage the number of threads in a service, either per-pool or globally across the service. Signed-off-by: Greg Banks <gnb@melbourne.sgi.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] knfsd: add svc_getGreg Banks
add svc_get() for those occasions when we need to temporarily bump up svc_serv->sv_nrthreads as a pseudo refcount. Signed-off-by: Greg Banks <gnb@melbourne.sgi.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] knfsd: split svc_serv into poolsGreg Banks
Split out the list of idle threads and pending sockets from svc_serv into a new svc_pool structure, and allocate a fixed number (in this patch, 1) of pools per svc_serv. The new structure contains a lock which takes over several of the duties of svc_serv->sv_lock, which is now relegated to protecting only sv_tempsocks, sv_permsocks, and sv_tmpcnt in svc_serv. The point is to move the hottest fields out of svc_serv and into svc_pool, allowing a following patch to arrange for a svc_pool per NUMA node or per CPU. This is a major step towards making the NFS server NUMA-friendly. Signed-off-by: Greg Banks <gnb@melbourne.sgi.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] knfsd: convert sk_reserved to atomic_tGreg Banks
Convert the svc_sock->sk_reserved variable from an int protected by svc_serv->sv_lock, to an atomic. This reduces (by 1) the number of places we need to take the (effectively global) svc_serv->sv_lock. Signed-off-by: Greg Banks <gnb@melbourne.sgi.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] knfsd: use new lock for svc_sock deferred listGreg Banks
Protect the svc_sock->sk_deferred list with a new lock svc_sock->sk_defer_lock instead of svc_serv->sv_lock. Using the more fine-grained lock reduces the number of places we need to take the svc_serv lock. Signed-off-by: Greg Banks <gnb@melbourne.sgi.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] knfsd: convert sk_inuse to atomic_tGreg Banks
Convert the svc_sock->sk_inuse counter from an int protected by svc_serv->sv_lock, to an atomic. This reduces the number of places we need to take the (effectively global) svc_serv->sv_lock. Signed-off-by: Greg Banks <gnb@melbourne.sgi.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] knfsd: move tempsock aging to a timerGreg Banks
Following are 11 patches from Greg Banks which combine to make knfsd more Numa-aware. They reduce hitting on 'global' data structures, and create some data-structures that can be node-local. knfsd threads are bound to a particular node, and the thread to handle a new request is chosen from the threads that are attach to the node that received the interrupt. The distribution of threads across nodes can be controlled by a new file in the 'nfsd' filesystem, though the default approach of an even spread is probably fine for most sites. Some (old) numbers that show the efficacy of these patches: N == number of NICs == number of CPUs == nmber of clients. Number of NUMA nodes == N/2 N Throughput, MiB/s CPU usage, % (max=N*100) Before After Before After --- ------ ---- ----- ----- 4 312 435 350 228 6 500 656 501 418 8 562 804 690 589 This patch: Move the aging of RPC/TCP connection sockets from the main svc_recv() loop to a timer which uses a mark-and-sweep algorithm every 6 minutes. This reduces the amount of work that needs to be done in the main RPC loop and the length of time we need to hold the (effectively global) svc_serv->sv_lock. [akpm@osdl.org: cleanup] Signed-off-by: Greg Banks <gnb@melbourne.sgi.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] knfsd: Drop 'serv' option to svc_recv and svc_processNeilBrown
It isn't needed as it is available in rqstp->rq_server, and dropping it allows some local vars to be dropped. [akpm@osdl.org: build fix] Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] knfsd: allow sockets to be passed to nfsd via 'portlist'NeilBrown
Userspace should create and bind a socket (but not connectted) and write the 'fd' to portlist. This will cause the nfs server to listen on that socket. To close a socket, the name of the socket - as read from 'portlist' can be written to 'portlist' with a preceding '-'. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] knfsd: define new nfsdfs file: portlist - contains list of portsNeilBrown
This file will list all ports that nfsd has open. Default when TCP enabled will be ipv4 udp 0.0.0.0 2049 ipv4 tcp 0.0.0.0 2049 Later, the list of ports will be settable. 'portlist' chosen rather than 'ports', to avoid unnecessary confusion with non-mainline patches which created 'ports' with different semantics. [akpm@osdl.org: cleanups, build fix] Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] knfsd: remove nfsd_versbits as intermediate storage for desired versionsNeilBrown
We have an array 'nfsd_version' which lists the available versions of nfsd, and 'nfsd_versions' (poor choice there :-() which lists the currently active versions. Then we have a bitmap - nfsd_versbits which says which versions are wanted. The bits in this bitset cause content to be copied from nfsd_version to nfsd_versions when nfsd starts. This patch removes nfsd_versbits and moves information directly from nfsd_version to nfsd_versions when requests for version changes arrive. Note that this doesn't make it possible to change versions while the server is running. This is because serv->sv_xdrsize is calculated when a service is created, and used when threads are created, and xdrsize depends on the active versions. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] knfsd: be more selective in which sockets lockd listens onNeilBrown
Currently lockd listens on UDP always, and TCP if CONFIG_NFSD_TCP is set. However as lockd performs services of the client as well, this is a problem. If CONFIG_NfSD_TCP is not set, and a tcp mount is used, the server will not be able to call back to lockd. So: - add an option to lockd_up saying which protocol is needed - Always open sockets for which an explicit port was given, otherwise only open a socket of the type required - Change nfsd to do one lockd_up per socket rather than one per thread. This - removes the dependancy on CONFIG_NFSD_TCP - means that lockd may open sockets other than at startup - means that lockd will *not* listen on UDP if the only mounts are TCP mount (and nfsd hasn't started). The latter is the only one that concerns me at all - I don't know if this might be a problem with some servers. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] knfsd: add a callback for when last rpc thread finishesNeilBrown
nfsd has some cleanup that it wants to do when the last thread exits, and there will shortly be some more. So collect this all into one place and define a callback for an rpc service to call when the service is about to be destroyed. [akpm@osdl.org: cleanups, build fix] Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] cpumask: add highest_possible_node_idGreg Banks
cpumask: add highest_possible_node_id(), analogous to highest_possible_processor_id(). [pj@sgi.com: fix typo] Signed-off-by: Greg Banks <gnb@melbourne.sgi.com> Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>