summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2007-05-11libata: implement libata.spindown_compatTejun Heo
Now that libata uses sd->manage_start_stop, libata spins down disk on shutdown. In an attempt to compensate libata's previous shortcoming, some distros sync and spin down disks attached via libata in their shutdown(8). Some disks spin back up just to spin down again on STANDBYNOW1 if the command is issued when the disk is spun down, so this double spinning down causes problem. This patch implements module parameter libata.spindown_compat which, when set to one (default value), prevents libata from spinning down disks on shutdown thus avoiding double spinning down. Note that libata spins down disks for suspend to mem and disk, so with libata.spindown_compat set to one, disks should be properly spun down in all cases without modifying shutdown(8). shutdown(8) should be fixed eventually. Some drive do spin up on SYNCHRONZE_CACHE even when their cache is clean. Those disks currently spin up briefly when sd tries to shutdown the device and then the machine powers off immediately, which can't be good for the head. We can't skip SYNCHRONIZE_CACHE during shudown as it can be dangerous data integrity-wise. So, this spindown_compat parameter is already scheduled for removal by the end of the next year and here's what shutdown(8) should do. * Check whether /sys/modules/libata/parameters/spindown_compat exists. If it does, write 0 to it. * For each libata harddisk { * Check whether /sys/class/scsi_disk/h:c:i:l/manage_start_stop exists. Iff it doesn't, synchronize cache and spin the disk down as before. } The above procedure will make shutdown(8) work properly with kernels before this change, ones with this workaround and later ones without it. To accelerate shutdown(8) updates, if the compat mode is in use, this patch prints BIG FAT warning for five seconds during shutdown (the optimal interval to annoy the user just the right amount discovered by hours of tireless usability testing). Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-05-11libata: reimplement suspend/resume support using sdev->manage_start_stopTejun Heo
Reimplement suspend/resume support using sdev->manage_start_stop. * Device suspend/resume is now SCSI layer's responsibility and the code is simplified a lot. * DPM is dropped. This also simplifies code a lot. Suspend/resume status is port-wide now. * ata_scsi_device_suspend/resume() and ata_dev_ready() removed. * Resume now has to wait for disk to spin up before proceeding. I couldn't find easy way out as libata is in EH waiting for the disk to be ready and sd is waiting for EH to complete to issue START_STOP. * sdev->manage_start_stop is set to 1 in ata_scsi_slave_config(). This fixes spindown on shutdown and suspend-to-disk. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-05-11Merge branch 'linus' of master.kernel.org:/pub/scm/linux/kernel/git/perex/alsaLinus Torvalds
* 'linus' of master.kernel.org:/pub/scm/linux/kernel/git/perex/alsa: (122 commits) [ALSA] version 1.0.14rc4 [ALSA] Add speaker pin sequencing to hda_codec.c:snd_hda_parse_pin_def_config() [ALSA] hda-codec - Add ALC861VD Lenovo support [ALSA] hda-codec - Fix connection list in generic parser [ALSA] usb-audio: work around wrong wMaxPacketSize on ESI M4U [ALSA] usb-audio: work around broken M-Audio MidiSport Uno firmware [ALSA] usb-audio: explicitly match Logitech QuickCam [ALSA] hda-codec - Fix a typo [ALSA] hda-codec - Fix ALC880 uniwill auto-mutes [ALSA] hda-codec - Fix AD1988 SPDIF playback route control [ALSA] wm8750 typo fix [ALSA] wavefront: only declare isapnp on CONFIG_PNP [ALSA] hda-codec - bug fixes for stac92xx HDA codecs. [ALSA] add MODULE_FIRMWARE entries [ALSA] do not depend on FW_LOADER when internal firmware images are used [ALSA] hda-codec - Fix resume of STAC92xx codecs [ALSA] usbaudio - Revert the minimal period size fix patch [ALSA] hda-codec - Add support for new HP DV series laptops [ALSA] usb-audio - Fix the minimum period size per transfer mode [ALSA] sound/pcmcia/vx/vxpocket.c: fix an if() condition ...
2007-05-11Merge branch 'master' of ↵Linus Torvalds
ssh://master.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb * 'master' of ssh://master.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (44 commits) V4L/DVB (5571): V4l1-compat: Make VIDIOCSPICT return errors in a useful way V4L/DVB (5624): Radio-maestro.c cleanup V4L/DVB (5623): Dsbr100.c Replace usb_dsbr100_do_ioctl to use video_ioctl2 V4L/DVB (5622): Radio-zoltrix.c cleanup V4L/DVB (5621): Radio-cadet.c Replace cadet_do_ioctl to use video_ioctl2 V4L/DVB (5619): Dvb-usb: fix typo V4L/DVB (5618): Cx88: Drop the generic i2c client from cx88-vp3054-i2c V4L/DVB (5617): V4L2: videodev, allow debugging V4L/DVB (5614): M920x: Disable second adapter on LifeView TV Walker Twin V4L/DVB (5613): M920x: loosen up 80-col limit V4L/DVB (5612): M920x: rename function prefixes from m9206_foo to m920x_foo V4L/DVB (5611): M920x: replace deb_rc with deb V4L/DVB (5610): M920x: remove duplicated code V4L/DVB (5609): M920x: group like functions together V4L/DVB (5608): M920x: various whitespace cleanups V4L/DVB (5607): M920x: Initial support for devices likely manufactured by Dposh V4L/DVB (5606): M920x: add "c-basic-offset: 8" to help emacs to enforce tabbing V4L/DVB (5605): M920x: Add support for LifeView TV Walker Twin V4L/DVB (5603): V4L: Prevent queueing queued buffers. V4L/DVB (5602): Enable DiSEqC in Starbox II (vp7021a) ...
2007-05-11Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] Quicklist support for IA64 [IA64] fix Kprobes reentrancy [IA64] SN: validate smp_affinity mask on intr redirect [IA64] drivers/char/snsc_event.c:206: warning: unused variable `p' [IA64] mca.c:121: warning: 'cpe_poll_timer' defined but not used [IA64] Fix - Section mismatch: reference to .init.data:mvec_name [IA64] more warning cleanups [IA64] Wire up epoll_pwait and utimensat [IA64] Fix warnings resulting from type-checking in dev_dbg() [IA64] typo s/kenrel/kernel/
2007-05-11x86_64: Don't call mtrr_bp_init from identify_cpuAndi Kleen
The code was ok, but triggered warnings for calling __init from __cpuinit. Instead call it from check_bugs instead. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11x86_64: off-by-two error in aperture.cAndrew Hastings
I'm using a custom BIOS to configure the northbridge GART at address 0x80000000, size 2G. Linux complains: "Aperture from northbridge cpu 0 beyond 4GB. Ignoring." I think there's an off-by-two error in arch/x86_64/kernel/aperture.c: AK: use correct types for i386 Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11i386: Fix compilation of verify_cpu.S on old binutilsAndi Kleen
Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivialLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: further UTF-8 fixes and name correction Fix wrong identifier name in Documentation/kref.txt
2007-05-11further UTF-8 fixes and name correctionDavid Woodhouse
> -** Copyright 1994 by Bj<94>rn Brauel > +** Copyright 1994 by Bj”rn Brauel I think these were cp437, and it should read 'Björn'. (asm-m68k/atari*.h) Also note that Arnaldo just put more legacy noise into CREDITS... Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-05-11Fix wrong identifier name in Documentation/kref.txtSatyam Sharma
There's a typo / wrong identifier name in Documentation/kref.txt. Fix it. Signed-off-by: Satyam Sharma <ssatyam@cse.iitk.ac.in> Acked-by: Corey Minyard <minyard@acm.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-05-11Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-ip22Linus Torvalds
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-ip22: Convert SGI IP22 and specific drivers to platform_device.
2007-05-11Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linusLinus Torvalds
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: (28 commits) [MIPS] Rework cobalt_board_id [MIPS] Use RTC_CMOS for Cobalt [MIPS] Use platform_device for Cobalt UART [MIPS] Separate Alchemy processor based boards config [MIPS] Fix build error in atomic64_cmpxchg [MIPS] Run checksyscalls for N32 and O32 ABI [MIPS] tlbex: use __maybe_unused [MIPS] excite: use __maybe_unused [MIPS] Add extern cobalt_board_id [MIPS] Remove unused CONFIG_TOSHIBA_BOARDS [MIPS] Rename tb0229_defconfig to tb0219_defconfig [MIPS] Update tb0229_defconfig; add CONFIG_GPIO_TB0219. [MIPS] Add minimum defconfig for RBHMA4200 [MIPS] SB1: Build fix. [MIPS] Drop __devinit tag from allocate_irqno() and free_irqno() [MIPS] clocksource: use CLOCKSOURCE_MASK() macro [MIPS] Remove LIMITED_DMA support [MIPS] Remove Momenco Jaguar ATX support [MIPS] Remove Momenco Ocelot G support [MIPS] FPU hazard handling ...
2007-05-11Merge branch 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-blockLinus Torvalds
* 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block: Fix compile/link of init/do_mounts.c with !CONFIG_BLOCK When stacked block devices are in-use (e.g. md or dm), the recursive calls
2007-05-11Merge branch 'audit.b38' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current * 'audit.b38' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: [PATCH] Abnormal End of Processes [PATCH] match audit name data [PATCH] complete message queue auditing [PATCH] audit inode for all xattr syscalls [PATCH] initialize name osid [PATCH] audit signal recipients [PATCH] add SIGNAL syscall class (v3) [PATCH] auditing ptrace
2007-05-11Merge branch 'upstream-fixes' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/jikos/hid * 'upstream-fixes' of master.kernel.org:/pub/scm/linux/kernel/git/jikos/hid: USB HID: hiddev - fix race between hiddev_send_event() and hiddev_release() HID: add hooks for getkeycode() and setkeycode() methods HID: switch to using input_dev->dev.parent USB HID: Logitech wheel 0x046d/0xc294 needs HID_QUIRK_NOGET quirk USB HID: usb_buffer_free() cleanup USB HID: report descriptor of Cypress USB barcode readers needs fixup Bluetooth HID: HIDP - don't initialize force feedback USB HID: update CONFIG_USB_HIDINPUT_POWERBOOK description HID: add input mappings for non-working keys on Logitech S510 remote
2007-05-11[IA64] Quicklist support for IA64Christoph Lameter
IA64 is the origin of the quicklist implementation. So cut out the pieces that are now in core code and modify the functions called. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-05-11[IA64] fix Kprobes reentrancyAnil S Keshavamurthy
In case of reentrance i.e when a probe handler calls a functions which inturn has a probe, we save a previous kprobe information and just single step the reentrant probe without calling the actual probe handler. During this reentracy period, if an interrupt occurs and if probe happens to trigger in the inturrupt path, then we were corrupting the previous kprobe( as we were overriding the previous kprobe info) info their by crashing the system. This patch fixes this issues by having a an array of previous kprobe info struct(with the array size of 2). This similar technique is not needed on i386 and x86_64 because by default interrupts are turn off in the break/int3 exception handler. Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-05-11[IA64] SN: validate smp_affinity mask on intr redirectJohn Keller
On SN, only allow one bit to be set in the smp_affinty mask when redirecting an interrupt. Currently setting multiple bits is allowed, but only the first bit is used in determining the CPU to redirect to. This has caused confusion among some customers. [akpm@linux-foundation.org: fixes] Signed-off-by: John Keller <jpk@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-05-11Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (31 commits) [NETFILTER]: xt_conntrack: add compat support [NETFILTER]: iptable_raw: ignore short packets sent by SOCK_RAW sockets [NETFILTER]: iptable_{filter,mangle}: more descriptive "happy cracking" message [NETFILTER]: nf_nat: Clears helper private area when NATing [NETFILTER]: ctnetlink: clear helper area and handle unchanged helper [NETFILTER]: nf_conntrack: Removes unused destroy operation of l3proto [NETFILTER]: nf_conntrack: Removes duplicated declarations [NETFILTER]: nf_nat: remove unused argument of function allocating binding [NETFILTER]: Clean up table initialization [NET_SCHED]: Avoid requeue warning on dev_deactivate [NET_SCHED]: Reread dev->qdisc for NETDEV_TX_OK [NET_SCHED]: Rationalise return value of qdisc_restart [NET]: Fix dev->qdisc race for NETDEV_TX_LOCKED case [UDP]: Fix AF-specific references in AF-agnostic code. [IrDA]: KingSun/DonShine USB IrDA dongle support. [IPV6] ROUTE: Assign rt6i_idev for ip6_{prohibit,blk_hole}_entry. [IPV6]: Do no rely on skb->dst before it is assigned. [IPV6]: Send ICMPv6 error on scope violations. [SCTP]: Do not include ABORT chunk header in the notification. [SCTP]: Correctly copy addresses in sctp_copy_laddrs ...
2007-05-11Input: evdev - fix overflow in compat_ioctlKenichi Nagai
When exporting input device bitmaps via compat_ioctl on BIG_ENDIAN platforms evdev calculates data size incorrectly. This causes buffer overflow if user specifies buffer smaller than maxlen. Signed-off-by: Kenichi Nagai <kenichi3.nagai@toshiba.co.jp> Signed-off-by: Dmitry Torokhov <dtor@mail.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11Convert SGI IP22 and specific drivers to platform_device.Ralf Baechle
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-05-11md: improve the is_mddev_idle testNeilBrown
During a 'resync' or similar activity, md checks if the devices in the array are otherwise active and winds back resync activity when they are. This test in done in is_mddev_idle, and it is somewhat fragile - it sometimes thinks there is non-sync io when there isn't. The test compares the total sectors of io (disk_stat_read) with the sectors of resync io (disk->sync_io). This has problems because total sectors gets updated when a request completes, while resync io gets updated when the request is submitted. The time difference can cause large differenced between the two which do not actually imply non-resync activity. The test currently allows for some fuzz (+/- 4096) but there are some cases when it is not enough. The test currently looks for any (non-fuzz) difference, either positive or negative. This clearly is not needed. Any non-sync activity will cause the total sectors to grow faster than the sync_io count (never slower) so we only need to look for a positive differences. If we do this then the amount of in-flight sync io will never cause the appearance of non-sync IO. Once enough non-sync IO to worry about starts happening, resync will be slowed down and the measurements will thus be more precise (as there is less in-flight) and control of resync will still be suitably responsive. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11VIDEO: remove archaic if[] construct from Kconfig fileRobert P. J. Day
Remove the obsolete "if [ ]" construct from the video console Kconfig file. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Acked-by: James Simmons <jsimmons@infradead.org> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11pm2fb: fb_sync addedAntonino A. Daplas
Convert internal wait_pm2() function to fb API fb_sync() method. Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl> Signed-off-by: Antonino Daplas <adaplas@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11nvidiafb: Enable debugging messages a Kconfig optionJean Delvare
Let the user enable debugging messages in nvidiafb. Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Antonino Daplas <adaplas@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11rivafb: Fix I2C getscl callback functionJean Delvare
Fix rivafb's I2C getscl callback function, as was done in nvidiafb recently. Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Antonino Daplas <adaplas@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11atmel_lcdfb: AT91/AT32 LCD Controller framebuffer driverNicolas Ferre
Adds a framebuffer driver to ATMEL AT91SAM9x and AT32 aka AVR32 platforms. Those chips share quite the same IP and this code is suitable for both architectures. Signed-off-by: Nicolas Ferre <nicolas.ferre@rfo.atmel.com> Signed-off-by: Antonino Daplas <adaplas@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11pm3fb: Preliminary 2.4 to 2.6 portKrzysztof Helt
This is a basic port from 2.4 kernel to 2.6. Acceleration is lost and big endian support probably too. The driver works in 8, 16 and 32 bit mode. [adaplas] - change VESA_* to FB_BLANK_* constants - removed unused function clear_memory - fix uninitialized variable compiler warning - some whitespace cleaning [akpm@linux-foundation.org: Nuke pestiferous CVS string] Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl> Signed-off-by: Antonino Daplas <adaplas@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11fbdev: geforce 7300 cleanupMichal Piotrowski
ups... coding style. Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11epoll cleanups: epoll remove static pre-declarations and akpm-ize the codeDavide Libenzi
Re-arrange epoll code to avoid static functions pre-declarations, and apply akpm-filter on it. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11epoll cleanups: epoll no moduleDavide Libenzi
Epoll is either compiled it, or not (if EMBEDDED). Remove the module code and use fs_initcall(). Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11epoll: use anonymous inodesDavide Libenzi
Cut out lots of code from epoll, by reusing the anonymous inode source patch (fs/anon_inodes.c). Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11signal/timer/event: KAIO eventfd support exampleDavide Libenzi
This is an example about how to add eventfd support to the current KAIO code, in order to enable KAIO to post readiness events to a pollable fd (hence compatible with POSIX select/poll). The KAIO code simply signals the eventfd fd when events are ready, and this triggers a POLLIN in the fd. This patch uses a reserved for future use member of the struct iocb to pass an eventfd file descriptor, that KAIO will use to post events every time a request completes. At that point, an aio_getevents() will return the completed result to a struct io_event. I made a quick test program to verify the patch, and it runs fine here: http://www.xmailserver.org/eventfd-aio-test.c The test program uses poll(2), but it'd, of course, work with select and epoll too. This can allow to schedule both block I/O and other poll-able devices requests, and wait for results using select/poll/epoll. In a typical scenario, an application would submit KAIO request using aio_submit(), and will also use epoll_ctl() on the whole other class of devices (that with the addition of signals, timers and user events, now it's pretty much complete), and then would: epoll_wait(...); for_each_event { if (curr_event_is_kaiofd) { aio_getevents(); dispatch_aio_events(); } else { dispatch_epoll_event(); } } Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11signal/timer/event: eventfd wire up x86 archesDavide Libenzi
This patch wires the eventfd system call to the x86 architectures. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11signal/timer/event: eventfd coreDavide Libenzi
This is a very simple and light file descriptor, that can be used as event wait/dispatch by userspace (both wait and dispatch) and by the kernel (dispatch only). It can be used instead of pipe(2) in all cases where those would simply be used to signal events. Their kernel overhead is much lower than pipes, and they do not consume two fds. When used in the kernel, it can offer an fd-bridge to enable, for example, functionalities like KAIO or syslets/threadlets to signal to an fd the completion of certain operations. But more in general, an eventfd can be used by the kernel to signal readiness, in a POSIX poll/select way, of interfaces that would otherwise be incompatible with it. The API is: int eventfd(unsigned int count); The eventfd API accepts an initial "count" parameter, and returns an eventfd fd. It supports poll(2) (POLLIN, POLLOUT, POLLERR), read(2) and write(2). The POLLIN flag is raised when the internal counter is greater than zero. The POLLOUT flag is raised when at least a value of "1" can be written to the internal counter. The POLLERR flag is raised when an overflow in the counter value is detected. The write(2) operation can never overflow the counter, since it blocks (unless O_NONBLOCK is set, in which case -EAGAIN is returned). But the eventfd_signal() function can do it, since it's supposed to not sleep during its operation. The read(2) function reads the __u64 counter value, and reset the internal value to zero. If the value read is equal to (__u64) -1, an overflow happened on the internal counter (due to 2^64 eventfd_signal() posts that has never been retired - unlickely, but possible). The write(2) call writes an __u64 count value, and adds it to the current counter. The eventfd fd supports O_NONBLOCK also. On the kernel side, we have: struct file *eventfd_fget(int fd); int eventfd_signal(struct file *file, unsigned int n); The eventfd_fget() should be called to get a struct file* from an eventfd fd (this is an fget() + check of f_op being an eventfd fops pointer). The kernel can then call eventfd_signal() every time it wants to post an event to userspace. The eventfd_signal() function can be called from any context. An eventfd() simple test and bench is available here: http://www.xmailserver.org/eventfd-bench.c This is the eventfd-based version of pipetest-4 (pipe(2) based): http://www.xmailserver.org/pipetest-4.c Not that performance matters much in the eventfd case, but eventfd-bench shows almost as double as performance than pipetest-4. [akpm@linux-foundation.org: fix i386 build] [akpm@linux-foundation.org: add sys_eventfd to sys_ni.c] Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11signal/timer/event: timerfd compat codeDavide Libenzi
This patch implements the necessary compat code for the timerfd system call. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11signal/timer/event: timerfd wire up x86 archesDavide Libenzi
This patch wires the timerfd system call to the x86 architectures. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Cc: Andi Kleen <ak@suse.de> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11signal/timer/event: timerfd coreDavide Libenzi
This patch introduces a new system call for timers events delivered though file descriptors. This allows timer event to be used with standard POSIX poll(2), select(2) and read(2). As a consequence of supporting the Linux f_op->poll subsystem, they can be used with epoll(2) too. The system call is defined as: int timerfd(int ufd, int clockid, int flags, const struct itimerspec *utmr); The "ufd" parameter allows for re-use (re-programming) of an existing timerfd w/out going through the close/open cycle (same as signalfd). If "ufd" is -1, s new file descriptor will be created, otherwise the existing "ufd" will be re-programmed. The "clockid" parameter is either CLOCK_MONOTONIC or CLOCK_REALTIME. The time specified in the "utmr->it_value" parameter is the expiry time for the timer. If the TFD_TIMER_ABSTIME flag is set in "flags", this is an absolute time, otherwise it's a relative time. If the time specified in the "utmr->it_interval" is not zero (.tv_sec == 0, tv_nsec == 0), this is the period at which the following ticks should be generated. The "utmr->it_interval" should be set to zero if only one tick is requested. Setting the "utmr->it_value" to zero will disable the timer, or will create a timerfd without the timer enabled. The function returns the new (or same, in case "ufd" is a valid timerfd descriptor) file, or -1 in case of error. As stated before, the timerfd file descriptor supports poll(2), select(2) and epoll(2). When a timer event happened on the timerfd, a POLLIN mask will be returned. The read(2) call can be used, and it will return a u32 variable holding the number of "ticks" that happened on the interface since the last call to read(2). The read(2) call supportes the O_NONBLOCK flag too, and EAGAIN will be returned if no ticks happened. A quick test program, shows timerfd working correctly on my amd64 box: http://www.xmailserver.org/timerfd-test.c [akpm@linux-foundation.org: add sys_timerfd to sys_ni.c] Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11signal/timer/event: signalfd compat codeDavide Libenzi
This patch implements the necessary compat code for the signalfd system call. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11signal/timer/event: signalfd wire up x86 archesDavide Libenzi
This patch wires the signalfd system call to the x86 architectures. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11signal/timer/event: signalfd coreDavide Libenzi
This patch series implements the new signalfd() system call. I took part of the original Linus code (and you know how badly it can be broken :), and I added even more breakage ;) Signals are fetched from the same signal queue used by the process, so signalfd will compete with standard kernel delivery in dequeue_signal(). If you want to reliably fetch signals on the signalfd file, you need to block them with sigprocmask(SIG_BLOCK). This seems to be working fine on my Dual Opteron machine. I made a quick test program for it: http://www.xmailserver.org/signafd-test.c The signalfd() system call implements signal delivery into a file descriptor receiver. The signalfd file descriptor if created with the following API: int signalfd(int ufd, const sigset_t *mask, size_t masksize); The "ufd" parameter allows to change an existing signalfd sigmask, w/out going to close/create cycle (Linus idea). Use "ufd" == -1 if you want a brand new signalfd file. The "mask" allows to specify the signal mask of signals that we are interested in. The "masksize" parameter is the size of "mask". The signalfd fd supports the poll(2) and read(2) system calls. The poll(2) will return POLLIN when signals are available to be dequeued. As a direct consequence of supporting the Linux poll subsystem, the signalfd fd can use used together with epoll(2) too. The read(2) system call will return a "struct signalfd_siginfo" structure in the userspace supplied buffer. The return value is the number of bytes copied in the supplied buffer, or -1 in case of error. The read(2) call can also return 0, in case the sighand structure to which the signalfd was attached, has been orphaned. The O_NONBLOCK flag is also supported, and read(2) will return -EAGAIN in case no signal is available. If the size of the buffer passed to read(2) is lower than sizeof(struct signalfd_siginfo), -EINVAL is returned. A read from the signalfd can also return -ERESTARTSYS in case a signal hits the process. The format of the struct signalfd_siginfo is, and the valid fields depends of the (->code & __SI_MASK) value, in the same way a struct siginfo would: struct signalfd_siginfo { __u32 signo; /* si_signo */ __s32 err; /* si_errno */ __s32 code; /* si_code */ __u32 pid; /* si_pid */ __u32 uid; /* si_uid */ __s32 fd; /* si_fd */ __u32 tid; /* si_fd */ __u32 band; /* si_band */ __u32 overrun; /* si_overrun */ __u32 trapno; /* si_trapno */ __s32 status; /* si_status */ __s32 svint; /* si_int */ __u64 svptr; /* si_ptr */ __u64 utime; /* si_utime */ __u64 stime; /* si_stime */ __u64 addr; /* si_addr */ }; [akpm@linux-foundation.org: fix signalfd_copyinfo() on i386] Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11signal/timer/event fds: anonymous inode sourceDavide Libenzi
This patch add an anonymous inode source, to be used for files that need and inode only in order to create a file*. We do not care of having an inode for each file, and we do not even care of having different names in the associated dentries (dentry names will be same for classes of file*). This allow code reuse, and will be used by epoll, signalfd and timerfd (and whatever else there'll be). Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11Don't init pgrp and __session in INIT_SIGNALSSukadev Bhattiprolu
Remove initialization of pgrp and __session in INIT_SIGNALS, as these are later set by the call to __set_special_pids() in init/main.c by the patch: explicitly-set-pgid-and-sid-of-init-process.patch Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11Replace pid_t in autofs with struct pid referenceSukadev Bhattiprolu
Make autofs container-friendly by caching struct pid reference rather than pid_t and using pid_nr() to retreive a task's pid_t. ChangeLog: - Fix Eric Biederman's comments - Use find_get_pid() to hold a reference to oz_pgrp and release while unmounting; separate out changes to autofs and autofs4. - Fix Cedric's comments: retain old prototype of parse_options() and move necessary change to its caller. Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: containers@lists.osdl.org Acked-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11Fix some coding-style errors in autofsSukadev Bhattiprolu
Fix coding style errors (extra spaces, long lines) in autofs and autofs4 files being modified for container/pidspace issues. Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: <containers@lists.osdl.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11Kill unused sesssion and group values in rocket driverSukadev Bhattiprolu
The process_session() and process_group() values are not really used by the driver. Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: <containers@lists.osdl.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11Use task_pgrp() task_session() in copy_process()Sukadev Bhattiprolu
Use task_pgrp() and task_session() in copy_process(), and avoid find_pid() call when attaching the task to its process group and session. Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: <containers@lists.osdl.org> Acked-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11Use struct pid parameter in copy_process()Sukadev Bhattiprolu
Modify copy_process() to take a struct pid * parameter instead of a pid_t. This simplifies the code a bit and also avoids having to call find_pid() to convert the pid_t to a struct pid. Changelog: - Fixed Badari Pulavarty's comments and passed in &init_struct_pid from fork_idle(). - Fixed Eric Biederman's comments and simplified this patch and used a new patch to remove the likely(pid) check. Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: <containers@lists.osdl.org> Acked-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11Explicitly set pgid and sid of init processSukadev Bhattiprolu
Explicitly set pgid and sid of init process to 1. Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: <containers@lists.osdl.org> Acked-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>