summaryrefslogtreecommitdiffstats
path: root/include
AgeCommit message (Collapse)Author
2008-10-10Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: selinux: Fix an uninitialized variable BUG/panic in selinux_secattr_to_sid() selinux: use default proc sid on symlinks file capabilities: uninline cap_safe_nice Update selinux info in MAINTAINERS and Kconfig help text SELinux: add gitignore file for mdp script SELinux: add boundary support and thread context assignment securityfs: do not depend on CONFIG_SECURITY selinux: add support for installing a dummy policy (v2) security: add/fix security kernel-doc selinux: Unify for- and while-loop style selinux: conditional expression type validation was off-by-one smack: limit privilege by label SELinux: Fix a potentially uninitialised variable in SELinux hooks SELinux: trivial, remove unneeded local variable SELinux: Trivial minor fixes that change C null character style make selinux_write_opts() static
2008-10-10Merge branch 'sched-v28-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (38 commits) sched debug: add name to sched_domain sysctl entries sched: sync wakeups vs avg_overlap sched: remove redundant code in cpu_cgroup_create() sched_rt.c: resch needed in rt_rq_enqueue() for the root rt_rq cpusets: scan_for_empty_cpusets(), cpuset doesn't seem to be so const sched: minor optimizations in wake_affine and select_task_rq_fair sched: maintain only task entities in cfs_rq->tasks list sched: fixup buddy selection sched: more sanity checks on the bandwidth settings sched: add some comments to the bandwidth code sched: fixlet for group load balance sched: rework wakeup preemption CFS scheduler: documentation about scheduling policies sched: clarify ifdef tangle sched: fix list traversal to use _rcu variant sched: turn off WAKEUP_OVERLAP sched: wakeup preempt when small overlap kernel/cpu.c: create a CPU_STARTING cpu_chain notifier kernel/cpu.c: Move the CPU_DYING notifiers sched: fix __load_balance_iterator() for cfq with only one task ...
2008-10-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: skcipher - Use RNG interface instead of get_random_bytes crypto: rng - RNG interface and implementation crypto: api - Add fips_enable flag crypto: skcipher - Move IV generators into their own modules crypto: cryptomgr - Test ciphers using ECB crypto: api - Use test infrastructure crypto: cryptomgr - Add test infrastructure crypto: tcrypt - Add alg_test interface crypto: tcrypt - Abort and only log if there is an error crypto: crc32c - Use Intel CRC32 instruction crypto: tcrypt - Avoid using contiguous pages crypto: api - Display larval objects properly crypto: api - Export crypto_alg_lookup instead of __crypto_alg_lookup crypto: Kconfig - Replace leading spaces with tabs
2008-10-10Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm: dlm: choose better identifiers dlm: remove bkl dlm: fix address compare dlm: fix locking of lockspace list in dlm_scand dlm: detect available userspace daemon dlm: allow multiple lockspace creates
2008-10-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dmLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: dm: detect lost queue dm: publish dm_vcalloc dm: publish dm_table_unplug_all dm: publish dm_get_mapinfo dm: export struct dm_dev dm crypt: avoid unnecessary wait when splitting bio dm crypt: tidy ctx pending dm crypt: fix async inc_pending dm crypt: move dec_pending on error into write_io_submit dm crypt: remove inc_pending from write_io_submit dm crypt: tidy write loop pending dm crypt: tidy crypt alloc dm crypt: tidy inc pending dm exception store: use chunk_t for_areas dm exception store: introduce area_location function dm raid1: kcopyd should stop on error if errors handled dm mpath: remove is_active from struct dm_path dm mpath: use more error codes Fixed up trivial conflict in drivers/md/dm-mpath.c manually.
2008-10-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmwLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: GFS2: Support for I/O barriers GFS2: Add UUID to GFS2 sb GFS2: high time to take some time over atime GFS2: The war on bloat GFS2: GFS2 will panic if you misspell any mount options GFS2: Direct IO write at end of file error GFS2: Use an IS_ERR test rather than a NULL test GFS2: Fix race relating to glock min-hold time GFS2: Fix & clean up GFS2 rename GFS2: rm on multiple nodes causes panic GFS2: Fix metafs mounts GFS2: Fix debugfs glock file iterator
2008-10-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (37 commits) [SCSI] zfcp: fix double dbf id usage [SCSI] zfcp: wait on SCSI work to be finished before proceeding with init dev [SCSI] zfcp: fix erp list usage without using locks [SCSI] zfcp: prevent fc_remote_port_delete calls for unregistered rport [SCSI] zfcp: fix deadlock caused by shared work queue tasks [SCSI] zfcp: put threshold data in hba trace [SCSI] zfcp: Simplify zfcp data structures [SCSI] zfcp: Simplify get_adapter_by_busid [SCSI] zfcp: remove all typedefs and replace them with standards [SCSI] zfcp: attach and release SAN nameserver port on demand [SCSI] zfcp: remove unused references, declarations and flags [SCSI] zfcp: Update message with input from review [SCSI] zfcp: add queue_full sysfs attribute [SCSI] scsi_dh: suppress comparison warning [SCSI] scsi_dh: add Dell product information into rdac device handler [SCSI] qla2xxx: remove the unused SCSI_QLOGIC_FC_FIRMWARE option [SCSI] qla2xxx: fix printk format warnings [SCSI] qla2xxx: Update version number to 8.02.01-k8. [SCSI] qla2xxx: Ignore payload reserved-bits during RSCN processing. [SCSI] qla2xxx: Additional residual-count corrections during UNDERRUN handling. ...
2008-10-10Merge branch 'for-2.6.28' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds
* 'for-2.6.28' of git://git.kernel.dk/linux-2.6-block: (132 commits) doc/cdrom: Trvial documentation error, file not present block_dev: fix kernel-doc in new functions block: add some comments around the bio read-write flags block: mark bio_split_pool static block: Find bio sector offset given idx and offset block: gendisk integrity wrapper block: Switch blk_integrity_compare from bdev to gendisk block: Fix double put in blk_integrity_unregister block: Introduce integrity data ownership flag block: revert part of d7533ad0e132f92e75c1b2eb7c26387b25a583c1 bio.h: Remove unused conditional code block: remove end_{queued|dequeued}_request() block: change elevator to use __blk_end_request() gdrom: change to use __blk_end_request() memstick: change to use __blk_end_request() virtio_blk: change to use __blk_end_request() blktrace: use BLKTRACE_BDEV_SIZE as the name size for setup structure block: add lld busy state exporting interface block: Fix blk_start_queueing() to not kick a stopped queue include blktrace_api.h in headers_install ...
2008-10-10Merge phase #1 of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip This merges phase 1 of the x86 tree, which is a collection of branches: x86/alternatives, x86/cleanups, x86/commandline, x86/crashdump, x86/debug, x86/defconfig, x86/doc, x86/exports, x86/fpu, x86/gart, x86/idle, x86/mm, x86/mtrr, x86/nmi-watchdog, x86/oprofile, x86/paravirt, x86/reboot, x86/sparse-fixes, x86/tsc, x86/urgent and x86/vmalloc and as Ingo says: "these are the easiest, purely independent x86 topics with no conflicts, in one nice Octopus merge". * 'x86-v28-for-linus-phase1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (147 commits) x86: mtrr_cleanup: treat WRPROT as UNCACHEABLE x86: mtrr_cleanup: first 1M may be covered in var mtrrs x86: mtrr_cleanup: print out correct type v2 x86: trivial printk fix in efi.c x86, debug: mtrr_cleanup print out var mtrr before change it x86: mtrr_cleanup try gran_size to less than 1M, v3 x86: mtrr_cleanup try gran_size to less than 1M, cleanup x86: change MTRR_SANITIZER to def_bool y x86, debug printouts: IOMMU setup failures should not be KERN_ERR x86: export set_memory_ro and set_memory_rw x86: mtrr_cleanup try gran_size to less than 1M x86: mtrr_cleanup prepare to make gran_size to less 1M x86: mtrr_cleanup safe to get more spare regs now x86_64: be less annoying on boot, v2 x86: mtrr_cleanup hole size should be less than half of chunk_size, v2 x86: add mtrr_cleanup_debug command line x86: mtrr_cleanup optimization, v2 x86: don't need to go to chunksize to 4G x86_64: be less annoying on boot x86, olpc: fix endian bug in openfirmware workaround ...
2008-10-10Merge branch 'upstream-2.6.28' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: ata_piix: IDE Mode SATA patch for Intel Ibex Peak DeviceIDs libata-eh: clear UNIT ATTENTION after reset ata_piix: add Hercules EC-900 mini-notebook to ich_laptop short cable list libata: reorder ata_device to remove 8 bytes of padding on 64 bits [libata] pata_bf54x: Add proper PM operation pata_sil680: convert CONFIG_PPC_MERGE to CONFIG_PPC libata: Implement disk shock protection support [libata] Introduce ata_id_has_unload() PATA: RPC now selects HAVE_PATA_PLATFORM for pata platform driver ata_piix: drop merged SCR access and use slave_link instead libata: implement slave_link libata: misc updates to prepare for slave link libata: reimplement link iterator libata: make SCR access ops per-link
2008-10-10dm: publish dm_vcallocMikulas Patocka
Publish dm_vcalloc in include/linux/device-mapper.h because this function is used by targets. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2008-10-10dm: publish dm_table_unplug_allMikulas Patocka
Publish dm_table_unplug_all in include/linux/device-mapper.h because this function is used by targets. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2008-10-10dm: publish dm_get_mapinfoMikulas Patocka
Publish dm_get_mapinfo in include/linux/device-mapper.h because this function is used by targets. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2008-10-10dm: export struct dm_devMikulas Patocka
Split struct dm_dev in two and publish the part that other targets need in include/linux/device-mapper.h. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2008-10-10Merge branch 'next' into for-linusJames Morris
2008-10-09sched debug: add name to sched_domain sysctl entriesIngo Molnar
add /proc/sys/kernel/sched_domain/cpu0/domain0/name, to make it easier to see which specific scheduler domain remained at that entry. Since we process the scheduler domain tree and simplify it, it's not always immediately clear during debugging which domain came from where. depends on CONFIG_SCHED_DEBUG=y. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-09block: add some comments around the bio read-write flagsJens Axboe
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: mark bio_split_pool staticDenis ChengRq
Since all bio_split calls refer the same single bio_split_pool, the bio_split function can use bio_split_pool directly instead of the mempool_t parameter; then the mempool_t parameter can be removed from bio_split param list, and bio_split_pool is only referred in fs/bio.c file, can be marked static. Signed-off-by: Denis ChengRq <crquan@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: Find bio sector offset given idx and offsetMartin K. Petersen
Helper function to find the sector offset in a bio given bvec index and page offset. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: gendisk integrity wrapperMartin K. Petersen
This is a wrapper for accessing a gendisk's integrity bits. It allows the integrity support in MD to be compiled with BLK_DEV_INTEGRITY off. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: Switch blk_integrity_compare from bdev to gendiskMartin K. Petersen
The DM and MD integrity support now depends on being able to use gendisks instead of block_devices when comparing integrity profiles. Change function parameters accordingly. Also update comparison logic so that two NULL profiles are a valid configuration. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: Introduce integrity data ownership flagMartin K. Petersen
A filesystem might supply its own integrity metadata. Introduce a flag that indicates whether the filesystem or the block layer owns the integrity buffer. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: revert part of d7533ad0e132f92e75c1b2eb7c26387b25a583c1Jens Axboe
We need bdev_get_integrity() to support the pending md/dm patches. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09bio.h: Remove unused conditional codeAlberto Bertogli
The whole bio_integrity() definition is inside an #ifdef CONFIG_BLK_DEV_INTEGRITY, there's no need for the conditional code. Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: remove end_{queued|dequeued}_request()Kiyoshi Ueda
This patch removes end_queued_request() and end_dequeued_request(), which are no longer used. As a results, users of __end_request() became only end_request(). So the actual code in __end_request() is moved to end_request() and __end_request() is removed. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09blktrace: use BLKTRACE_BDEV_SIZE as the name size for setup structureJens Axboe
Define as 32, which is is what BDEVNAME_SIZE is/was as well. This keeps the user interface the same and gets rid of the difference between kernel and user api here. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: add lld busy state exporting interfaceKiyoshi Ueda
This patch adds an new interface, blk_lld_busy(), to check lld's busy state from the block layer. blk_lld_busy() calls down into low-level drivers for the checking if the drivers set q->lld_busy_fn() using blk_queue_lld_busy(). This resolves a performance problem on request stacking devices below. Some drivers like scsi mid layer stop dispatching request when they detect busy state on its low-level device like host/target/device. It allows other requests to stay in the I/O scheduler's queue for a chance of merging. Request stacking drivers like request-based dm should follow the same logic. However, there is no generic interface for the stacked device to check if the underlying device(s) are busy. If the request stacking driver dispatches and submits requests to the busy underlying device, the requests will stay in the underlying device's queue without a chance of merging. This causes performance problem on burst I/O load. With this patch, busy state of the underlying device is exported via q->lld_busy_fn(). So the request stacking driver can check it and stop dispatching requests if busy. The underlying device driver must return the busy state appropriately: 1: when the device driver can't process requests immediately. 0: when the device driver can process requests immediately, including abnormal situations where the device driver needs to kill all requests. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09include blktrace_api.h in headers_installSven Schuetz
This header file is of interest for user space programming, i.e. for tools that process blktrace data. We would like to use it for a tool on-top of blktrace which processes data provided by blktrace. For this purpose, it would be helpful if the blktrace API would make it to /usr/include/linux. The git tree for the blktrace tools comes with its own copy of this header file. I didn't manage to replace that copy with the file generated by the patch below yet. A few more cleanups would be needed. For example, the blktrace ioctl numbers, which are currently defined in usr/include/fs.h, might need to be moved. Should be feasible, though. Signed-off-by: Sven Schuetz <sven@linux.vnet.ibm.com> Signed-off-by: Martin Peschke <mp3@de.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09libata: set queue SSD flag for SSD devicesJens Axboe
SSD devices should give an RPM setting of 1 in word 217 of the ID page. If we see such a device, tell the block layer about it. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: add queue flag for SSD/non-rotational devicesJens Axboe
We don't want to idle in AS/CFQ if the device doesn't have a seek penalty. So add a QUEUE_FLAG_NONROT to indicate a non-rotational device, low level drivers should set this flag upon discovery of an SSD or similar device type. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09floppy: support arbitrary first-sector numbersKeith Wansbrough
The current floppy_struct allows floppies to number sectors starting from 0 or 1. This patch allows arbitrary first-sector numbers - for example, 0xC1 for Amstrad CPC disks. This extends the existing 1-bit field (FD_ZEROBASED, bit 2 of stretch) to 8 bits (FD_SECTMASK, bits 2 to 9). Currently 0x00 denotes a first sector number of 1, and 0x01 denotes a first sector number of 0. We extend this by interpreting FD_SECTMASK as the first sector number with the LSB flipped. Signed-off-by: Keith Wansbrough <keith@lochan.org> Cc: Alain Knaff <alain@linux.lu> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: Karel Zak <kzak@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: add a queue flag for request stacking supportKiyoshi Ueda
This patch adds a queue flag to indicate the block device can be used for request stacking. Request stacking drivers need to stack their devices on top of only devices of which q->request_fn is functional. Since bio stacking drivers (e.g. md, loop) basically initialize their queue using blk_alloc_queue() and don't set q->request_fn, the check of (q->request_fn == NULL) looks enough for that purpose. However, dm will become both types of stacking driver (bio-based and request-based). And dm will always set q->request_fn even if the dm device is bio-based of which q->request_fn is not functional actually. So we need something else to distinguish the type of the device. Adding a queue flag is a solution for that. The reason why dm always sets q->request_fn is to keep the compatibility of dm user-space tools. Currently, all dm user-space tools are using bio-based dm without specifying the type of the dm device they use. To use request-based dm without changing such tools, the kernel must decide the type of the dm device automatically. The automatic type decision can't be done at the device creation time and needs to be deferred until such tools load a mapping table, since the actual type is decided by dm target type included in the mapping table. So a dm device has to be initialized using blk_init_queue() so that we can load either type of table. Then, all queue stuffs are set (e.g. q->request_fn) and we have no element to distinguish that it is bio-based or request-based, even after a table is loaded and the type of the device is decided. By the way, some stuffs of the queue (e.g. request_list, elevator) are needless when the dm device is used as bio-based. But the memory size is not so large (about 20[KB] per queue on ia64), so I hope the memory loss can be acceptable for bio-based dm users. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: add request submission interfaceKiyoshi Ueda
This patch adds blk_insert_cloned_request(), a generic request submission interface for request stacking drivers. Request-based dm will use it to submit their clones to underlying devices. blk_rq_check_limits() is also added because it is possible that the lower queue has stronger limitations than the upper queue if multiple drivers are stacking at request-level. Not only for blk_insert_cloned_request()'s internal use, the function will be used by request-based dm when the queue limitation is modified (e.g. by replacing dm's table). Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: add request update interfaceKiyoshi Ueda
This patch adds blk_update_request(), which updates struct request with completing its data part, but doesn't complete the struct request itself. Though it looks like end_that_request_first() of older kernels, blk_update_request() should be used only by request stacking drivers. Request-based dm will use it in bio->bi_end_io callback to update the original request when a data part of a cloned request completes. Followings are additional background information of why request-based dm needs this interface. - Request stacking drivers can't use blk_end_request() directly from the lower driver's completion context (bio->bi_end_io or rq->end_io), because some device drivers (e.g. ide) may try to complete their request with queue lock held, and it may cause deadlock. See below for detailed description of possible deadlock: <http://marc.info/?l=linux-kernel&m=120311479108569&w=2> - To solve that, request-based dm offloads the completion of cloned struct request to softirq context (i.e. using blk_complete_request() from rq->end_io). - Though it is possible to use the same solution from bio->bi_end_io, it will delay the notification of bio completion to the original submitter. Also, it will cause inefficient partial completion, because the lower driver can't perform the cloned request anymore and request-based dm needs to requeue and redispatch it to the lower driver again later. That's not good. - So request-based dm needs blk_update_request() to perform the bio completion in the lower driver's completion context, which is more efficient. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: cleanup some of the integrity stuff in blkdev.hJens Axboe
Don't put functions that are only used in fs/bio-integrity.c in blkdev.h, it's much cleaner to just keep it in there. Also kill completely unused bdev_get_tag_size() Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: add fault injection mechanism for faking request timeoutsJens Axboe
Only works for the generic request timer handling. Allows one to sporadically ignore request completions, thus exercising the timeout handling. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: add bio_kmalloc()Jens Axboe
Not all callers need (or want!) the mempool backing guarentee, it essentially means that you can only use bio_alloc() for short allocations and not for preallocating some bio's at setup or init time. So add bio_kmalloc() which does the same thing as bio_alloc(), except it just uses kmalloc() as the backing instead of the bio mempools. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: adjust blkdev_issue_discard for swapHugh Dickins
Two mods to blkdev_issue_discard(), thinking ahead to its use on swap: 1. Add gfp_mask argument, so swap allocation can use it where GFP_KERNEL might deadlock but GFP_NOIO is safe. 2. Enlarge nr_sects argument from unsigned to sector_t: unsigned long is enough to cover a whole swap area, but sector_t suits any partition. Change sb_issue_discard()'s nr_blocks to sector_t too; but no need seen for a gfp_mask there, just pass GFP_KERNEL down to blkdev_issue_discard(). Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: Add interface to abort queued requestsMike Anderson
Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: unify request timeout handlingJens Axboe
Right now SCSI and others do their own command timeout handling. Move those bits to the block layer. Instead of having a timer per command, we try to be a bit more clever and simply have one per-queue. This avoids the overhead of having to tear down and setup a timer for each command, so it will result in a lot less timer fiddling. Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09Adjust block device size after an online resize of a disk.Andrew Patterson
The revalidate_disk routine now checks if a disk has been resized by comparing the gendisk capacity to the bdev inode size. If they are different (usually because the disk has been resized underneath the kernel) the bdev inode size is adjusted to match the capacity. Signed-off-by: Andrew Patterson <andrew.patterson@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09Wrapper for lower-level revalidate_disk routines.Andrew Patterson
This is a wrapper for the lower-level revalidate_disk call-backs such as sd_revalidate_disk(). It allows us to perform pre and post operations when calling them. We will use this wrapper in a later patch to adjust block device sizes after an online resize (a _post_ operation). Signed-off-by: Andrew Patterson <andrew.patterson@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: make blk_rq_map_user take a NULL user-space bufferFUJITA Tomonori
This patch changes blk_rq_map_user to accept a NULL user-space buffer with a READ command if rq_map_data is not NULL. Thus a caller can pass page frames to lk_rq_map_user to just set up a request and bios with page frames propely. bio_uncopy_user (called via blk_rq_unmap_user) doesn't copy data to user space with such request. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: add blk_rq_aligned helper functionFUJITA Tomonori
This adds blk_rq_aligned helper function to see if alignment and padding requirement is satisfied for DMA transfer. This also converts blk_rq_map_kern and __blk_rq_map_user to use the helper function. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: introduce struct rq_map_data to use reserved pagesFUJITA Tomonori
This patch introduces struct rq_map_data to enable bio_copy_use_iov() use reserved pages. Currently, bio_copy_user_iov allocates bounce pages but drivers/scsi/sg.c wants to allocate pages by itself and use them. struct rq_map_data can be used to pass allocated pages to bio_copy_user_iov. The current users of bio_copy_user_iov simply passes NULL (they don't want to use pre-allocated pages). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Douglas Gilbert <dougg@torque.net> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iovFUJITA Tomonori
Currently, blk_rq_map_user and blk_rq_map_user_iov always do GFP_KERNEL allocation. This adds gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov so sg can use it (sg always does GFP_ATOMIC allocation). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Douglas Gilbert <dougg@torque.net> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: inherit CPU completion on bio->rq and rq->rq mergesJens Axboe
Somewhat incomplete, as we do allow merges of requests and bios that have different completion CPUs given. This is done on the assumption that a larger IO is still more beneficial than CPU locality. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: add support for IO CPU affinityJens Axboe
This patch adds support for controlling the IO completion CPU of either all requests on a queue, or on a per-request basis. We export a sysfs variable (rq_affinity) which, if set, migrates completions of requests to the CPU that originally submitted it. A bio helper (bio_set_completion_cpu()) is also added, so that queuers can ask for completion on that specific CPU. In testing, this has been show to cut the system time by as much as 20-40% on synthetic workloads where CPU affinity is desired. This requires a little help from the architecture, so it'll only work as designed for archs that are using the new generic smp helper infrastructure. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: make kblockd_schedule_work() take the queue as parameterJens Axboe
Preparatory patch for checking queuing affinity. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: allow disk to have extended device numberTejun Heo
Now that disk and partition handlings are mostly unified, it's easy to allow disk to have extended device number. This patch makes add_disk() use extended device number if disk->minors is zero. Both sd and ide-disk are updated to use this. * sd_format_disk_name() is implemented which can generically determine the drive name. This removes disk number restriction stemming from limited device names. * If sd index goes over SD_MAX_DISKS (which can be increased now BTW), sd simply doesn't initialize minors letting block layer choose extended device number. * If CONFIG_DEBUG_EXT_DEVT is set, both sd and ide-disk always set minors to 0 and use extended device numbers. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>