summaryrefslogtreecommitdiffstats
path: root/include
AgeCommit message (Collapse)Author
2006-06-23[PATCH] add page_mkwrite() vm_operations methodDavid Howells
Add a new VMA operation to notify a filesystem or other driver about the MMU generating a fault because userspace attempted to write to a page mapped through a read-only PTE. This facility permits the filesystem or driver to: (*) Implement storage allocation/reservation on attempted write, and so to deal with problems such as ENOSPC more gracefully (perhaps by generating SIGBUS). (*) Delay making the page writable until the contents have been written to a backing cache. This is useful for NFS/AFS when using FS-Cache/CacheFS. It permits the filesystem to have some guarantee about the state of the cache. (*) Account and limit number of dirty pages. This is one piece of the puzzle needed to make shared writable mapping work safely in FUSE. Needed by cachefs (Or is it cachefiles? Or fscache? <head spins>). At least four other groups have stated an interest in it or a desire to use the functionality it provides: FUSE, OCFS2, NTFS and JFFS2. Also, things like EXT3 really ought to use it to deal with the case of shared-writable mmap encountering ENOSPC before we permit the page to be dirtied. From: Peter Zijlstra <a.p.zijlstra@chello.nl> get_user_pages(.write=1, .force=1) can generate COW hits on read-only shared mappings, this patch traps those as mkpage_write candidates and fails to handle them the old way. Signed-off-by: David Howells <dhowells@redhat.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Joel Becker <Joel.Becker@oracle.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] mm: fix swap unused warningCon Kolivas
If CONFIG_SWAP is not defined we get: mm/vmscan.c: In function ‘remove_mapping’: mm/vmscan.c:387: warning: unused variable ‘swap’ Convert defines in swap.h into blank inline functions to fix this warning and be consistent. Signed-off-by: Con Kolivas <kernel@kolivas.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] sparsemem: record nid during memory presentAndy Whitcroft
Record the node id as we mark sections for instantiation. Use this nid during instantiation to direct allocations. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Cc: Mike Kravetz <kravetz@us.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Bob Picco <bob.picco@hp.com> Cc: Jack Steiner <steiner@sgi.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Martin Bligh <mbligh@google.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] More page migration: use migration entries for file pagesChristoph Lameter
This implements the use of migration entries to preserve ptes of file backed pages during migration. Processes can therefore be migrated back and forth without loosing their connection to pagecache pages. Note that we implement the migration entries only for linear mappings. Nonlinear mappings still require the unmapping of the ptes for migration. And another writepage() ugliness shows up. writepage() can drop the page lock. Therefore we have to remove migration ptes before calling writepages() in order to avoid having migration entries point to unlocked pages. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] Swapless page migration: rip out swap based logicChristoph Lameter
Rip the page migration logic out. Remove all code that has to do with swapping during page migration. This also guts the ability to migrate pages to swap. No one used that so lets let it go for good. Page migration should be a bit broken after this patch. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] Swapless page migration: add R/W migration entriesChristoph Lameter
Implement read/write migration ptes We take the upper two swapfiles for the two types of migration ptes and define a series of macros in swapops.h. The VM is modified to handle the migration entries. migration entries can only be encountered when the page they are pointing to is locked. This limits the number of places one has to fix. We also check in copy_pte_range and in mprotect_pte_range() for migration ptes. We check for migration ptes in do_swap_cache and call a function that will then wait on the page lock. This allows us to effectively stop all accesses to apge. Migration entries are created by try_to_unmap if called for migration and removed by local functions in migrate.c From: Hugh Dickins <hugh@veritas.com> Several times while testing swapless page migration (I've no NUMA, just hacking it up to migrate recklessly while running load), I've hit the BUG_ON(!PageLocked(p)) in migration_entry_to_page. This comes from an orphaned migration entry, unrelated to the current correctly locked migration, but hit by remove_anon_migration_ptes as it checks an address in each vma of the anon_vma list. Such an orphan may be left behind if an earlier migration raced with fork: copy_one_pte can duplicate a migration entry from parent to child, after remove_anon_migration_ptes has checked the child vma, but before it has removed it from the parent vma. (If the process were later to fault on this orphaned entry, it would hit the same BUG from migration_entry_wait.) This could be fixed by locking anon_vma in copy_one_pte, but we'd rather not. There's no such problem with file pages, because vma_prio_tree_add adds child vma after parent vma, and the page table locking at each end is enough to serialize. Follow that example with anon_vma: add new vmas to the tail instead of the head. (There's no corresponding problem when inserting migration entries, because a missed pte will leave the page count and mapcount high, which is allowed for. And there's no corresponding problem when migrating via swap, because a leftover swap entry will be correctly faulted. But the swapless method has no refcounting of its entries.) From: Ingo Molnar <mingo@elte.hu> pte_unmap_unlock() takes the pte pointer as an argument. From: Hugh Dickins <hugh@veritas.com> Several times while testing swapless page migration, gcc has tried to exec a pointer instead of a string: smells like COW mappings are not being properly write-protected on fork. The protection in copy_one_pte looks very convincing, until at last you realize that the second arg to make_migration_entry is a boolean "write", and SWP_MIGRATION_READ is 30. Anyway, it's better done like in change_pte_range, using is_write_migration_entry and make_migration_entry_read. From: Hugh Dickins <hugh@veritas.com> Remove unnecessary obfuscation from sys_swapon's range check on swap type, which blew up causing memory corruption once swapless migration made MAX_SWAPFILES no longer 2 ^ MAX_SWAPFILES_SHIFT. Signed-off-by: Hugh Dickins <hugh@veritas.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Christoph Lameter <clameter@engr.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> From: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] page migration cleanup: pass "mapping" to migration functionsChristoph Lameter
Change handling of address spaces. Pass a pointer to the address space in which the page is migrated to all migration function. This avoids repeatedly having to retrieve the address space pointer from the page and checking it for validity. The old page mapping will change once migration has gone to a certain step, so it is less confusing to have the pointer always available. Move the setting of the mapping and index for the new page into migrate_pages(). Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] page migration cleanup: remove useless definitionsChristoph Lameter
Remove the export for migrate_page_remove_references() and migrate_page_copy() that are unlikely to be used directly by filesystems implementing migration. The export was useful when buffer_migrate_page() lived in fs/buffer.c but it has now been moved to migrate.c in the migration reorg. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] writeback: fix range handlingOGAWA Hirofumi
When a writeback_control's `start' and `end' fields are used to indicate a one-byte-range starting at file offset zero, the required values of .start=0,.end=0 mean that the ->writepages() implementation has no way of telling that it is being asked to perform a range request. Because we're currently overloading (start == 0 && end == 0) to mean "this is not a write-a-range request". To make all this sane, the patch changes range of writeback_control. So caller does: If it is calling ->writepages() to write pages, it sets range (range_start/end or range_cyclic) always. And if range_cyclic is true, ->writepages() thinks the range is cyclic, otherwise it just uses range_start and range_end. This patch does, - Add LLONG_MAX, LLONG_MIN, ULLONG_MAX to include/linux/kernel.h -1 is usually ok for range_end (type is long long). But, if someone did, range_end += val; range_end is "val - 1" u64val = range_end >> bits; u64val is "~(0ULL)" or something, they are wrong. So, this adds LLONG_MAX to avoid nasty things, and uses LLONG_MAX for range_end. - All callers of ->writepages() sets range_start/end or range_cyclic. - Fix updates of ->writeback_index. It seems already bit strange. If it starts at 0 and ended by check of nr_to_write, this last index may reduce chance to scan end of file. So, this updates ->writeback_index only if range_cyclic is true or whole-file is scanned. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Nathan Scott <nathans@sgi.com> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: Steven French <sfrench@us.ibm.com> Cc: "Vladimir V. Saveliev" <vs@namesys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] radix-tree: direct dataNick Piggin
The ability to have height 0 radix trees (a direct pointer to the data item rather than going through a full node->slot) quietly disappeared with old-2.6-bkcvs commit ffee171812d51652f9ba284302d9e5c5cc14bdfd. On 64-bit machines this causes nearly 600 bytes to be used for every <= 4K file in pagecache. Re-introduce this feature, root tags stored in spare ->gfp_mask bits. Simplify radix_tree_delete's complex tag clearing arrangement (which would become even more complex) by just falling back to tag clearing functions (the pagecache radix-tree never uses this path anyway, so the icache savings will mean it's actually a speedup). On my 4GB G5, this saves 8MB RAM per kernel kernel source+object tree in pagecache. Pagecache lookup, insertion, and removal speed for small files will also be improved. This makes RCU radix tree harder, but it's worth it. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] change gen_pool allocator to not touch managed memoryDean Nelson
Modify the gen_pool allocator (lib/genalloc.c) to utilize a bitmap scheme instead of the buddy scheme. The purpose of this change is to eliminate the touching of the actual memory being allocated. Since the change modifies the interface, a change to the uncached allocator (arch/ia64/kernel/uncached.c) is also required. Both Andrey Volkov and Jes Sorenson have expressed a desire that the gen_pool allocator not write to the memory being managed. See the following: http://marc.theaimsgroup.com/?l=linux-kernel&m=113518602713125&w=2 http://marc.theaimsgroup.com/?l=linux-kernel&m=113533568827916&w=2 Signed-off-by: Dean Nelson <dcn@sgi.com> Cc: Andrey Volkov <avolkov@varma-el.com> Acked-by: Jes Sorensen <jes@trained-monkey.org> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] mm: introduce remap_vmalloc_range()Nick Piggin
Add remap_vmalloc_range, vmalloc_user, and vmalloc_32_user so that drivers can have a nice interface for remapping vmalloc memory. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] Unify pxm_to_node() and node_to_pxm()Yasunori Goto
Consolidate the various arch-specific implementations of pxm_to_node() and node_to_pxm() into a single generic version. Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: "Brown, Len" <len.brown@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] tightening hugetlb strict accountingChen, Kenneth W
Current hugetlb strict accounting for shared mapping always assume mapping starts at zero file offset and reserves pages between zero and size of the file. This assumption often reserves (or lock down) a lot more pages then necessary if application maps at none zero file offset. libhugetlbfs is one example that requires proper reservation on shared mapping starts at none zero offset. This patch extends the reservation and hugetlb strict accounting to support any arbitrary pair of (offset, len), resulting a much more robust and accurate scheme. More importantly, it won't lock down any hugetlb pages outside file mapping. Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Acked-by: Adam Litke <agl@us.ibm.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: William Lee Irwin III <wli@holomorphy.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] reserve space for swap labelAndreas Dilger
Reserve space in the swap disk header for a LABEL and UUID to be specified. This has been possible with util-linux-2.12b (via e2fsprogs 1.36 libblkid), and is used by at least FC3 and later. The kernel doesn't really care about this, but the space shouldn't accidentally be used by something else either. Also make the on-disk structures be fixed-size types, instead of "int", though I don't know of any architecture in use where an "int" isn't the same size as a "__u32" (all current kernel arches have it as "unsigned int"). Signed-off-by: Andreas Dilger <adilger@shaw.ca> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] support for panic at OOMKAMEZAWA Hiroyuki
This patch adds panic_on_oom sysctl under sys.vm. When sysctl vm.panic_on_oom = 1, the kernel panics intead of killing rogue processes. And if vm.panic_on_oom is 0 the kernel will do oom_kill() in the same way as it does today. Of course, the default value is 0 and only root can modifies it. In general, oom_killer works well and kill rogue processes. So the whole system can survive. But there are environments where panic is preferable rather than kill some processes. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] squash duplicate page_to_pfn and pfn_to_pageAndy Whitcroft
We have architectures where the size of page_to_pfn and pfn_to_page are significant enough to overall image size that they wish to push them out of line. However, in the process we have grown a second copy of the implementation of each of these routines for each memory model. Share the implmentation exposing it either inline or out-of-line as required. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] wait_table and zonelist initializing for memory hotadd: add return ↵Yasunori Goto
code for init_current_empty_zone When add_zone() is called against empty zone (not populated zone), we have to initialize the zone which didn't initialize at boot time. But, init_currently_empty_zone() may fail due to allocation of wait table. So, this patch is to catch its error code. Changes against wait_table is in the next patch. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] wait_table and zonelist initializing for memory hotadd: change to ↵Yasunori Goto
meminit for build_zonelist Change definitions of some functions and data from __init to __meminit. These functions and data can be used after bootup by this patch to be used for hot-add codes. Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] wait_table and zonelist initializing for memory hotadd: change name ↵Yasunori Goto
of wait_table_size() This is just to rename from wait_table_size() to wait_table_hash_nr_entries(). Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] PG_uncached is ia64 onlyAndrew Morton
As Nick points out, only ia64 uses PG_uncached. So we can push it up into the higher bits of the lower half of page->flags and make room for another flag on 32-bit machines. Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Jesse Barnes <jbarnes@sgi.com> Cc: Jes Sorensen <jes@trained-monkey.org> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] zone handle unaligned zone boundariesAndy Whitcroft
The buddy allocator has a requirement that boundaries between contigious zones occur aligned with the the MAX_ORDER ranges. Where they do not we will incorrectly merge pages cross zone boundaries. This can lead to pages from the wrong zone being handed out. Originally the buddy allocator would check that buddies were in the same zone by referencing the zone start and end page frame numbers. This was removed as it became very expensive and the buddy allocator already made the assumption that zones boundaries were aligned. It is clear that not all configurations and architectures are honouring this alignment requirement. Therefore it seems safest to reintroduce support for non-aligned zone boundaries. This patch introduces a new check when considering a page a buddy it compares the zone_table index for the two pages and refuses to merge the pages where they do not match. The zone_table index is unique for each node/zone combination when FLATMEM/DISCONTIGMEM is enabled and for each section/zone combination when SPARSEMEM is enabled (a SPARSEMEM section is at least a MAX_ORDER size). Signed-off-by: Andy Whitcroft <apw@shadowen.org> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] VFS: Permit filesystem to perform statfs with a known root dentryDavid Howells
Give the statfs superblock operation a dentry pointer rather than a superblock pointer. This complements the get_sb() patch. That reduced the significance of sb->s_root, allowing NFS to place a fake root there. However, NFS does require a dentry to use as a target for the statfs operation. This permits the root in the vfsmount to be used instead. linux/mount.h has been added where necessary to make allyesconfig build successfully. Interest has also been expressed for use with the FUSE and XFS filesystems. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Nathan Scott <nathans@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] VFS: Permit filesystem to override root dentry on mountDavid Howells
Extend the get_sb() filesystem operation to take an extra argument that permits the VFS to pass in the target vfsmount that defines the mountpoint. The filesystem is then required to manually set the superblock and root dentry pointers. For most filesystems, this should be done with simple_set_mnt() which will set the superblock pointer and then set the root dentry to the superblock's s_root (as per the old default behaviour). The get_sb() op now returns an integer as there's now no need to return the superblock pointer. This patch permits a superblock to be implicitly shared amongst several mount points, such as can be done with NFS to avoid potential inode aliasing. In such a case, simple_set_mnt() would not be called, and instead the mnt_root and mnt_sb would be set directly. The patch also makes the following changes: (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount pointer argument and return an integer, so most filesystems have to change very little. (*) If one of the convenience function is not used, then get_sb() should normally call simple_set_mnt() to instantiate the vfsmount. This will always return 0, and so can be tail-called from get_sb(). (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the dcache upon superblock destruction rather than shrink_dcache_anon(). This is required because the superblock may now have multiple trees that aren't actually bound to s_root, but that still need to be cleaned up. The currently called functions assume that the whole tree is rooted at s_root, and that anonymous dentries are not the roots of trees which results in dentries being left unculled. However, with the way NFS superblock sharing are currently set to be implemented, these assumptions are violated: the root of the filesystem is simply a dummy dentry and inode (the real inode for '/' may well be inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries with child trees. [*] Anonymous until discovered from another tree. (*) The documentation has been adjusted, including the additional bit of changing ext2_* into foo_* in the documentation. [akpm@osdl.org: convert ipath_fs, do other stuff] Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Nathan Scott <nathans@sgi.com> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22Merge branch 'devel' of master.kernel.org:/home/rmk/linux-2.6-armLinus Torvalds
* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (21 commits) [ARM] 3629/1: S3C24XX: fix missing bracket in regs-dsc.h [ARM] 3537/1: Rework DMA-bounce locking for finer granularity [ARM] 3601/1: i.MX/MX1 DMA error handling for signaled channels only [ARM] 3597/1: ixp4xx/nslu2: Board support for new LED subsystem [ARM] 3595/1: ixp4xx/nas100d: Board support for new LED subsystem [ARM] 3626/1: ARM EABI: fix syscall restarting [ARM] 3628/1: S3C24XX: add get_rate call to struct clk [ARM] 3627/1: S3C24XX: split s3c2410 clocks from core clocks [ARM] 3613/1: S3C2410: Add sysdev and sysclass [ARM] 3624/1: Report true modem control line states [ARM] 3620/2: ixp23xx: add uengine loader support [ARM] 3618/1: add defconfig for logicpd pxa270 card engine [ARM] 3617/1: ep93xx: fix slightly incorrect timer tick rate [ARM] 3616/1: fix timer handler wrap logic for a number of platforms [ARM] 3615/1: ixp23xx: use platform devices for physmap flash [ARM] 3614/1: ep93xx: use platform devices for physmap flash [ARM] 3621/1: fix compilation breakage for pnx4008 [ARM] 3623/1: pnx4008: move GPIO-related defines to gpio.h [ARM] 3622/1: pnx4008: remove clk_use/clk_unuse [ARM] Enable VFP to be built when non-VFP capable CPUs are selected ...
2006-06-22Merge branch 'upstream-linus' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (33 commits) [PATCH] myri10ge - drop workaround pci_save_state() disabling MSI [PATCH] myri10ge - drop workaround for the missing AER ext cap on nVidia CK804 via-velocity: the link is not correctly detected when the device starts [PATCH] add b44 to maintainers [PATCH] WAN: ioremap() failure checks in drivers [PATCH] WAN: register_hdlc_device() doesn't need dev_alloc_name() [PATCH] skb_padto()-area fixes in 8390, wavelan [PATCH] make drivers/net/forcedeth.c:nv_update_pause() static [PATCH] network driver for Hilscher netx [PATCH] Dereference in tokenring/olympic.c [PATCH] Array overrun in drivers/net/wireless/wavelan.c [PATCH] Remove useless check in drivers/net/pcmcia/xirc2ps_cs.c [PATCH] 8139cp: add ethtool eeprom support [PATCH] 8139cp: fix eeprom read command length [PATCH] b44: update b44 Kconfig entry [PATCH] b44: update version to 1.01 [PATCH] b44: add wol for old nic [PATCH] b44: add parameter [PATCH] b44: add wol [PATCH] b44: fix manual speed/duplex/autoneg settings ...
2006-06-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpcLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (139 commits) [POWERPC] re-enable OProfile for iSeries, using timer interrupt [POWERPC] support ibm,extended-*-frequency properties [POWERPC] Extra sanity check in EEH code [POWERPC] Dont look for class-code in pci children [POWERPC] Fix mdelay badness on shared processor partitions [POWERPC] disable floating point exceptions for init [POWERPC] Unify ppc syscall tables [POWERPC] mpic: add support for serial mode interrupts [POWERPC] pseries: Print PCI slot location code on failure [POWERPC] spufs: one more fix for 64k pages [POWERPC] spufs: fail spu_create with invalid flags [POWERPC] spufs: clear class2 interrupt status before wakeup [POWERPC] spufs: fix Makefile for "make clean" [POWERPC] spufs: remove stop_code from struct spu [POWERPC] spufs: fix spu irq affinity setting [POWERPC] spufs: further abstract priv1 register access [POWERPC] spufs: split the Cell BE support into generic and platform dependant parts [POWERPC] spufs: dont try to access SPE channel 1 count [POWERPC] spufs: use kzalloc in create_spu [POWERPC] spufs: fix initial state of wbox file ... Manually resolved conflicts in: drivers/net/phy/Makefile include/asm-powerpc/spu.h
2006-06-22[PATCH] WAN: register_hdlc_device() doesn't need dev_alloc_name()Krzysztof Halasa
David Boggs noticed that register_hdlc_device() no longer needs to call dev_alloc_name() as it's called by register_netdev(). register_hdlc_device() is currently equivalent to register_netdev(). hdlc_setup() is now EXPORTed as per David's request. Signed-off-by: Krzysztof Halasa <khc@pm.waw.pl> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-06-22[PATCH] network driver for Hilscher netxSascha Hauer
This is a patch for the Hilscher netx builtin ethernet ports. The netx board support was merged into 2.6.17-git2. The netx is a arm926 based SoC. Signed-off-by: Robert Schwebel <r.schwebel@pengutronix.de> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> -- drivers/net/Kconfig | 11 drivers/net/Makefile | 1 drivers/net/netx-eth.c | 516 ++++++++++++++++++++++++++++++++++++++++ include/asm-arm/arch-netx/eth.h | 27 ++ 4 files changed, 555 insertions(+) Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-06-22Merge branch 'master' into upstreamJeff Garzik
2006-06-22Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/i2c-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/i2c-2.6: (44 commits) [PATCH] I2C: I2C controllers go into right place on sysfs [PATCH] hwmon-vid: Add support for Intel Core and Conroe [PATCH] lm70: New hardware monitoring driver [PATCH] hwmon: Fix the Kconfig header [PATCH] i2c-i801: Merge setup function [PATCH] i2c-i801: Better pci subsystem integration [PATCH] i2c-i801: Cleanups [PATCH] i2c-i801: Remove PCI function check [PATCH] i2c-i801: Remove force_addr parameter [PATCH] i2c-i801: Fix block transaction poll loops [PATCH] scx200_acb: Documentation update [PATCH] scx200_acb: Mark scx200_acb_probe __init [PATCH] scx200_acb: Use PCI I/O resource when appropriate [PATCH] i2c: Mark block write buffers as const [PATCH] i2c-ocores: Minor cleanups [PATCH] abituguru: Fix fan detection [PATCH] abituguru: Review fixes [PATCH] abituguru: New hardware monitoring driver [PATCH] w83792d: Add missing data access locks [PATCH] w83792d: Fix setting the PWM value ...
2006-06-22Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/w1-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/w1-2.6: [PATCH] w1: warning fix [PATCH] w1: clean up W1_CON dependency. [PATCH] drivers/w1/w1.c: fix a compile error [PATCH] W1: fix dependencies of W1_SLAVE_DS2433_CRC [PATCH] W1: possible cleanups [PATCH] W1: cleanups [PATCH] w1 exports [PATCH] w1: Use mutexes instead of semaphores. [PATCH] w1: Make w1 connector notifications depend on connector. [PATCH] w1: netlink: Mark netlink group 1 as unused. [PATCH] w1: Move w1-connector definitions into linux/include/connector.h [PATCH] w1: Userspace communication protocol over connector. [PATCH] w1: Replace dscore and ds_w1_bridge with ds2490 driver. [PATCH] w1: Added default generic read/write operations.
2006-06-22Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (27 commits) [PATCH] PCI: nVidia quirk to make AER PCI-E extended capability visible [PATCH] PCI: fix issues with extended conf space when MMCONFIG disabled because of e820 [PATCH] PCI: Bus Parity Status sysfs interface [PATCH] PCI: fix memory leak in MMCONFIG error path [PATCH] PCI: fix error with pci_get_device() call in the mpc85xx driver [PATCH] PCI: MSI-K8T-Neo2-Fir: run only where needed [PATCH] PCI: fix race with pci_walk_bus and pci_destroy_dev [PATCH] PCI: clean up pci documentation to be more specific [PATCH] PCI: remove unneeded msi code [PATCH] PCI: don't move ioapics below PCI bridge [PATCH] PCI: cleanup unused variable about msi driver [PATCH] PCI: disable msi mode in pci_disable_device [PATCH] PCI: Allow MSI to work on kexec kernel [PATCH] PCI: AMD 8131 MSI quirk called too late, bus_flags not inherited ? [PATCH] PCI: Move various PCI IDs to header file [PATCH] PCI Bus Parity Status-broken hardware attribute, EDAC foundation [PATCH] PCI: i386/x86_84: disable PCI resource decode on device disable [PATCH] PCI ACPI: Rename the functions to avoid multiple instances. [PATCH] PCI: don't enable device if already enabled [PATCH] PCI: Add a "enable" sysfs attribute to the pci devices to allow userspace (Xorg) to enable devices without doing foul direct access ...
2006-06-22[PATCH] zlib_inflate: Upgrade library code to a recent versionRichard Purdie
Upgrade the zlib_inflate implementation in the kernel from a patched version 1.1.3/4 to a patched 1.2.3. The code in the kernel is about seven years old and I noticed that the external zlib library's inflate performance was significantly faster (~50%) than the code in the kernel on ARM (and faster again on x86_32). For comparison the newer deflate code is 20% slower on ARM and 50% slower on x86_32 but gives an approx 1% compression ratio improvement. I don't consider this to be an improvement for kernel use so have no plans to change the zlib_deflate code. Various changes have been made to the zlib code in the kernel, the most significant being the extra functions/flush option used by ppp_deflate. This update reimplements the features PPP needs to ensure it continues to work. This code has been tested on ARM under both JFFS2 (with zlib compression enabled) and ppp_deflate and on x86_32. JFFS2 sees an approx. 10% real world file read speed improvement. This patch also removes ZLIB_VERSION as it no longer has a correct value. We don't need version checks anyway as the kernel's module handling will take care of that for us. This removal is also more in keeping with the zlib author's wishes (http://www.zlib.net/zlib_faq.html#faq24) and I've added something to the zlib.h header to note its a modified version. Signed-off-by: Richard Purdie <rpurdie@rpsys.net> Acked-by: Joern Engel <joern@wh.fh-wedel.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22[PATCH] vgacon: make VGA_MAP_MEM take size, remove extra useBjorn Helgaas
VGA_MAP_MEM translates to ioremap() on some architectures. It makes sense to do this to vga_vram_base, because we're going to access memory between vga_vram_base and vga_vram_end. But it doesn't really make sense to map starting at vga_vram_end, because we aren't going to access memory starting there. On ia64, which always has to be different, ioremapping vga_vram_end gives you something completely incompatible with ioremapped vga_vram_start, so vga_vram_size ends up being nonsense. As a bonus, we often know the size up front, so we can use ioremap() correctly, rather than giving it a zero size. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: "Antonino A. Daplas" <adaplas@pol.net> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22[PATCH] Fix dcache race during umountNeilBrown
The race is that the shrink_dcache_memory shrinker could get called while a filesystem is being unmounted, and could try to prune a dentry belonging to that filesystem. If it does, then it will call in to iput on the inode while the dentry is no longer able to be found by the umounting process. If iput takes a while, generic_shutdown_super could get all the way though shrink_dcache_parent and shrink_dcache_anon and invalidate_inodes without ever waiting on this particular inode. Eventually the superblock gets freed anyway and if the iput tried to touch it (which some filesystems certainly do), it will lose. The promised "Self-destruct in 5 seconds" doesn't lead to a nice day. The race is closed by holding s_umount while calling prune_one_dentry on someone else's dentry. As a down_read_trylock is used, shrink_dcache_memory will no longer try to prune the dentry of a filesystem that is being unmounted, and unmount will not be able to start until any such active prune_one_dentry completes. This requires that prune_dcache *knows* which filesystem (if any) it is doing the prune on behalf of so that it can be careful of other filesystems. shrink_dcache_memory isn't called it on behalf of any filesystem, and so is careful of everything. shrink_dcache_anon is now passed a super_block rather than the s_anon list out of the superblock, so it can get the s_anon list itself, and can pass the superblock down to prune_dcache. If prune_dcache finds a dentry that it cannot free, it leaves it where it is (at the tail of the list) and exits, on the assumption that some other thread will be removing that dentry soon. To try to make sure that some work gets done, a limited number of dnetries which are untouchable are skipped over while choosing the dentry to work on. I believe this race was first found by Kirill Korotaev. Cc: Jan Blunck <jblunck@suse.de> Acked-by: Kirill Korotaev <dev@openvz.org> Cc: Olaf Hering <olh@suse.de> Acked-by: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22[PATCH] remove steal_locks()Miklos Szeredi
This patch removes the steal_locks() function. steal_locks() doesn't work correctly with any filesystem that does it's own lock management, including NFS, CIFS, etc. In addition it has weird semantics on local filesystems in case tasks sharing file-descriptor tables are doing POSIX locking operations in parallel to execve(). The steal_locks() function has an effect on applications doing: clone(CLONE_FILES) /* in child */ lock execve lock POSIX locks acquired before execve (by "child", "parent" or any further task sharing files_struct) will after the execve be owned exclusively by "child". According to Chris Wright some LSB/LTP kind of suite triggers without the stealing behavior, but there's no known real-world application that would also fail. Apps using NPTL are not affected, since all other threads are killed before execve. Apps using LinuxThreads are only affected if they - have multiple threads during exec (LinuxThreads doesn't kill other threads, the app may do it with pthread_kill_other_threads_np()) - rely on POSIX locks being inherited across exec Both conditions are documented, but not their interaction. Apps using clone() natively are affected if they - use clone(CLONE_FILES) - rely on POSIX locks being inherited across exec The above scenarios are unlikely, but possible. If the patch is vetoed, there's a plan B, that involves mostly keeping the weird stealing semantics, but changing the way lock ownership is handled so that network and local filesystems work consistently. That would add more complexity though, so this solution seems to be preferred by most people. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Matthew Wilcox <willy@debian.org> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Steven French <sfrench@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22[PATCH] PCI: Add PCI_CAP_ID_VNDRBrice Goglin
Add the vendor-specific extended capability PCI_CAP_ID_VNDR. It is required by the Myri-10G Ethernet driver. Signed-off-by: Brice Goglin <brice@myri.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Cc: Jeff Garzik <jeff@garzik.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22[PATCH] Keys: Fix race between two instantiators of a keyDavid Howells
Add a revocation notification method to the key type and calls it whilst the key's semaphore is still write-locked after setting the revocation flag. The patch then uses this to maintain a reference on the task_struct of the process that calls request_key() for as long as the authorisation key remains unrevoked. This fixes a potential race between two processes both of which have assumed the authority to instantiate a key (one may have forked the other for example). The problem is that there's no locking around the check for revocation of the auth key and the use of the task_struct it points to, nor does the auth key keep a reference on the task_struct. Access to the "context" pointer in the auth key must thenceforth be done with the auth key semaphore held. The revocation method is called with the target key semaphore held write-locked and the search of the context process's keyrings is done with the auth key semaphore read-locked. The check for the revocation state of the auth key just prior to searching it is done after the auth key is read-locked for the search. This ensures that the auth key can't be revoked between the check and the search. The revocation notification method is added so that the context task_struct can be released as soon as instantiation happens rather than waiting for the auth key to be destroyed, thus avoiding the unnecessary pinning of the requesting process. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22[PATCH] selinux: add hooks for key subsystemMichael LeMay
Introduce SELinux hooks to support the access key retention subsystem within the kernel. Incorporate new flask headers from a modified version of the SELinux reference policy, with support for the new security class representing retained keys. Extend the "key_alloc" security hook with a task parameter representing the intended ownership context for the key being allocated. Attach security information to root's default keyrings within the SELinux initialization routine. Has passed David's testsuite. Signed-off-by: Michael LeMay <mdlemay@epoch.ncsc.mil> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org> Acked-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22[ARM] 3629/1: S3C24XX: fix missing bracket in regs-dsc.hBen Dooks
Patch from Ben Dooks Fix missing bracket in include/asm-arm/arch-s3c2410/regs-dsc.h Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-06-22[ARM] 3601/1: i.MX/MX1 DMA error handling for signaled channels onlyPavel Pisa
Patch from Pavel Pisa There has been bug, that dma_err_handler() touches even channels not signaling error condition. Problem noticed by Andrea Paterniani. Signed-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-06-22[ALSA] version 1.0.12rc1Jaroslav Kysela
2006-06-22[ALSA] Disable AC97 AUX and VIDEO controls for WM9705 touchscreenRodolfo Giometti
This patch by Rodolfo Giometti disables the AC97 AUX and VIDEO controls on the WM9705 when the touchscreen is selected as the AUX and VIDEO lines are shared with the touch controller. Changes:- o Added AC97_HAS_NO_AUX flag o Test for AC97_HAS_NO_AUX flag in snd_ac97_mixer_build() o Sets AC97_HAS_NO_VIDEO and AC97_HAS_NO_AUX in patch_wolfson05() when WM9705 touch driver is selected. Signed-off-by: Rodolfo Giometti <giometti@linux.it> Signed-off-by: Liam Girdwood <liam.girdwood@wolfsonmicro.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2006-06-22[ALSA] Change an arugment of snd_mpu401_uart_new() to bit flagsTakashi Iwai
Change the 5th argument of snd_mpu401_uart_new() to bit flags instead of a boolean. The argument takes bits that consist of MPU401_INFO_XXX flags. The callers that used the value 1 there are replaced with MPU401_INFO_INTEGRATED. Signed-off-by: Takashi Iwai <tiwai@suse.de>
2006-06-22[ALSA] Fix rwlock around snd_iprintf() in sound coreTakashi Iwai
Fixed rwlock around snd_iprintf() in sound core part. Replaced with mutex. Also, make mutex and flags static variables with addition of snd_card_locked() function (just for sound.c). Signed-off-by: Takashi Iwai <tiwai@suse.de>
2006-06-22[ALSA] rawmidi: add get_port_info callback for sequencer information flagsClemens Ladisch
Add a get_port_info callback to the snd_rawmidi_global_ops structure to allow the USB MIDI driver to supply information flags for the sequencer ports created by seq_midi. Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
2006-06-22[ALSA] add more sequencer port type information bitsClemens Ladisch
Add four new information flags SNDRV_SEQ_PORT_TYPE_HARDWARE, _SOFTWARE, _SYNTHESIZER, _PORT for sequencer ports. This makes it easier for apps like Rosegarden to make policy decisions based on the port type. Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
2006-06-22[ALSA] Fix mmap_count with O_APPEND opened streamsTakashi Iwai
Move mmap_count to snd_pcm_substream instead of runtime struct so that multiplly opened substreams via O_APPEND can be handled correctly. Signed-off-by: Takashi Iwai <tiwai@suse.de>
2006-06-22[ALSA] Add O_APPEND flag support to PCMTakashi Iwai
Added O_APPEND flag support to PCM to enable shared substreams among multiple processes. This mechanism is used by dmix and dsnoop plugins. Signed-off-by: Takashi Iwai <tiwai@suse.de>