summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2013-02-11nfsd4: free_stid can be staticFengguang Wu
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
2013-02-08nfsd: keep a checksum of the first 256 bytes of requestJeff Layton
Now that we're allowing more DRC entries, it becomes a lot easier to hit problems with XID collisions. In order to mitigate those, calculate a checksum of up to the first 256 bytes of each request coming in and store that in the cache entry, along with the total length of the request. This initially used crc32, but Chuck Lever and Jim Rees pointed out that crc32 is probably more heavyweight than we really need for generating these checksums, and recommended looking at using the same routines that are used to generate checksums for IP packets. On an x86_64 KVM guest measurements with ftrace showed ~800ns to use csum_partial vs ~1750ns for crc32. The difference probably isn't terribly significant, but for now we may as well use csum_partial. Signed-off-by: Jeff Layton <jlayton@redhat.com> Stones-thrown-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-08sunrpc: trim off trailing checksum before returning decrypted or integrity ↵Jeff Layton
authenticated buffer When GSSAPI integrity signatures are in use, or when we're using GSSAPI privacy with the v2 token format, there is a trailing checksum on the xdr_buf that is returned. It's checked during the authentication stage, and afterward nothing cares about it. Ordinarily, it's not a problem since the XDR code generally ignores it, but it will be when we try to compute a checksum over the buffer to help prevent XID collisions in the duplicate reply cache. Fix the code to trim off the checksums after verifying them. Note that in unwrap_integ_data, we must avoid trying to reverify the checksum if the request was deferred since it will no longer be present when it's revisited. Signed-off-by: Jeff Layton <jlayton@redhat.com>
2013-02-05sunrpc: fix comment in struct xdr_buf definitionJeff Layton
...these pages aren't necessarily contiguous. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-05sunrpc: move address copy/cmp/convert routines and prototypes from clnt.h to ↵Jeff Layton
addr.h These routines are used by server and client code, so having them in a separate header would be best. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-05sunrpc: copy scope ID in __rpc_copy_addr6Jeff Layton
When copying an address, we should also copy the scopeid in the event that this is a link-local address and the scope matters. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-05nfsd4: simplify idr allocationJ. Bruce Fields
We don't really need to preallocate at all; just allocate and initialize everything at once, but leave the sc_type field initially 0 to prevent finding the stateid till it's fully initialized. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-05nfsd: Fix memleakmajianpeng
When free nfs-client, it must free the ->cl_stateids. Cc: stable@kernel.org Signed-off-by: Jianpeng Ma <majianpeng@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-04nfsd: register a shrinker for DRC cache entriesJeff Layton
Since we dynamically allocate them now, allow the system to call us up to release them if it gets low on memory. Since these entries aren't replaceable, only free ones that are expired or that are over the cap. The the seeks value is set to '1' however to indicate that freeing the these entries is low-cost. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-04nfsd: add recurring workqueue job to clean the cacheJeff Layton
It's not sufficient to only clean the cache when requests come in. What if we have a flurry of activity and then the server goes idle? Add a workqueue job that will clean the cache every RC_EXPIRE period. Care is taken to only run this when we expect to have entries expiring. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-04nfsd: when updating an entry with RC_NOCACHE, just free itJeff Layton
There's no need to keep entries around that we're declaring RC_NOCACHE. Ditto if there's a problem with the entry. With this change too, there's no need to test for RC_UNUSED in the search function. If the entry's in the hash table then it's either INPROG or DONE. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-04nfsd: remove the cache_disabled flagJeff Layton
With the change to dynamically allocate entries, the cache is never disabled on the fly. Remove this flag. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-04nfsd: dynamically allocate DRC entriesJeff Layton
The existing code keeps a fixed-size cache of 1024 entries. This is much too small for a busy server, and wastes memory on an idle one. This patch changes the code to dynamically allocate and free these cache entries. A cap on the number of entries is retained, but it's much larger than the existing value and now scales with the amount of low memory in the machine. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-04nfsd: track the number of DRC entries in the cacheJeff Layton
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-04nfsd: always move DRC entries to the end of LRU list when updating timestampJeff Layton
...otherwise, we end up with the list ordering wrong. Currently, it's not a problem since we skip RC_INPROG entries, but keeping the ordering strict will be necessary for a later patch that adds a cache cleaner. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-04nfsd: initialize the exp->ex_uuid field in svc_export_initJeff Layton
commit 885c91f7466 in Bruce's tree was causing oopses for me: general protection fault: 0000 [#1] SMP Modules linked in: nfsd(OF) nfs_acl(OF) auth_rpcgss(OF) lockd(OF) sunrpc(OF) kvm_amd kvm microcode i2c_piix4 virtio_net virtio_balloon cirrus drm_kms_helper ttm drm virtio_blk i2c_core CPU 0 Pid: 564, comm: exportfs Tainted: GF O 3.8.0-0.rc5.git2.1.fc19.x86_64 #1 Bochs Bochs RIP: 0010:[<ffffffff811b1509>] [<ffffffff811b1509>] kfree+0x49/0x280 RSP: 0018:ffff88007a3d7c50 EFLAGS: 00010203 RAX: 01adaf8dadadad80 RBX: 6b6b6b6b6b6b6b6b RCX: 0000000000000001 RDX: ffffffff7fffffff RSI: 0000000000000000 RDI: 6b6b6b6b6b6b6b6b RBP: ffff88007a3d7c80 R08: 6b6b6b6b6b6b6b6b R09: 0000000000000000 R10: 0000000000000018 R11: 0000000000000000 R12: ffff88006a117b50 R13: ffffffffa01a589c R14: ffff8800631b0f50 R15: 01ad998dadadad80 FS: 00007fcaa3616740(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f5d84b6fdd8 CR3: 0000000064db4000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process exportfs (pid: 564, threadinfo ffff88007a3d6000, task ffff88006af28000) Stack: ffff88007a3d7c80 ffff88006a117b68 ffff88006a117b50 0000000000000000 ffff8800631b0f50 ffff88006a117b50 ffff88007a3d7ca0 ffffffffa01a589c ffff880036be1148 ffff88007a3d7cf8 ffff88007a3d7e28 ffffffffa01a6a98 Call Trace: [<ffffffffa01a589c>] svc_export_put+0x5c/0x70 [nfsd] [<ffffffffa01a6a98>] svc_export_parse+0x328/0x7e0 [nfsd] [<ffffffffa016f1c7>] cache_do_downcall+0x57/0x70 [sunrpc] [<ffffffffa016f25e>] cache_downcall+0x7e/0x100 [sunrpc] [<ffffffffa016f338>] cache_write_procfs+0x58/0x90 [sunrpc] [<ffffffffa016f2e0>] ? cache_downcall+0x100/0x100 [sunrpc] [<ffffffff8123b0e5>] proc_reg_write+0x75/0xb0 [<ffffffff811ccecf>] vfs_write+0x9f/0x170 [<ffffffff811cd089>] sys_write+0x49/0xa0 [<ffffffff816e0919>] system_call_fastpath+0x16/0x1b Code: 66 66 66 90 48 83 fb 10 0f 86 c3 00 00 00 48 89 df 49 bf 00 00 00 00 00 ea ff ff e8 f2 12 ea ff 48 c1 e8 0c 48 c1 e0 06 49 01 c7 <49> 8b 07 f6 c4 80 0f 85 1d 02 00 00 49 8b 07 a8 80 0f 84 ee 01 RIP [<ffffffff811b1509>] kfree+0x49/0x280 RSP <ffff88007a3d7c50> I think Majianpeng's patch is correct, but incomplete. In order for it to be safe to free the ex_uuid unconditionally in svc_export_put, we need to make sure it's initialized to NULL in the init routine. Cc: majianpeng <majianpeng@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-04nfsd: break out hashtable search into separate functionJeff Layton
Later, we'll need more than one call site for this, so break it out into a new function. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-04nfsd: clean up and clarify the cache expiration codeJeff Layton
Add a preprocessor constant for the expiry time of cache entries, and move the test for an expired entry into a function. Note that the current code does not test for RC_INPROG. It just assumes that it won't take more than 2 minutes to fill out an in-progress entry. I'm not sure how valid that assumption is though, so let's just ensure that we never consider an RC_INPROG entry to be expired. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-04nfsd: remove redundant test from nfsd_reply_cache_freeJeff Layton
Entries can only get a c_type of RC_REPLBUFF iff they are RC_DONE. Therefore the test for RC_DONE isn't necessary here. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-04nfsd: add alloc and free functions for DRC entriesJeff Layton
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-04nfsd: create a dedicated slabcache for DRC entriesJeff Layton
Currently we use kmalloc() which wastes a little bit of memory on each allocation since it's a power of 2 allocator. Since we're allocating a 1024 of these now, and may need even more later, let's create a new slabcache for them. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-04nfsd: get rid of RC_INTRJeff Layton
The reply cache code never returns this status. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-04nfsd: remove unneeded spinlock in nfsd_cache_updateJeff Layton
The locking rules for cache entries say that locking the cache_lock isn't needed if you're just touching the current entry. Earlier in this function we set rp->c_state to RC_UNUSED without any locking, so I believe it's ok to do the same here. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-04nfsd: fix IPv6 address handling in the DRCJeff Layton
Currently, it only stores the first 16 bytes of any address. struct sockaddr_in6 is 28 bytes however, so we're currently ignoring the last 12 bytes of the address. Expand the c_addr field to a sockaddr_in6, and cast it to a sockaddr_in as necessary. Also fix the comparitor to use the existing RPC helpers for this. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-01-29nfsd: Fix memleak in svc_export_putmajianpeng
In func svc_export_parse, the uuid which used kmemdup to alloc will be changed in func export_update.So the later kfree don't free this memory. And it can't be free in func svc_export_parse because other place still used.So put this operation in func svc_export_put. Signed-off-by: Jianpeng Ma <majianpeng@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-01-23nfsd4: require version 4 when enabling or disabling minorversionJ. Bruce Fields
The current code will allow silly things like: echo "+2 +3 +4 +7.1">/proc/fs/nfsd/versions Reported-by: Fan Chaoting <fanchaoting@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-01-23nfsd: fix unused "nn" variable warning in free_client()Stanislav Kinsbursky
If CONFIG_LOCKDEP is disabled, then there would be a warning like this: CC [M] fs/nfsd/nfs4state.o fs/nfsd/nfs4state.c: In function ‘free_client’: fs/nfsd/nfs4state.c:1051:19: warning: unused variable ‘nn’ [-Wunused-variable] So, let's add "maybe_unused" tag to this variable. Reported-by: Toralf Förster <toralf.foerster@gmx.de> Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-01-23sunrpc: Fix lockd sleeping until timeoutAndriy Skulysh
There is a race in enqueueing thread to a pool and waking up a thread. lockd doesn't wake up on reception of lock granted callback if svc_wake_up() is called before lockd's thread is added to a pool. Signed-off-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-01-23svcrpc: silence "unused variable" warning in !RPC_DEBUG caseJ. Bruce Fields
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-01-23nfsd: Remove write permission from file contentYanchuan Nian
The write function doesn't be implemented in file content, and it's meaningless to write data into this file directly. Remove write permission from it. Signed-off-by: Yanchuan Nian <ycnian@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-01-23nfsd: Don't unlock the state while it's not lockedYanchuan Nian
In the procedure of CREATE_SESSION, the state is locked after alloc_conn_from_crses(). If the allocation fails, the function goes to "out_free_session", and then "out" where there is an unlock function. Signed-off-by: Yanchuan Nian <ycnian@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-01-23nfsd: Pass correct slot number to nfsd4_put_drc_mem()Yanchuan Nian
In alloc_session(), numslots is the correct slot number used by the session. But the slot number passed to nfsd4_put_drc_mem() is the one from nfs client. Signed-off-by: Yanchuan Nian <ycnian@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-01-23nfsd4: simplify nfsd4_encode_fattr interface slightlyJ. Bruce Fields
It seems slightly simpler to make nfsd4_encode_fattr rather than its callers responsible for advancing the write pointer on success. (Also: the count == 0 check in the verify case looks superfluous. Running out of buffer space is really the only reason fattr encoding should fail with eresource.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-01-02Linux 3.8-rc2v3.8-rc2Linus Torvalds
2013-01-02Merge branch 'fixes-for-3.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds Pull LED fix from Bryan Wu. * 'fixes-for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds: leds: leds-gpio: set devm_gpio_request_one() flags param correctly
2013-01-02leds: leds-gpio: set devm_gpio_request_one() flags param correctlyJavier Martinez Canillas
commit a99d76f leds: leds-gpio: use gpio_request_one changed the leds-gpio driver to use gpio_request_one() instead of gpio_request() + gpio_direction_output() Unfortunately, it also made a semantic change that breaks the leds-gpio driver. The gpio_request_one() flags parameter was set to: GPIOF_DIR_OUT | (led_dat->active_low ^ state) Since GPIOF_DIR_OUT is 0, the final flags value will just be the XOR'ed value of led_dat->active_low and state. This value were used to distinguish between HIGH/LOW output initial level and call gpio_direction_output() accordingly. With this new semantic gpio_request_one() will take the flags value of 1 as a configuration of input direction (GPIOF_DIR_IN) and will call gpio_direction_input() instead of gpio_direction_output(). int gpio_request_one(unsigned gpio, unsigned long flags, const char *label) { .. if (flags & GPIOF_DIR_IN) err = gpio_direction_input(gpio); else err = gpio_direction_output(gpio, (flags & GPIOF_INIT_HIGH) ? 1 : 0); .. } The right semantic is to evaluate led_dat->active_low ^ state and set the output initial level explicitly. Signed-off-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk> Reported-by: Arnaud Patard <arnaud.patard@rtp-net.org> Tested-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com> Signed-off-by: Bryan Wu <cooloney@gmail.com>
2013-01-02Merge git://www.linux-watchdog.org/linux-watchdogLinus Torvalds
Pull watchdog fixes from Wim Van Sebroeck: "This fixes some small errors in the new da9055 driver, eliminates a compiler warning and adds DT support for the twl4030_wdt driver (so that we can have multiple watchdogs with DT on the omap platforms)." * git://www.linux-watchdog.org/linux-watchdog: watchdog: twl4030_wdt: add DT support watchdog: omap_wdt: eliminate unused variable and a compiler warning watchdog: da9055: Don't update wdt_dev->timeout in da9055_wdt_set_timeout error path watchdog: da9055: Fix invalid free of devm_ allocated data
2013-01-02Merge tag '3.8-pci-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "Some fixes for v3.8. They include a fix for the new SR-IOV sysfs management support, an expanded quirk for Ricoh SD card readers, a Stratus DMI quirk fix, and a PME polling fix." * tag '3.8-pci-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: Reduce Ricoh 0xe822 SD card reader base clock frequency to 50MHz PCI/PM: Do not suspend port if any subordinate device needs PME polling PCI: Add PCIe Link Capability link speed and width names PCI: Work around Stratus ftServer broken PCIe hierarchy (fix DMI check) PCI: Remove spurious error for sriov_numvfs store and simplify flow
2013-01-02UAPI: Strip _UAPI prefix on header install no matter the whitespaceDavid Howells
Commit 56c176c9cac9 ("UAPI: strip the _UAPI prefix from header guards during header installation") strips the _UAPI prefix from header guards, but only if there's a single space between the cpp directive and the label. Make it more flexible and able to handle tabs and multiple white space characters. Signed-off-by: David Howells <dhowell@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-02UAPI: Remove empty Kbuild filesDavid Howells
Empty files can get deleted by the patch program, so remove empty Kbuild files and their links from the parent Kbuilds. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-02Merge tag 'ecryptfs-3.8-rc2-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs Pull ecryptfs fixes from Tyler Hicks: "Two self-explanatory fixes and a third patch which improves performance: when overwriting a full page in the eCryptfs page cache, skip reading in and decrypting the corresponding lower page." * tag 'ecryptfs-3.8-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs: fs/ecryptfs/crypto.c: make ecryptfs_encode_for_filename() static eCryptfs: fix to use list_for_each_entry_safe() when delete items eCryptfs: Avoid unnecessary disk read and data decryption during writing
2013-01-02Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph fixes from Sage Weil: "Two of Alex's patches deal with a race when reseting server connections for open RBD images, one demotes some non-fatal BUGs to WARNs, and my patch fixes a protocol feature bit failure path." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: libceph: fix protocol feature mismatch failure path libceph: WARN, don't BUG on unexpected connection states libceph: always reset osds when kicking libceph: move linger requests sooner in kick_requests()
2013-01-02mm: mempolicy: Convert shared_policy mutex to spinlockMel Gorman
Sasha was fuzzing with trinity and reported the following problem: BUG: sleeping function called from invalid context at kernel/mutex.c:269 in_atomic(): 1, irqs_disabled(): 0, pid: 6361, name: trinity-main 2 locks held by trinity-main/6361: #0: (&mm->mmap_sem){++++++}, at: [<ffffffff810aa314>] __do_page_fault+0x1e4/0x4f0 #1: (&(&mm->page_table_lock)->rlock){+.+...}, at: [<ffffffff8122f017>] handle_pte_fault+0x3f7/0x6a0 Pid: 6361, comm: trinity-main Tainted: G W 3.7.0-rc2-next-20121024-sasha-00001-gd95ef01-dirty #74 Call Trace: __might_sleep+0x1c3/0x1e0 mutex_lock_nested+0x29/0x50 mpol_shared_policy_lookup+0x2e/0x90 shmem_get_policy+0x2e/0x30 get_vma_policy+0x5a/0xa0 mpol_misplaced+0x41/0x1d0 handle_pte_fault+0x465/0x6a0 This was triggered by a different version of automatic NUMA balancing but in theory the current version is vunerable to the same problem. do_numa_page -> numa_migrate_prep -> mpol_misplaced -> get_vma_policy -> shmem_get_policy It's very unlikely this will happen as shared pages are not marked pte_numa -- see the page_mapcount() check in change_pte_range() -- but it is possible. To address this, this patch restores sp->lock as originally implemented by Kosaki Motohiro. In the path where get_vma_policy() is called, it should not be calling sp_alloc() so it is not necessary to treat the PTL specially. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-02Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bug fixes from Ted Ts'o: "Various bug fixes for ext4. Perhaps the most serious bug fixed is one which could cause file system corruptions when performing file punch operations." * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: avoid hang when mounting non-journal filesystems with orphan list ext4: lock i_mutex when truncating orphan inodes ext4: do not try to write superblock on ro remount w/o journal ext4: include journal blocks in df overhead calcs ext4: remove unaligned AIO warning printk ext4: fix an incorrect comment about i_mutex ext4: fix deadlock in journal_unmap_buffer() ext4: split off ext4_journalled_invalidatepage() jbd2: fix assertion failure in jbd2_journal_flush() ext4: check dioread_nolock on remount ext4: fix extent tree corruption caused by hole punch
2013-01-02mempolicy: remove arg from mpol_parse_str, mpol_to_strHugh Dickins
Remove the unused argument (formerly no_context) from mpol_parse_str() and from mpol_to_str(). Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-02tmpfs mempolicy: fix /proc/mounts corrupting memoryHugh Dickins
Recently I suggested using "mount -o remount,mpol=local /tmp" in NUMA mempolicy testing. Very nasty. Reading /proc/mounts, /proc/pid/mounts or /proc/pid/mountinfo may then corrupt one bit of kernel memory, often in a page table (causing "Bad swap" or "Bad page map" warning or "Bad pagetable" oops), sometimes in a vm_area_struct or rbnode or somewhere worse. "mpol=prefer" and "mpol=prefer:Node" are equally toxic. Recent NUMA enhancements are not to blame: this dates back to 2.6.35, when commit e17f74af351c "mempolicy: don't call mpol_set_nodemask() when no_context" skipped mpol_parse_str()'s call to mpol_set_nodemask(), which used to initialize v.preferred_node, or set MPOL_F_LOCAL in flags. With slab poisoning, you can then rely on mpol_to_str() to set the bit for node 0x6b6b, probably in the next page above the caller's stack. mpol_parse_str() is only called from shmem_parse_options(): no_context is always true, so call it unused for now, and remove !no_context code. Set v.nodes or v.preferred_node or MPOL_F_LOCAL as mpol_to_str() might expect. Then mpol_to_str() can ignore its no_context argument also, the mpol being appropriately initialized whether contextualized or not. Rename its no_context unused too, and let subsequent patch remove them (that's not needed for stable backporting, which would involve rejects). I don't understand why MPOL_LOCAL is described as a pseudo-policy: it's a reasonable policy which suffers from a confusing implementation in terms of MPOL_PREFERRED with MPOL_F_LOCAL. I believe this would be much more robust if MPOL_LOCAL were recognized in switch statements throughout, MPOL_F_LOCAL deleted, and MPOL_PREFERRED use the (possibly empty) nodes mask like everyone else, instead of its preferred_node variant (I presume an optimization from the days before MPOL_LOCAL). But that would take me too long to get right and fully tested. Signed-off-by: Hugh Dickins <hughd@google.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-02epoll: prevent missed events on EPOLL_CTL_MODEric Wong
EPOLL_CTL_MOD sets the interest mask before calling f_op->poll() to ensure events are not missed. Since the modifications to the interest mask are not protected by the same lock as ep_poll_callback, we need to ensure the change is visible to other CPUs calling ep_poll_callback. We also need to ensure f_op->poll() has an up-to-date view of past events which occured before we modified the interest mask. So this barrier also pairs with the barrier in wq_has_sleeper(). This should guarantee either ep_poll_callback or f_op->poll() (or both) will notice the readiness of a recently-ready/modified item. This issue was encountered by Andreas Voellmy and Junchang(Jason) Wang in: http://thread.gmane.org/gmane.linux.kernel/1408782/ Signed-off-by: Eric Wong <normalperson@yhbt.net> Cc: Hans Verkuil <hans.verkuil@cisco.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andreas Voellmy <andreas.voellmy@yale.edu> Tested-by: "Junchang(Jason) Wang" <junchang.wang@yale.edu> Cc: netdev@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-02watchdog: twl4030_wdt: add DT supportAaro Koskinen
Add DT support for twl4030_wdt. This is needed to get twl4030_wdt to probe when booting with DT. Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2013-01-02watchdog: omap_wdt: eliminate unused variable and a compiler warningAaro Koskinen
We forgot to delete this in the commit 4f4753d9 (watchdog: omap_wdt: convert to devm_ functions), and as a result the following compilation warning was introduced: drivers/watchdog/omap_wdt.c: In function 'omap_wdt_remove': drivers/watchdog/omap_wdt.c:299:19: warning: unused variable 'res' [-Wunused-variable] Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Reviewed-by: Paul Walmsley <paul@pwsan.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2013-01-02watchdog: da9055: Don't update wdt_dev->timeout in da9055_wdt_set_timeout ↵Axel Lin
error path Otherwise, WDIOC_GETTIMEOUT returns wrong value if set_timeout fails. This patch also removes unnecessary ret variable in da9055_wdt_ping function. Signed-off-by: Axel Lin <axel.lin@ingics.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>