summaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)Author
2014-08-06mm: softdirty: respect VM_SOFTDIRTY in PTE holesPeter Feiner
After a VMA is created with the VM_SOFTDIRTY flag set, /proc/pid/pagemap should report that the VMA's virtual pages are soft-dirty until VM_SOFTDIRTY is cleared (i.e., by the next write of "4" to /proc/pid/clear_refs). However, pagemap ignores the VM_SOFTDIRTY flag for virtual addresses that fall in PTE holes (i.e., virtual addresses that don't have a PMD, PUD, or PGD allocated yet). To observe this bug, use mmap to create a VMA large enough such that there's a good chance that the VMA will occupy an unused PMD, then test the soft-dirty bit on its pages. In practice, I found that a VMA that covered a PMD's worth of address space was big enough. This patch adds the necessary VMA lookup to the PTE hole callback in /proc/pid/pagemap's page walk and sets soft-dirty according to the VMAs' VM_SOFTDIRTY flag. Signed-off-by: Peter Feiner <pfeiner@google.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Hugh Dickins <hughd@google.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06mm: export NR_SHMEM via sysinfo(2) / si_meminfo() interfacesRafael Aquini
Historically, we exported shared pages to userspace via sysinfo(2) sharedram and /proc/meminfo's "MemShared" fields. With the advent of tmpfs, from kernel v2.4 onward, that old way for accounting shared mem was deemed inaccurate and we started to export a hard-coded 0 for sysinfo.sharedram. Later on, during the 2.6 timeframe, "MemShared" got re-introduced to /proc/meminfo re-branded as "Shmem", but we're still reporting sysinfo.sharedmem as that old hard-coded zero, which makes the "shared memory" report inconsistent across interfaces. This patch leverages the addition of explicit accounting for pages used by shmem/tmpfs -- "4b02108 mm: oom analysis: add shmem vmstat" -- in order to make the users of sysinfo(2) and si_meminfo*() friends aware of that vmstat entry and make them report it consistently across the interfaces, as well to make sysinfo(2) returned data consistent with our current API documentation states. Signed-off-by: Rafael Aquini <aquini@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06fs/ocfs2/slot_map.c: replace count*size kzalloc by kcallocFabian Frederick
kcalloc manages count*sizeof overflow. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06ocfs2: race between umount and unfinished remastering during recoveryTariq Saeed
Orabug: 19074140 When umount is issued during recovery on the new master that has not finished remastering locks, it triggers BUG() in dlm_send_mig_lockres_msg(). Here is the situation: 1) node A has a lock on resource X mastered by node B. 2) node B dies -> node A sets recovering flag for res X 3) Node C becomes the new master for resources owned by the dead node and is remastering locks of the dead node but has not finished the remastering process yet. 4) umount is issued on node C. 5) During processing of umount, ignoring unfished recovery, node C attempts to migrate resource X to node A. 6) node A finds res X in DLM_LOCK_RES_RECOVERING state, considers it a logic error and sends back -EFAULT. 7) node C asserts BUG() upon seeing EFAULT resp from node B. Fix is to delay migrating res X till remastering is finished at which point recovering flag will be cleared on both A and C. Signed-off-by: Tariq Saeed <tariq.x.saeed@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06ocfs2: remove conversion of total_backoff in dlm_join_domain()Xue jiufei
The unit of total_backoff is msecs not jiffies, so no need to do the conversion. Otherwise, the join timeout is not 90 sec. Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com> Signed-off-by: joyce.xue <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06ocfs2: correctly check the return value of ocfs2_search_extent_listYingtai Xie
ocfs2_search_extent_list may return -1, so we should check the return value in ocfs2_split_and_insert, otherwise it may cause array index out of bound. And ocfs2_search_extent_list can only return value less than el->l_next_free_rec, so check if it is equal or larger than le16_to_cpu(el->l_next_free_rec) is meaningless. Signed-off-by: Yingtai Xie <xieyingtai@huawei.com> Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06fs/squashfs/super.c: logging cleanupFabian Frederick
- Convert printk to pr_foo() - Add pr_fmt for future logging entries - Coalesce formats Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06fs/squashfs/file_direct.c: replace count*size kmalloc by kmalloc_arrayFabian Frederick
kmalloc_array() manages count*sizeof overflow. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Phillip Lougher <phillip@squashfs.org.uk> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06ntfs: kernel-doc warning fixesFabian Frederick
cached_page and lru_pvec were removed from ntfs_attr_extend_initialized in commit 2ec93b0bf35f ("ntfs: clean up ntfs_attr_extend_initialized") lru_pvec has been removed from __ntfs_grab_cache_pages in commit 4c99000ac47c ("ntfs: use add_to_page_cache_lru()") Signed-off-by: Fabian Frederick <fabf@skynet.be> Acked-by: Anton Altaparmakov <anton@tuxera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06fs/logfs/readwrite.c: kernel-doc warning fixesFabian Frederick
s/-/:/ and fix variable names. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Joern Engel <joern@logfs.org> Cc: Prasad Joshi <prasadjoshi.linux@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06fanotify: fix double free of pending permission eventsJan Kara
Commit 85816794240b ("fanotify: Fix use after free for permission events") introduced a double free issue for permission events which are pending in group's notification queue while group is being destroyed. These events are freed from fanotify_handle_event() but they are not removed from groups notification queue and thus they get freed again from fsnotify_flush_notify(). Fix the problem by removing permission events from notification queue before freeing them if we skip processing access response. Also expand comments in fanotify_release() to explain group shutdown in detail. Fixes: 85816794240b9659e66e4d9b0df7c6e814e5f603 Signed-off-by: Jan Kara <jack@suse.cz> Reported-by: Douglas Leeder <douglas.leeder@sophos.com> Tested-by: Douglas Leeder <douglas.leeder@sophos.com> Reported-by: Heinrich Schuchard <xypron.glpk@gmx.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06fsnotify: rename event handling functionsJan Kara
Rename fsnotify_add_notify_event() to fsnotify_add_event() since the "notify" part is duplicit. Rename fsnotify_remove_notify_event() and fsnotify_peek_notify_event() to fsnotify_remove_first_event() and fsnotify_peek_first_event() respectively since "notify" part is duplicit and they really look at the first event in the queue. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Jan Kara <jack@suse.cz> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06fs/fscache: make ctl_table staticFabian Frederick
fscache_sysctls and fscache_sysctls_root are only used in main.c Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: David Howells <dhowells@redhat.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: "Highlights: 1) Steady transitioning of the BPF instructure to a generic spot so all kernel subsystems can make use of it, from Alexei Starovoitov. 2) SFC driver supports busy polling, from Alexandre Rames. 3) Take advantage of hash table in UDP multicast delivery, from David Held. 4) Lighten locking, in particular by getting rid of the LRU lists, in inet frag handling. From Florian Westphal. 5) Add support for various RFC6458 control messages in SCTP, from Geir Ola Vaagland. 6) Allow to filter bridge forwarding database dumps by device, from Jamal Hadi Salim. 7) virtio-net also now supports busy polling, from Jason Wang. 8) Some low level optimization tweaks in pktgen from Jesper Dangaard Brouer. 9) Add support for ipv6 address generation modes, so that userland can have some input into the process. From Jiri Pirko. 10) Consolidate common TCP connection request code in ipv4 and ipv6, from Octavian Purdila. 11) New ARP packet logger in netfilter, from Pablo Neira Ayuso. 12) Generic resizable RCU hash table, with intial users in netlink and nftables. From Thomas Graf. 13) Maintain a name assignment type so that userspace can see where a network device name came from (enumerated by kernel, assigned explicitly by userspace, etc.) From Tom Gundersen. 14) Automatic flow label generation on transmit in ipv6, from Tom Herbert. 15) New packet timestamping facilities from Willem de Bruijn, meant to assist in measuring latencies going into/out-of the packet scheduler, latency from TCP data transmission to ACK, etc" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1536 commits) cxgb4 : Disable recursive mailbox commands when enabling vi net: reduce USB network driver config options. tg3: Modify tg3_tso_bug() to handle multiple TX rings amd-xgbe: Perform phy connect/disconnect at dev open/stop amd-xgbe: Use dma_set_mask_and_coherent to set DMA mask net: sun4i-emac: fix memory leak on bad packet sctp: fix possible seqlock seadlock in sctp_packet_transmit() Revert "net: phy: Set the driver when registering an MDIO bus device" cxgb4vf: Turn off SGE RX/TX Callback Timers and interrupts in PCI shutdown routine team: Simplify return path of team_newlink bridge: Update outdated comment on promiscuous mode net-timestamp: ACK timestamp for bytestreams net-timestamp: TCP timestamping net-timestamp: SCHED timestamp on entering packet scheduler net-timestamp: add key to disambiguate concurrent datagrams net-timestamp: move timestamp flags out of sk_flags net-timestamp: extend SCM_TIMESTAMPING ancillary data struct cxgb4i : Move stray CPL definitions to cxgb4 driver tcp: reduce spurious retransmits due to transient SACK reneging qlcnic: Initialize dcbnl_ops before register_netdev ...
2014-08-06Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "In this release: - PKCS#7 parser for the key management subsystem from David Howells - appoint Kees Cook as seccomp maintainer - bugfixes and general maintenance across the subsystem" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (94 commits) X.509: Need to export x509_request_asymmetric_key() netlabel: shorter names for the NetLabel catmap funcs/structs netlabel: fix the catmap walking functions netlabel: fix the horribly broken catmap functions netlabel: fix a problem when setting bits below the previously lowest bit PKCS#7: X.509 certificate issuer and subject are mandatory fields in the ASN.1 tpm: simplify code by using %*phN specifier tpm: Provide a generic means to override the chip returned timeouts tpm: missing tpm_chip_put in tpm_get_random() tpm: Properly clean sysfs entries in error path tpm: Add missing tpm_do_selftest to ST33 I2C driver PKCS#7: Use x509_request_asymmetric_key() Revert "selinux: fix the default socket labeling in sock_graft()" X.509: x509_request_asymmetric_keys() doesn't need string length arguments PKCS#7: fix sparse non static symbol warning KEYS: revert encrypted key change ima: add support for measuring and appraising firmware firmware_class: perform new LSM checks security: introduce kernel_fw_from_file hook PKCS#7: Missing inclusion of linux/err.h ...
2014-08-05Merge branch 'timers-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer and time updates from Thomas Gleixner: "A rather large update of timers, timekeeping & co - Core timekeeping code is year-2038 safe now for 32bit machines. Now we just need to fix all in kernel users and the gazillion of user space interfaces which rely on timespec/timeval :) - Better cache layout for the timekeeping internal data structures. - Proper nanosecond based interfaces for in kernel users. - Tree wide cleanup of code which wants nanoseconds but does hoops and loops to convert back and forth from timespecs. Some of it definitely belongs into the ugly code museum. - Consolidation of the timekeeping interface zoo. - A fast NMI safe accessor to clock monotonic for tracing. This is a long standing request to support correlated user/kernel space traces. With proper NTP frequency correction it's also suitable for correlation of traces accross separate machines. - Checkpoint/restart support for timerfd. - A few NOHZ[_FULL] improvements in the [hr]timer code. - Code move from kernel to kernel/time of all time* related code. - New clocksource/event drivers from the ARM universe. I'm really impressed that despite an architected timer in the newer chips SoC manufacturers insist on inventing new and differently broken SoC specific timers. [ Ed. "Impressed"? I don't think that word means what you think it means ] - Another round of code move from arch to drivers. Looks like most of the legacy mess in ARM regarding timers is sorted out except for a few obnoxious strongholds. - The usual updates and fixlets all over the place" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits) timekeeping: Fixup typo in update_vsyscall_old definition clocksource: document some basic timekeeping concepts timekeeping: Use cached ntp_tick_length when accumulating error timekeeping: Rework frequency adjustments to work better w/ nohz timekeeping: Minor fixup for timespec64->timespec assignment ftrace: Provide trace clocks monotonic timekeeping: Provide fast and NMI safe access to CLOCK_MONOTONIC seqcount: Add raw_write_seqcount_latch() seqcount: Provide raw_read_seqcount() timekeeping: Use tk_read_base as argument for timekeeping_get_ns() timekeeping: Create struct tk_read_base and use it in struct timekeeper timekeeping: Restructure the timekeeper some more clocksource: Get rid of cycle_last clocksource: Move cycle_last validation to core code clocksource: Make delta calculation a function wireless: ath9k: Get rid of timespec conversions drm: vmwgfx: Use nsec based interfaces drm: i915: Use nsec based interfaces timekeeping: Provide ktime_get_raw() hangcheck-timer: Use ktime_get_ns() ...
2014-08-05reiserfs: fix corruption introduced by balance_leaf refactorJeff Mahoney
Commits f1f007c308e (reiserfs: balance_leaf refactor, pull out balance_leaf_insert_left) and cf22df182bf (reiserfs: balance_leaf refactor, pull out balance_leaf_paste_left) missed that the `body' pointer was getting repositioned. Subsequent users of the pointer would expect it to be repositioned, and as a result, parts of the tree would get overwritten. The most common observed corruption is indirect block pointers being overwritten. Since the body value isn't actually used anymore in the called routines, we can pass back the offset it should be shifted. We constify the body and ih pointers in the balance_leaf as a mostly-free preventative measure. Cc: <stable@vger.kernel.org> # 3.16 Reported-and-tested-by: Jeff Chua <jeff.chua.linux@gmail.com> Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Jan Kara <jack@suse.cz>
2014-08-05nfsd: add some comments to the nfsd4 object definitionsJeff Layton
Add some comments that describe what each of these objects is, and how they related to one another. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05nfsd: remove the client_mutex and the nfs4_lock/unlock_state wrappersJeff Layton
Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05Add worker function to set allocation sizeSteve French
Adds setinfo worker function for SMB2/SMB3 support of SET_ALLOCATION_INFORMATION Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilovsky@samba.org>
2014-08-05nfsd: remove nfs4_lock_state: nfs4_state_shutdown_netJeff Layton
Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05nfsd: remove nfs4_lock_state: nfs4_laundromatJeff Layton
Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05nfsd: Remove nfs4_lock_state(): reclaim_complete()Trond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05nfsd: Remove nfs4_lock_state(): setclientid, setclientid_confirm, renewTrond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05nfsd: Remove nfs4_lock_state(): exchange_id, create/destroy_session()Trond Myklebust
Also destroy_clientid and bind_conn_to_session. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05nfsd: Remove nfs4_lock_state(): nfsd4_open and nfsd4_open_confirmTrond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05nfsd: Remove nfs4_lock_state(): nfsd4_delegreturn()Trond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05nfsd: Remove nfs4_lock_state(): nfsd4_open_downgrade + nfsd4_closeTrond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05nfsd: Remove nfs4_lock_state(): nfsd4_lock/locku/lockt()Trond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05nfsd: Remove nfs4_lock_state(): nfsd4_release_lockownerTrond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05nfsd: Remove nfs4_lock_state(): nfsd4_test_stateid/nfsd4_free_stateidTrond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05nfsd: Remove nfs4_lock_state(): nfs4_preprocess_stateid_op()Trond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05nfsd: remove old fault injection infrastructureJeff Layton
Remove the old nfsd_for_n_state function and move nfsd_find_client higher up into the file to get rid of forward declaration. Remove the struct nfsd_fault_inject_op arguments from the operations as they are no longer needed by any of them. Finally, remove the old "standard" get and set routines, which also eliminates the client_mutex from this code. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05nfsd: add more granular locking to *_delegations fault injectorsJeff Layton
...instead of relying on the client_mutex. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05nfsd: add more granular locking to forget_openowners fault injectorJeff Layton
...instead of relying on the client_mutex. Also, fix up the printk output that is generated when the file is read. It currently says that it's reporting the number of open files, but it's actually reporting the number of openowners. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05nfsd: add more granular locking to forget_locks fault injectorJeff Layton
...instead of relying on the client_mutex. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05nfsd: add a list_head arg to nfsd_foreach_client_lockJeff Layton
In a later patch, we'll want to collect the locks onto a list for later destruction. If "func" is defined and "collect" is defined, then we'll add the lock stateid to the list. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05nfsd: add nfsd_inject_forget_clientsJeff Layton
...which uses the client_lock for protection instead of client_mutex. Also remove nfsd_forget_client as there are no more callers. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05nfsd: add a forget_client set_clnt routineJeff Layton
...that relies on the client_lock instead of client_mutex. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05nfsd: add a forget_clients "get" routine with proper lockingJeff Layton
Add a new "get" routine for forget_clients that relies on the client_lock instead of the client_mutex. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05nfsd: abstract out the get and set routines into the fault injection opsJeff Layton
Now that we've added more granular locking in other places, it's time to address the fault injection code. This code is currently quite reliant on the client_mutex for protection. Start to change this by adding a new set of fault injection op vectors. For now they all use the legacy ones. In later patches we'll add new routines that can deal with more granular locking. Also, move some of the printk routines into the callers to make the results of the operations more uniform. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05nfsd: protect clid and verifier generation with client_lockJeff Layton
The clid counter is a global counter currently. Move it to be a per-net property so that it can be properly protected by the nn->client_lock instead of relying on the client_mutex. The verifier generator is also potentially racy if there are two simultaneous callers. Generate the verifier when we generate the clid value, so it's also created under the client_lock. With this, there's no need to keep two counters as they'd always be in sync anyway, so just use the clientid_counter for both. As Trond points out, what would be best is to eventually move this code to use IDR instead of the hash tables. That would also help ensure uniqueness, but that's probably best done as a separate project. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05nfsd: don't destroy clients that are busyJeff Layton
It's possible that we'll have an in-progress call on some of the clients while a rogue EXCHANGE_ID or DESTROY_CLIENTID call comes in. Be sure to try and mark the client expired first, so that the refcount is respected. This will only be a problem once the client_mutex is removed. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05NFSD: Put the reference of nfs4_file when freeing stidKinglong Mee
After testing nfs4 lock, I restart the nfsd service, got messages as, [ 5677.403419] nfsd: last server has exited, flushing export cache [ 5677.463728] ============================================================================= [ 5677.463942] BUG nfsd4_files (Tainted: G B OE): Objects remaining in nfsd4_files on kmem_cache_close() [ 5677.464055] ----------------------------------------------------------------------------- [ 5677.464203] INFO: Slab 0xffffea0000233400 objects=28 used=1 fp=0xffff880008cd3d98 flags=0x3ffc0000004080 [ 5677.464318] CPU: 0 PID: 3772 Comm: rmmod Tainted: G B OE 3.16.0-rc2+ #29 [ 5677.464420] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013 [ 5677.464538] 0000000000000000 0000000036af2c9f ffff88000ce97d68 ffffffff816eacfa [ 5677.464643] ffffea0000233400 ffff88000ce97e40 ffffffff811cda44 ffffffff00000020 [ 5677.464774] ffff88000ce97e50 ffff88000ce97e00 656a624f00000008 616d657220737463 [ 5677.464875] Call Trace: [ 5677.464925] [<ffffffff816eacfa>] dump_stack+0x45/0x56 [ 5677.464983] [<ffffffff811cda44>] slab_err+0xb4/0xe0 [ 5677.465040] [<ffffffff811d0457>] ? __kmalloc+0x117/0x290 [ 5677.465099] [<ffffffff81100eec>] ? on_each_cpu_cond+0xac/0xf0 [ 5677.465158] [<ffffffff811d1bc0>] ? kmem_cache_close+0x110/0x2e0 [ 5677.465218] [<ffffffff811d1be0>] kmem_cache_close+0x130/0x2e0 [ 5677.465279] [<ffffffff8135a0c1>] ? kobject_cleanup+0x91/0x1b0 [ 5677.465338] [<ffffffff811d22be>] __kmem_cache_shutdown+0xe/0x10 [ 5677.465399] [<ffffffff8119bd28>] kmem_cache_destroy+0x48/0x100 [ 5677.465466] [<ffffffffa05ef78d>] nfsd4_free_slabs+0x2d/0x50 [nfsd] [ 5677.465530] [<ffffffffa05fa987>] exit_nfsd+0x34/0x6ad [nfsd] [ 5677.465589] [<ffffffff81104ac2>] SyS_delete_module+0x162/0x200 [ 5677.465649] [<ffffffff81013b69>] ? do_notify_resume+0x59/0x90 [ 5677.465759] [<ffffffff816f2369>] system_call_fastpath+0x16/0x1b [ 5677.465822] INFO: Object 0xffff880008cd0000 @offset=0 [ 5677.465882] INFO: Allocated in nfsd4_process_open1+0x61/0x350 [nfsd] age=7599 cpu=0 pid=3253 [ 5677.466115] __slab_alloc+0x3b0/0x4b1 [ 5677.466166] kmem_cache_alloc+0x1e4/0x240 [ 5677.466220] nfsd4_process_open1+0x61/0x350 [nfsd] [ 5677.466276] nfsd4_open+0xee/0x860 [nfsd] [ 5677.466329] nfsd4_proc_compound+0x4d7/0x7f0 [nfsd] [ 5677.466384] nfsd_dispatch+0xbb/0x200 [nfsd] [ 5677.466447] svc_process_common+0x453/0x6f0 [sunrpc] [ 5677.466506] svc_process+0x103/0x170 [sunrpc] [ 5677.466559] nfsd+0x117/0x190 [nfsd] [ 5677.466609] kthread+0xd8/0xf0 [ 5677.466656] ret_from_fork+0x7c/0xb0 [ 5677.466775] kmem_cache_destroy nfsd4_files: Slab cache still has objects [ 5677.466839] CPU: 0 PID: 3772 Comm: rmmod Tainted: G B OE 3.16.0-rc2+ #29 [ 5677.466937] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013 [ 5677.467049] 0000000000000000 0000000036af2c9f ffff88000ce97eb0 ffffffff816eacfa [ 5677.467150] ffff880020bb2d00 ffff88000ce97ed0 ffffffff8119bdd9 0000000000000000 [ 5677.467250] ffffffffa06065c0 ffff88000ce97ee0 ffffffffa05ef78d ffff88000ce97ef0 [ 5677.467351] Call Trace: [ 5677.467397] [<ffffffff816eacfa>] dump_stack+0x45/0x56 [ 5677.467454] [<ffffffff8119bdd9>] kmem_cache_destroy+0xf9/0x100 [ 5677.467516] [<ffffffffa05ef78d>] nfsd4_free_slabs+0x2d/0x50 [nfsd] [ 5677.467579] [<ffffffffa05fa987>] exit_nfsd+0x34/0x6ad [nfsd] [ 5677.467639] [<ffffffff81104ac2>] SyS_delete_module+0x162/0x200 [ 5677.467765] [<ffffffff81013b69>] ? do_notify_resume+0x59/0x90 [ 5677.467826] [<ffffffff816f2369>] system_call_fastpath+0x16/0x1b Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Jeff Layton <jlayton@primarydata.com> Fixes: 11b9164adad7 "nfsd: Add a struct nfs4_file field to struct nfs4_stid" Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-04Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "Bug fixes and clean ups for the 3.17 merge window" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix ext4_discard_allocated_blocks() if we can't allocate the pa struct ext4: fix COLLAPSE RANGE test for bigalloc file systems ext4: check inline directory before converting ext4: fix incorrect locking in move_extent_per_page ext4: use correct depth value ext4: add i_data_sem sanity check ext4: fix wrong size computation in ext4_mb_normalize_request() ext4: make ext4_has_inline_data() as a inline function ext4: remove readpage() check in ext4_mmap_file() ext4: fix punch hole on files with indirect mapping ext4: remove metadata reservation checks ext4: rearrange initialization to fix EXT4FS_DEBUG
2014-08-04Merge tag 'for-f2fs-3.17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "This series includes patches to: - add nobarrier mount option - support tmpfile and rename2 - enhance the fdatasync behavior - fix the error path - fix the recovery routine - refactor a part of the checkpoint procedure - reduce some lock contentions" * tag 'for-f2fs-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (40 commits) f2fs: use for_each_set_bit to simplify the code f2fs: add f2fs_balance_fs for expand_inode_data f2fs: invalidate xattr node page when evict inode f2fs: avoid skipping recover_inline_xattr after recover_inline_data f2fs: add tracepoint for f2fs_direct_IO f2fs: reduce competition among node page writes f2fs: fix coding style f2fs: remove redundant lines in allocate_data_block f2fs: add tracepoint for f2fs_issue_flush f2fs: avoid retrying wrong recovery routine when error was occurred f2fs: test before set/clear bits f2fs: fix wrong condition for unlikely f2fs: enable in-place-update for fdatasync f2fs: skip unnecessary data writes during fsync f2fs: add info of appended or updated data writes f2fs: use radix_tree for ino management f2fs: add infra for ino management f2fs: punch the core function for inode management f2fs: add nobarrier mount option f2fs: fix to put root inode in error path of fill_super ...
2014-08-04Merge tag 'driver-core-3.17-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here's the big driver-core pull request for 3.17-rc1. Largest thing in here is the dma-buf rework and fence code, that touched many different subsystems so it was agreed it should go through this tree to handle merge issues. There's also some firmware loading updates, as well as tests added, and a few other tiny changes, the changelog has the details. All have been in linux-next for a long time" * tag 'driver-core-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (32 commits) ARM: imx: Remove references to platform_bus in mxc code firmware loader: Fix _request_firmware_load() return val for fw load abort platform: Remove most references to platform_bus device test: add firmware_class loader test doc: fix minor typos in firmware_class README staging: android: Cleanup style issues Documentation: devres: Sort managed interfaces Documentation: devres: Add devm_kmalloc() et al fs: debugfs: remove trailing whitespace kernfs: kernel-doc warning fix debugfs: Fix corrupted loop in debugfs_remove_recursive stable_kernel_rules: Add pointer to netdev-FAQ for network patches driver core: platform: add device binding path 'driver_override' driver core/platform: remove unused implicit padding in platform_object firmware loader: inform direct failure when udev loader is disabled firmware: replace ALIGN(PAGE_SIZE) by PAGE_ALIGN firmware: read firmware size using i_size_read() firmware loader: allow disabling of udev as firmware loader reservation: add suppport for read-only access using rcu reservation: update api and add some helpers ... Conflicts: drivers/base/platform.c
2014-08-04Merge branch 'sched-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: - Move the nohz kick code out of the scheduler tick to a dedicated IPI, from Frederic Weisbecker. This necessiated quite some background infrastructure rework, including: * Clean up some irq-work internals * Implement remote irq-work * Implement nohz kick on top of remote irq-work * Move full dynticks timer enqueue notification to new kick * Move multi-task notification to new kick * Remove unecessary barriers on multi-task notification - Remove proliferation of wait_on_bit() action functions and allow wait_on_bit_action() functions to support a timeout. (Neil Brown) - Another round of sched/numa improvements, cleanups and fixes. (Rik van Riel) - Implement fast idling of CPUs when the system is partially loaded, for better scalability. (Tim Chen) - Restructure and fix the CPU hotplug handling code that may leave cfs_rq and rt_rq's throttled when tasks are migrated away from a dead cpu. (Kirill Tkhai) - Robustify the sched topology setup code. (Peterz Zijlstra) - Improve sched_feat() handling wrt. static_keys (Jason Baron) - Misc fixes. * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits) sched/fair: Fix 'make xmldocs' warning caused by missing description sched: Use macro for magic number of -1 for setparam sched: Robustify topology setup sched: Fix sched_setparam() policy == -1 logic sched: Allow wait_on_bit_action() functions to support a timeout sched: Remove proliferation of wait_on_bit() action functions sched/numa: Revert "Use effective_load() to balance NUMA loads" sched: Fix static_key race with sched_feat() sched: Remove extra static_key*() function indirection sched/rt: Fix replenish_dl_entity() comments to match the current upstream code sched: Transform resched_task() into resched_curr() sched/deadline: Kill task_struct->pi_top_task sched: Rework check_for_tasks() sched/rt: Enqueue just unthrottled rt_rq back on the stack in __disable_runtime() sched/fair: Disable runtime_enabled on dying rq sched/numa: Change scan period code to match intent sched/numa: Rework best node setting in task_numa_migrate() sched/numa: Examine a task move when examining a task swap sched/numa: Simplify task_numa_compare() sched/numa: Use effective_load() to balance NUMA loads ...
2014-08-04nfs: reject changes to resvport and sharecache during remountScott Mayhew
Commit c8e47028 made it possible to change resvport/noresvport and sharecache/nosharecache via a remount operation, neither of which should be allowed. Signed-off-by: Scott Mayhew <smayhew@redhat.com> Fixes: c8e47028 (nfs: Apply NFS_MOUNT_CMP_FLAGMASK to nfs_compare_remount_data) Cc: stable@vger.kernel.org # 3.16+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-08-04NFS: Avoid infinite loop when RELEASE_LOCKOWNER getting expired errorKinglong Mee
Fix Commit 60ea681299 (NFS: Migration support for RELEASE_LOCKOWNER) If getting expired error, client will enter a infinite loop as, client server RELEASE_LOCKOWNER(old clid) -----> <--- expired error RENEW(old clid) -----> <--- expired error SETCLIENTID -----> <--- a new clid SETCLIENTID_CONFIRM (new clid) --> <--- ok RELEASE_LOCKOWNER(old clid) -----> <--- expired error RENEW(new clid) -----> <-- ok RELEASE_LOCKOWNER(old clid) -----> <--- expired error RENEW(new clid) -----> <-- ok ... ... Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> [Trond: replace call to nfs4_async_handle_error() with nfs4_schedule_lease_recovery()] Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>