summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2014-12-10mm: introduce single zone pcplists drainVlastimil Babka
The functions for draining per-cpu pages back to buddy allocators currently always operate on all zones. There are however several cases where the drain is only needed in the context of a single zone, and spilling other pcplists is a waste of time both due to the extra spilling and later refilling. This patch introduces new zone pointer parameter to drain_all_pages() and changes the dummy parameter of drain_local_pages() to be also a zone pointer. When NULL is passed, the functions operate on all zones as usual. Passing a specific zone pointer reduces the work to the single zone. All callers are updated to pass the NULL pointer in this patch. Conversion to single zone (where appropriate) is done in further patches. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10mm/vmscan.c: replace printk with pr_errPintu Kumar
This patch replaces printk(KERN_ERR..) with pr_err found under shrink_slab. Thus it also reduces one line extra because of formatting. Signed-off-by: Pintu Kumar <pintu.k@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10mm/vmalloc.c: replace printk with pr_warnPintu Kumar
This patch replaces printk(KERN_WARNING..) with pr_warn. Thus it also reduces one line extra because of formatting. Signed-off-by: Pintu Kumar <pintu.k@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10mm/page_alloc.c: convert boot printks without log level to pr_infoAnton Blanchard
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10mm: memcontrol: remove synchronous stock draining codeJohannes Weiner
With charge reparenting, the last synchronous stock drainer left. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Vladimir Davydov <vdavydov@parallels.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: David Rientjes <rientjes@google.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10mm: memcontrol: continue cache reclaim from offlined groupsJohannes Weiner
On cgroup deletion, outstanding page cache charges are moved to the parent group so that they're not lost and can be reclaimed during pressure on/inside said parent. But this reparenting is fairly tricky and its synchroneous nature has led to several lock-ups in the past. Since c2931b70a32c ("cgroup: iterate cgroup_subsys_states directly") css iterators now also include offlined css, so memcg iterators can be changed to include offlined children during reclaim of a group, and leftover cache can just stay put. There is a slight change of behavior in that charges of deleted groups no longer show up as local charges in the parent. But they are still included in the parent's hierarchical statistics. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vladimir Davydov <vdavydov@parallels.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: David Rientjes <rientjes@google.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10mm: memcontrol: remove obsolete kmemcg pinning tricksJohannes Weiner
As charges now pin the css explicitely, there is no more need for kmemcg to acquire a proxy reference for outstanding pages during offlining, or maintain state to identify such "dead" groups. This was the last user of the uncharge functions' return values, so remove them as well. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Vladimir Davydov <vdavydov@parallels.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: David Rientjes <rientjes@google.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10mm: memcontrol: take a css reference for each charged pageJohannes Weiner
Charges currently pin the css indirectly by playing tricks during css_offline(): user pages stall the offlining process until all of them have been reparented, whereas kmemcg acquires a keep-alive reference if outstanding kernel pages are detected at that point. In preparation for removing all this complexity, make the pinning explicit and acquire a css references for every charged page. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Vladimir Davydov <vdavydov@parallels.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: David Rientjes <rientjes@google.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10mm: memcontrol: convert reclaim iterator to simple css refcountingJohannes Weiner
The memcg reclaim iterators use a complicated weak reference scheme to prevent pinning cgroups indefinitely in the absence of memory pressure. However, during the ongoing cgroup core rework, css lifetime has been decoupled such that a pinned css no longer interferes with removal of the user-visible cgroup, and all this complexity is now unnecessary. [mhocko@suse.cz: ensure that the cached reference is always released] Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: David Rientjes <rientjes@google.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10kernel: res_counter: remove the unused APIJohannes Weiner
All memory accounting and limiting has been switched over to the lockless page counters. Bye, res_counter! [akpm@linux-foundation.org: update Documentation/cgroups/memory.txt] [mhocko@suse.cz: ditch the last remainings of res_counter] Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vladimir Davydov <vdavydov@parallels.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Tejun Heo <tj@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10mm: hugetlb_cgroup: convert to lockless page countersJohannes Weiner
Abandon the spinlock-protected byte counters in favor of the unlocked page counters in the hugetlb controller as well. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Vladimir Davydov <vdavydov@parallels.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Tejun Heo <tj@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10mm: memcontrol: lockless page countersJohannes Weiner
Memory is internally accounted in bytes, using spinlock-protected 64-bit counters, even though the smallest accounting delta is a page. The counter interface is also convoluted and does too many things. Introduce a new lockless word-sized page counter API, then change all memory accounting over to it. The translation from and to bytes then only happens when interfacing with userspace. The removed locking overhead is noticable when scaling beyond the per-cpu charge caches - on a 4-socket machine with 144-threads, the following test shows the performance differences of 288 memcgs concurrently running a page fault benchmark: vanilla: 18631648.500498 task-clock (msec) # 140.643 CPUs utilized ( +- 0.33% ) 1,380,638 context-switches # 0.074 K/sec ( +- 0.75% ) 24,390 cpu-migrations # 0.001 K/sec ( +- 8.44% ) 1,843,305,768 page-faults # 0.099 M/sec ( +- 0.00% ) 50,134,994,088,218 cycles # 2.691 GHz ( +- 0.33% ) <not supported> stalled-cycles-frontend <not supported> stalled-cycles-backend 8,049,712,224,651 instructions # 0.16 insns per cycle ( +- 0.04% ) 1,586,970,584,979 branches # 85.176 M/sec ( +- 0.05% ) 1,724,989,949 branch-misses # 0.11% of all branches ( +- 0.48% ) 132.474343877 seconds time elapsed ( +- 0.21% ) lockless: 12195979.037525 task-clock (msec) # 133.480 CPUs utilized ( +- 0.18% ) 832,850 context-switches # 0.068 K/sec ( +- 0.54% ) 15,624 cpu-migrations # 0.001 K/sec ( +- 10.17% ) 1,843,304,774 page-faults # 0.151 M/sec ( +- 0.00% ) 32,811,216,801,141 cycles # 2.690 GHz ( +- 0.18% ) <not supported> stalled-cycles-frontend <not supported> stalled-cycles-backend 9,999,265,091,727 instructions # 0.30 insns per cycle ( +- 0.10% ) 2,076,759,325,203 branches # 170.282 M/sec ( +- 0.12% ) 1,656,917,214 branch-misses # 0.08% of all branches ( +- 0.55% ) 91.369330729 seconds time elapsed ( +- 0.45% ) On top of improved scalability, this also gets rid of the icky long long types in the very heart of memcg, which is great for 32 bit and also makes the code a lot more readable. Notable differences between the old and new API: - res_counter_charge() and res_counter_charge_nofail() become page_counter_try_charge() and page_counter_charge() resp. to match the more common kernel naming scheme of try_do()/do() - res_counter_uncharge_until() is only ever used to cancel a local counter and never to uncharge bigger segments of a hierarchy, so it's replaced by the simpler page_counter_cancel() - res_counter_set_limit() is replaced by page_counter_limit(), which expects its callers to serialize against themselves - res_counter_memparse_write_strategy() is replaced by page_counter_limit(), which rounds down to the nearest page size - rather than up. This is more reasonable for explicitely requested hard upper limits. - to keep charging light-weight, page_counter_try_charge() charges speculatively, only to roll back if the result exceeds the limit. Because of this, a failing bigger charge can temporarily lock out smaller charges that would otherwise succeed. The error is bounded to the difference between the smallest and the biggest possible charge size, so for memcg, this means that a failing THP charge can send base page charges into reclaim upto 2MB (4MB) before the limit would have been reached. This should be acceptable. [akpm@linux-foundation.org: add includes for WARN_ON_ONCE and memparse] [akpm@linux-foundation.org: add includes for WARN_ON_ONCE, memparse, strncmp, and PAGE_SIZE] Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Tejun Heo <tj@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10slab: replace smp_read_barrier_depends() with lockless_dereference()Pranith Kumar
Recently lockless_dereference() was added which can be used in place of hard-coding smp_read_barrier_depends(). The following PATCH makes the change. Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10slab: improve checking for invalid gfp_flagsAndrew Morton
The code goes BUG, but doesn't tell us which bits were unexpectedly set. Print that out. Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10mm: slub: fix format mismatches in slab_err() callersAndrey Ryabinin
Adding __printf(3, 4) to slab_err exposed following: mm/slub.c: In function `check_slab': mm/slub.c:852:4: warning: format `%u' expects argument of type `unsigned int', but argument 4 has type `const char *' [-Wformat=] s->name, page->objects, maxobj); ^ mm/slub.c:852:4: warning: too many arguments for format [-Wformat-extra-args] mm/slub.c:857:4: warning: format `%u' expects argument of type `unsigned int', but argument 4 has type `const char *' [-Wformat=] s->name, page->inuse, page->objects); ^ mm/slub.c:857:4: warning: too many arguments for format [-Wformat-extra-args] mm/slub.c: In function `on_freelist': mm/slub.c:905:4: warning: format `%d' expects argument of type `int', but argument 5 has type `long unsigned int' [-Wformat=] "should be %d", page->objects, max_objects); Fix first two warnings by removing redundant s->name. Fix the last by changing type of max_object from unsigned long to int. Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Acked-by: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10mm/slab: reverse iteration on find_mergeable()Joonsoo Kim
Unlike SLUB, sometimes, object isn't started at the beginning of the slab in the SLAB. This causes the unalignment problem when after slab merging is supported by commit 12220dea07f1 ("mm/slab: support slab merge"). Alignment mismatch check is introduced ("mm/slab: fix unalignment problem on Malta with EVA due to slab merge") to prevent merge in this case. This causes undesirable result that merging happens between infrequently used kmem_caches if there are kmem_caches with same size and is 256 bytes, are merged into pool_workqueue rather than kmalloc-256, because kmem_caches for kmalloc are at the tail of the list. To prevent this situation, this patch reverses iteration order in find_mergeable() to find frequently used kmem_caches. This change helps to merge kmem_cache to frequently used kmem_caches, such as kmalloc kmem_caches. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10slab: print slabinfo header in seq showVladimir Davydov
Currently we print the slabinfo header in the seq start method, which makes it unusable for showing leaks, so we have leaks_show, which does practically the same as s_show except it doesn't show the header. However, we can print the header in the seq show method - we only need to check if the current element is the first on the list. This will allow us to use the same set of seq iterators for both leaks and slabinfo reporting, which is nice. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10mm: slab/slub: coding style: whitespaces and tabs mixtureLQYMGT
Some code in mm/slab.c and mm/slub.c use whitespaces in indent. Clean them up. Signed-off-by: LQYMGT <lqymgt@gmail.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10fs/char_dev.c: remove pointless assignment from __register_chrdev_region()Jan Kara
At one place we assign major number we found to ret. That assignment is then never used and actually doesn't make any sense given how the code is currently structured (the assignment comes from pre-git times). Just remove it. Coverity id: 1226852. Signed-off-by: Jan Kara <jack@suse.cz> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10ocfs2: remove unneeded NULL checkDan Carpenter
In commit 1faf289454b9 ("ocfs2_dlm: disallow a domain join if node maps mismatch") we introduced a new earlier NULL check so this one is not needed. Also static checkers complain because we dereference it first and then check for NULL. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10ocfs2: remove bogus NULL check in ocfs2_move_extents()Dan Carpenter
"inode" isn't NULL here, and also we dereference it on the previous line so static checkers get annoyed. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10ocfs2: do not set filesystem readonly if link downjiangyiwen
Do not set the filesystem readonly if the storage link is down. In this case, metadata is not corrupted and only -EIO is returned. And if it is indeed corrupted metadata, it has already called ocfs2_error() in ocfs2_validate_inode_block(). Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10ocfs2: do not set OCFS2_LOCK_UPCONVERT_FINISHING if nonblocking lock can not ↵Xue jiufei
be granted at once ocfs2_readpages() use nonblocking flag to avoid page lock inversion. It will trigger cluster hang because that flag OCFS2_LOCK_UPCONVERT_FINISHING is not cleared if nonblocking lock cannot be granted at once. The flag would prevent dc thread from downconverting. So other nodes cannot acheive this lockres for ever. So we should not set OCFS2_LOCK_UPCONVERT_FINISHING when receiving ast if nonblocking lock had already returned. Signed-off-by: joyce.xue <xuejiufei@huawei.com> Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10ocfs2: fix error handling when creating debugfs root in ocfs2_init()Jan Kara
Error handling if creation of root of debugfs in ocfs2_init() fails is broken. Although error code is set we fail to exit ocfs2_init() with error and thus initialization ends with success. Later when mounting a filesystem, ocfs2 debugfs entries end up being created in the root of debugfs filesystem which is confusing. Fix the error handling to bail out. Coverity id: 1227009. Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10ocfs2: remove filesize checks for sync I/O journal commitGoldwyn Rodrigues
Filesize is not a good indication that the file needs to be synced. An example where this breaks is: 1. Open the file in O_SYNC|O_RDWR 2. Read a small portion of the file (say 64 bytes) 3. Lseek to starting of the file 4. Write 64 bytes If the node crashes, it is not written out to disk because this was not committed in the journal and the other node which reads the file after recovery reads stale data (even if the write on the other node was successful) Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.de> Reviewed-by: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10ocfs2: o2net: fix connect expiredJunxiao Bi
Set nn_persistent_error to -ENOTCONN will stop reconnect since the "stop" condition in o2net_start_connect() will be true. stop = (nn->nn_sc || (nn->nn_persistent_error && (nn->nn_persistent_error != -ENOTCONN || timeout == 0))); This will make connection never be established if the first connection request is lost. Set nn_persistent_error to 0 when connect expired to fix this. With this changes, dlm will not be waken up when connect expired, this is OK since dlm depends on network, dlm can do nothing in this case if waken up. Let it wait there for network recover and connect built again to continue. Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10ocfs2: o2dlm: fix a race between purge and master querySrinivas Eeda
Node A sends master query request to node B which is the master. At this time lockres happens to be on purgelist. dlm_master_request_handler gets the dlm spinlock, finds the resource and releases the dlm spin lock. Right at this dlm_thread on this node could purge the lockres. dlm_master_request_handler can then acquire lockres spinlock and reply to Node A that node B is the master even though lockres on node B is purged. The above scenario will now make node A falsely think node B is the master which is inconsistent. Further if another node C tries to master the same resource, every node will respond they are not the master. Node C then masters the resource and sends assert master to all nodes. This will now make node A crash with the following message. dlm_assert_master_handler:1831 ERROR: DIE! Mastery assert from 9, but current owner is 10! Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Reviewed-by: Wengang Wang <wen.gang.wang@oracle.com> Tested-by: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10ocfs2: report error from o2hb_do_disk_heartbeat() to userJan Kara
Report return value of o2hb_do_disk_heartbeat() as a part of ML_HEARTBEAT message so that we know whether a heartbeat actually happened or not. This also makes assigned but otherwise unused 'ret' variable used. Coverity id: 1227053. Signed-off-by: Jan Kara <jack@suse.cz> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10ocfs2: remove bogus test from ocfs2_read_locked_inode()Jan Kara
'args' are always set for ocfs2_read_locked_inode() and brelse() checks whether bh is NULL. So the test (args && bh) is unnecessary (plus the args part is really confusing anyway). Remove it. Coverity id: 1128856. Signed-off-by: Jan Kara <jack@suse.cz> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10ocfs2: Fix xattr check in ocfs2_get_xattr_nolock()Jan Kara
ocfs2_get_xattr_nolock() checks whether inode has any extended attributes (OCFS2_HAS_XATTR_FL). If not, it just sets 'ret' to -ENODATA but continues with checking inline and external attributes anyway (which is pointless although it does not harm). Just return immediately when we know there are no extended attributes in the inode. Coverity id: 1226906. Signed-off-by: Jan Kara <jack@suse.cz> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10ocfs2: fix an off-by-one BUG_ON() statementDan Carpenter
The ->si_slots[] array is allocated in ocfs2_init_slot_info() it has "->max_slots" number of elements so this test should be >= instead of >. Static checker work. Compile tested only. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10ocfs2/dlm: let sender retry if dlm_dispatch_assert_master failed with -ENOMEMJoseph Qi
Do not BUG() if GFP_ATOMIC allocation fails in dlm_dispatch_assert_master. Instead, return -ENOMEM to the sender and then retry. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Reviewed-by: Alex Chen <alex.chen@huawei.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10sh: off by one BUG_ON() in setup_bootmem_node()Dan Carpenter
This off by one bug is harmless but it upsets the static checkers and the code is obvious so it doesn't hurt to fix it. The Smatch warning is: arch/sh/mm/numa.c:47 setup_bootmem_node() error: buffer overflow 'node_data' 1024 <= 1024 Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Paul Mundt <lethal@linux-sh.org> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10scripts/kernel-doc: don't eat struct members with __alignedJohannes Berg
The change from \d+ to .+ inside __aligned() means that the following structure: struct test { u8 a __aligned(2); u8 b __aligned(2); }; essentially gets modified to struct test { u8 a; }; for purposes of kernel-doc, thus dropping a struct member, which in turns causes warnings and invalid kernel-doc generation. Fix this by replacing the catch-all (".") with anything that's not a semicolon ("[^;]"). Fixes: 9dc30918b23f ("scripts/kernel-doc: handle struct member __aligned without numbers") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Cc: Nishanth Menon <nm@ti.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Michal Marek <mmarek@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10dma-debug: prevent early callers from crashingFlorian Fainelli
dma_debug_init() is called by architecture specific code at different levels, but typically as a fs_initcall due to the debugfs initialization. Some platforms may have early callers of the DMA-API, running prior to the fs_initcall() level, which is not much of an issue unless CONFIG_DMA_API_DEBUG is set. When the DMA-API debugging facilities are turned on a caller will go through: debug_dma_map_{single,page} -> dma_mapping_error (inline function usually) -> debug_dma_mapping_error -> get_hash_bucket Calling get_hash_bucket() returns a valid hash value since we hash on high bits of the dma_addr cookie, but we will grab an unitialized spinlock, which typically won't crash but produce a warning, the real crash will however happen during the bucket list traversal because the list has not been initialized yet. An obvious solution is of course to move some of the offenders to run after the fs_initcall level, but since this might not always be an option, we add a flag "dma_debug_initialized" which is set to false by default, and set to true once dma_debug_init() has had a chance to run. The dma_debug_disabled() helper function previously introduced just needs to check for dma_debug_initialized to allow the caller to proceed or not. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Horia Geanta <horia.geanta@freescale.com> Cc: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10dma-debug: introduce dma_debug_disabledFlorian Fainelli
Add a helper function which returns whether the DMA debugging API is disabled, right now we only check for global_disable, but in order to accommodate early callers of the DMA-API, we will check for more initialization flags in the next patch. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Horia Geanta <horia.geanta@freescale.com> Cc: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10fs/cifs/smb2file.c: replace count*size kzalloc by kcallocFabian Frederick
kcalloc manages count*sizeof overflow. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Steve French <sfrench@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10fs/cifs/file.c: replace count*size kzalloc by kcallocFabian Frederick
kcalloc manages count*sizeof overflow. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Steve French <sfrench@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10fs/cifs: remove obsolete __constantFabian Frederick
Replace all __constant_foo to foo() except in smb2status.h (1700 lines to update). Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Steve French <sfrench@samba.org> Cc: Jeff Layton <jlayton@poochiereds.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10mm/CMA: fix boot regression due to physical address of high_memoryJoonsoo Kim
high_memory isn't direct mapped memory so retrieving it's physical address isn't appropriate. But, it would be useful to check physical address of highmem boundary so it's justfiable to get physical address from it. In x86, there is a validation check if CONFIG_DEBUG_VIRTUAL and it triggers following boot failure reported by Ingo. ... BUG: Int 6: CR2 00f06f53 ... Call Trace: dump_stack+0x41/0x52 early_idt_handler+0x6b/0x6b cma_declare_contiguous+0x33/0x212 dma_contiguous_reserve_area+0x31/0x4e dma_contiguous_reserve+0x11d/0x125 setup_arch+0x7b5/0xb63 start_kernel+0xb8/0x3e6 i386_start_kernel+0x79/0x7d To fix boot regression, this patch implements workaround to avoid validation check in x86 when retrieving physical address of high_memory. __pa_nodebug() used by this patch is implemented only in x86 so there is no choice but to use dirty #ifdef. [akpm@linux-foundation.org: tweak comment] Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Reported-by: Ingo Molnar <mingo@kernel.org> Tested-by: Ingo Molnar <mingo@kernel.org> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-11drm/doc: Document drm_add_modes_noedid() usageLaurent Pinchart
And fix a spelling mistake. Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-12-11Merge tag 'topic/core-stuff-2014-12-10' of ↵Dave Airlie
git://anongit.freedesktop.org/drm-intel into drm-next Merge drm core fixes from Daniel. * tag 'topic/core-stuff-2014-12-10' of git://anongit.freedesktop.org/drm-intel: drm: Zero out DRM object memory upon cleanup drm: fix a typo in a comment drm: fix a word repetition in a comment drm: Fix memory leak at error path of drm_read() drm/Documentation: Fix rowspan value in drm-kms-properties drm/edid: Restore kerneldoc consistency drm/edid: new drm_edid_block_checksum helper function V3 drm/edid: shorten log output in case of all zeroes edid block drm/edid: move drm_edid_is_zero to top, make edid argument const
2014-12-10Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS changes from Al Viro: "First pile out of several (there _definitely_ will be more). Stuff in this one: - unification of d_splice_alias()/d_materialize_unique() - iov_iter rewrite - killing a bunch of ->f_path.dentry users (and f_dentry macro). Getting that completed will make life much simpler for unionmount/overlayfs, since then we'll be able to limit the places sensitive to file _dentry_ to reasonably few. Which allows to have file_inode(file) pointing to inode in a covered layer, with dentry pointing to (negative) dentry in union one. Still not complete, but much closer now. - crapectomy in lustre (dead code removal, mostly) - "let's make seq_printf return nothing" preparations - assorted cleanups and fixes There _definitely_ will be more piles" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits) copy_from_iter_nocache() new helper: iov_iter_kvec() csum_and_copy_..._iter() iov_iter.c: handle ITER_KVEC directly iov_iter.c: convert copy_to_iter() to iterate_and_advance iov_iter.c: convert copy_from_iter() to iterate_and_advance iov_iter.c: get rid of bvec_copy_page_{to,from}_iter() iov_iter.c: convert iov_iter_zero() to iterate_and_advance iov_iter.c: convert iov_iter_get_pages_alloc() to iterate_all_kinds iov_iter.c: convert iov_iter_get_pages() to iterate_all_kinds iov_iter.c: convert iov_iter_npages() to iterate_all_kinds iov_iter.c: iterate_and_advance iov_iter.c: macros for iterating over iov_iter kill f_dentry macro dcache: fix kmemcheck warning in switch_names new helper: audit_file() nfsd_vfs_write(): use file_inode() ncpfs: use file_inode() kill f_dentry uses lockd: get rid of ->f_path.dentry->d_sb ...
2014-12-10Merge tag 'dlm-3.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm update from David Teigland: "This set includes one feature, which allows locks that have been orphaned to be reacquired" * tag 'dlm-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: dlm: adopt orphan locks
2014-12-10Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull quota updates from Jan Kara: "Quota improvements and some minor cleanups. The main portion in the pull request are changes which move i_dquot array from struct inode into fs-private part of an inode which saves memory for filesystems which don't use VFS quotas" * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: udf: One function call less in udf_fill_super() after error detection udf: Deletion of unnecessary checks before the function call "iput" jbd: Deletion of an unnecessary check before the function call "iput" vfs: Remove i_dquot field from inode jfs: Convert to private i_dquot field reiserfs: Convert to private i_dquot field ocfs2: Convert to private i_dquot field ext4: Convert to private i_dquot field ext3: Convert to private i_dquot field ext2: Convert to private i_dquot field quota: Use function to provide i_dquot pointers xfs: Set allowed quota types gfs2: Set allowed quota types quota: Allow each filesystem to specify which quota types it supports quota: Remove const from function declarations quota: Add log level to printk
2014-12-10Merge tag 'for-f2fs-3.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "This patch-set includes lots of bug fixes based on clean-ups and refactored codes. And inline_dir was introduced and two minor mount options were added. Details from signed tag: This series includes the following enhancement with refactored flows. - fix inmemory page operations - fix wrong inline_data & inline_dir logics - enhance memory and IO control under memory pressure - consider preemption on radix_tree operation - fix memory leaks and deadlocks But also, there are a couple of new features: - support inline_dir to store dentries inside inode page - add -o fastboot to reduce booting time - implement -o dirsync And a lot of clean-ups and minor bug fixes as well" * tag 'for-f2fs-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (88 commits) f2fs: avoid to ra unneeded blocks in recover flow f2fs: introduce is_valid_blkaddr to cleanup codes in ra_meta_pages f2fs: fix to enable readahead for SSA/CP blocks f2fs: use atomic for counting inode with inline_{dir,inode} flag f2fs: cleanup path to need cp at fsync f2fs: check if inode state is dirty at fsync f2fs: count the number of inmemory pages f2fs: release inmemory pages when the file was closed f2fs: set page private for inmemory pages for truncation f2fs: count inline_xx in do_read_inode f2fs: do retry operations with cond_resched f2fs: call radix_tree_preload before radix_tree_insert f2fs: use rw_semaphore for nat entry lock f2fs: fix missing kmem_cache_free f2fs: more fast lookup for gc_inode list f2fs: cleanup redundant macro f2fs: fix to return correct error number in f2fs_write_begin f2fs: cleanup if-statement of phase in gc_data_segment f2fs: fix to recover converted inline_data f2fs: make clean the page before writing ...
2014-12-10Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs update from Steve French: "Mostly cifs cleanup but also a few cifs fixes" * 'for-linus' of git://git.samba.org/sfrench/cifs-2.6: cifs: remove unneeded condition check Set UID in sess_auth_rawntlmssp_authenticate too cifs: convert printk(LEVEL...) to pr_<level> cifs: convert to print_hex_dump() instead of custom implementation cifs: call strtobool instead of custom implementation Update MAINTAINERS entry Update modinfo cifs version for cifs.ko decode_negTokenInit had wrong calling sequence Add missing defines for ACL query support Add support for original fallocate
2014-12-10Merge tag 'gfs2-merge-window' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw Pull GFS2 update from Steven Whitehouse: "In contrast to recent merge windows, there are a number of interesting features this time: There is a set of patches to improve performance in relation to block reservations. Some correctness fixes for fallocate, and an update to the freeze/thaw code which greatly simplyfies this code path. In addition there is a set of clean ups from Al Viro too" * tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw: GFS2: gfs2_atomic_open(): simplify the use of finish_no_open() GFS2: gfs2_dir_get_hash_table(): avoiding deferred vfree() is easy here... GFS2: use kvfree() instead of open-coding it GFS2: gfs2_create_inode(): don't bother with d_splice_alias() GFS2: bugger off early if O_CREAT open finds a directory GFS2: Deletion of unnecessary checks before two function calls GFS2: update freeze code to use freeze/thaw_super on all nodes fs: add freeze_super/thaw_super fs hooks GFS2: Update timestamps on fallocate GFS2: Update i_size properly on fallocate GFS2: Use inode_newsize_ok and get_write_access in fallocate GFS2: If we use up our block reservation, request more next time GFS2: Only increase rs_sizehint GFS2: Set of distributed preferences for rgrps GFS2: directly return gfs2_dir_check()
2014-12-11ACPI / Fan: Use bus id as the name for non PNP0C0B (Fan) devicesSrinivas Pandruvada
The _ART (Active Cooling Relationship Table), specifies relationship among heat generating sources to a target active cooling device like fan. The _ART table refers to actual bus id name for specifying relationship. Naming "Fan" is not enough as name in the _ART table can change on every platform, to establish relationship for user space thermal controllers. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-12-11intel_pstate: Add a few commentsKristen Carlson Accardi
Add a few comments in the code which calculates busyness to clarify parts of the algorithm. Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>