summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2012-12-18memcg: add documentation about the kmem controllerGlauber Costa
Signed-off-by: Glauber Costa <glommer@parallels.com> Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: JoonSoo Kim <js1304@gmail.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Rik van Riel <riel@redhat.com> Cc: Suleiman Souhlal <suleiman@google.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18fork: protect architectures where THREAD_SIZE >= PAGE_SIZE against fork bombsGlauber Costa
Because those architectures will draw their stacks directly from the page allocator, rather than the slab cache, we can directly pass __GFP_KMEMCG flag, and issue the corresponding free_pages. This code path is taken when the architecture doesn't define CONFIG_ARCH_THREAD_INFO_ALLOCATOR (only ia64 seems to), and has THREAD_SIZE >= PAGE_SIZE. Luckily, most - if not all - of the remaining architectures fall in this category. This will guarantee that every stack page is accounted to the memcg the process currently lives on, and will have the allocations to fail if they go over limit. For the time being, I am defining a new variant of THREADINFO_GFP, not to mess with the other path. Once the slab is also tracked by memcg, we can get rid of that flag. Tested to successfully protect against :(){ :|:& };: Signed-off-by: Glauber Costa <glommer@parallels.com> Acked-by: Frederic Weisbecker <fweisbec@redhat.com> Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Greg Thelen <gthelen@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: JoonSoo Kim <js1304@gmail.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Rik van Riel <riel@redhat.com> Cc: Suleiman Souhlal <suleiman@google.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18memcg: execute the whole memcg freeing in free_worker()Glauber Costa
A lot of the initialization we do in mem_cgroup_create() is done with softirqs enabled. This include grabbing a css id, which holds &ss->id_lock->rlock, and the per-zone trees, which holds rtpz->lock->rlock. All of those signal to the lockdep mechanism that those locks can be used in SOFTIRQ-ON-W context. This means that the freeing of memcg structure must happen in a compatible context, otherwise we'll get a deadlock, like the one below, caught by lockdep: free_accounted_pages+0x47/0x4c free_task+0x31/0x5c __put_task_struct+0xc2/0xdb put_task_struct+0x1e/0x22 delayed_put_task_struct+0x7a/0x98 __rcu_process_callbacks+0x269/0x3df rcu_process_callbacks+0x31/0x5b __do_softirq+0x122/0x277 This usage pattern could not be triggered before kmem came into play. With the introduction of kmem stack handling, it is possible that we call the last mem_cgroup_put() from the task destructor, which is run in an rcu callback. Such callbacks are run with softirqs disabled, leading to the offensive usage pattern. In general, we have little, if any, means to guarantee in which context the last memcg_put will happen. The best we can do is test it and try to make sure no invalid context releases are happening. But as we add more code to memcg, the possible interactions grow in number and expose more ways to get context conflicts. One thing to keep in mind, is that part of the freeing process is already deferred to a worker, such as vfree(), that can only be called from process context. For the moment, the only two functions we really need moved away are: * free_css_id(), and * mem_cgroup_remove_from_trees(). But because the later accesses per-zone info, free_mem_cgroup_per_zone_info() needs to be moved as well. With that, we are left with the per_cpu stats only. Better move it all. Signed-off-by: Glauber Costa <glommer@parallels.com> Tested-by: Greg Thelen <gthelen@google.com> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: JoonSoo Kim <js1304@gmail.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Rik van Riel <riel@redhat.com> Cc: Suleiman Souhlal <suleiman@google.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18memcg: allow a memcg with kmem charges to be destructedGlauber Costa
Because the ultimate goal of the kmem tracking in memcg is to track slab pages as well, we can't guarantee that we'll always be able to point a page to a particular process, and migrate the charges along with it - since in the common case, a page will contain data belonging to multiple processes. Because of that, when we destroy a memcg, we only make sure the destruction will succeed by discounting the kmem charges from the user charges when we try to empty the cgroup. Signed-off-by: Glauber Costa <glommer@parallels.com> Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: JoonSoo Kim <js1304@gmail.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Rik van Riel <riel@redhat.com> Cc: Suleiman Souhlal <suleiman@google.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18memcg: use static branches when code not in useGlauber Costa
We can use static branches to patch the code in or out when not used. Because the _ACTIVE bit on kmem_accounted is only set after the increment is done, we guarantee that the root memcg will always be selected for kmem charges until all call sites are patched (see memcg_kmem_enabled). This guarantees that no mischarges are applied. Static branch decrement happens when the last reference count from the kmem accounting in memcg dies. This will only happen when the charges drop down to 0. When that happens, we need to disable the static branch only on those memcgs that enabled it. To achieve this, we would be forced to complicate the code by keeping track of which memcgs were the ones that actually enabled limits, and which ones got it from its parents. It is a lot simpler just to do static_key_slow_inc() on every child that is accounted. Signed-off-by: Glauber Costa <glommer@parallels.com> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Suleiman Souhlal <suleiman@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: JoonSoo Kim <js1304@gmail.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18memcg: kmem accounting lifecycle managementGlauber Costa
Because kmem charges can outlive the cgroup, we need to make sure that we won't free the memcg structure while charges are still in flight. For reviewing simplicity, the charge functions will issue mem_cgroup_get() at every charge, and mem_cgroup_put() at every uncharge. This can get expensive, however, and we can do better. mem_cgroup_get() only really needs to be issued once: when the first limit is set. In the same spirit, we only need to issue mem_cgroup_put() when the last charge is gone. We'll need an extra bit in kmem_account_flags for that: KMEM_ACCOUNTED_DEAD. it will be set when the cgroup dies, if there are charges in the group. If there aren't, we can proceed right away. Our uncharge function will have to test that bit every time the charges drop to 0. Because that is not the likely output of res_counter_uncharge, this should not impose a big hit on us: it is certainly much better than a reference count decrease at every operation. Signed-off-by: Glauber Costa <glommer@parallels.com> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Suleiman Souhlal <suleiman@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: JoonSoo Kim <js1304@gmail.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18res_counter: return amount of charges after res_counter_uncharge()Glauber Costa
It is useful to know how many charges are still left after a call to res_counter_uncharge. While it is possible to issue a res_counter_read after uncharge, this can be racy. If we need, for instance, to take some action when the counters drop down to 0, only one of the callers should see it. This is the same semantics as the atomic variables in the kernel. Since the current return value is void, we don't need to worry about anything breaking due to this change: nobody relied on that, and only users appearing from now on will be checking this value. Signed-off-by: Glauber Costa <glommer@parallels.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Suleiman Souhlal <suleiman@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux.com> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: JoonSoo Kim <js1304@gmail.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18mm: allocate kernel pages to the right memcgGlauber Costa
When a process tries to allocate a page with the __GFP_KMEMCG flag, the page allocator will call the corresponding memcg functions to validate the allocation. Tasks in the root memcg can always proceed. To avoid adding markers to the page - and a kmem flag that would necessarily follow, as much as doing page_cgroup lookups for no reason, whoever is marking its allocations with __GFP_KMEMCG flag is responsible for telling the page allocator that this is such an allocation at free_pages() time. This is done by the invocation of __free_accounted_pages() and free_accounted_pages(). Signed-off-by: Glauber Costa <glommer@parallels.com> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Mel Gorman <mgorman@suse.de> Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Suleiman Souhlal <suleiman@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: JoonSoo Kim <js1304@gmail.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18memcg: kmem controller infrastructureGlauber Costa
Introduce infrastructure for tracking kernel memory pages to a given memcg. This will happen whenever the caller includes the flag __GFP_KMEMCG flag, and the task belong to a memcg other than the root. In memcontrol.h those functions are wrapped in inline acessors. The idea is to later on, patch those with static branches, so we don't incur any overhead when no mem cgroups with limited kmem are being used. Users of this functionality shall interact with the memcg core code through the following functions: memcg_kmem_newpage_charge: will return true if the group can handle the allocation. At this point, struct page is not yet allocated. memcg_kmem_commit_charge: will either revert the charge, if struct page allocation failed, or embed memcg information into page_cgroup. memcg_kmem_uncharge_page: called at free time, will revert the charge. Signed-off-by: Glauber Costa <glommer@parallels.com> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: JoonSoo Kim <js1304@gmail.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Rik van Riel <riel@redhat.com> Cc: Suleiman Souhlal <suleiman@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18mm: add a __GFP_KMEMCG flagGlauber Costa
This flag is used to indicate to the callees that this allocation is a kernel allocation in process context, and should be accounted to current's memcg. Signed-off-by: Glauber Costa <glommer@parallels.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Suleiman Souhlal <suleiman@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: JoonSoo Kim <js1304@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18memcg: kmem accounting basic infrastructureGlauber Costa
Add the basic infrastructure for the accounting of kernel memory. To control that, the following files are created: * memory.kmem.usage_in_bytes * memory.kmem.limit_in_bytes * memory.kmem.failcnt * memory.kmem.max_usage_in_bytes They have the same meaning of their user memory counterparts. They reflect the state of the "kmem" res_counter. Per cgroup kmem memory accounting is not enabled until a limit is set for the group. Once the limit is set the accounting cannot be disabled for that group. This means that after the patch is applied, no behavioral changes exists for whoever is still using memcg to control their memory usage, until memory.kmem.limit_in_bytes is set for the first time. We always account to both user and kernel resource_counters. This effectively means that an independent kernel limit is in place when the limit is set to a lower value than the user memory. A equal or higher value means that the user limit will always hit first, meaning that kmem is effectively unlimited. People who want to track kernel memory but not limit it, can set this limit to a very high number (like RESOURCE_MAX - 1page - that no one will ever hit, or equal to the user memory) [akpm@linux-foundation.org: MEMCG_MMEM only works with slab and slub] Signed-off-by: Glauber Costa <glommer@parallels.com> Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: JoonSoo Kim <js1304@gmail.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18memcg: change defines to an enumGlauber Costa
This is just a cleanup patch for clarity of expression. In earlier submissions, people asked it to be in a separate patch, so here it is. Signed-off-by: Glauber Costa <glommer@parallels.com> Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: David Rientjes <rientjes@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux.com> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: JoonSoo Kim <js1304@gmail.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Rik van Riel <riel@redhat.com> Cc: Suleiman Souhlal <suleiman@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18memcg: reclaim when more than one page neededSuleiman Souhlal
mem_cgroup_do_charge() was written before kmem accounting, and expects three cases: being called for 1 page, being called for a stock of 32 pages, or being called for a hugepage. If we call for 2 or 3 pages (and both the stack and several slabs used in process creation are such, at least with the debug options I had), it assumed it's being called for stock and just retried without reclaiming. Fix that by passing down a minsize argument in addition to the csize. And what to do about that (csize == PAGE_SIZE && ret) retry? If it's needed at all (and presumably is since it's there, perhaps to handle races), then it should be extended to more than PAGE_SIZE, yet how far? And should there be a retry count limit, of what? For now retry up to COSTLY_ORDER (as page_alloc.c does) and make sure not to do it if __GFP_NORETRY. v4: fixed nr pages calculation pointed out by Christoph Lameter. Signed-off-by: Suleiman Souhlal <suleiman@google.com> Signed-off-by: Glauber Costa <glommer@parallels.com> Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: David Rientjes <rientjes@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: JoonSoo Kim <js1304@gmail.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18memcg: make it possible to use the stock for more than one pageSuleiman Souhlal
We currently have a percpu stock cache scheme that charges one page at a time from memcg->res, the user counter. When the kernel memory controller comes into play, we'll need to charge more than that. This is because kernel memory allocations will also draw from the user counter, and can be bigger than a single page, as it is the case with the stack (usually 2 pages) or some higher order slabs. [glommer@parallels.com: added a changelog ] Signed-off-by: Suleiman Souhlal <suleiman@google.com> Signed-off-by: Glauber Costa <glommer@parallels.com> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux.com> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: JoonSoo Kim <js1304@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Rik van Riel <riel@redhat.com> Cc: Suleiman Souhlal <suleiman@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18memory-hotplug: document and enable CONFIG_MOVABLE_NODETang Chen
Add help info for CONFIG_MOVABLE_NODE and permit its selection. This option allows the user to online all memory of a node as movable memory. So that the whole node can be hotplugged. Users who don't use the hotplug feature are also fine with this option on since they won't online memory as movable. Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> [akpm@linux-foundation.org: tweak help text] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18mm/page_alloc.c: remove duplicate checkGavin Shan
While allocating pages using buddy allocator, the compound page is probably split up to free pages. Under these circumstances, the compound page should be destroyed by destroy_compound_page(). However, there is a duplicate check to judge if the page is compound. Remove the duplicate check since the compound_order() returns 0 when the page doesn't have PG_head set in destroy_compound_page(). That is to say, destroy_compound_page() needn't check PageHead(). Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18drivers/message/fusion/mptscsih.c: missing breakAlan Cox
This happens to do the right thing in all cases on fibre channel but not on other media types Signed-off-by: Alan Cox <alan@linux.intel.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com> Cc: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18h8300: select generic atomic64_t supportFengguang Wu
Rationales from Eric: So I just looked a little deeper and it appears architectures that do not support atomic64_t are broken. The generic atomic64 support came in 2009 to support the perf subsystem with the expectation that all architectures would implement atomic64 support. Furthermore upon inspection of the kernel atomic64_t is used in a fair number of places beyond the performance counters: block/blk-cgroup.c drivers/acpi/apei/ drivers/block/rbd.c drivers/crypto/nx/nx.h drivers/gpu/drm/radeon/radeon.h drivers/infiniband/hw/ipath/ drivers/infiniband/hw/qib/ drivers/staging/octeon/ fs/xfs/ include/linux/perf_event.h include/net/netfilter/nf_conntrack_acct.h kernel/events/ kernel/trace/ net/mac80211/key.h net/rds/ The block control group, infiniband, xfs, crypto, 802.11, netfilter. Nothing quite so fundamental as fs/namespace.c but definitely in multiplatform-code that should work, and is already broken on those architecutres. Looking at the implementation of atomic64_add_return in lib/atomic64.c the code looks as efficient as these kinds of things get. Which leads me to the conclusion that we need atomic64 support on all architectures. Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18Coccinelle: add api/d_find_alias.cocciCyril Roelandt
Ensure that calls to d_find_alias() have a corresponding dput(). Signed-off-by: Cyril Roelandt <tipecaml@gmail.com> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Gilles Muller <Gilles.Muller@lip6.fr> Cc: Nicolas Palix <nicolas.palix@imag.fr> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18irq: tsk->comm is an arrayAlan Cox
The array check is useless so remove it. [akpm@linux-foundation.org: remove comment, per David] Signed-off-by: Alan Cox <alan@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18ceph: fix dentry reference leak in ceph_encode_fh()Cyril Roelandt
dput() was not called in the error path. Signed-off-by: Cyril Roelandt <tipecaml@gmail.com> Cc: Sage Weil <sage@inktank.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18arch/x86/platform/iris/iris.c: register a platform device and a platform driverShérab
This makes the iris driver use the platform API, so it is properly exposed in /sys. [akpm@linux-foundation.org: remove commented-out code, add missing space to printk, clean up code layout] Signed-off-by: Shérab <Sebastien.Hinderer@ens-lyon.org> Cc: Len Brown <lenb@kernel.org> Cc: Matthew Garrett <mjg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18CRIS: fix I/O macrosCorey Minyard
The inb/outb macros for CRIS are broken from a number of points of view, missing () around parameters and they have an unprotected if statement in them. This was breaking the compile of IPMI on CRIS and thus I was being annoyed by build regressions, so I fixed them. Plus I don't think they would have worked at all, since the data values were missing "&" and the outsl had a "3" instead of a "4" for the size. From what I can tell, this stuff is not used at all, so this can't be any more broken than it was before, anyway. Signed-off-by: Corey Minyard <cminyard@mvista.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Mikael Starvik <starvik@axis.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18backlight: locomolcd: fix checkpatch error and warningJingoo Han
This patch fixes the checkpatch error and warning as below: WARNING: space prohibited between function name and open parenthesis '(' ERROR: trailing statements should be on next line Also, long comments are fixed for the preferred style and unnecessary lines are removed. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18drm/i915: Preallocate the drm_mm_node prior to manipulating the GTT drm_mm ↵Chris Wilson
manager As we may reap neighbouring objects in order to free up pages for allocations, we need to be careful not to allocate in the middle of the drm_mm manager. To accomplish this, we can simply allocate the drm_mm_node up front and then use the combined search & insert drm_mm routines, reducing our code footprint in the process. Fixes (partially) i-g-t/gem_tiled_swapping Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jani Nikula <jani.nikula@intel.com> [danvet: Again fixup atomic bikeshed.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-12-18drm: Export routines for inserting preallocated nodes into the mm managerChris Wilson
Required by i915 in order to avoid the allocation in the middle of manipulating the drm_mm lists. Use a pair of stubs to preserve the existing EXPORT_SYMBOLs for backporting; to be removed later. Cc: Dave Airlie <airlied@redhat.com> Cc: dri-devel@lists.freedesktop.org Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jani Nikula <jani.nikula@intel.com> [danvet: bikeshedded-away the atomic parameter, it's not yet used anywhere.] Acked-by: Dave Airlie <airlied@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-12-18Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull second round of input updates from Dmitry Torokhov: "As usual, there are a couple of new drivers, input core now supports managed input devices (devres), a slew of drivers now have device tree support and a bunch of fixes and cleanups." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (71 commits) Input: walkera0701 - fix crash on startup Input: matrix-keymap - provide a proper module license Input: gpio_keys_polled - switch to using gpio_request_one() Input: gpio_keys - switch to using gpio_request_one() Input: wacom - fix touch support for Bamboo Fun CTH-461 Input: xpad - add a few new VID/PID combinations Input: xpad - minor formatting fixes Input: gpio-keys-polled - honor 'autorepeat' setting in platform data Input: tca8418-keypad - switch to using managed resources Input: tca8418_keypad - increase severity of failures in probe() Input: tca8418_keypad - move device ID tables closer to where they are used Input: tca8418_keypad - use dev_get_platdata() to retrieve platform data Input: tca8418_keypad - use a temporary variable for parent device Input: tca8418_keypad - add support for shared interrupt Input: tca8418_keypad - add support for device tree bindings Input: remove Compaq iPAQ H3600 (Bitsy) touchscreen driver Input: bu21013_ts - add support for Device Tree booting Input: bu21013_ts - move GPIO init and exit functions into the driver Input: bu21013_ts - request regulator that actually exists ARM: ux500: Strip out duplicate touch screen platform information ...
2012-12-18Revert "Btrfs: MOD_LOG_KEY_REMOVE_WHILE_MOVING never change node's nritems"Chris Mason
This reverts commit 95c80bb1f6b24b57058d971ed252b2c1c5121b51. The bug addressed by this commit was fixed differently back in 3.6 Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-18Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull powertool update from Len Brown: "This updates the tree w/ the latest version of turbostat, which reports temperature and - on SNB and later - Watts." Fix up semantic merge conflict as per Len. * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: tools: Allow tools to be installed in a user specified location tools/power: turbostat: make Makefile a bit more capable tools/power x86_energy_perf_policy: close /proc/stat in for_every_cpu() tools/power turbostat: v3.0: monitor Watts and Temperature tools/power turbostat: fix output buffering issue tools/power turbostat: prevent infinite loop on migration error path x86 power: define RAPL MSRs tools/power/x86/turbostat: share kernel MSR #defines
2012-12-18drm/i915: don't disable disconnected outputsDaniel Vetter
This piece of neat lore has been ported painstakingly and bug-for-bug compatible from the old crtc helper code. Imo it's utter nonsense. If you disconnected a cable and before you reconnect it, userspace (or the kernel) does an set_crtc call, this will result in that connector getting disabled. Which will result in a nice black screen when plugging in the cable again. There's absolutely no reason the kernel does such policy enforcements - if userspace tries to set up a mode on something disconnected we might fail loudly (since the dp link training fails), but silently adjusting the output configuration behind userspace's back is a recipe for disaster. Specifically I think that this could explain some of our MI_WAIT hangs around suspend, where userspace issues a scanline wait on a disable pipe. This mechanisims here could explain how that pipe got disabled without userspace noticing. Note that this fixes a NULL deref at BIOS takeover when the firmware sets up a disconnected output in a clone configuration with a connected output on the 2nd pipe: When doing the full modeset we don't have a mode for the 2nd pipe and OOPS. On the first pipe this doesn't matter, since at boot-up the fbdev helpers will set up the choosen configuration on that on first. Since this is now the umptenth bug around handling this imo brain-dead semantics correctly, I think it's time to kill it and see whether there's any userspace out there which relies on this. It also nicely demonstrates that we have a tiny window where DP hotplug can still kill the driver. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58396 Cc: stable@vger.kernel.org Tested-by: Peter Ujfalusi <peter.ujfalusi@gmail.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-12-18Merge branch 'tip/perf/core-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull minor tracing updates and fixes from Steven Rostedt: "It seems that one of my old pull requests have slipped through. The changes are contained to just the files that I maintain, and are changes from others that I told I would get into this merge window. They have already been in linux-next for several weeks, and should be well tested." * 'tip/perf/core-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Remove unnecessary WARN_ONCE's from tracing_buffers_splice_read tracing: Remove unneeded checks from the stack tracer tracing: Add a resize function to make one buffer equivalent to another buffer
2012-12-18Merge tag 'stable/for-linus-3.8-rc0-bugfix-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull Xen bugfixes from Konrad Rzeszutek Wilk: "Two fixes. One of them is caused by the recent change introduced by the 'x86-bsp-hotplug-for-linus' tip tree that inhibited bootup (old function does not do what it used to do). The other one is just a vanilla bug. - Fix to bootup regression introduced by 'x86-bsp-hotplug-for-linus' tip branch. - Fix to vcpu hotplug code." * tag 'stable/for-linus-3.8-rc0-bugfix-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/vcpu: Fix vcpu restore path. xen: Add EVTCHNOP_reset in Xen interface header files. xen/smp: Use smp_store_boot_cpu_info() to store cpu info for BSP during boot time.
2012-12-18arch/tile: set CORE_DUMP_USE_REGSET on tileSimon Marchi
Following the previous patch which adds support for user_regset, tile can now use this feature. Signed-off-by: Simon Marchi <simon.marchi@polymtl.ca> Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2012-12-18arch/tile: implement arch_ptrace using user_regset on tileSimon Marchi
This patch changes arch_ptrace on tile so that it uses user_regset to implement the PTRACE_GETREGS and PTRACE_SETREGS operations. Signed-off-by: Simon Marchi <simon.marchi@polymtl.ca> Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2012-12-18arch/tile: implement user_regset interface on tileSimon Marchi
This is a basic implementation of user_regset for the tile architecture. It reuses the basic blocks that were already there. Signed-off-by: Simon Marchi <simon.marchi@polymtl.ca> Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2012-12-18Merge branch 'slab/for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux Pull SLAB changes from Pekka Enberg: "This contains preparational work from Christoph Lameter and Glauber Costa for SLAB memcg and cleanups and improvements from Ezequiel Garcia and Joonsoo Kim. Please note that the SLOB cleanup commit from Arnd Bergmann already appears in your tree but I had also merged it myself which is why it shows up in the shortlog." * 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux: mm/sl[aou]b: Common alignment code slab: Use the new create_boot_cache function to simplify bootstrap slub: Use statically allocated kmem_cache boot structure for bootstrap mm, sl[au]b: create common functions for boot slab creation slab: Simplify bootstrap slub: Use correct cpu_slab on dead cpu mm: fix slab.c kernel-doc warnings mm/slob: use min_t() to compare ARCH_SLAB_MINALIGN slab: Ignore internal flags in cache creation mm/slob: Use free_page instead of put_page for page-size kmalloc allocations mm/sl[aou]b: Move common kmem_cache_size() to slab.h mm/slob: Use object_size field in kmem_cache_size() mm/slob: Drop usage of page->private for storing page-sized allocations slub: Commonize slab_cache field in struct page sl[au]b: Process slabinfo_show in common code mm/sl[au]b: Move print_slabinfo_header to slab_common.c mm/sl[au]b: Move slabinfo processing to slab_common.c slub: remove one code path and reduce lock contention in __slab_free()
2012-12-18Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull (again) user namespace infrastructure changes from Eric Biederman: "Those bugs, those darn embarrasing bugs just want don't want to get fixed. Linus I just updated my mirror of your kernel.org tree and it appears you successfully pulled everything except the last 4 commits that fix those embarrasing bugs. When you get a chance can you please repull my branch" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: userns: Fix typo in description of the limitation of userns_install userns: Add a more complete capability subset test to commit_creds userns: Require CAP_SYS_ADMIN for most uses of setns. Fix cap_capable to only allow owners in the parent user namespace to have caps.
2012-12-18Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lliubbo/blackfin Pull blackfin update from Bob Liu. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lliubbo/blackfin: blackfin: SEC: clean up SEC interrupt initialization blackfin: kgdb: call generic_exec_single() directly blackfin: anomaly: add anomaly 16000030 for bf5xx Blackfin: dpmc: use module_platform_driver macro Blackfin: remove unused is_in_rom() Blackfin: remove unnecessary prototype for kobjsize() Blackfin: twi: Add missing __iomem annotation Blackfin: Annotate strnlen_user and strlen_user 'src' parameter with __user Blackfin: Annotate clear_user 'to' parameter with __user Blackfin: Add missing __user annotations to put_user Blackfin: Annotate strncpy_from_user src parameter with __user blackfin: Use Kbuild infrastructure for kvm_para.h UAPI: (Scripted) Disintegrate arch/blackfin/include/asm
2012-12-18Merge tag 'disintegrate-alpha-20121217' of ↵Linus Torvalds
git://git.infradead.org/users/dhowells/linux-headers Pull UAPI disintegration for Alpha from David Howells: "I've been asked to send the Alpha UAPI disintegration to you directly. The acks I have been given have been added into the patch." * tag 'disintegrate-alpha-20121217' of git://git.infradead.org/users/dhowells/linux-headers: UAPI: (Scripted) Disintegrate arch/alpha/include/asm
2012-12-18Merge tag 'for-3.8' of git://openrisc.net/~jonas/linuxLinus Torvalds
Pull OpenRISC update from Jonas Bonn: "Trivial cleanups for OpenRISC." * tag 'for-3.8' of git://openrisc.net/~jonas/linux: openrisc: use kbuild.h instead of defining macros in asm-offset.c openrisc: Use Kbuild infrastructure for kvm_para.h
2012-12-18Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 update #2 from Martin Schwidefsky: "The main patch is the function measurement blocks extension for PCI to do performance statistics and help with debugging. The other patch is a small cleanup in ccwdev.h." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/ccwdev: Include asm/schid.h. s390/pci: performance statistics and debug infrastructure
2012-12-18Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc update from Benjamin Herrenschmidt: "The main highlight is probably some base POWER8 support. There's more to come such as transactional memory support but that will wait for the next one. Overall it's pretty quiet, or rather I've been pretty poor at picking things up from patchwork and reviewing them this time around and Kumar no better on the FSL side it seems..." * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (73 commits) powerpc+of: Rename and fix OF reconfig notifier error inject module powerpc: mpc5200: Add a3m071 board support powerpc/512x: don't compile any platform DIU code if the DIU is not enabled powerpc/mpc52xx: use module_platform_driver macro powerpc+of: Export of_reconfig_notifier_[register,unregister] powerpc/dma/raidengine: add raidengine device powerpc/iommu/fsl: Add PAMU bypass enable register to ccsr_guts struct powerpc/mpc85xx: Change spin table to cached memory powerpc/fsl-pci: Add PCI controller ATMU PM support powerpc/86xx: fsl_pcibios_fixup_bus requires CONFIG_PCI drivers/virt: the Freescale hypervisor driver doesn't need to check MSR[GS] powerpc/85xx: p1022ds: Use NULL instead of 0 for pointers powerpc: Disable relocation on exceptions when kexecing powerpc: Enable relocation on during exceptions at boot powerpc: Move get_longbusy_msecs into hvcall.h and remove duplicate function powerpc: Add wrappers to enable/disable relocation on exceptions powerpc: Add set_mode hcall powerpc: Setup relocation on exceptions for bare metal systems powerpc: Move initial mfspr LPCR out of __init_LPCR powerpc: Add relocation on exception vector handlers ...
2012-12-18x86, paravirt: fix build error when thp is disabledDavid Rientjes
With CONFIG_PARAVIRT=y and CONFIG_TRANSPARENT_HUGEPAGE=n, the build breaks because set_pmd_at() is undeclared: mm/memory.c: In function 'do_pmd_numa_page': mm/memory.c:3520: error: implicit declaration of function 'set_pmd_at' mm/mprotect.c: In function 'change_pmd_protnuma': mm/mprotect.c:120: error: implicit declaration of function 'set_pmd_at' This is because paravirt defines set_pmd_at() only when CONFIG_TRANSPARENT_HUGEPAGE=y and such a restriction is unneeded. The fix is to define it for all CONFIG_PARAVIRT configurations. Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osdLinus Torvalds
Pull exofs changes from Boaz Harrosh: "These are just 3 patches, the last two are bug fixes on the error paths in exofs. The important patch is the one to osd_uld which adds sysfs info to osd devices for use by user-mode clustering discovery software. I'm already sitting on this patch since before February this year, It is important for some of the big installation cluster systems, who's been compiling their own kernel just for that patch." Ugh. The osd_uld patch already went through the SCSI tree, so this was kind of pointless. But at least it has the two small error-path fixes.. * 'for-linus' of git://git.open-osd.org/linux-open-osd: exofs: don't leak io_state and pages on read error exofs: clean up the correct page collection on write error osduld: Add osdname & systemid sysfs at scsi_osd class
2012-12-18Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs update from Chris Mason: "A big set of fixes and features. In terms of line count, most of the code comes from Stefan, who added the ability to replace a single drive in place. This is different from how btrfs normally replaces drives, and is much much much faster. Josef is plowing through our synchronous write performance. This pull request does not include the DIO_OWN_WAITING patch that was discussed on the list, but it has a number of other improvements to cut down our latencies and CPU time during fsync/O_DIRECT writes. Miao Xie has a big series of fixes and is spreading out ordered operations over more CPUs. This improves performance and reduces contention. I've put in fixes for error handling around hash collisions. These are going back to individual stable kernels as I test against them. Otherwise we have a lot of fixes and cleanups, thanks everyone! raid5/6 is being rebased against the device replacement code. I'll have it posted this Friday along with a nice series of benchmarks." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (115 commits) Btrfs: fix a bug of per-file nocow Btrfs: fix hash overflow handling Btrfs: don't take inode delalloc mutex if we're a free space inode Btrfs: fix autodefrag and umount lockup Btrfs: fix permissions of empty files not affected by umask Btrfs: put raid properties into global table Btrfs: fix BUG() in scrub when first superblock reading gives EIO Btrfs: do not call file_update_time in aio_write Btrfs: only unlock and relock if we have to Btrfs: use tokens where we can in the tree log Btrfs: optimize leaf_space_used Btrfs: don't memset new tokens Btrfs: only clear dirty on the buffer if it is marked as dirty Btrfs: move checks in set_page_dirty under DEBUG Btrfs: log changed inodes based on the extent map tree Btrfs: add path->really_keep_locks Btrfs: do not mark ems as prealloc if we are writing to them Btrfs: keep track of the extents original block length Btrfs: inline csums if we're fsyncing Btrfs: don't bother copying if we're only logging the inode ...
2012-12-18Merge tag 'nfs-for-3.8-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client updates from Trond Myklebust: "Features include: - Full audit of BUG_ON asserts in the NFS, SUNRPC and lockd client code. Remove altogether where possible, and replace with WARN_ON_ONCE and appropriate error returns where not. - NFSv4.1 client adds session dynamic slot table management. There is matching server side code that has been submitted to Bruce for consideration. Together, this code allows the server to dynamically manage the amount of memory it allocates to the duplicate request cache for each client. It will constantly resize those caches to reserve more memory for clients that are hot while shrinking caches for those that are quiescent. In addition, there are assorted bugfixes for the generic NFS write code, fixes to deal with the drop_nlink() warnings, and yet another fix for NFSv4 getacl." * tag 'nfs-for-3.8-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (106 commits) SUNRPC: continue run over clients list on PipeFS event instead of break NFS: Don't use SetPageError in the NFS writeback code SUNRPC: variable 'svsk' is unused in function bc_send_request SUNRPC: Handle ECONNREFUSED in xs_local_setup_socket NFSv4.1: Deal effectively with interrupted RPC calls. NFSv4.1: Move the RPC timestamp out of the slot. NFSv4.1: Try to deal with NFS4ERR_SEQ_MISORDERED. NFS: nfs_lookup_revalidate should not trust an inode with i_nlink == 0 NFS: Fix calls to drop_nlink() NFS: Ensure that we always drop inodes that have been marked as stale nfs: Remove unused list nfs4_clientid_list nfs: Remove duplicate function declaration in internal.h NFS: avoid NULL dereference in nfs_destroy_server SUNRPC handle EKEYEXPIRED in call_refreshresult SUNRPC set gss gc_expiry to full lifetime nfs: fix page dirtying in NFS DIO read codepath nfs: don't zero out the rest of the page if we hit the EOF on a DIO READ NFSv4.1: Be conservative about the client highest slotid NFSv4.1: Handle NFS4ERR_BADSLOT errors correctly nfs: don't extend writes to cover entire page if pagecache is invalid ...
2012-12-18Merge tag 'md-3.8' of git://neil.brown.name/mdLinus Torvalds
Pull md update from Neil Brown: "Mostly just little fixes. Probably biggest part is AVX accelerated RAID6 calculations." * tag 'md-3.8' of git://neil.brown.name/md: md/raid5: add blktrace calls md/raid5: use async_tx_quiesce() instead of open-coding it. md: Use ->curr_resync as last completed request when cleanly aborting resync. lib/raid6: build proper files on corresponding arch lib/raid6: Add AVX2 optimized gen_syndrome functions lib/raid6: Add AVX2 optimized recovery functions md: Update checkpoint of resync/recovery based on time. md:Add place to update ->recovery_cp. md.c: re-indent various 'switch' statements. md: close race between removing and adding a device. md: removed unused variable in calc_sb_1_csm.
2012-12-18fs/ecryptfs/crypto.c: make ecryptfs_encode_for_filename() staticCong Ding
the function ecryptfs_encode_for_filename() is only used in this file Signed-off-by: Cong Ding <dinggnu@gmail.com> Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
2012-12-18eCryptfs: fix to use list_for_each_entry_safe() when delete itemsWei Yongjun
Since we will be removing items off the list using list_del() we need to use a safer version of the list_for_each_entry() macro aptly named list_for_each_entry_safe(). We should use the safe macro if the loop involves deletions of items. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> [tyhicks: Fixed compiler err - missing list_for_each_entry_safe() param] Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
2012-12-18ALSA: HDA: Fix sound resume hangDaniel J Blueman
Resuming a switcheroo'd HDA controller hangs since the completion is one-shot (thus works the first time). Fix by using completions that explictly need rearming, so remain fired before. Signed-off-by: Daniel J Blueman <daniel@quora.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>