Age | Commit message (Collapse) | Author |
|
This patch removes an artificial RapidIO bus root device and establishes
actual device hierarchy by providing reference to real parent devices.
It also introduces device class for RapidIO controller devices (on-chip
or an eternal bridge, known as "mport").
Existing implementation was sufficient for SoC-based platforms that have
a single RapidIO controller. With introduction of devices using
multiple RapidIO controllers and PCIe-to-RapidIO bridges the old scheme
is very limiting or does not work at all. The implemented changes allow
to properly reference platform's local RapidIO mport devices and provide
device details needed for upper layers.
This change to RapidIO device hierarchy does not break any known
existing kernel or user space interfaces.
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
Cc: Jerry Jacobs <jerry.jacobs@prodrive-technologies.com>
Cc: Arno Tiemersma <arno.tiemersma@prodrive-technologies.com>
Cc: Rob Landley <rob@landley.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Remove no longer used deprecated code, and make local functions
static.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Jean Delvare <jdelvare@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: George Spelvin <linux@horizon.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Eliminate the following warning in proc/vmcore.c:
fs/proc/vmcore.c:1088:6: warning: no previous prototype for `vmcore_cleanup' [-Wmissing-prototypes]
[akpm@linux-foundation.org: clean up powerpc, remove unneeded EXPORT_SYMBOL]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
get_task_state() uses the most significant bit to report the state to
user-space, this means that EXIT_ZOMBIE->EXIT_TRACE->EXIT_DEAD transition
can be noticed via /proc as Z -> X -> Z change. Note that this was
possible even before EXIT_TRACE was introduced.
This is not really bad but imho it make sense to hide EXIT_TRACE from
user-space completely. So the patch simply swaps EXIT_ZOMBIE and
EXIT_DEAD, this way EXIT_TRACE will be seen as EXIT_ZOMBIE by user-space.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
Cc: Michal Schmidt <mschmidt@redhat.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Lennart Poettering <lpoetter@redhat.com>
Cc: Roland McGrath <roland@hack.frob.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
wait_task_zombie() first does EXIT_ZOMBIE->EXIT_DEAD transition and
drops tasklist_lock. If this task is not the natural child and it is
traced, we change its state back to EXIT_ZOMBIE for ->real_parent.
The last transition is racy, this is even documented in 50b8d257486a
"ptrace: partially fix the do_wait(WEXITED) vs EXIT_DEAD->EXIT_ZOMBIE
race". wait_consider_task() tries to detect this transition and clear
->notask_error but we can't rely on ptrace_reparented(), debugger can
exit and do ptrace_unlink() before its sub-thread sets EXIT_ZOMBIE.
And there is another problem which were missed before: this transition
can also race with reparent_leader() which doesn't reset >exit_signal if
EXIT_DEAD, assuming that this task must be reaped by someone else. So
the tracee can be re-parented with ->exit_signal != SIGCHLD, and if
/sbin/init doesn't use __WALL it becomes unreapable. This was fixed by
the previous commit, but it was the temporary hack.
1. Add the new exit_state, EXIT_TRACE. It means that the task is the
traced zombie, debugger is going to detach and notify its natural
parent.
This new state is actually EXIT_ZOMBIE | EXIT_DEAD. This way we
can avoid the changes in proc/kgdb code, get_task_state() still
reports "X (dead)" in this case.
Note: with or without this change userspace can see Z -> X -> Z
transition. Not really bad, but probably makes sense to fix.
2. Change wait_task_zombie() to use EXIT_TRACE instead of EXIT_DEAD
if we need to notify the ->real_parent.
3. Revert the previous hack in reparent_leader(), now that EXIT_DEAD
is always the final state we can safely ignore such a task.
4. Change wait_consider_task() to check EXIT_TRACE separately and kill
the racy and no longer needed ptrace_reparented() case.
If ptrace == T an EXIT_TRACE thread should be simply ignored, the
owner of this state is going to ptrace_unlink() this task. We can
pretend that it was already removed from ->ptraced list.
Otherwise we should skip this thread too but clear ->notask_error,
we must be the natural parent and debugger is going to untrace and
notify us. IOW, this doesn't differ from "EXIT_ZOMBIE && p->ptrace"
even if the task was already untraced.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Jan Kratochvil <jan.kratochvil@redhat.com>
Reported-by: Michal Schmidt <mschmidt@redhat.com>
Tested-by: Michal Schmidt <mschmidt@redhat.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Lennart Poettering <lpoetter@redhat.com>
Cc: Roland McGrath <roland@hack.frob.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Starting from commit c4ad8f98bef7 ("execve: use 'struct filename *' for
executable name passing") bprm->filename can not go away after
flush_old_exec(), so we do not need to save the binary name in
bprm->tcomm[] added by 96e02d158678 ("exec: fix use-after-free bug in
setup_new_exec()").
And there was never need for filename_to_taskname-like code, we can
simply do set_task_comm(kbasename(filename).
This patch has to change set_task_comm() and trace_task_rename() to
accept "const char *", but I think this change is also good.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
LAST_CPUPID_MASK is calculated using LAST_CPUPID_WIDTH. However
LAST_CPUPID_WIDTH itself can be 0. (when LAST_CPUPID_NOT_IN_PAGE_FLAGS is
set). In such a case LAST_CPUPID_MASK turns out to be 0.
But with recent commit 1ae71d0319: (mm: numa: bugfix for
LAST_CPUPID_NOT_IN_PAGE_FLAGS) if LAST_CPUPID_MASK is 0,
page_cpupid_xchg_last() and page_cpupid_reset_last() causes
page->_last_cpupid to be set to 0.
This causes performance regression. Its almost as if numa_balancing is
off.
Fix LAST_CPUPID_MASK by using LAST_CPUPID_SHIFT instead of
LAST_CPUPID_WIDTH.
Some performance numbers and perf stats with and without the fix.
(3.14-rc6)
----------
numa01
Performance counter stats for '/usr/bin/time -f %e %S %U %c %w -o start_bench.out -a ./numa01':
12,27,462 cs [100.00%]
2,41,957 migrations [100.00%]
1,68,01,713 faults [100.00%]
7,99,35,29,041 cache-misses
98,808 migrate:mm_migrate_pages [100.00%]
1407.690148814 seconds time elapsed
numa02
Performance counter stats for '/usr/bin/time -f %e %S %U %c %w -o start_bench.out -a ./numa02':
63,065 cs [100.00%]
14,364 migrations [100.00%]
2,08,118 faults [100.00%]
25,32,59,404 cache-misses
12 migrate:mm_migrate_pages [100.00%]
63.840827219 seconds time elapsed
(3.14-rc6 with fix)
-------------------
numa01
Performance counter stats for '/usr/bin/time -f %e %S %U %c %w -o start_bench.out -a ./numa01':
9,68,911 cs [100.00%]
1,01,414 migrations [100.00%]
88,38,697 faults [100.00%]
4,42,92,51,042 cache-misses
4,25,060 migrate:mm_migrate_pages [100.00%]
685.965331189 seconds time elapsed
numa02
Performance counter stats for '/usr/bin/time -f %e %S %U %c %w -o start_bench.out -a ./numa02':
17,543 cs [100.00%]
2,962 migrations [100.00%]
1,17,843 faults [100.00%]
11,80,61,644 cache-misses
12,358 migrate:mm_migrate_pages [100.00%]
20.380132343 seconds time elapsed
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
s/MADV_NODUMP/MADV_DONTDUMP/
Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit f9acc8c7b35a ("readahead: sanify file_ra_state names") left
ra_submit with a single function call.
Move ra_submit to internal.h and inline it to save some stack. Thanks
to Andrew Morton for commenting different versions.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
There's only one caller of set_page_dirty_balance() and that will call it
with page_mkwrite == 0.
The page_mkwrite argument was unused since commit b827e496c893 "mm: close
page_mkwrite races".
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
mem_cgroup_newpage_charge is used only for charging anonymous memory so
it is better to rename it to mem_cgroup_charge_anon.
mem_cgroup_cache_charge is used for file backed memory so rename it to
mem_cgroup_charge_file.
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Instead of returning NULL from try_get_mem_cgroup_from_mm() when the mm
owner is exiting, just return root_mem_cgroup. This makes sense for all
callsites and gets rid of some of them having to fallback manually.
[fengguang.wu@intel.com: fix warnings]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
I tried to use 'dump_page(page, __func__)' for debugging, but it triggers
warning:
warning: passing argument 2 of `dump_page' discards `const' qualifier from pointer target type [enabled by default]
Let's convert 'reason' to 'const char *' in dump_page() and friends: we
shouldn't modify it anyway.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The res_counter_{charge,uncharge}_locked() variants are not used in the
kernel outside of the resource counter code itself, so remove the
interface.
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Tim Hockin <thockin@google.com>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
PF_MEMPOLICY is an unnecessary optimization for CONFIG_SLAB users.
There's no significant performance degradation to checking
current->mempolicy rather than current->flags & PF_MEMPOLICY in the
allocation path, especially since this is considered unlikely().
Running TCP_RR with netperf-2.4.5 through localhost on 16 cpu machine with
64GB of memory and without a mempolicy:
threads before after
16 1249409 1244487
32 1281786 1246783
48 1239175 1239138
64 1244642 1241841
80 1244346 1248918
96 1266436 1254316
112 1307398 1312135
128 1327607 1326502
Per-process flags are a scarce resource so we should free them up whenever
possible and make them available. We'll be using it shortly for memcg oom
reserves.
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Tim Hockin <thockin@google.com>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
slab_node() is actually a mempolicy function, so rename it to
mempolicy_slab_node() to make it clearer that it used for processes with
mempolicies.
At the same time, cleanup its code by saving numa_mem_id() in a local
variable (since we require a node with memory, not just any node) and
remove an obsolete comment that assumes the mempolicy is actually passed
into the function.
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Tim Hockin <thockin@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This patch is a continuation of efforts trying to optimize find_vma(),
avoiding potentially expensive rbtree walks to locate a vma upon faults.
The original approach (https://lkml.org/lkml/2013/11/1/410), where the
largest vma was also cached, ended up being too specific and random,
thus further comparison with other approaches were needed. There are
two things to consider when dealing with this, the cache hit rate and
the latency of find_vma(). Improving the hit-rate does not necessarily
translate in finding the vma any faster, as the overhead of any fancy
caching schemes can be too high to consider.
We currently cache the last used vma for the whole address space, which
provides a nice optimization, reducing the total cycles in find_vma() by
up to 250%, for workloads with good locality. On the other hand, this
simple scheme is pretty much useless for workloads with poor locality.
Analyzing ebizzy runs shows that, no matter how many threads are
running, the mmap_cache hit rate is less than 2%, and in many situations
below 1%.
The proposed approach is to replace this scheme with a small per-thread
cache, maximizing hit rates at a very low maintenance cost.
Invalidations are performed by simply bumping up a 32-bit sequence
number. The only expensive operation is in the rare case of a seq
number overflow, where all caches that share the same address space are
flushed. Upon a miss, the proposed replacement policy is based on the
page number that contains the virtual address in question. Concretely,
the following results are seen on an 80 core, 8 socket x86-64 box:
1) System bootup: Most programs are single threaded, so the per-thread
scheme does improve ~50% hit rate by just adding a few more slots to
the cache.
+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline | 50.61% | 19.90 |
| patched | 73.45% | 13.58 |
+----------------+----------+------------------+
2) Kernel build: This one is already pretty good with the current
approach as we're dealing with good locality.
+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline | 75.28% | 11.03 |
| patched | 88.09% | 9.31 |
+----------------+----------+------------------+
3) Oracle 11g Data Mining (4k pages): Similar to the kernel build workload.
+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline | 70.66% | 17.14 |
| patched | 91.15% | 12.57 |
+----------------+----------+------------------+
4) Ebizzy: There's a fair amount of variation from run to run, but this
approach always shows nearly perfect hit rates, while baseline is just
about non-existent. The amounts of cycles can fluctuate between
anywhere from ~60 to ~116 for the baseline scheme, but this approach
reduces it considerably. For instance, with 80 threads:
+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline | 1.06% | 91.54 |
| patched | 99.97% | 14.18 |
+----------------+----------+------------------+
[akpm@linux-foundation.org: fix nommu build, per Davidlohr]
[akpm@linux-foundation.org: document vmacache_valid() logic]
[akpm@linux-foundation.org: attempt to untangle header files]
[akpm@linux-foundation.org: add vmacache_find() BUG_ON]
[hughd@google.com: add vmacache_valid_mm() (from Oleg)]
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: adjust and enhance comments]
Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Michel Lespinasse <walken@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Tested-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
filemap_map_pages() is generic implementation of ->map_pages() for
filesystems who uses page cache.
It should be safe to use filemap_map_pages() for ->map_pages() if
filesystem use filemap_fault() for ->fault().
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ning Qu <quning@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Here's new version of faultaround patchset. It took a while to tune it
and collect performance data.
First patch adds new callback ->map_pages to vm_operations_struct.
->map_pages() is called when VM asks to map easy accessible pages.
Filesystem should find and map pages associated with offsets from
"pgoff" till "max_pgoff". ->map_pages() is called with page table
locked and must not block. If it's not possible to reach a page without
blocking, filesystem should skip it. Filesystem should use do_set_pte()
to setup page table entry. Pointer to entry associated with offset
"pgoff" is passed in "pte" field in vm_fault structure. Pointers to
entries for other offsets should be calculated relative to "pte".
Currently VM use ->map_pages only on read page fault path. We try to
map FAULT_AROUND_PAGES a time. FAULT_AROUND_PAGES is 16 for now.
Performance data for different FAULT_AROUND_ORDER is below.
TODO:
- implement ->map_pages() for shmem/tmpfs;
- modify get_user_pages() to be able to use ->map_pages() and implement
mmap(MAP_POPULATE|MAP_NONBLOCK) on top.
=========================================================================
Tested on 4-socket machine (120 threads) with 128GiB of RAM.
Few real-world workloads. The sweet spot for FAULT_AROUND_ORDER here is
somewhere between 3 and 5. Let's say 4 :)
Linux build (make -j60)
FAULT_AROUND_ORDER Baseline 1 3 4 5 7 9
minor-faults 283,301,572 247,151,987 212,215,789 204,772,882 199,568,944 194,703,779 193,381,485
time, seconds 151.227629483 153.920996480 151.356125472 150.863792049 150.879207877 151.150764954 151.450962358
Linux rebuild (make -j60)
FAULT_AROUND_ORDER Baseline 1 3 4 5 7 9
minor-faults 5,396,854 4,148,444 2,855,286 2,577,282 2,361,957 2,169,573 2,112,643
time, seconds 27.404543757 27.559725591 27.030057426 26.855045126 26.678618635 26.974523490 26.761320095
Git test suite (make -j60 test)
FAULT_AROUND_ORDER Baseline 1 3 4 5 7 9
minor-faults 129,591,823 99,200,751 66,106,718 57,606,410 51,510,808 45,776,813 44,085,515
time, seconds 66.087215026 64.784546905 64.401156567 65.282708668 66.034016829 66.793780811 67.237810413
Two synthetic tests: access every word in file in sequential/random order.
It doesn't improve much after FAULT_AROUND_ORDER == 4.
Sequential access 16GiB file
FAULT_AROUND_ORDER Baseline 1 3 4 5 7 9
1 thread
minor-faults 4,195,437 2,098,275 525,068 262,251 131,170 32,856 8,282
time, seconds 7.250461742 6.461711074 5.493859139 5.488488147 5.707213983 5.898510832 5.109232856
8 threads
minor-faults 33,557,540 16,892,728 4,515,848 2,366,999 1,423,382 442,732 142,339
time, seconds 16.649304881 9.312555263 6.612490639 6.394316732 6.669827501 6.75078944 6.371900528
32 threads
minor-faults 134,228,222 67,526,810 17,725,386 9,716,537 4,763,731 1,668,921 537,200
time, seconds 49.164430543 29.712060103 12.938649729 10.175151004 11.840094583 9.594081325 9.928461797
60 threads
minor-faults 251,687,988 126,146,952 32,919,406 18,208,804 10,458,947 2,733,907 928,217
time, seconds 86.260656897 49.626551828 22.335007632 17.608243696 16.523119035 16.339489186 16.326390902
120 threads
minor-faults 503,352,863 252,939,677 67,039,168 35,191,827 19,170,091 4,688,357 1,471,862
time, seconds 124.589206333 79.757867787 39.508707872 32.167281632 29.972989292 28.729834575 28.042251622
Random access 1GiB file
1 thread
minor-faults 262,636 132,743 34,369 17,299 8,527 3,451 1,222
time, seconds 15.351890914 16.613802482 16.569227308 15.179220992 16.557356122 16.578247824 15.365266994
8 threads
minor-faults 2,098,948 1,061,871 273,690 154,501 87,110 25,663 7,384
time, seconds 15.040026343 15.096933500 14.474757288 14.289129964 14.411537468 14.296316837 14.395635804
32 threads
minor-faults 8,390,734 4,231,023 1,054,432 528,847 269,242 97,746 26,881
time, seconds 20.430433109 21.585235358 22.115062928 14.872878951 14.880856305 14.883370649 14.821261690
60 threads
minor-faults 15,733,258 7,892,809 1,973,393 988,266 594,789 164,994 51,691
time, seconds 26.577302548 25.692397770 18.728863715 20.153026398 21.619101933 17.745086260 17.613215273
120 threads
minor-faults 31,471,111 15,816,616 3,959,209 1,978,685 1,008,299 264,635 96,010
time, seconds 41.835322703 40.459786095 36.085306105 35.313894834 35.814445675 36.552633793 34.289210594
Touch only one page in page table in 16GiB file
FAULT_AROUND_ORDER Baseline 1 3 4 5 7 9
1 thread
minor-faults 8,372 8,324 8,270 8,260 8,249 8,239 8,237
time, seconds 0.039892712 0.045369149 0.051846126 0.063681685 0.079095975 0.17652406 0.541213386
8 threads
minor-faults 65,731 65,681 65,628 65,620 65,608 65,599 65,596
time, seconds 0.124159196 0.488600638 0.156854426 0.191901957 0.242631486 0.543569456 1.677303984
32 threads
minor-faults 262,388 262,341 262,285 262,276 262,266 262,257 263,183
time, seconds 0.452421421 0.488600638 0.565020946 0.648229739 0.789850823 1.651584361 5.000361559
60 threads
minor-faults 491,822 491,792 491,723 491,711 491,701 491,691 491,825
time, seconds 0.763288616 0.869620515 0.980727360 1.161732354 1.466915814 3.04041448 9.308612938
120 threads
minor-faults 983,466 983,655 983,366 983,372 983,363 984,083 984,164
time, seconds 1.595846553 1.667902182 2.008959376 2.425380942 2.941368804 5.977807890 18.401846125
This patch (of 2):
Introduce new vm_ops callback ->map_pages() and uses it for mapping easy
accessible pages around fault address.
On read page fault, if filesystem provides ->map_pages(), we try to map up
to FAULT_AROUND_PAGES pages around page fault address in hope to reduce
number of minor page faults.
We call ->map_pages first and use ->fault() as fallback if page by the
offset is not ready to be mapped (cold page cache or something).
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ning Qu <quning@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Add VM_INIT_DEF_MASK, to allow us to set the default flags for VMs. It
also adds a prctl control which allows us to set the THP disable bit in
mm->def_flags so that VMs will pick up the setting as they are created.
Signed-off-by: Alex Thorlton <athorlton@sgi.com>
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull CPU hotplug notifiers registration fixes from Rafael Wysocki:
"The purpose of this single series of commits from Srivatsa S Bhat
(with a small piece from Gautham R Shenoy) touching multiple
subsystems that use CPU hotplug notifiers is to provide a way to
register them that will not lead to deadlocks with CPU online/offline
operations as described in the changelog of commit 93ae4f978ca7f ("CPU
hotplug: Provide lockless versions of callback registration
functions").
The first three commits in the series introduce the API and document
it and the rest simply goes through the users of CPU hotplug notifiers
and converts them to using the new method"
* tag 'cpu-hotplug-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (52 commits)
net/iucv/iucv.c: Fix CPU hotplug callback registration
net/core/flow.c: Fix CPU hotplug callback registration
mm, zswap: Fix CPU hotplug callback registration
mm, vmstat: Fix CPU hotplug callback registration
profile: Fix CPU hotplug callback registration
trace, ring-buffer: Fix CPU hotplug callback registration
xen, balloon: Fix CPU hotplug callback registration
hwmon, via-cputemp: Fix CPU hotplug callback registration
hwmon, coretemp: Fix CPU hotplug callback registration
thermal, x86-pkg-temp: Fix CPU hotplug callback registration
octeon, watchdog: Fix CPU hotplug callback registration
oprofile, nmi-timer: Fix CPU hotplug callback registration
intel-idle: Fix CPU hotplug callback registration
clocksource, dummy-timer: Fix CPU hotplug callback registration
drivers/base/topology.c: Fix CPU hotplug callback registration
acpi-cpufreq: Fix CPU hotplug callback registration
zsmalloc: Fix CPU hotplug callback registration
scsi, fcoe: Fix CPU hotplug callback registration
scsi, bnx2fc: Fix CPU hotplug callback registration
scsi, bnx2i: Fix CPU hotplug callback registration
...
|
|
This is the final piece in the puzzle, as all patches to remove the
last users of \(interruptible_\|\)sleep_on\(_timeout\|\) have made it
into the 3.15 merge window. The work was long overdue, and this
interface in particular should not have survived the BKL removal
that was done a couple of years ago.
Citing Jon Corbet from http://lwn.net/2001/0201/kernel.php3":
"[...] it was suggested that the janitors look for and fix all code
that calls sleep_on() [...] since (1) almost all such code is
incorrect, and (2) Linus has agreed that those functions should
be removed in the 2.5 development series".
We haven't quite made it for 2.5, but maybe we can merge this for 3.15.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull Ceph updates from Sage Weil:
"The biggest chunk is a series of patches from Ilya that add support
for new Ceph osd and crush map features, including some new tunables,
primary affinity, and the new encoding that is needed for erasure
coding support. This brings things into parity with the server side
and the looming firefly release. There is also support for allocation
hints in RBD that help limit fragmentation on the server side.
There is also a series of patches from Zheng fixing NFS reexport,
directory fragmentation support, flock vs fnctl behavior, and some
issues with clustered MDS.
Finally, there are some miscellaneous fixes from Yunchuan Wen for
fscache, Fabian Frederick for ACLs, and from me for fsync(dirfd)
behavior"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (79 commits)
ceph: skip invalid dentry during dcache readdir
libceph: dump pool {read,write}_tier to debugfs
libceph: output primary affinity values on osdmap updates
ceph: flush cap release queue when trimming session caps
ceph: don't grabs open file reference for aborted request
ceph: drop extra open file reference in ceph_atomic_open()
ceph: preallocate buffer for readdir reply
libceph: enable PRIMARY_AFFINITY feature bit
libceph: redo ceph_calc_pg_primary() in terms of ceph_calc_pg_acting()
libceph: add support for osd primary affinity
libceph: add support for primary_temp mappings
libceph: return primary from ceph_calc_pg_acting()
libceph: switch ceph_calc_pg_acting() to new helpers
libceph: introduce apply_temps() helper
libceph: introduce pg_to_raw_osds() and raw_to_up_osds() helpers
libceph: ceph_can_shift_osds(pool) and pool type defines
libceph: ceph_osd_{exists,is_up,is_down}(osd) definitions
libceph: enable OSDMAP_ENC feature bit
libceph: primary_affinity decode bits
libceph: primary_affinity infrastructure
...
|
|
Some white space and 80 char overruns corrected.
Signed-off-by: Jon Mason <jon.mason@intel.com>
|
|
Provide a better event interface between the client and transport
Signed-off-by: Jon Mason <jon.mason@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"This patch-set includes the following major enhancement patches.
- introduce large directory support
- introduce f2fs_issue_flush to merge redundant flush commands
- merge write IOs as much as possible aligned to the segment
- add sysfs entries to tune the f2fs configuration
- use radix_tree for the free_nid_list to reduce in-memory operations
- remove costly bit operations in f2fs_find_entry
- enhance the readahead flow for CP/NAT/SIT/SSA blocks
The other bug fixes are as follows:
- recover xattr node blocks correctly after sudden-power-cut
- fix to calculate the maximum number of node ids
- enhance to handle many error cases
And, there are a bunch of cleanups"
* tag 'for-f2fs-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (62 commits)
f2fs: fix wrong statistics of inline data
f2fs: check the acl's validity before setting
f2fs: introduce f2fs_issue_flush to avoid redundant flush issue
f2fs: fix to cover io->bio with io_rwsem
f2fs: fix error path when fail to read inline data
f2fs: use list_for_each_entry{_safe} for simplyfying code
f2fs: avoid free slab cache under spinlock
f2fs: avoid unneeded lookup when xattr name length is too long
f2fs: avoid unnecessary bio submit when wait page writeback
f2fs: return -EIO when node id is not matched
f2fs: avoid RECLAIM_FS-ON-W warning
f2fs: skip unnecessary node writes during fsync
f2fs: introduce fi->i_sem to protect fi's info
f2fs: change reclaim rate in percentage
f2fs: add missing documentation for dir_level
f2fs: remove unnecessary threshold
f2fs: throttle the memory footprint with a sysfs entry
f2fs: avoid to drop nat entries due to the negative nr_shrink
f2fs: call f2fs_wait_on_page_writeback instead of native function
f2fs: introduce nr_pages_to_write for segment alignment
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux
Pull OMAP fbdev changes from Tomi Valkeinen:
"This is based on the already pulled fbdev-main changes, and this also
merges .dts branch from Tony Lindgren (which has also been pulled), so
that I was able to add the display related .dts changes.
This contains OMAP related fbdev changes for 3.15. The bulk of the
patches are for adding Device Tree support for OMAP Display Subsystem:
- SoCs: OMAP2/3/4
- Boards: OMAP4 Panda, OMAP4 SDP, OMAP3 Beagle, OMAP3 Beagle-xM,
OMAP3 IGEP0020, OMAP3 N900
- Devices: TFP410 Encoder, tpd12s015 HDMI companion chip, Sony
acx565akm panel, MIPI DSI Command mode panel and HDMI, DVI and
Analog TV connectors"
* tag 'fbdev-omap-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux: (45 commits)
OMAPDSS: HDMI: fix interlace output
OMAPDSS: add missing __init for dss_init_ports
ARM: OMAP2+: remove pdata quirks for displays
OMAPDSS: remove DT hacks for regulators
Doc/DT: Add DT binding documentation for tpd12s015 encoder
Doc/DT: Add DT binding documentation for TFP410 encoder
Doc/DT: Add DT binding documentation for Sony acx565akm panel
Doc/DT: Add DT binding documentation for MIPI DSI CM Panel
Doc/DT: Add DT binding documentation for HDMI Connector
Doc/DT: Add DT binding documentation for DVI Connector
Doc/DT: Add DT binding documentation for Analog TV Connector
ARM: omap3-n900.dts: add display information
ARM: omap3-igep0020.dts: add display information
ARM: omap3-beagle-xm.dts: add display information
ARM: omap3-beagle.dts: add display information
ARM: omap4-sdp.dts: add display information
Doc/DT: Add DT binding documentation for OMAP DSS
OMAPDSS: acx565akm: Add DT support
OMAPDSS: connector-analog-tv: Add DT support
OMAPDSS: hdmi-connector: Add DT support
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
Pull MFD updates from Lee Jones:
"Changes to existing drivers:
- Use of managed resources - omap, twl4030, ti_am335x_tscadc
- Advanced error handling - omap
- Rework clk management - omap
- Device Tree (re-)work - tc3589x, pm8921, da9055, sec
- IRC management overhaul and !BROKEN - pm8921
- Convert to regmap - ssbi, pm8921
- Use simple power-management ops - ucb1x00
- Include file clean-up - adp5520, cs5535, janz, lpc_ich,
- lpc_sch, max14577, mcp-sa11x0, pcf50633-adc, rc5t583,
rdc321x-southbridge, retu, smsc-ece1099, ti-ssp, ti_am335x_tscadc,
tps65912, vexpress-config, wm8350, ywm8350
- Various bug fixes across the subsystem
- NULL/invalid pointer dereference prevention
- Resource leak mitigation,
- Variable used initialised
- Staticise various containers
- Enforce return value checks
New drivers/supported devices:
- Add support for s2mps14 and s2mpa01 to sec
- Add support for da9063 (v5) to da9063
- Add support for atom-c2000 to gpio-ich
- Add support for come-{mbt10,cbt6,chl6} to kempld
- Add support for da9053 to da9052
- Add support for itco-wdt (v3) and baytrail to lpc_ich
- Add new drivers for tps65218, rtsx_usb, bcm590xx
(Re-)moved drivers:
- twl4030 ==> drivers/iio
- ti-ssp ==> /dev/null"
* tag 'mfd-for-linus-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (103 commits)
mfd: wm5110: Correct default for HEADPHONE_DETECT_1
mfd: arizona: Correct small errors in the DT binding documentation
mfd: arizona: Mark DSP clocking register as volatile
mfd: devicetree: bindings: Add pm8xxx RTC description
mfd: kempld-core: Fix potential hang-up during boot
mfd: sec-core: Fix uninitialized 'regmap_rtc' on S2MPA01
mfd: tps65910: Fix regmap_irq_chip_data leak on mfd_add_devices fail
mfd: tps65910: Fix possible invalid pointer dereference on regmap_add_irq_chip fail
mfd: sec-core: Fix I2C dummy device resource leak on probe failure
mfd: sec-core: Add of_compatible strings for clock MFD cells
mfd: Remove obsolete ti-ssp driver
Documentation: mfd: s2mps11: Describe S5M8767 and S2MPS14 clocks
mfd: bcm590xx: Fix type argument for module device table
mfd: lpc_ich: Add support for Intel Bay Trail SoC
mfd: lpc_ich: Add support for NM10 GPIO
mfd: lpc_ich: Change Avoton to iTCO v3
watchdog: iTCO_wdt: Add support for v3 silicon
mfd: lpc_ich: Add support for iTCO v3
mfd: lpc_ich: Remove lpc_ich_cfg struct use
mfd: lpc_ich: Only configure watchdog or GPIO when present
...
|
|
Pull MTD updates from Brian Norris:
- A few SPI NOR ID definitions
- Kill the NAND "max pagesize" restriction
- Fix some x16 bus-width NAND support
- Add NAND JEDEC parameter page support
- DT bindings for NAND ECC
- GPMI NAND updates (subpage reads)
- More OMAP NAND refactoring
- New STMicro SPI NOR driver (now in 40 patches!)
- A few other random bugfixes
* tag 'for-linus-20140405' of git://git.infradead.org/linux-mtd: (120 commits)
Fix index regression in nand_read_subpage
mtd: diskonchip: mem resource name is not optional
mtd: nand: fix mention to CONFIG_MTD_NAND_ECC_BCH
mtd: nand: fix GET/SET_FEATURES address on 16-bit devices
mtd: omap2: Use devm_ioremap_resource()
mtd: denali_dt: Use devm_ioremap_resource()
mtd: devices: elm: update DRIVER_NAME as "omap-elm"
mtd: devices: elm: configure parallel channels based on ecc_steps
mtd: devices: elm: clean elm_load_syndrome
mtd: devices: elm: check for hardware engine's design constraints
mtd: st_spi_fsm: Succinctly reorganise .remove()
mtd: st_spi_fsm: Allow loop to run at least once before giving up CPU
mtd: st_spi_fsm: Correct vendor name spelling issue - missing "M"
mtd: st_spi_fsm: Avoid duplicating MTD core code
mtd: st_spi_fsm: Remove useless consts from function arguments
mtd: st_spi_fsm: Convert ST SPI FSM (NOR) Flash driver to new DT partitions
mtd: st_spi_fsm: Move runtime configurable msg sequences into device's struct
mtd: st_spi_fsm: Supply the W25Qxxx chip specific configuration call-back
mtd: st_spi_fsm: Supply the S25FLxxx chip specific configuration call-back
mtd: st_spi_fsm: Supply the MX25xxx chip specific configuration call-back
...
|
|
Currently cpufreq frequency table has two fields: frequency and driver_data.
driver_data is only for drivers' internal use and cpufreq core shouldn't use
it at all. But with the introduction of BOOST frequencies, this assumption
was broken and we started using it as a flag instead.
There are two problems due to this:
- It is against the description of this field, as driver's data is used by
the core now.
- if drivers fill it with -3 for any frequency, then those frequencies are
never considered by cpufreq core as it is exactly same as value of
CPUFREQ_BOOST_FREQ, i.e. ~2.
The best way to get this fixed is by creating another field flags which
will be used for such flags. This patch does that. Along with that various
drivers need modifications due to the change of struct cpufreq_frequency_table.
Reviewed-by: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Split up __sbc_dif_verify_read() so that VERIFY READ emulation can
perform target-core specific READ_STRIP, seperate from the existing
FILEIO/RAMDISK backend emulation code.
Also add sbc_dif_read_strip() in order to determine number of sectors
using cmd->prot_length, and skip the extra sbc_dif_copy_prot().
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Quinn Tran <quinn.tran@qlogic.com>
Cc: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
This patch adds WRITE_INSERT emulation within target-core
using TYPE1 / TYPE3 PI modes in sbc_dif_generate() code.
This is useful in order for existing legacy fabrics that do not
support protection offloads to interact with backend devices that
currently have T10 PI enabled.
v2 changes:
- Rename to sbc_dif_generate() (Sagi)
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Quinn Tran <quinn.tran@qlogic.com>
Cc: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
In order to support local WRITE_INSERT + READ_STRIP operations for
non PI enabled fabrics, the fabric driver needs to be able signal
what protection offload operations are supported.
This is done at session initialization time so the modes can be
signaled by individual se_wwn + se_portal_group endpoints, as well
as optionally across different transports on the same endpoint.
For iser-target, set TARGET_PROT_ALL if the underlying ib_device
has already signaled PI offload support, and allow this to be
exposed via a new iscsit_transport->iscsit_get_sup_prot_ops()
callback.
For loopback, set TARGET_PROT_ALL to signal SCSI initiator mode
operation.
For all other drivers, set TARGET_PROT_NORMAL to disable fabric
level PI.
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Quinn Tran <quinn.tran@qlogic.com>
Cc: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
Now that TASK_ABORTED status is not generated for all cases by
TMR ABORT_TASK + LUN_RESET, a new TFO->abort_task() caller is
necessary in order to give fabric drivers a chance to unmap
hardware / software resources before the se_cmd descriptor is
released via the normal TFO->release_cmd() codepath.
This patch adds TFO->aborted_task() in core_tmr_abort_task()
in place of the original transport_send_task_abort(), and
also updates all fabric drivers to implement this caller.
The fabric drivers that include changes to perform cleanup
via ->aborted_task() are:
- iscsi-target
- iser-target
- srpt
- tcm_qla2xxx
The fabric drivers that currently set ->aborted_task() to
NOPs are:
- loopback
- tcm_fc
- usb-gadget
- sbp-target
- vhost-scsi
For the latter five, there appears to be no additional cleanup
required before invoking TFO->release_cmd() to release the
se_cmd descriptor.
v2 changes:
- Move ->aborted_task() call into transport_cmd_finish_abort (Alex)
Cc: Alex Leung <amleung21@yahoo.com>
Cc: Mark Rustad <mark.d.rustad@intel.com>
Cc: Roland Dreier <roland@kernel.org>
Cc: Vu Pham <vu@mellanox.com>
Cc: Chris Boot <bootc@bootc.net>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Giridhar Malavali <giridhar.malavali@qlogic.com>
Cc: Saurav Kashyap <saurav.kashyap@qlogic.com>
Cc: Quinn Tran <quinn.tran@qlogic.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
This patch addresses three of long standing issues wrt to Task
Aborted Status (TAS) handling.
The first is the incorrect assumption in core_tmr_handle_tas_abort()
that TASK_ABORTED status is sent for the task referenced by TMR
ABORT_TASK, and sending TASK_ABORTED status for TMR LUN_RESET on
the same nexus the LUN_RESET was received.
The second is to ensure the lun reference count is dropped within
transport_cmd_finish_abort() by calling transport_lun_remove_cmd()
before invoking transport_cmd_check_stop_to_fabric().
The last is to fix the delayed TAS handling to allow outstanding
WRITEs to complete before sending the TASK_ABORTED status. This
includes changing transport_check_aborted_status() to avoid
processing when SCF_SEND_DELAYED_TAS has not be set, and updating
transport_send_task_abort() to drop the SCF_SENT_DELAYED_TAS
check.
Signed-off-by: Alex Leung <amleung21@yahoo.com>
Cc: Alex Leung <amleung21@yahoo.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
This is not going to be supported soon - so drop it.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
Some transports (iSCSI/iSER/SRP/FC) support hardware INSERT/STRIP
capabilities while other transports like loopback/vhost-scsi need
perform this is software.
This patch allows fabrics using SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC
to signal the early LUN scan handling case where PROTECT CDB bits
are set, but no fabric buffer has been provided.
For transports which use generic new command these buffers have yet
to be allocated.
Also this way, target may support protection information
against legacy initiators (writes are inserted and reads
are stripped).
(Only set prot_pto for loopback during early special case - nab)
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
Pull NFS client updates from Trond Myklebust:
"Highlights include:
- Stable fix for a use after free issue in the NFSv4.1 open code
- Fix the SUNRPC bi-directional RPC code to account for TCP segmentation
- Optimise usage of readdirplus when confronted with 'ls -l' situations
- Soft mount bugfixes
- NFS over RDMA bugfixes
- NFSv4 close locking fixes
- Various NFSv4.x client state management optimisations
- Rename/unlink code cleanups"
* tag 'nfs-for-3.15-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (28 commits)
nfs: pass string length to pr_notice message about readdir loops
NFSv4: Fix a use-after-free problem in open()
SUNRPC: rpc_restart_call/rpc_restart_call_prepare should clear task->tk_status
SUNRPC: Don't let rpc_delay() clobber non-timeout errors
SUNRPC: Ensure call_connect_status() deals correctly with SOFTCONN tasks
SUNRPC: Ensure call_status() deals correctly with SOFTCONN tasks
NFSv4: Ensure we respect soft mount timeouts during trunking discovery
NFSv4: Schedule recovery if nfs40_walk_client_list() is interrupted
NFS: advertise only supported callback netids
SUNRPC: remove KERN_INFO from dprintk() call sites
SUNRPC: Fix large reads on NFS/RDMA
NFS: Clean up: revert increase in READDIR RPC buffer max size
SUNRPC: Ensure that call_bind times out correctly
SUNRPC: Ensure that call_connect times out correctly
nfs: emit a fsnotify_nameremove call in sillyrename codepath
nfs: remove synchronous rename code
nfs: convert nfs_rename to use async_rename infrastructure
nfs: make nfs_async_rename non-static
nfs: abstract out code needed to complete a sillyrename
NFSv4: Clear the open state flags if the new stateid does not match
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module updates from Rusty Russell:
"Nothing major: the stricter permissions checking for sysfs broke a
staging driver; fix included. Greg KH said he'd take the patch but
hadn't as the merge window opened, so it's included here to avoid
breaking build"
* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
staging: fix up speakup kobject mode
Use 'E' instead of 'X' for unsigned module taint flag.
VERIFY_OCTAL_PERMISSIONS: stricter checking for sysfs perms.
kallsyms: fix percpu vars on x86-64 with relocation.
kallsyms: generalize address range checking
module: LLVMLinux: Remove unused function warning from __param_check macro
Fix: module signature vs tracepoints: add new TAINT_UNSIGNED_MODULE
module: remove MODULE_GENERIC_TABLE
module: allow multiple calls to MODULE_DEVICE_TABLE() per module
module: use pr_cont
|
|
The generic scancode filtering has questionable value and makes it
impossible to determine from userspace if there is an actual
scancode hw filter present or not.
So revert the generic parts.
Based on a patch from James Hogan <james.hogan@imgtec.com>, but this
version also makes sure that only the valid sysfs files are created
in the first place.
Signed-off-by: David Härdeman <david@hardeman.nu>
Acked-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
|
|
Overloading dev->s_filter to do two different functions (set wakeup filters
and generic hardware filters) makes it impossible to tell what the
hardware actually supports, so create a separate dev->s_wakeup_filter and
make the distinction explicit.
Signed-off-by: David Härdeman <david@hardeman.nu>
Acked-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
|
|
As reported by Linus, make headers_check is reporting:
usr/include/linux/v4l2-common.h:72: found __[us]{8,16,32,64} type without #include <linux/types.h>
which seems to have come in through commits 777f4f85b75f1 and
254a47770163f.
That happens because struct v4l2_edid should be visible by both
subdev and V4L2 APIs. So, it was moved to v4l2-common.h.
As Linus pointed, the proper fix is to just add an include for
linux/types.h at v4l2-common.h.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper changes from Mike Snitzer:
- Fix dm-cache corruption caused by discard_block_size > cache_block_size
- Fix a lock-inversion detected by LOCKDEP in dm-cache
- Fix a dangling bio bug in the dm-thinp target's process_deferred_bios
error path
- Fix corruption due to non-atomic transaction commit which allowed a
metadata superblock to be written before all other metadata was
successfully written -- this is common to all targets that use the
persistent-data library's transaction manager (dm-thinp, dm-cache and
dm-era).
- Various small cleanups in the DM core
- Add the dm-era target which is useful for keeping track of which
blocks were written within a user defined period of time called an
'era'. Use cases include tracking changed blocks for backup
software, and partially invalidating the contents of a cache to
restore cache coherency after rolling back a vendor snapshot.
- Improve the on-disk layout of multithreaded writes to the
dm-thin-pool by splitting the pool's deferred bio list to be a
per-thin device list and then sorting that list using an rb_tree.
The subsequent read throughput of the data written via multiple
threads improved by ~70%.
- Simplify the multipath target's handling of queuing IO by pushing
requests back to the request queue rather than queueing the IO
internally.
* tag 'dm-3.15-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (24 commits)
dm cache: fix a lock-inversion
dm thin: sort the per thin deferred bios using an rb_tree
dm thin: use per thin device deferred bio lists
dm thin: simplify pool_is_congested
dm thin: fix dangling bio in process_deferred_bios error path
dm mpath: print more useful warnings in multipath_message()
dm-mpath: do not activate failed paths
dm mpath: remove extra nesting in map function
dm mpath: remove map_io()
dm mpath: reduce memory pressure when requeuing
dm mpath: remove process_queued_ios()
dm mpath: push back requests instead of queueing
dm table: add dm_table_run_md_queue_async
dm mpath: do not call pg_init when it is already running
dm: use RCU_INIT_POINTER instead of rcu_assign_pointer in __unbind
dm: stop using bi_private
dm: remove dm_get_mapinfo
dm: make dm_table_alloc_md_mempools static
dm: take care to copy the space map roots before locking the superblock
dm transaction manager: fix corruption due to non-atomic transaction commit
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU upates from Joerg Roedel:
"This time a few more updates queued up.
- Rework VT-d code to support ACPI devices
- Improvements for memory and PCI hotplug support in the VT-d driver
- Device-tree support for OMAP IOMMU
- Convert OMAP IOMMU to use devm_* interfaces
- Fixed PASID support for AMD IOMMU
- Other random cleanups and fixes for OMAP, ARM-SMMU and SHMOBILE
IOMMU
Most of the changes are in the VT-d driver because some rework was
necessary for better hotplug and ACPI device support"
* tag 'iommu-updates-v3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (75 commits)
iommu/vt-d: Fix error handling in ANDD processing
iommu/vt-d: returning free pointer in get_domain_for_dev()
iommu/vt-d: Only call dmar_acpi_dev_scope_init() if DRHD units present
iommu/vt-d: Check for NULL pointer in dmar_acpi_dev_scope_init()
iommu/amd: Fix logic to determine and checking max PASID
iommu/vt-d: Include ACPI devices in iommu=pt
iommu/vt-d: Finally enable translation for non-PCI devices
iommu/vt-d: Remove to_pci_dev() in intel_map_page()
iommu/vt-d: Remove pdev from intel_iommu_attach_device()
iommu/vt-d: Remove pdev from iommu_no_mapping()
iommu/vt-d: Make domain_add_dev_info() take struct device
iommu/vt-d: Make domain_remove_one_dev_info() take struct device
iommu/vt-d: Rename 'hwdev' variables to 'dev' now that that's the norm
iommu/vt-d: Remove some pointless to_pci_dev() calls
iommu/vt-d: Make get_valid_domain_for_dev() take struct device
iommu/vt-d: Make iommu_should_identity_map() take struct device
iommu/vt-d: Handle RMRRs for non-PCI devices
iommu/vt-d: Make get_domain_for_dev() take struct device
iommu/vt-d: Make domain_context_mapp{ed,ing}() take struct device
iommu/vt-d: Make device_to_iommu() cope with non-PCI devices
...
|
|
git://git.linaro.org/people/mike.turquette/linux
Pull clock framework changes from Mike Turquette:
"The clock framework changes for 3.15 look similar to past pull
requests. Mostly clock driver updates, more Device Tree support in
the form of common functions useful across platforms and a handful of
features and fixes to the framework core"
* tag 'clk-for-linus-3.15' of git://git.linaro.org/people/mike.turquette/linux: (86 commits)
clk: shmobile: fix setting paretn clock rate
clk: shmobile: rcar-gen2: fix lb/sd0/sd1/sdh clock parent to pll1
clk: Fix minor errors in of_clk_init() function comments
clk: reverse default clk provider initialization order in of_clk_init()
clk: sirf: update copyright years to 2014
clk: mmp: try to use closer one when do round rate
clk: mmp: fix the wrong calculation formula
clk: mmp: fix wrong mask when calculate denominator
clk: st: Adds quadfs clock binding
clk: st: Adds clockgen-vcc and clockgen-mux clock binding
clk: st: Adds clockgen clock binding
clk: st: Adds divmux and prediv clock binding
clk: st: Support for A9 MUX clocks
clk: st: Support for ClockGenA9/DDR/GPU
clk: st: Support for QUADFS inside ClockGenB/C/D/E/F
clk: st: Support for VCC-mux and MUX clocks
clk: st: Support for PLLs inside ClockGenA(s)
clk: st: Support for DIVMUX and PreDiv Clocks
clk: support hardware-specific debugfs entries
clk: s2mps11: Use of_get_child_by_name
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm
Pull pwm changes from Thierry Reding:
"The legacy HAVE_PWM Kconfig symbol is finally being retired. Thanks a
lot to Sascha Hauer for doing that.
Three new drivers are added: Freescale FTM, Cirrus Logic CLPS711X and
Intel Low Power Subsystem.
An assortment of fixes and cleanups rounds things off for this release
cycle"
* tag 'pwm/for-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
pwm: pxa: Constify OF match table
pwm: pxa: Fix typo "pwm" -> "PWM"
Revert "pwm: pxa: Use of_match_ptr()"
pwm: add support for Intel Low Power Subsystem PWM
pwm: Add CLPS711X PWM support
pwm: atmel: correct CDTY calculation
pwm: atmel: Fix polarity handling
Documentation: Add device tree bindings for Freescale FTM PWM.
pwm: Add Freescale FTM PWM driver support
pwm: pxa: Use of_match_ptr()
pwm: samsung: Use SIMPLE_DEV_PM_OPS macro
pwm: renesas-tpu: Add dependency on HAS_IOMEM
pwm: Remove obsolete HAVE_PWM Kconfig symbol
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC late cleanups from Arnd Bergmann:
"These could not be part of the first cleanup branch, because they
either came too late in the cycle, or they have dependencies on other
branches. Important changes are:
- The integrator platform is almost multiplatform capable after some
reorganization (Linus Walleij)
- Minor cleanups on Zynq (Michal Simek)
- Lots of changes for Exynos and other Samsung platforms, including
further preparations for multiplatform support and the clocks
bindings are rearranged"
* tag 'tags/cleanup2-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (54 commits)
devicetree: fix newly added exynos sata bindings
ARM: EXYNOS: Fix compilation error in cpuidle.c
ARM: S5P64X0: Explicitly include linux/serial_s3c.h in mach/pm-core.h
ARM: EXYNOS: Remove hardware.h file
ARM: SAMSUNG: Remove hardware.h inclusion
ARM: S3C24XX: Remove invalid code from hardware.h
dt-bindings: clock: Move exynos-audss-clk.h to dt-bindings/clock
ARM: dts: Keep some essential LDOs enabled for arndale-octa board
ARM: dts: Disable MDMA1 node for arndale-octa board
ARM: S3C64XX: Fix build for implicit serial_s3c.h inclusion
serial: s3c: Fix build of header without serial_core.h preinclusion
ARM: EXYNOS: Allow wake-up using GIC interrupts
ARM: EXYNOS: Stop using legacy Samsung PM code
ARM: EXYNOS: Remove PM initcalls and useless indirection
ARM: EXYNOS: Fix abuse of CONFIG_PM
ARM: SAMSUNG: Move s3c_pm_check_* prototypes to plat/pm-common.h
ARM: SAMSUNG: Move common save/restore helpers to separate file
ARM: SAMSUNG: Move Samsung PM debug code into separate file
ARM: SAMSUNG: Consolidate PM debug functions
ARM: SAMSUNG: Use debug_ll_addr() to get UART base address
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC driver changes from Arnd Bergmann:
"These changes are mostly for ARM specific device drivers that either
don't have an upstream maintainer, or that had the maintainer ask us
to pick up the changes to avoid conflicts.
A large chunk of this are clock drivers (bcm281xx, exynos, versatile,
shmobile), aside from that, reset controllers for STi as well as a
large rework of the Marvell Orion/EBU watchdog driver are notable"
* tag 'drivers-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (99 commits)
Revert "dts: socfpga: Add DTS entry for adding the stmmac glue layer for stmmac."
Revert "net: stmmac: Add SOCFPGA glue driver"
ARM: shmobile: r8a7791: Fix SCIFA3-5 clocks
ARM: STi: Add reset controller support to mach-sti Kconfig
drivers: reset: stih416: add softreset controller
drivers: reset: stih415: add softreset controller
drivers: reset: Reset controller driver for STiH416
drivers: reset: Reset controller driver for STiH415
drivers: reset: STi SoC system configuration reset controller support
dts: socfpga: Add sysmgr node so the gmac can use to reference
dts: socfpga: Add support for SD/MMC on the SOCFPGA platform
reset: Add optional resets and stubs
ARM: shmobile: r7s72100: fix bus clock calculation
Power: Reset: Generalize qnap-poweroff to work on Synology devices.
dts: socfpga: Update clock entry to support multiple parents
ARM: socfpga: Update socfpga_defconfig
dts: socfpga: Add DTS entry for adding the stmmac glue layer for stmmac.
net: stmmac: Add SOCFPGA glue driver
watchdog: orion_wdt: Use %pa to print 'phys_addr_t'
drivers: cci: Export CCI PMU revision
...
|
|
Pull ARM SoC device tree changes from Arnd Bergmann:
"A large part of the arm-soc patches are nowadays DT changes, adding
support for new SoCs, boards and devices without changing kernel
source. The plan is still to move the devicetree files out of the
kernel tree and reduce the amount of churn going on here, but we keep
finding reasons to delay doing that.
Changes are really all over the place, with little sticking out
particularly. We have contributions from a total of 116 people in
this branch.
Unfortunately, the size of this branch also causes a significant
number of conflicts at the moment, typically when subsystem
maintainers merge patches that change the driver at the same time as
the dts files. In most cases this could be avoided because the dts
changes are supposed to be compatible in both ways, and we are asking
everyone to send ARM dts changes through our tree only"
* tag 'dt-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (541 commits)
dts: stmmac: Document the clocks property in the stmmac base document
dts: socfpga: Add DTS entry for adding the stmmac glue layer for stmmac.
ARM: STi: stih41x: Add support for the FSM Serial Flash Controller
ARM: STi: stih416: Add support for the FSM Serial Flash Controller
ARM: tegra: fix Dalmore pinctrl configuration
ARM: dts: keystone: use common "ti,keystone" compatible instead of -evm
ARM: dts: k2hk-evm: set ubifs partition size for 512M NAND
ARM: dts: Build all keystone dt blobs
ARM: dts: keystone: Fix control register range for clktsip
ARM: dts: keystone: Fix domain register range for clkfftc1
ARM: dts: bcm28155-ap: leave camldo1 on to fix reboot
ARM: dts: add bcm590xx pmu support and enable for bcm28155-ap
ARM: dts: bcm21664: Add device tree files.
ARM: DT: bcm21664: Device tree bindings
ARM: efm32: properly namespace i2c location property
ARM: efm32: fix unit address part in USART2 device nodes' names
ARM: mvebu: Enable NAND controller in Armada 385-DB
ARM: mvebu: Add support for NAND controller in Armada 38x SoC
ARM: mvebu: Add the Core Divider clock to Armada 38x SoCs
ARM: mvebu: Add a 2 GHz fixed-clock on Armada 38x SoCs
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC specific changes from Arnd Bergmann:
"Lots of changes specific to one of the SoC families. Some that stick
out are:
- mach-qcom gains new features, most importantly SMP support for the
newer chips (Stephen Boyd, Rohit Vaswani)
- mvebu gains support for three new SoCs: Armada 375, 380 and 385
(Thomas Petazzoni and Free-electrons team)
- SMP support for Rockchips (Heiko Stübner)
- Lots of i.MX changes (Shawn Guo)
- Added support for BCM5301x SoC (Hauke Mehrtens)
- Multiplatform support for Marvell Kirkwood and Dove (Andrew Lunn
and Sebastian Hesselbarth doing the final part of a long journey)
- Unify davinci platforms and remove obsolete ones (Sekhar Nori, Arnd
Bergmann)"
* tag 'soc-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (126 commits)
ARM: sunxi: Select HAVE_ARM_ARCH_TIMER
ARM: cache-tauros2: remove ARMv6 code
ARM: mvebu: don't select CONFIG_NEON
ARM: davinci: fix DT booting with default defconfig
ARM: configs: bcm_defconfig: enable bcm590xx regulator support
ARM: davinci: remove tnetv107x support
MAINTAINERS: Update ARM STi maintainers
ARM: restrict BCM_KONA_UART to ARCH_BCM_MOBILE
ARM: bcm21664: Add board support.
ARM: sunxi: Add the new watchog compatibles to the reboot code
ARM: enable ARM_HAS_SG_CHAIN for multiplatform
ARM: davinci: remove da8xx_omapl_defconfig
ARM: davinci: da8xx: fix multiple watchdog device registration
ARM: davinci: add da8xx specific configs to davinci_all_defconfig
ARM: davinci: enable da8xx build concurrently with older devices
ARM: BCM5301X: workaround suppress fault
ARM: BCM5301X: add early debugging support
ARM: BCM5301X: initial support for the BCM5301X/BCM470X SoCs with ARM CPU
ARM: mach-bcm: Remove GENERIC_TIME
ARM: shmobile: APMU: Fix warnings due to improper printk formats
...
|