summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2014-12-17libceph: specify position of extent operationYan, Zheng
allow specifying position of extent operation in multi-operations osd request. This is required for cephfs to convert inline data to normal data (compare xattr, then write object). Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
2014-12-17libceph: add CREATE osd operation supportYan, Zheng
Add CEPH_OSD_OP_CREATE support. Also change libceph to not treat CEPH_OSD_OP_DELETE as an extent op and add an assert to that end. Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
2014-12-17libceph: add SETXATTR/CMPXATTR osd operations supportYan, Zheng
Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
2014-12-17rbd: don't treat CEPH_OSD_OP_DELETE as extent opIlya Dryomov
CEPH_OSD_OP_DELETE is not an extent op, stop treating it as such. This sneaked in with discard patches - it's one of the three osd ops (the other two are CEPH_OSD_OP_TRUNCATE and CEPH_OSD_OP_ZERO) that discard is implemented with. Signed-off-by: Ilya Dryomov <idryomov@redhat.com> Reviewed-by: Alex Elder <elder@linaro.org>
2014-12-17ceph: remove unused stringification macrosIlya Dryomov
These were used to report git versions a long time ago. Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
2014-12-17libceph: require cephx message signature by defaultYan, Zheng
Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
2014-12-17ceph: introduce global empty snap contextYan, Zheng
Current snaphost code does not properly handle moving inode from one empty snap realm to another empty snap realm. After changing inode's snap realm, some dirty pages' snap context can be not equal to inode's i_head_snap. This can trigger BUG() in ceph_put_wrbuffer_cap_refs() The fix is introduce a global empty snap context for all empty snap realm. This avoids triggering the BUG() for filesystem with no snapshot. Fixes: http://tracker.ceph.com/issues/9928 Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
2014-12-17ceph: message versioning fixesJohn Spray
There were two places we were assigning version in host byte order instead of network byte order. Also in MSG_CLIENT_SESSION we weren't setting compat_version in the header to reflect continued compatability with older MDSs. Fixes: http://tracker.ceph.com/issues/9945 Signed-off-by: John Spray <john.spray@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
2014-12-17libceph: update ceph_msg_header structureJohn Spray
2 bytes of what was reserved space is now used by userspace for the compat_version field. Signed-off-by: John Spray <john.spray@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
2014-12-17libceph: message signature supportYan, Zheng
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2014-12-17libceph: store session key in cephx authorizerYan, Zheng
Session key is required when calculating message signature. Save the session key in authorizer, this avoid lookup ticket handler for each message Signed-off-by: Yan, Zheng <zyan@redhat.com>
2014-12-17ceph, rbd: delete unnecessary checks before two function callsSF Markus Elfring
The functions ceph_put_snap_context() and iput() test whether their argument is NULL and then return immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> [idryomov@redhat.com: squashed rbd.c hunk, changelog] Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
2014-12-17ceph: introduce a new inode flag indicating if cached dentries are orderedYan, Zheng
After creating/deleting/renaming file, offsets of sibling dentries may change. So we can not use cached dentries to satisfy readdir. But we can still use the cached dentries to conclude -ENOENT for lookup. This patch introduces a new inode flag indicating if child dentries are ordered. The flag is set at the same time marking a directory complete. After creating/deleting/renaming file, we clear the flag on directory inode. This prevents ceph_readdir() from using cached dentries to satisfy readdir syscall. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2014-12-17libceph: nuke ceph_kvfree()Ilya Dryomov
Use kvfree() from linux/mm.h instead, which is identical. Also fix the ceph_buffer comment: we will allocate with kmalloc() up to 32k - the value of PAGE_ALLOC_COSTLY_ORDER, but that really is just an implementation detail so don't mention it at all. Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
2014-12-17ceph: fix file lock interruptionYan, Zheng
When a lock operation is interrupted, current code sends a unlock request to MDS to undo the lock operation. This method does not work as expected because the unlock request can drop locks that have already been acquired. The fix is use the newly introduced CEPH_LOCK_FCNTL_INTR/CEPH_LOCK_FLOCK_INTR requests to interrupt blocked file lock request. These requests do not drop locks that have alread been acquired, they only interrupt blocked file lock request. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2014-12-17ALSA: hda - Add quirk for Packard Bell EasyNote MX65Takashi Iwai
Packard Bell EasyNote MX65 with AD1986A codec needs a few fixups, namely, the pin config overrides to set only the known I/O pins and the EAPD has to be turned on. In addition, add stereo mix input forcibly for avoiding the weird KDE behavior by this update. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=88251 Signed-off-by: Takashi Iwai <tiwai@suse.de>
2014-12-17ALSA: usb-audio: add native DSD support for Matrix Audio DACsJurgen Kramer
This patch adds native DSD support for two XMOS based DACs from Matrix Audio: - X-Sabre - Mini-i Pro Signed-off-by: Jurgen Kramer <gtmkramer@xs4all.nl> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2014-12-17clk: samsung: Fix Exynos 5420 pinctrl setup and clock disable failure due to ↵Krzysztof Kozlowski
domain being gated Audio subsystem clocks are located in separate block. On Exynos 5420 if clock for this block (from main clock domain) 'mau_epll' is gated then any read or write to audss registers will block. This kind of boot hang was observed on Arndale Octa and Peach Pi/Pit after introducing runtime PM to pl330 DMA driver. After that commit the 'mau_epll' was gated, because the "amba" clock was disabled and there were no more users of mau_epll. The system hang on one of steps: 1. Disabling unused clocks from audss block. 2. During audss GPIO setup (just before probing i2s0 because samsung_pinmux_setup() tried to access memory from audss block which was gated. Add a workaround for this by enabling the 'mau_epll' clock in probe. Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Acked-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Tested-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk> Tested-by: Kevin Hilman <khilman@linaro.org> Signed-off-by: Michael Turquette <mturquette@linaro.org>
2014-12-17perf symbols: Fix use after free in filename__read_build_idMitchell Krome
In filename__read_build_id, phdr points to memory in buf, which gets realloced before a call to fseek that uses phdr->p_offset. This change stores the value of p_offset before buf is realloced, so the fseek can use the value safely. Signed-off-by: Mitchell Krome <mitchellkrome@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20141216021612.GA7199@mitchell Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-12-17perf evlist: Use roundup_pow_of_twoArnaldo Carvalho de Melo
And remove the equivalent next_pow2{_l} functions. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-hl9ct3wcbs5deai3v5ljmuws@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-12-17tools: Adopt roundup_pow_of_twoArnaldo Carvalho de Melo
To replace equivalent code used in the mmap_pages command line parameter handling in tools/perf. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-i44zs02xt4zexfxywpklo7km@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-12-17perf tools: Make the mmap length autotuning more robustArnaldo Carvalho de Melo
If /proc/sys/kernel/perf_event_mlock_kb is not (power of 2 + PAGE_SIZE_in_kb) and we let the perf tools do mmap length autosizing based on that, then, for non-CAP_IPC_LOCK users when /proc/sys/kernel/perf_event_paranoid is > -1, then we get an -EINVAL that ends up in: [acme@ssdandy linux]$ trace usleep 1 Invalid argument [acme@ssdandy linux]$ perf record usleep 1 failed to mmap with 22 (Invalid argument) After this fix: [acme@ssdandy linux]$ trace usleep 1 <SNIP> 0.806 ( 0.006 ms): munmap(addr: 0x7f7e4740a000, len: 66467) = 0 0.869 ( 0.002 ms): brk( ) = 0x7bb000 0.873 ( 0.003 ms): brk(brk: 0x7dc000 ) = 0x7dc000 0.877 ( 0.001 ms): brk( ) = 0x7dc000 0.953 ( 0.058 ms): nanosleep(rqtp: 0x7fff26ab9420 ) = 0 0.959 ( 0.000 ms): exit_group( [acme@ssdandy linux]$ perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.017 MB perf.data (~759 samples) ] [acme@ssdandy linux]$ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-6p6l5ou6jev6o7ymc4nn1n2a@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-12-17tools: Adopt rounddown_pow_of_two and depsArnaldo Carvalho de Melo
Will be used to make sure we pass a power of two when automatically setting up the perf_mmap addr range length, as the kernel code validating input on /proc/sys/kernel/perf_event_mlock_kb accepts any integer, if we plain use it to set up the mmap lenght, we may get an EINVAL when passing a non power of two. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-zflvep0q01dmkruf4o291l4p@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-12-17tools: Adopt fls_long and depsArnaldo Carvalho de Melo
Will be used when adopting rounddown_pow_of_two. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-9m0tt5300q1ygv51hejjas82@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-12-17tools: Move bitops.h from tools/perf/util to tools/Arnaldo Carvalho de Melo
So that we better mirror the kernel sources and make it available for other tools. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-mvfu6x753tksnto3t6412m93@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-12-17tools: Introduce asm-generic/bitops.hArnaldo Carvalho de Melo
In preparation for moving linux/bitops.h from tools/perf/util/ to tools/include/. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-2wuk8vahl7voz0ie55f07c9k@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-12-17tools lib: Move asm-generic/bitops/find.h code to tools/include and tools/libArnaldo Carvalho de Melo
To match the Linux kernel source code structure from where this code came from. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-fkbma1h04ki0zzdmp0dpgfyy@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-12-17tools: Whitespace prep patches for moving bitops.hArnaldo Carvalho de Melo
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-6xmwcvgm2rvoayv2mf9n5sf8@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-12-17tools: Move code originally from asm-generic/atomic.h into ↵Arnaldo Carvalho de Melo
tools/include/asm-generic/ To match the Linux kernel source code structure from where this code came from. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-1ldjhvioch1uczilno5e1epl@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-12-17tools: Move code originally from linux/log2.h to tools/include/linux/Arnaldo Carvalho de Melo
From tools/perf/util/include/linux, so that it becomes accessible to other tools/. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-uqohgzilp3ebd3cbybnf3luc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-12-17tools: Move __ffs implementation to tools/include/asm-generic/bitops/__ffs.hArnaldo Carvalho de Melo
To match the Linux kernel source code structure from where this code came from. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-gubysnp4a8hd98lxoeruak13@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-12-17vfs: make mounts and mountstats honor root dir like mountinfo doesDmitry V. Levin
As we already show mountpoints relative to the root directory, thanks to the change made back in 2000, change show_vfsmnt() and show_vfsstat() to skip out-of-root mountpoints the same way as show_mountinfo() does. Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-17vfs: cleanup show_mountinfoDmitry V. Levin
Starting with commit v3.2-rc4-1-g02125a8, seq_path_root() no longer changes the value of its "struct path *root" argument. Starting with commit v3.2-rc7-104-g8c9379e, the "struct path *root" argument of seq_path_root() is const. As result, the temporary variable "root" in show_mountinfo() that holds a copy of struct path root is no longer needed. Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-17init: fix read-write root mountMiklos Szeredi
If mount flags don't have MS_RDONLY, iso9660 returns EACCES without actually checking if it's an iso image. This tricks mount_block_root() into retrying with MS_RDONLY. This results in a read-only root despite the "rw" boot parameter if the actual filesystem was checked after iso9660. I believe the behavior of iso9660 is okay, while that of mount_block_root() is not. It should rather try all types without MS_RDONLY and only then retry with MS_RDONLY. This change also makes the code more robust against the case when EACCES is returned despite MS_RDONLY, which would've resulted in a lockup. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-17unfuck binfmt_misc.c (broken by commit e6084d4)Al Viro
scanarg(s, del) never returns s; the empty field results in s + 1. Restore the correct checks, and move NUL-termination into scanarg(), while we are at it. Incidentally, mixing "coding style cleanups" (for small values of cleanup) with functional changes is a Bad Idea(tm)... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-17vm_area_operations: kill ->migrate()Al Viro
the only instance this method has ever grown was one in kernfs - one that call ->migrate() of another vm_ops if it exists. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-17mac80211: free management frame keys when removing stationJohannes Berg
When writing the code to allow per-station GTKs, I neglected to take into account the management frame keys (index 4 and 5) when freeing the station and only added code to free the first four data frame keys. Fix this by iterating the array of keys over the right length. Cc: stable@vger.kernel.org Fixes: e31b82136d1a ("cfg80211/mac80211: allow per-station GTKs") Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-12-17KVM: PPC: Book3S HV: Improve H_CONFER implementationSam Bobroff
Currently the H_CONFER hcall is implemented in kernel virtual mode, meaning that whenever a guest thread does an H_CONFER, all the threads in that virtual core have to exit the guest. This is bad for performance because it interrupts the other threads even if they are doing useful work. The H_CONFER hcall is called by a guest VCPU when it is spinning on a spinlock and it detects that the spinlock is held by a guest VCPU that is currently not running on a physical CPU. The idea is to give this VCPU's time slice to the holder VCPU so that it can make progress towards releasing the lock. To avoid having the other threads exit the guest unnecessarily, we add a real-mode implementation of H_CONFER that checks whether the other threads are doing anything. If all the other threads are idle (i.e. in H_CEDE) or trying to confer (i.e. in H_CONFER), it returns H_TOO_HARD which causes a guest exit and allows the H_CONFER to be handled in virtual mode. Otherwise it spins for a short time (up to 10 microseconds) to give other threads the chance to observe that this thread is trying to confer. The spin loop also terminates when any thread exits the guest or when all other threads are idle or trying to confer. If the timeout is reached, the H_CONFER returns H_SUCCESS. In this case the guest VCPU will recheck the spinlock word and most likely call H_CONFER again. This also improves the implementation of the H_CONFER virtual mode handler. If the VCPU is part of a virtual core (vcore) which is runnable, there will be a 'runner' VCPU which has taken responsibility for running the vcore. In this case we yield to the runner VCPU rather than the target VCPU. We also introduce a check on the target VCPU's yield count: if it differs from the yield count passed to H_CONFER, the target VCPU has run since H_CONFER was called and may have already released the lock. This check is required by PAPR. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-12-17KVM: PPC: Book3S HV: Fix endianness of instruction obtained from HEIR registerPaul Mackerras
There are two ways in which a guest instruction can be obtained from the guest in the guest exit code in book3s_hv_rmhandlers.S. If the exit was caused by a Hypervisor Emulation interrupt (i.e. an illegal instruction), the offending instruction is in the HEIR register (Hypervisor Emulation Instruction Register). If the exit was caused by a load or store to an emulated MMIO device, we load the instruction from the guest by turning data relocation on and loading the instruction with an lwz instruction. Unfortunately, in the case where the guest has opposite endianness to the host, these two methods give results of different endianness, but both get put into vcpu->arch.last_inst. The HEIR value has been loaded using guest endianness, whereas the lwz will load the instruction using host endianness. The rest of the code that uses vcpu->arch.last_inst assumes it was loaded using host endianness. To fix this, we define a new vcpu field to store the HEIR value. Then, in kvmppc_handle_exit_hv(), we transfer the value from this new field to vcpu->arch.last_inst, doing a byte-swap if the guest and host endianness differ. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-12-17KVM: PPC: Book3S HV: Remove code for PPC970 processorsPaul Mackerras
This removes the code that was added to enable HV KVM to work on PPC970 processors. The PPC970 is an old CPU that doesn't support virtualizing guest memory. Removing PPC970 support also lets us remove the code for allocating and managing contiguous real-mode areas, the code for the !kvm->arch.using_mmu_notifiers case, the code for pinning pages of guest memory when first accessed and keeping track of which pages have been pinned, and the code for handling H_ENTER hypercalls in virtual mode. Book3S HV KVM is now supported only on POWER7 and POWER8 processors. The KVM_CAP_PPC_RMA capability now always returns 0. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-12-17KVM: PPC: Book3S HV: Tracepoints for KVM HV guest interactionsSuresh E. Warrier
This patch adds trace points in the guest entry and exit code and also for exceptions handled by the host in kernel mode - hypercalls and page faults. The new events are added to /sys/kernel/debug/tracing/events under a new subsystem called kvm_hv. Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-12-17KVM: PPC: Book3S HV: Simplify locking around stolen time calculationsPaul Mackerras
Currently the calculations of stolen time for PPC Book3S HV guests uses fields in both the vcpu struct and the kvmppc_vcore struct. The fields in the kvmppc_vcore struct are protected by the vcpu->arch.tbacct_lock of the vcpu that has taken responsibility for running the virtual core. This works correctly but confuses lockdep, because it sees that the code takes the tbacct_lock for a vcpu in kvmppc_remove_runnable() and then takes another vcpu's tbacct_lock in vcore_stolen_time(), and it thinks there is a possibility of deadlock, causing it to print reports like this: ============================================= [ INFO: possible recursive locking detected ] 3.18.0-rc7-kvm-00016-g8db4bc6 #89 Not tainted --------------------------------------------- qemu-system-ppc/6188 is trying to acquire lock: (&(&vcpu->arch.tbacct_lock)->rlock){......}, at: [<d00000000ecb1fe8>] .vcore_stolen_time+0x48/0xd0 [kvm_hv] but task is already holding lock: (&(&vcpu->arch.tbacct_lock)->rlock){......}, at: [<d00000000ecb25a0>] .kvmppc_remove_runnable.part.3+0x30/0xd0 [kvm_hv] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&vcpu->arch.tbacct_lock)->rlock); lock(&(&vcpu->arch.tbacct_lock)->rlock); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by qemu-system-ppc/6188: #0: (&vcpu->mutex){+.+.+.}, at: [<d00000000eb93f98>] .vcpu_load+0x28/0xe0 [kvm] #1: (&(&vcore->lock)->rlock){+.+...}, at: [<d00000000ecb41b0>] .kvmppc_vcpu_run_hv+0x530/0x1530 [kvm_hv] #2: (&(&vcpu->arch.tbacct_lock)->rlock){......}, at: [<d00000000ecb25a0>] .kvmppc_remove_runnable.part.3+0x30/0xd0 [kvm_hv] stack backtrace: CPU: 40 PID: 6188 Comm: qemu-system-ppc Not tainted 3.18.0-rc7-kvm-00016-g8db4bc6 #89 Call Trace: [c000000b2754f3f0] [c000000000b31b6c] .dump_stack+0x88/0xb4 (unreliable) [c000000b2754f470] [c0000000000faeb8] .__lock_acquire+0x1878/0x2190 [c000000b2754f600] [c0000000000fbf0c] .lock_acquire+0xcc/0x1a0 [c000000b2754f6d0] [c000000000b2954c] ._raw_spin_lock_irq+0x4c/0x70 [c000000b2754f760] [d00000000ecb1fe8] .vcore_stolen_time+0x48/0xd0 [kvm_hv] [c000000b2754f7f0] [d00000000ecb25b4] .kvmppc_remove_runnable.part.3+0x44/0xd0 [kvm_hv] [c000000b2754f880] [d00000000ecb43ec] .kvmppc_vcpu_run_hv+0x76c/0x1530 [kvm_hv] [c000000b2754f9f0] [d00000000eb9f46c] .kvmppc_vcpu_run+0x2c/0x40 [kvm] [c000000b2754fa60] [d00000000eb9c9a4] .kvm_arch_vcpu_ioctl_run+0x54/0x160 [kvm] [c000000b2754faf0] [d00000000eb94538] .kvm_vcpu_ioctl+0x498/0x760 [kvm] [c000000b2754fcb0] [c000000000267eb4] .do_vfs_ioctl+0x444/0x770 [c000000b2754fd90] [c0000000002682a4] .SyS_ioctl+0xc4/0xe0 [c000000b2754fe30] [c0000000000092e4] syscall_exit+0x0/0x98 In order to make the locking easier to analyse, we change the code to use a spinlock in the kvmppc_vcore struct to protect the stolen_tb and preempt_tb fields. This lock needs to be an irq-safe lock since it is used in the kvmppc_core_vcpu_load_hv() and kvmppc_core_vcpu_put_hv() functions, which are called with the scheduler rq lock held, which is an irq-safe lock. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-12-17arch: powerpc: kvm: book3s_paired_singles.c: Remove unused functionRickard Strandqvist
Remove the function inst_set_field() that is not used anywhere. This was partially found by using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-12-17arch: powerpc: kvm: book3s_pr.c: Remove unused functionRickard Strandqvist
Remove the function get_fpr_index() that is not used anywhere. This was partially found by using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-12-17arch: powerpc: kvm: book3s.c: Remove some unused functionsRickard Strandqvist
Removes some functions that are not used anywhere: kvmppc_core_load_guest_debugstate() kvmppc_core_load_host_debugstate() This was partially found by using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-12-17arch: powerpc: kvm: book3s_32_mmu.c: Remove unused functionRickard Strandqvist
Remove the function sr_nx() that is not used anywhere. This was partially found by using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-12-17microblaze: Fix mmap for cache coherent memoryLars-Peter Clausen
When running in non-cache coherent configuration the memory that was allocated with dma_alloc_coherent() has a custom mapping and so there is no 1-to-1 relationship between the kernel virtual address and the PFN. This means that virt_to_pfn() will not work correctly for those addresses and the default mmap implementation in the form of dma_common_mmap() will map some random, but not the requested, memory area. Fix this by providing a custom mmap implementation that looks up the PFN from the page table rather than using virt_to_pfn. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Michal Simek <michal.simek@xilinx.com>
2014-12-17new helper: iter_is_iovec()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-17move_extent_per_page(): get rid of unused w_flagsAl Viro
... and comparing get_fs() with KERNEL_DS used only to initialize that Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-17lustre: get rid of playing with ->fsAl Viro
* removed several pieces of dead code in lustre_compat25.h * don't open-code current_umask() (and BTW, 0755 & (S_IRWXUGO | S_ISVTX) is better spelled as 0755) * fix broken attempt to get the pathname by dentry - abusing d_path() for that is simply wrong. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>