summaryrefslogtreecommitdiffstats
path: root/drivers/xen
AgeCommit message (Collapse)Author
2013-03-27Merge tag 'stable/for-linus-3.9-rc4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull Xen bug-fixes from Konrad Rzeszutek Wilk: "This is mostly just the last stragglers of the regression bugs that this merge window had. There are also two bug-fixes: one that adds an extra layer of security, and a regression fix for a change that was added in v3.7 (the v1 was faulty, the v2 works). - Regression fixes for C-and-P states not being parsed properly. - Fix possible security issue with guests triggering DoS via non-assigned MSI-Xs. - Fix regression (introduced in v3.7) with raising an event (v2). - Fix hastily introduced band-aid during c0 for the CR3 blowup." * tag 'stable/for-linus-3.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/events: avoid race with raising an event in unmask_evtchn() xen/mmu: Move the setting of pvops.write_cr3 to later phase in bootup. xen/acpi-stub: Disable it b/c the acpi_processor_add is no longer called. xen-pciback: notify hypervisor about devices intended to be assigned to guests xen/acpi-processor: Don't dereference struct acpi_processor on all CPUs.
2013-03-27xen/events: avoid race with raising an event in unmask_evtchn()David Vrabel
In unmask_evtchn(), when the mask bit is cleared after testing for pending and the event becomes pending between the test and clear, then the upcall will not become pending and the event may be lost or delayed. Avoid this by always clearing the mask bit before checking for pending. If a hypercall is needed, remask the event as EVTCHNOP_unmask will only retrigger pending events if they were masked. This fixes a regression introduced in 3.7 by b5e579232d635b79a3da052964cb357ccda8d9ea (xen/events: fix unmask_evtchn for PV on HVM guests) which reordered the clear mask and check pending operations. Changes in v2: - set mask before hypercall. Cc: stable@vger.kernel.org Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-03-27xen/acpi-stub: Disable it b/c the acpi_processor_add is no longer called.Konrad Rzeszutek Wilk
With the Xen ACPI stub code (CONFIG_XEN_STUB=y) enabled, the power C and P states are no longer uploaded to the hypervisor. The reason is that the Xen CPU hotplug code: xen-acpi-cpuhotplug.c and the xen-acpi-stub.c register themselves as the "processor" type object. That means the generic processor (processor_driver.c) stops working and it does not call (acpi_processor_add) which populates the per_cpu(processors, pr->id) = pr; structure. The 'pr' is gathered from the acpi_processor_get_info function which does the job of finding the C-states and figuring out PBLK address. The 'processors->pr' is then later used by xen-acpi-processor.c (the one that uploads C and P states to the hypervisor). Since it is NULL, we end skip the gathering of _PSD, _PSS, _PCT, etc and never upload the power management data. The end result is that enabling the CONFIG_XEN_STUB in the build means that xen-acpi-processor is not working anymore. This temporary patch fixes it by marking the XEN_STUB driver as BROKEN until this can be properly fixed. CC: jinsong.liu@intel.com Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-03-22xen-pciback: notify hypervisor about devices intended to be assigned to guestsJan Beulich
For MSI-X capable devices the hypervisor wants to write protect the MSI-X table and PBA, yet it can't assume that resources have been assigned to their final values at device enumeration time. Thus have pciback do that notification, as having the device controlled by it is a prerequisite to assigning the device to guests anyway. This is the kernel part of hypervisor side commit 4245d33 ("x86/MSI: add mechanism to fully protect MSI-X table from PV guest accesses") on the master branch of git://xenbits.xen.org/xen.git. CC: stable@vger.kernel.org Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-03-22xen/acpi-processor: Don't dereference struct acpi_processor on all CPUs.Konrad Rzeszutek Wilk
With git commit c705c78c0d0835a4aa5d0d9a3422e3218462030c "acpi: Export the acpi_processor_get_performance_info" we are now using a different mechanism to access the P-states. The acpi_processor per-cpu structure is set and filtered by the core ACPI code which shrinks the per_cpu contents to only online CPUs. In the past we would call acpi_processor_register_performance() which would have not tried to dereference offline cpus. With the new patch and the fact that the loop we take is for for_all_possible_cpus we end up crashing on some machines. We could modify the loop to be for online_cpus - but all the other loops in the code use possible_cpus (for a good reason) - so lets leave it as so and just check if per_cpu(processor) is NULL. With this patch we will bypass the !online but possible CPUs. This fixes: IP: [<ffffffffa00d13b5>] xen_acpi_processor_init+0x1b6/0xe01 [xen_acpi_processor] PGD 4126e6067 PUD 4126e3067 PMD 0 Oops: 0002 [#1] SMP Pid: 432, comm: modprobe Not tainted 3.9.0-rc3+ #28 To be filled by O.E.M. To be filled by O.E.M./M5A97 RIP: e030:[<ffffffffa00d13b5>] [<ffffffffa00d13b5>] xen_acpi_processor_init+0x1b6/0xe01 [xen_acpi_processor] RSP: e02b:ffff88040c8a3ce8 EFLAGS: 00010282 .. snip.. Call Trace: [<ffffffffa00d11ff>] ? read_acpi_id+0x12b/0x12b [xen_acpi_processor] [<ffffffff8100215a>] do_one_initcall+0x12a/0x180 [<ffffffff810c42c3>] load_module+0x1cd3/0x2870 [<ffffffff81319b70>] ? ddebug_proc_open+0xc0/0xc0 [<ffffffff810c4f37>] sys_init_module+0xd7/0x120 [<ffffffff8166ce19>] system_call_fastpath+0x16/0x1b on some machines. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-03-12Merge tag 'stable/for-linus-3.9-rc2-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull Xen fixes from Konrad Rzeszutek Wilk: - Compile warnings and errors (one on x86, two on ARM) - WARNING in xen-pciback - Use the acpi_processor_get_performance_info instead of the 'register' version * tag 'stable/for-linus-3.9-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/acpi: remove redundant acpi/acpi_drivers.h include xen: arm: mandate EABI and use generic atomic operations. acpi: Export the acpi_processor_get_performance_info xen/pciback: Don't disable a PCI device that is already disabled.
2013-03-11xen/acpi: remove redundant acpi/acpi_drivers.h includeLiu Jinsong
It's redundant since linux/acpi.h has include it when CONFIG_ACPI enabled, and when CONFIG_ACPI disabled it will trigger compiling warning In file included from drivers/xen/xen-stub.c:28:0: include/acpi/acpi_drivers.h:103:31: warning: 'struct acpi_device' declared inside parameter list [enabled by default] include/acpi/acpi_drivers.h:103:31: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default] include/acpi/acpi_drivers.h:107:43: warning: 'struct acpi_pci_root' declared inside parameter list [enabled by default] Reported-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Liu Jinsong <jinsong.liu@intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-03-06acpi: Export the acpi_processor_get_performance_infoKonrad Rzeszutek Wilk
The git commit d5aaffa9dd531c978c6f3fea06a2972653bd7fc8 (cpufreq: handle cpufreq being disabled for all exported function) tightens the cpufreq API by returning errors when disable_cpufreq() had been called. The problem we are hitting is that the module xen-acpi-processor which uses the ACPI's functions: acpi_processor_register_performance, acpi_processor_preregister_performance, and acpi_processor_notify_smm fails at acpi_processor_register_performance with -22. Note that earlier during bootup in arch/x86/xen/setup.c there is also an call to cpufreq's API: disable_cpufreq(). This is b/c we want the Linux kernel to parse the ACPI data, but leave the cpufreq decisions to the hypervisor. In v3.9 all the checks that d5aaffa9dd531c978c6f3fea06a2972653bd7fc8 added are now hit and the calls to cpufreq_register_notifier will now fail. This means that acpi_processor_ppc_init ends up printing: "Warning: Processor Platform Limit not supported" and the acpi_processor_ppc_status is not set. The repercussions of that is that the call to acpi_processor_register_performance fails right away at: if (!(acpi_processor_ppc_status & PPC_REGISTERED)) and we don't progress any further on parsing and extracting the _P* objects. The only reason the Xen code called that function was b/c it was exported and the only way to gather the P-states. But we can also just make acpi_processor_get_performance_info be exported and not use acpi_processor_register_performance. This patch does so. Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-03-06xen/pciback: Don't disable a PCI device that is already disabled.Konrad Rzeszutek Wilk
While shuting down a HVM guest with pci devices passed through we get this: pciback 0000:04:00.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100002) ------------[ cut here ]------------ WARNING: at drivers/pci/pci.c:1397 pci_disable_device+0x88/0xa0() Hardware name: MS-7640 Device pciback disabling already-disabled device Modules linked in: Pid: 53, comm: xenwatch Not tainted 3.9.0-rc1-20130304a+ #1 Call Trace: [<ffffffff8106994a>] warn_slowpath_common+0x7a/0xc0 [<ffffffff81069a31>] warn_slowpath_fmt+0x41/0x50 [<ffffffff813cf288>] pci_disable_device+0x88/0xa0 [<ffffffff814554a7>] xen_pcibk_reset_device+0x37/0xd0 [<ffffffff81454b6f>] ? pcistub_put_pci_dev+0x6f/0x120 [<ffffffff81454b8d>] pcistub_put_pci_dev+0x8d/0x120 [<ffffffff814582a9>] __xen_pcibk_release_devices+0x59/0xa0 This fixes the bug. CC: stable@vger.kernel.org Reported-and-Tested-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-03-03fs: Limit sys_mount to only request filesystem modules.Eric W. Biederman
Modify the request_module to prefix the file system type with "fs-" and add aliases to all of the filesystems that can be built as modules to match. A common practice is to build all of the kernel code and leave code that is not commonly needed as modules, with the result that many users are exposed to any bug anywhere in the kernel. Looking for filesystems with a fs- prefix limits the pool of possible modules that can be loaded by mount to just filesystems trivially making things safer with no real cost. Using aliases means user space can control the policy of which filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf with blacklist and alias directives. Allowing simple, safe, well understood work-arounds to known problematic software. This also addresses a rare but unfortunate problem where the filesystem name is not the same as it's module name and module auto-loading would not work. While writing this patch I saw a handful of such cases. The most significant being autofs that lives in the module autofs4. This is relevant to user namespaces because we can reach the request module in get_fs_type() without having any special permissions, and people get uncomfortable when a user specified string (in this case the filesystem type) goes all of the way to request_module. After having looked at this issue I don't think there is any particular reason to perform any filtering or permission checks beyond making it clear in the module request that we want a filesystem module. The common pattern in the kernel is to call request_module() without regards to the users permissions. In general all a filesystem module does once loaded is call register_filesystem() and go to sleep. Which means there is not much attack surface exposed by loading a filesytem module unless the filesystem is mounted. In a user namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT, which most filesystems do not set today. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Acked-by: Kees Cook <keescook@chromium.org> Reported-by: Kees Cook <keescook@google.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-03-03Merge tag 'stable/for-linus-3.9-rc1-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull Xen bug-fixes from Konrad Rzeszutek Wilk: - Update the Xen ACPI memory and CPU hotplug locking mechanism. - Fix PAT issues wherein various applications would not start - Fix handling of multiple MSI as AHCI now does it. - Fix ARM compile failures. * tag 'stable/for-linus-3.9-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xenbus: fix compile failure on ARM with Xen enabled xen/pci: We don't do multiple MSI's. xen/pat: Disable PAT using pat_enabled value. xen/acpi: xen cpu hotplug minor updates xen/acpi: xen memory hotplug minor updates
2013-03-01xenbus: fix compile failure on ARM with Xen enabledSteven Noonan
Adding an include of linux/mm.h resolves this: drivers/xen/xenbus/xenbus_client.c: In function ‘xenbus_map_ring_valloc_hvm’: drivers/xen/xenbus/xenbus_client.c:532:66: error: implicit declaration of function ‘page_to_section’ [-Werror=implicit-function-declaration] CC: stable@vger.kernel.org Signed-off-by: Steven Noonan <steven@uplinklabs.net> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-02-26Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile (part one) from Al Viro: "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent locking violations, etc. The most visible changes here are death of FS_REVAL_DOT (replaced with "has ->d_weak_revalidate()") and a new helper getting from struct file to inode. Some bits of preparation to xattr method interface changes. Misc patches by various people sent this cycle *and* ocfs2 fixes from several cycles ago that should've been upstream right then. PS: the next vfs pile will be xattr stuff." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits) saner proc_get_inode() calling conventions proc: avoid extra pde_put() in proc_fill_super() fs: change return values from -EACCES to -EPERM fs/exec.c: make bprm_mm_init() static ocfs2/dlm: use GFP_ATOMIC inside a spin_lock ocfs2: fix possible use-after-free with AIO ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero target: writev() on single-element vector is pointless export kernel_write(), convert open-coded instances fs: encode_fh: return FILEID_INVALID if invalid fid_type kill f_vfsmnt vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op nfsd: handle vfs_getattr errors in acl protocol switch vfs_getattr() to struct path default SET_PERSONALITY() in linux/elf.h ceph: prepopulate inodes only when request is aborted d_hash_and_lookup(): export, switch open-coded instances 9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate() 9p: split dropping the acls from v9fs_set_create_acl() ...
2013-02-25xen/acpi: xen cpu hotplug minor updatesLiu Jinsong
Recently at native Rafael did some cleanup for acpi, say, drop acpi_bus_add, remove unnecessary argument of acpi_bus_scan, and run acpi_bus_scan under acpi_scan_lock. This patch does similar cleanup for xen cpu hotplug, removing redundant logic, and adding lock. Signed-off-by: Liu Jinsong <jinsong.liu@intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-02-25xen/acpi: xen memory hotplug minor updatesLiu Jinsong
Dan Carpenter found current xen memory hotplug logic has potential issue: at func acpi_memory_get_device() *mem_device = acpi_driver_data(device); while the device may be NULL and then dereference. At native side, Rafael recently updated acpi_memory_get_device(), dropping acpi_bus_add, adding lock, and avoiding above issue. This patch updates xen memory hotplug logic accordingly, removing redundant logic, adding lock, and avoiding dereference. Signed-off-by: Liu Jinsong <jinsong.liu@intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-02-24Merge tag 'stable/for-linus-3.9-rc0-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull Xen update from Konrad Rzeszutek Wilk: "This has two new ACPI drivers for Xen - a physical CPU offline/online and a memory hotplug. The way this works is that ACPI kicks the drivers and they make the appropiate hypercall to the hypervisor to tell it that there is a new CPU or memory. There also some changes to the Xen ARM ABIs and couple of fixes. One particularly nasty bug in the Xen PV spinlock code was fixed by Stefan Bader - and has been there since the 2.6.32! Features: - Xen ACPI memory and CPU hotplug drivers - allowing Xen hypervisor to be aware of new CPU and new DIMMs - Cleanups Bug-fixes: - Fixes a long-standing bug in the PV spinlock wherein we did not kick VCPUs that were in a tight loop. - Fixes in the error paths for the event channel machinery" Fix up a few semantic conflicts with the ACPI interface changes in drivers/xen/xen-acpi-{cpu,mem}hotplug.c. * tag 'stable/for-linus-3.9-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: event channel arrays are xen_ulong_t and not unsigned long xen: Send spinlock IPI to all waiters xen: introduce xen_remap, use it instead of ioremap xen: close evtchn port if binding to irq fails xen-evtchn: correct comment and error output xen/tmem: Add missing %s in the printk statement. xen/acpi: move xen_acpi_get_pxm under CONFIG_XEN_DOM0 xen/acpi: ACPI cpu hotplug xen/acpi: Move xen_acpi_get_pxm to Xen's acpi.h xen/stub: driver for CPU hotplug xen/acpi: ACPI memory hotplug xen/stub: driver for memory hotplug xen: implement updated XENMEM_add_to_physmap_range ABI xen/smp: Move the common CPU init code a bit to prep for PVH patch.
2013-02-22xenfs: switch to pure simple_fill_super()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-21Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm changes from Peter Anvin: "This is a huge set of several partly interrelated (and concurrently developed) changes, which is why the branch history is messier than one would like. The *really* big items are two humonguous patchsets mostly developed by Yinghai Lu at my request, which completely revamps the way we create initial page tables. In particular, rather than estimating how much memory we will need for page tables and then build them into that memory -- a calculation that has shown to be incredibly fragile -- we now build them (on 64 bits) with the aid of a "pseudo-linear mode" -- a #PF handler which creates temporary page tables on demand. This has several advantages: 1. It makes it much easier to support things that need access to data very early (a followon patchset uses this to load microcode way early in the kernel startup). 2. It allows the kernel and all the kernel data objects to be invoked from above the 4 GB limit. This allows kdump to work on very large systems. 3. It greatly reduces the difference between Xen and native (Xen's equivalent of the #PF handler are the temporary page tables created by the domain builder), eliminating a bunch of fragile hooks. The patch series also gets us a bit closer to W^X. Additional work in this pull is the 64-bit get_user() work which you were also involved with, and a bunch of cleanups/speedups to __phys_addr()/__pa()." * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (105 commits) x86, mm: Move reserving low memory later in initialization x86, doc: Clarify the use of asm("%edx") in uaccess.h x86, mm: Redesign get_user with a __builtin_choose_expr hack x86: Be consistent with data size in getuser.S x86, mm: Use a bitfield to mask nuisance get_user() warnings x86/kvm: Fix compile warning in kvm_register_steal_time() x86-32: Add support for 64bit get_user() x86-32, mm: Remove reference to alloc_remap() x86-32, mm: Remove reference to resume_map_numa_kva() x86-32, mm: Rip out x86_32 NUMA remapping code x86/numa: Use __pa_nodebug() instead x86: Don't panic if can not alloc buffer for swiotlb mm: Add alloc_bootmem_low_pages_nopanic() x86, 64bit, mm: hibernate use generic mapping_init x86, 64bit, mm: Mark data/bss/brk to nx x86: Merge early kernel reserve for 32bit and 64bit x86: Add Crash kernel low reservation x86, kdump: Remove crashkernel range find limit for 64bit memblock: Add memblock_mem_size() x86, boot: Not need to check setup_header version for setup_data ...
2013-02-20Merge tag 'pm+acpi-3.9-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management updates from Rafael Wysocki: - Rework of the ACPI namespace scanning code from Rafael J. Wysocki with contributions from Bjorn Helgaas, Jiang Liu, Mika Westerberg, Toshi Kani, and Yinghai Lu. - ACPI power resources handling and ACPI device PM update from Rafael J Wysocki. - ACPICA update to version 20130117 from Bob Moore and Lv Zheng with contributions from Aaron Lu, Chao Guan, Jesper Juhl, and Tim Gardner. - Support for Intel Lynxpoint LPSS from Mika Westerberg. - cpuidle update from Len Brown including Intel Haswell support, C1 state for intel_idle, removal of global pm_idle. - cpuidle fixes and cleanups from Daniel Lezcano. - cpufreq fixes and cleanups from Viresh Kumar and Fabio Baltieri with contributions from Stratos Karafotis and Rickard Andersson. - Intel P-states driver for Sandy Bridge processors from Dirk Brandewie. - cpufreq driver for Marvell Kirkwood SoCs from Andrew Lunn. - cpufreq fixes related to ordering issues between acpi-cpufreq and powernow-k8 from Borislav Petkov and Matthew Garrett. - cpufreq support for Calxeda Highbank processors from Mark Langsdorf and Rob Herring. - cpufreq driver for the Freescale i.MX6Q SoC and cpufreq-cpu0 update from Shawn Guo. - cpufreq Exynos fixes and cleanups from Jonghwan Choi, Sachin Kamat, and Inderpal Singh. - Support for "lightweight suspend" from Zhang Rui. - Removal of the deprecated power trace API from Paul Gortmaker. - Assorted updates from Andreas Fleig, Colin Ian King, Davidlohr Bueso, Joseph Salisbury, Kees Cook, Li Fei, Nishanth Menon, ShuoX Liu, Srinivas Pandruvada, Tejun Heo, Thomas Renninger, and Yasuaki Ishimatsu. * tag 'pm+acpi-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (267 commits) PM idle: remove global declaration of pm_idle unicore32 idle: delete stray pm_idle comment openrisc idle: delete pm_idle mn10300 idle: delete pm_idle microblaze idle: delete pm_idle m32r idle: delete pm_idle, and other dead idle code ia64 idle: delete pm_idle cris idle: delete idle and pm_idle ARM64 idle: delete pm_idle ARM idle: delete pm_idle blackfin idle: delete pm_idle sparc idle: rename pm_idle to sparc_idle sh idle: rename global pm_idle to static sh_idle x86 idle: rename global pm_idle to static x86_idle APM idle: register apm_cpu_idle via cpuidle cpufreq / intel_pstate: Add kernel command line option disable intel_pstate. cpufreq / intel_pstate: Change to disallow module build tools/power turbostat: display SMI count by default intel_idle: export both C1 and C1E ACPI / hotplug: Fix concurrency issues and memory leaks ...
2013-02-20xen: event channel arrays are xen_ulong_t and not unsigned longIan Campbell
On ARM we want these to be the same size on 32- and 64-bit. This is an ABI change on ARM. X86 does not change. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Keir (Xen.org) <keir@xen.org> Cc: Tim Deegan <tim@xen.org> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: linux-arm-kernel@lists.infradead.org Cc: xen-devel@lists.xen.org Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-02-19Merge branch 'x86-hyperv-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/hyperv changes from Ingo Molnar: "The biggest change is support for Windows 8's improved hypervisor interrupt model on the Linux Hyper-V guest subsystem code side. Smallish fixes otherwise." * 'x86-hyperv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, hyperv: HYPERV depends on X86_LOCAL_APIC X86: Handle Hyper-V vmbus interrupts as special hypervisor interrupts X86: Add a check to catch Xen emulation of Hyper-V x86: Hyper-V: register clocksource only if its advertised
2013-02-19xen: introduce xen_remap, use it instead of ioremapStefano Stabellini
ioremap can't be used to map ring pages on ARM because it uses device memory caching attributes (MT_DEVICE*). Introduce a Xen specific abstraction to map ring pages, called xen_remap, that is defined as ioremap on x86 (no behavioral changes). On ARM it explicitly calls __arm_ioremap with the right caching attributes: MT_MEMORY. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-02-19xen: close evtchn port if binding to irq failsWei Liu
CC: stable@vger.kernel.org Signed-off-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-02-19xen-evtchn: correct comment and error outputWei Liu
The evtchn device has been moved to /dev/xen. Also change log level to KERN_ERR as other xen drivers. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-02-19xen/tmem: Add missing %s in the printk statement.Konrad Rzeszutek Wilk
Seems that it got lost. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-02-19xen/acpi: ACPI cpu hotplugLiu Jinsong
This patch implement real Xen ACPI cpu hotplug driver as module. When loaded, it replaces Xen stub driver. For booting existed cpus, the driver enumerates them. For hotadded cpus, which added at runtime and notify OS via device or container event, the driver is invoked to add them, parsing cpu information, hypercalling to Xen hypervisor to add them, and finally setting up new /sys interface for them. Signed-off-by: Liu Jinsong <jinsong.liu@intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-02-19xen/acpi: Move xen_acpi_get_pxm to Xen's acpi.hLiu Jinsong
So that it could be reused by Xen CPU hotplug logic. Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-02-19xen/stub: driver for CPU hotplugLiu Jinsong
Add Xen stub driver for CPU hotplug, early occupy to block native, will be replaced later by real Xen processor driver module. Signed-off-by: Liu Jinsong <jinsong.liu@intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-02-19xen/acpi: ACPI memory hotplugLiu Jinsong
This patch implements real Xen acpi memory hotplug driver as module. When loaded, it replaces Xen stub driver. When an acpi memory device hotadd event occurs, it notifies OS and invokes notification callback, adding related memory device and parsing memory information, finally hypercall to xen hypervisor to add memory. Signed-off-by: Liu Jinsong <jinsong.liu@intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-02-19xen/stub: driver for memory hotplugLiu Jinsong
This patch create a file (xen-stub.c) for Xen stub drivers. Xen stub drivers are used to reserve space for Xen drivers, i.e. memory hotplug and cpu hotplug, and to block native drivers loaded, so that real Xen drivers can be modular and loaded on demand. This patch is specific for Xen memory hotplug (other Xen logic can add stub drivers on their own). The xen stub driver will occupied earlier via subsys_initcall (than native memory hotplug driver via module_init and so blocking native). Later real Xen memory hotplug logic will unregister the stub driver and register itself to take effect on demand. Signed-off-by: Liu Jinsong <jinsong.liu@intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-02-15Merge tag 'stable/for-linus-3.8-rc7-tag-two' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull xen fixes from Konrad Rzeszutek Wilk: "Two fixes: - A simple bug-fix for redundant NULL check. - CVE-2013-0228/XSA-42: x86/xen: don't assume %ds is usable in xen_iret for 32-bit PVOPS and two reverts: - Revert the PVonHVM kexec. The patch introduces a regression with older hypervisor stacks, such as Xen 4.1." * tag 'stable/for-linus-3.8-rc7-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: Revert "xen PVonHVM: use E820_Reserved area for shared_info" Revert "xen/PVonHVM: fix compile warning in init_hvm_pv_info" xen: remove redundant NULL check before unregister_and_remove_pcpu(). x86/xen: don't assume %ds is usable in xen_iret for 32-bit PVOPS.
2013-02-15Merge branch 'acpi-cleanup'Rafael J. Wysocki
* acpi-cleanup: (21 commits) ACPI / hotplug: Fix concurrency issues and memory leaks ACPI: Remove the use of CONFIG_ACPI_CONTAINER_MODULE ACPI / scan: Full transition to D3cold in acpi_device_unregister() ACPI / scan: Make acpi_bus_hot_remove_device() acquire the scan lock ACPI: Drop the container.h header file ACPI / Documentation: refer to correct file for acpi_platform_device_ids[] table ACPI / scan: Make container driver use struct acpi_scan_handler ACPI / scan: Remove useless #ifndef from acpi_eject_store() ACPI: Unbind ACPI drv when probe failed ACPI: sysfs eject support for ACPI scan handlers ACPI / scan: Follow priorities of IDs when matching scan handlers ACPI / PCI: pci_slot: replace printk(KERN_xxx) with pr_xxx() ACPI / dock: Fix acpi_bus_get_device() check in drivers/acpi/dock.c ACPI / scan: Clean up acpi_bus_get_parent() ACPI / platform: Use struct acpi_scan_handler for creating devices ACPI / PCI: Make PCI IRQ link driver use struct acpi_scan_handler ACPI / PCI: Make PCI root driver use struct acpi_scan_handler ACPI / scan: Introduce struct acpi_scan_handler ACPI / scan: Make scanning of fixed devices follow the general scheme ACPI: Drop device start operation that is not used ...
2013-02-13xen: remove redundant NULL check before unregister_and_remove_pcpu().Cyril Roelandt
unregister_and_remove_pcpu on a NULL pointer is a no-op, so the NULL check in sync_pcpu can be removed. Signed-off-by: Cyril Roelandt <tipecaml@gmail.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-02-12X86: Handle Hyper-V vmbus interrupts as special hypervisor interruptsK. Y. Srinivasan
Starting with win8, vmbus interrupts can be delivered on any VCPU in the guest and furthermore can be concurrently active on multiple VCPUs. Support this interrupt delivery model by setting up a separate IDT entry for Hyper-V vmbus. interrupts. I would like to thank Jan Beulich <JBeulich@suse.com> and Thomas Gleixner <tglx@linutronix.de>, for their help. In this version of the patch, based on the feedback, I have merged the IDT vector for Xen and Hyper-V and made the necessary adjustments. Furhermore, based on Jan's feedback I have added the necessary compilation switches. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Link: http://lkml.kernel.org/r/1359940959-32168-3-git-send-email-kys@microsoft.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-02-08Merge tag 'stable/for-linus-3.8-rc6-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull Xen fixes from Konrad Rzeszutek Wilk: "This has two fixes. One is a security fix wherein we would spam the kernel printk buffer if one of the guests was misbehaving. The other is much tamer and it was us only checking for one type of error from the IRQ subsystem (when allocating new IRQs) instead of for all of them. - Fix an IRQ allocation where we only check for a specific error (-1). - CVE-2013-0231 / XSA-43. Make xen-pciback rate limit error messages from xen_pcibk_enable_msi{,x}()" * tag 'stable/for-linus-3.8-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: fix error handling path if xen_allocate_irq_dynamic fails xen-pciback: rate limit error messages from xen_pcibk_enable_msi{,x}()
2013-02-06xen: fix error handling path if xen_allocate_irq_dynamic failsWei Liu
It is possible that the call to xen_allocate_irq_dynamic() returns negative number other than -1. Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-02-06xen-pciback: rate limit error messages from xen_pcibk_enable_msi{,x}()Jan Beulich
... as being guest triggerable (e.g. by invoking XEN_PCI_OP_enable_msi{,x} on a device not being MSI/MSI-X capable). This is CVE-2013-0231 / XSA-43. Also make the two messages uniform in both their wording and severity. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-01-29x86: Don't panic if can not alloc buffer for swiotlbYinghai Lu
Normal boot path on system with iommu support: swiotlb buffer will be allocated early at first and then try to initialize iommu, if iommu for intel or AMD could setup properly, swiotlb buffer will be freed. The early allocating is with bootmem, and could panic when we try to use kdump with buffer above 4G only, or with memmap to limit mem under 4G. for example: memmap=4095M$1M to remove memory under 4G. According to Eric, add _nopanic version and no_iotlb_memory to fail map single later if swiotlb is still needed. -v2: don't pass nopanic, and use -ENOMEM return value according to Eric. panic early instead of using swiotlb_full to panic...according to Eric/Konrad. -v3: make swiotlb_init to be notpanic, but will affect: arm64, ia64, powerpc, tile, unicore32, x86. -v4: cleanup swiotlb_init by removing swiotlb_init_with_default_size. Suggested-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1359058816-7615-36-git-send-email-yinghai@kernel.org Reviewed-and-tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Kyungmin Park <kyungmin.park@samsung.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Andrzej Pietrasiewicz <andrzej.p@samsung.com> Cc: linux-mips@linux-mips.org Cc: xen-devel@lists.xensource.com Cc: virtualization@lists.linux-foundation.org Cc: Shuah Khan <shuahkhan@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-26ACPI: Remove useless type argument of driver .remove() operationRafael J. Wysocki
The second argument of ACPI driver .remove() operation is only used by the ACPI processor driver and the value passed to that driver through it is always available from the given struct acpi_device object's removal_type field. For this reason, the second ACPI driver .remove() argument is in fact useless, so drop it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Jiang Liu <jiang.liu@huawei.com> Acked-by: Toshi Kani <toshi.kani@hp.com> Acked-by: Yinghai Lu <yinghai@kernel.org>
2013-01-18Merge tag 'stable/for-linus-3.8-rc3-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull Xen fixes from Konrad Rzeszutek Wilk: - CVE-2013-0190/XSA-40 (or stack corruption for 32-bit PV kernels) - Fix racy vma access spotted by Al Viro - Fix mmap batch ioctl potentially resulting in large O(n) page allcations. - Fix vcpu online/offline BUG:scheduling while atomic.. - Fix unbound buffer scanning for more than 32 vCPUs. - Fix grant table being incorrectly initialized - Fix incorrect check in pciback - Allow privcmd in backend domains. Fix up whitespace conflict due to ugly merge resolution in Xen tree in arch/arm/xen/enlighten.c * tag 'stable/for-linus-3.8-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: Fix stack corruption in xen_failsafe_callback for 32bit PVOPS guests. Revert "xen/smp: Fix CPU online/offline bug triggering a BUG: scheduling while atomic." xen/gntdev: remove erronous use of copy_to_user xen/gntdev: correctly unmap unlinked maps in mmu notifier xen/gntdev: fix unsafe vma access xen/privcmd: Fix mmap batch ioctl. Xen: properly bound buffer access when parsing cpu/*/availability xen/grant-table: correctly initialize grant table version 1 x86/xen : Fix the wrong check in pciback xen/privcmd: Relax access control in privcmd_ioctl_mmap
2013-01-15xen/gntdev: remove erronous use of copy_to_userDaniel De Graaf
Since there is now a mapping of granted pages in kernel address space in both PV and HVM, use it for UNMAP_NOTIFY_CLEAR_BYTE instead of accessing memory via copy_to_user and triggering sleep-in-atomic warnings. Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-01-15xen/gntdev: correctly unmap unlinked maps in mmu notifierDaniel De Graaf
If gntdev_ioctl_unmap_grant_ref is called on a range before unmapping it, the entry is removed from priv->maps and the later call to mn_invl_range_start won't find it to do the unmapping. Fix this by creating another list of freeable maps that the mmu notifier can search and use to unmap grants. Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-01-15xen/gntdev: fix unsafe vma accessDaniel De Graaf
In gntdev_ioctl_get_offset_for_vaddr, we need to hold mmap_sem while calling find_vma() to avoid potentially having the result freed out from under us. Similarly, the MMU notifier functions need to synchronize with gntdev_vma_close to avoid map->vma being freed during their iteration. Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-01-15xen/privcmd: Fix mmap batch ioctl.Andres Lagar-Cavilla
1. If any individual mapping error happens, the V1 case will mark *all* operations as failed. Fixed. 2. The err_array was allocated with kcalloc, resulting in potentially O(n) page allocations. Refactor code to not use this array. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-01-15Merge tag 'v3.7' into stable/for-linus-3.8Konrad Rzeszutek Wilk
Linux 3.7 * tag 'v3.7': (833 commits) Linux 3.7 Input: matrix-keymap - provide proper module license Revert "revert "Revert "mm: remove __GFP_NO_KSWAPD""" and associated damage ipv4: ip_check_defrag must not modify skb before unsharing Revert "mm: avoid waking kswapd for THP allocations when compaction is deferred or contended" inet_diag: validate port comparison byte code to prevent unsafe reads inet_diag: avoid unsafe and nonsensical prefix matches in inet_diag_bc_run() inet_diag: validate byte code to prevent oops in inet_diag_bc_run() inet_diag: fix oops for IPv4 AF_INET6 TCP SYN-RECV state mm: vmscan: fix inappropriate zone congestion clearing vfs: fix O_DIRECT read past end of block device net: gro: fix possible panic in skb_gro_receive() tcp: bug fix Fast Open client retransmission tmpfs: fix shared mempolicy leak mm: vmscan: do not keep kswapd looping forever due to individual uncompactable zones mm: compaction: validate pfn range passed to isolate_freepages_block mmc: sh-mmcif: avoid oops on spurious interrupts (second try) Revert misapplied "mmc: sh-mmcif: avoid oops on spurious interrupts" mmc: sdhci-s3c: fix missing clock for gpio card-detect lib/Makefile: Fix oid_registry build dependency ... Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Conflicts: arch/arm/xen/enlighten.c drivers/xen/Makefile [We need to have the v3.7 base as the 'for-3.8' was based off v3.7-rc3 and there are some patches in v3.7-rc6 that we to have in our branch]
2013-01-15Xen: properly bound buffer access when parsing cpu/*/availabilityJan Beulich
At the same time reduce the local buffers to 16 bytes each. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-01-15xen/grant-table: correctly initialize grant table version 1Matt Wilson
Commit 85ff6acb075a484780b3d763fdf41596d8fc0970 (xen/granttable: Grant tables V2 implementation) changed the GREFS_PER_GRANT_FRAME macro from a constant to a conditional expression. The expression depends on grant_table_version being appropriately set. Unfortunately, at init time grant_table_version will be 0. The GREFS_PER_GRANT_FRAME conditional expression checks for "grant_table_version == 1", and therefore returns the number of grant references per frame for v2. This causes gnttab_init() to allocate fewer pages for gnttab_list, as a frame can old half the number of v2 entries than v1 entries. After gnttab_resume() is called, grant_table_version is appropriately set. nr_init_grefs will then be miscalculated and gnttab_free_count will hold a value larger than the actual number of free gref entries. If a guest is heavily utilizing improperly initialized v1 grant tables, memory corruption can occur. One common manifestation is corruption of the vmalloc list, resulting in a poisoned pointer derefrence when accessing /proc/meminfo or /proc/vmallocinfo: [ 40.770064] BUG: unable to handle kernel paging request at 0000200200001407 [ 40.770083] IP: [<ffffffff811a6fb0>] get_vmalloc_info+0x70/0x110 [ 40.770102] PGD 0 [ 40.770107] Oops: 0000 [#1] SMP [ 40.770114] CPU 10 This patch introduces a static variable, grefs_per_grant_frame, to cache the calculated value. gnttab_init() now calls gnttab_request_version() early so that grant_table_version and grefs_per_grant_frame can be appropriately set. A few BUG_ON()s have been added to prevent this type of bug from reoccurring in the future. Signed-off-by: Matt Wilson <msw@amazon.com> Reviewed-and-Tested-by: Steven Noonan <snoonan@amazon.com> Acked-by: Ian Campbell <Ian.Campbell@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Annie Li <annie.li@oracle.com> Cc: xen-devel@lists.xen.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org # v3.3 and newer Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-01-15x86/xen : Fix the wrong check in pcibackYang Zhang
Fix the wrong check in pciback. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-01-11xen/privcmd: Relax access control in privcmd_ioctl_mmapTamas Lengyel
In the privcmd Linux driver two checks in the functions privcmd_ioctl_mmap and privcmd_ioctl_mmap_batch are not needed as they are trying to enforce hypervisor-level access control. They should be removed as they break secondary control domains when performing dom0 disaggregation. Xen itself provides adequate security controls around these hypercalls and these checks prevent those controls from functioning as intended. Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com> Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov> [v1: Fixed up the patch and commit description] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-01-03Drivers: xen: remove __dev* attributes.Greg Kroah-Hartman
CONFIG_HOTPLUG is going away as an option. As a result, the __dev* markings need to be removed. This change removes the use of __devinit, and __devinitdata from these drivers. Based on patches originally written by Bill Pemberton, but redone by me in order to handle some of the coding style issues better, by hand. Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Jan Beulich <jbeulich@suse.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Bill Pemberton <wfp5p@virginia.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>