From 7f1fc268c47491fd5e63548f6415fc8604e13003 Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk Date: Sun, 5 May 2013 09:30:09 -0400 Subject: xen/vcpu/pvhvm: Fix vcpu hotplugging hanging. If a user did: echo 0 > /sys/devices/system/cpu/cpu1/online echo 1 > /sys/devices/system/cpu/cpu1/online we would (this a build with DEBUG enabled) get to: smpboot: ++++++++++++++++++++=_---CPU UP 1 .. snip.. smpboot: Stack at about ffff880074c0ff44 smpboot: CPU1: has booted. and hang. The RCU mechanism would kick in an try to IPI the CPU1 but the IPIs (and all other interrupts) would never arrive at the CPU1. At first glance at least. A bit digging in the hypervisor trace shows that (using xenanalyze): [vla] d4v1 vec 243 injecting 0.043163027 --|x d4v1 intr_window vec 243 src 5(vector) intr f3 ] 0.043163639 --|x d4v1 vmentry cycles 1468 ] 0.043164913 --|x d4v1 vmexit exit_reason PENDING_INTERRUPT eip ffffffff81673254 0.043164913 --|x d4v1 inj_virq vec 243 real [vla] d4v1 vec 243 injecting 0.043164913 --|x d4v1 intr_window vec 243 src 5(vector) intr f3 ] 0.043165526 --|x d4v1 vmentry cycles 1472 ] 0.043166800 --|x d4v1 vmexit exit_reason PENDING_INTERRUPT eip ffffffff81673254 0.043166800 --|x d4v1 inj_virq vec 243 real [vla] d4v1 vec 243 injecting there is a pending event (subsequent debugging shows it is the IPI from the VCPU0 when smpboot.c on VCPU1 has done "set_cpu_online(smp_processor_id(), true)") and the guest VCPU1 is interrupted with the callback IPI (0xf3 aka 243) which ends up calling __xen_evtchn_do_upcall. The __xen_evtchn_do_upcall seems to do *something* but not acknowledge the pending events. And the moment the guest does a 'cli' (that is the ffffffff81673254 in the log above) the hypervisor is invoked again to inject the IPI (0xf3) to tell the guest it has pending interrupts. This repeats itself forever. The culprit was the per_cpu(xen_vcpu, cpu) pointer. At the bootup we set each per_cpu(xen_vcpu, cpu) to point to the shared_info->vcpu_info[vcpu] but later on use the VCPUOP_register_vcpu_info to register per-CPU structures (xen_vcpu_setup). This is used to allow events for more than 32 VCPUs and for performance optimizations reasons. When the user performs the VCPU hotplug we end up calling the the xen_vcpu_setup once more. We make the hypercall which returns -EINVAL as it does not allow multiple registration calls (and already has re-assigned where the events are being set). We pick the fallback case and set per_cpu(xen_vcpu, cpu) to point to the shared_info->vcpu_info[vcpu] (which is a good fallback during bootup). However the hypervisor is still setting events in the register per-cpu structure (per_cpu(xen_vcpu_info, cpu)). As such when the events are set by the hypervisor (such as timer one), and when we iterate in __xen_evtchn_do_upcall we end up reading stale events from the shared_info->vcpu_info[vcpu] instead of the per_cpu(xen_vcpu_info, cpu) structures. Hence we never acknowledge the events that the hypervisor has set and the hypervisor keeps on reminding us to ack the events which we never do. The fix is simple. Don't on the second time when xen_vcpu_setup is called over-write the per_cpu(xen_vcpu, cpu) if it points to per_cpu(xen_vcpu_info). Acked-by: Stefano Stabellini CC: stable@vger.kernel.org Signed-off-by: Konrad Rzeszutek Wilk --- arch/x86/xen/enlighten.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) (limited to 'arch/x86/xen/enlighten.c') diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index ddbd54a9b84..94a81f41e8a 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -157,6 +157,21 @@ static void xen_vcpu_setup(int cpu) BUG_ON(HYPERVISOR_shared_info == &xen_dummy_shared_info); + /* + * This path is called twice on PVHVM - first during bootup via + * smp_init -> xen_hvm_cpu_notify, and then if the VCPU is being + * hotplugged: cpu_up -> xen_hvm_cpu_notify. + * As we can only do the VCPUOP_register_vcpu_info once lets + * not over-write its result. + * + * For PV it is called during restore (xen_vcpu_restore) and bootup + * (xen_setup_vcpu_info_placement). The hotplug mechanism does not + * use this function. + */ + if (xen_hvm_domain()) { + if (per_cpu(xen_vcpu, cpu) == &per_cpu(xen_vcpu_info, cpu)) + return; + } if (cpu < MAX_VIRT_CPUS) per_cpu(xen_vcpu,cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu]; -- cgit v1.2.3-70-g09d2 From a520996ae2e2792e1f90b74e67c974120c8a3b83 Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk Date: Sun, 5 May 2013 08:51:47 -0400 Subject: xen/vcpu: Document the xen_vcpu_info and xen_vcpu They are important structures and it is not clear at first look what they are for. The xen_vcpu is a pointer. By default it points to the shared_info structure (at the CPU offset location). However if the VCPUOP_register_vcpu_info hypercall is implemented we can make the xen_vcpu pointer point to a per-CPU location. Acked-by: Stefano Stabellini [v1: Added comments from Ian Campbell] Signed-off-by: Konrad Rzeszutek Wilk --- arch/x86/xen/enlighten.c | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-) (limited to 'arch/x86/xen/enlighten.c') diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index 94a81f41e8a..a2babdb13a2 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -85,7 +85,29 @@ EXPORT_SYMBOL_GPL(hypercall_page); +/* + * Pointer to the xen_vcpu_info structure or + * &HYPERVISOR_shared_info->vcpu_info[cpu]. See xen_hvm_init_shared_info + * and xen_vcpu_setup for details. By default it points to share_info->vcpu_info + * but if the hypervisor supports VCPUOP_register_vcpu_info then it can point + * to xen_vcpu_info. The pointer is used in __xen_evtchn_do_upcall to + * acknowledge pending events. + * Also more subtly it is used by the patched version of irq enable/disable + * e.g. xen_irq_enable_direct and xen_iret in PV mode. + * + * The desire to be able to do those mask/unmask operations as a single + * instruction by using the per-cpu offset held in %gs is the real reason + * vcpu info is in a per-cpu pointer and the original reason for this + * hypercall. + * + */ DEFINE_PER_CPU(struct vcpu_info *, xen_vcpu); + +/* + * Per CPU pages used if hypervisor supports VCPUOP_register_vcpu_info + * hypercall. This can be used both in PV and PVHVM mode. The structure + * overrides the default per_cpu(xen_vcpu, cpu) value. + */ DEFINE_PER_CPU(struct vcpu_info, xen_vcpu_info); enum xen_domain_type xen_domain_type = XEN_NATIVE; @@ -187,7 +209,12 @@ static void xen_vcpu_setup(int cpu) /* Check to see if the hypervisor will put the vcpu_info structure where we want it, which allows direct access via - a percpu-variable. */ + a percpu-variable. + N.B. This hypercall can _only_ be called once per CPU. Subsequent + calls will error out with -EINVAL. This is due to the fact that + hypervisor has no unregister variant and this hypercall does not + allow to over-write info.mfn and info.offset. + */ err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, cpu, &info); if (err) { -- cgit v1.2.3-70-g09d2 From d5b17dbff83d63fb6bf35daec21c8ebfb8d695b5 Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk Date: Sun, 5 May 2013 09:24:07 -0400 Subject: xen/smp/pvhvm: Don't point per_cpu(xen_vpcu, 33 and larger) to shared_info As it will point to some data, but not event channel data (the shared_info has an array limited to 32). This means that for PVHVM guests with more than 32 VCPUs without the usage of VCPUOP_register_info any interrupts to VCPUs larger than 32 would have gone unnoticed during early bootup. That is OK, as during early bootup, in smp_init we end up calling the hotplug mechanism (xen_hvm_cpu_notify) which makes the VCPUOP_register_vcpu_info call for all VCPUs and we can receive interrupts on VCPUs 33 and further. This is just a cleanup. Acked-by: Stefano Stabellini Signed-off-by: Konrad Rzeszutek Wilk --- arch/x86/xen/enlighten.c | 3 +++ 1 file changed, 3 insertions(+) (limited to 'arch/x86/xen/enlighten.c') diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index a2babdb13a2..31e19f97fb5 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -1646,6 +1646,9 @@ void __ref xen_hvm_init_shared_info(void) * online but xen_hvm_init_shared_info is run at resume time too and * in that case multiple vcpus might be online. */ for_each_online_cpu(cpu) { + /* Leave it to be NULL. */ + if (cpu >= MAX_VIRT_CPUS) + continue; per_cpu(xen_vcpu, cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu]; } } -- cgit v1.2.3-70-g09d2 From 4ea9b9aca90cfc71e6872ed3522356755162932c Mon Sep 17 00:00:00 2001 From: Zhenzhong Duan Date: Fri, 29 Mar 2013 09:14:00 +0800 Subject: xen: mask x2APIC feature in PV On x2apic enabled pvm, doing sysrq+l, got NULL pointer dereference as below. SysRq : Show backtrace of all active CPUs BUG: unable to handle kernel NULL pointer dereference at (null) IP: [] memcpy+0xb/0x120 Call Trace: [] ? __x2apic_send_IPI_mask+0x73/0x160 [] x2apic_send_IPI_all+0x1e/0x20 [] arch_trigger_all_cpu_backtrace+0x6c/0xb0 [] ? _raw_spin_lock_irqsave+0x34/0x50 [] sysrq_handle_showallcpus+0xe/0x10 [] __handle_sysrq+0x7d/0x140 [] ? __handle_sysrq+0x140/0x140 [] write_sysrq_trigger+0x57/0x60 [] proc_reg_write+0x86/0xc0 [] vfs_write+0xce/0x190 [] sys_write+0x55/0x90 [] system_call_fastpath+0x16/0x1b That's because apic points to apic_x2apic_cluster or apic_x2apic_phys but the basic element like cpumask isn't initialized. Mask x2APIC feature in pvm to avoid overwrite of apic pointer, update commit message per Konrad's suggestion. Signed-off-by: Zhenzhong Duan Tested-by: Tamon Shiose Signed-off-by: Konrad Rzeszutek Wilk --- arch/x86/xen/enlighten.c | 3 +++ 1 file changed, 3 insertions(+) (limited to 'arch/x86/xen/enlighten.c') diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index 31e19f97fb5..4bd0066a1ef 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -429,6 +429,9 @@ static void __init xen_init_cpuid_mask(void) cpuid_leaf1_edx_mask &= ~((1 << X86_FEATURE_APIC) | /* disable local APIC */ (1 << X86_FEATURE_ACPI)); /* disable ACPI */ + + cpuid_leaf1_ecx_mask &= ~(1 << (X86_FEATURE_X2APIC % 32)); + ax = 1; cx = 0; xen_cpuid(&ax, &bx, &cx, &dx); -- cgit v1.2.3-70-g09d2