summaryrefslogtreecommitdiffstats
path: root/arch/x86
AgeCommit message (Collapse)Author
2009-11-11x86: Make sure wakeup trampoline code is below 1MBYinghai Lu
Instead of using bootmem, try find_e820_area()/reserve_early(), and call acpi_reserve_memory() early, to allocate the wakeup trampoline code area below 1M. This is more reliable, and it also removes a dependency on bootmem. -v2: change function name to acpi_reserve_wakeup_memory(), as suggested by Rafael. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: H. Peter Anvin <hpa@zytor.com> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: pm list <linux-pm@lists.linux-foundation.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <4AFA210B.3020207@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08x86: k8.h: Add struct bootnodeRandy Dunlap
k8.h uses struct bootnode but does not #include a header file for it, so provide a simple declaration for it. arch/x86/include/asm/k8.h:13: warning: 'struct bootnode' declared inside parameter list arch/x86/include/asm/k8.h:13: warning: its scope is only this definition or declaration, which is probably not what you want Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> LKML-Reference: <20091028160955.d27ccb16.randy.dunlap@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-02x86_64, cpa: Use only text section in set_kernel_text_rw/roSuresh Siddha
set_kernel_text_rw()/set_kernel_text_ro() are marking pages starting from _text to __start_rodata as RW or RO. With CONFIG_DEBUG_RODATA, there might be free pages (associated with padding the sections to 2MB large page boundary) between text and rodata sections that are given back to page allocator. So we should use only use the start (__text) and end (__stop___ex_table) of the text section in set_kernel_text_rw()/set_kernel_text_ro(). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Tested-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20091029024821.164525222@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-02x86_64, ftrace: Make ftrace use kernel identity mapping to modify codeSuresh Siddha
On x86_64, kernel text mappings are mapped read-only with CONFIG_DEBUG_RODATA. So use the kernel identity mapping instead of the kernel text mapping to modify the kernel text. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Tested-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20091029024821.080941108@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-02x86, cpa: Fix kernel text RO checks in static_protection()Suresh Siddha
Steven Rostedt reported that we are unconditionally making the kernel text mapping as read-only. i.e., if someone does cpa() to the kernel text area for setting/clearing any page table attribute, we unconditionally clear the read-write attribute for the kernel text mapping that is set at compile time. We should delay (to forbid the write attribute) and enforce only after the kernel has mapped the text as read-only. Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Tested-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20091029024820.996634347@sbs-t61.sc.intel.com> [ marked kernel_set_to_readonly as __read_mostly ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-27tracing: allow to change permissions for text with dynamic ftrace enabledSteven Rostedt
The commit 74e081797bd9d2a7d8005fe519e719df343a2ba8 x86-64: align RODATA kernel section to 2MB with CONFIG_DEBUG_RODATA prevents text sections from becoming read/write using set_memory_rw. The dynamic ftrace changes all text pages to read/write just before converting the calls to tracing to nops, and vice versa. I orginally just added a flag to allow this transaction when ftrace did the change, but I also found that when the CPA testing was running it would remove the read/write as well, and ftrace does not do the text conversion on boot up, and the CPA changes caused the dynamic tracer to fail on self tests. The current solution I have is to simply not to prevent change_page_attr from setting the RW bit for kernel text pages. Reported-by: Ingo Molnar <mingo@elte.hu> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-10-24x86, boot: Simplify setting of the PAE bitAlexander Potashev
A single 'movl' is shorter than the 'xorl'-'orl' pair. No change in behaviour. Signed-off-by: Alexander Potashev <aspotashev@gmail.com> LKML-Reference: <1256341043-4928-1-git-send-email-aspotashev@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-23x86: Remove pfn in add_one_highpage_init()Minchan Kim
commit cc9f7a0ccf000d4db5fbdc7b0ae48eefea102f69 changed add_one_highpage_init. We don't use pfn any more. Let's remove unnecessary argument. This patch doesn't chage function behavior. This patch is based on v2.6.32-rc5. Signed-off-by: Minchan Kim <minchan.kim@gmail.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> LKML-Reference: <20091022112722.adc8e55c.minchan.kim@barrios-desktop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-20x86-64: add comment for RODATA large page retainmentSuresh Siddha
Add a comment explaining why RODATA is aligned to 2 MB. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-10-20x86-64: align RODATA kernel section to 2MB with CONFIG_DEBUG_RODATASuresh Siddha
CONFIG_DEBUG_RODATA chops the large pages spanning boundaries of kernel text/rodata/data to small 4KB pages as they are mapped with different attributes (text as RO, RODATA as RO and NX etc). On x86_64, preserve the large page mappings for kernel text/rodata/data boundaries when CONFIG_DEBUG_RODATA is enabled. This is done by allowing the RODATA section to be hugepage aligned and having same RWX attributes for the 2MB page boundaries Extra Memory pages padding the sections will be freed during the end of the boot and the kernel identity mappings will have different RWX permissions compared to the kernel text mappings. Kernel identity mappings to these physical pages will be mapped with smaller pages but large page mappings are still retained for kernel text,rodata,data mappings. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20091014220254.190119924@sbs-t61.sc.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-10-20x86-64: preserve large page mapping for 1st 2MB kernel txt with ↵Suresh Siddha
CONFIG_DEBUG_RODATA In the first 2MB, kernel text is co-located with kernel static page tables setup by head_64.S. CONFIG_DEBUG_RODATA chops this 2MB large page mapping to small 4KB pages as we mark the kernel text as RO, leaving the static page tables as RW. With CONFIG_DEBUG_RODATA disabled, OLTP run on NHM-EP shows 1% improvement with 2% reduction in system time and 1% improvement in iowait idle time. To recover this, move the kernel static page tables to .data section, so that we don't have to break the first 2MB of kernel text to small pages with CONFIG_DEBUG_RODATA. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20091014220254.063193621@sbs-t61.sc.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-10-12x86: Interleave emulated nodes over physical nodesDavid Rientjes
Add interleaved NUMA emulation support This patch interleaves emulated nodes over the system's physical nodes. This is required for interleave optimizations since mempolicies, for example, operate by iterating over a nodemask and act without knowledge of node distances. It can also be used for testing memory latencies and NUMA bugs in the kernel. There're a couple of ways to do this: - divide the number of emulated nodes by the number of physical nodes and allocate the result on each physical node, or - allocate each successive emulated node on a different physical node until all memory is exhausted. The disadvantage of the first option is, depending on the asymmetry in node capacities of each physical node, emulated nodes may substantially differ in size on a particular physical node compared to another. The disadvantage of the second option is, also depending on the asymmetry in node capacities of each physical node, there may be more emulated nodes allocated on a single physical node as another. This patch implements the second option; we sacrifice the possibility that we may have slightly more emulated nodes on a particular physical node compared to another in lieu of node size asymmetry. [ Note that "node capacity" of a physical node is not only a function of its addressable range, but also is affected by subtracting out the amount of reserved memory over that range. NUMA emulation only deals with available, non-reserved memory quantities. ] We ensure there is at least a minimal amount of available memory allocated to each node. We also make sure that at least this amount of available memory is available in ZONE_DMA32 for any node that includes both ZONE_DMA32 and ZONE_NORMAL. This patch also cleans the emulation code up by no longer passing the statically allocated struct bootnode array among the various functions. This init.data array is not allocated on the stack since it may be very large and thus it may be accessed at file scope. The WARN_ON() for nodes_cover_memory() when faking proximity domains is removed since it relies on successive nodes always having greater start addresses than previous nodes; with interleaving this is no longer always true. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Ankita Garg <ankita@in.ibm.com> Cc: Len Brown <len.brown@intel.com> LKML-Reference: <alpine.DEB.1.00.0909251519150.14754@chino.kir.corp.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12x86: Export srat physical topologyDavid Rientjes
This is the counterpart to "x86: export k8 physical topology" for SRAT. It is not as invasive because the acpi code already seperates node setup into detection and registration steps, with the exception of registering e820 active regions in acpi_numa_memory_affinity_init(). This is now moved to acpi_scan_nodes() if NUMA emulation is disabled or deferred. acpi_numa_init() now returns a value which specifies whether an underlying SRAT was located. If so, that topology can be used by the emulation code to interleave emulated nodes over physical nodes or to register the nodes for ACPI. acpi_get_nodes() may now be used to export the srat physical topology of the machine for NUMA emulation. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Ankita Garg <ankita@in.ibm.com> Cc: Len Brown <len.brown@intel.com> LKML-Reference: <alpine.DEB.1.00.0909251518580.14754@chino.kir.corp.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12x86: Export k8 physical topologyDavid Rientjes
To eventually interleave emulated nodes over physical nodes, we need to know the physical topology of the machine without actually registering it. This does the k8 node setup in two parts: detection and registration. NUMA emulation can then used the physical topology detected to setup the address ranges of emulated nodes accordingly. If emulation isn't used, the k8 nodes are registered as normal. Two formals are added to the x86 NUMA setup functions: `acpi' and `k8'. These represent whether ACPI or K8 NUMA has been detected; both cannot be true at the same time. This specifies to the NUMA emulation code whether an underlying physical NUMA topology exists and which interface to use. This patch deals solely with separating the k8 setup path into Northbridge detection and registration steps and leaves the ACPI changes for a subsequent patch. The `acpi' formal is added here, however, to avoid touching all the header files again in the next patch. This approach also ensures emulated nodes will not span physical nodes so the true memory latency is not misrepresented. k8_get_nodes() may now be used to export the k8 physical topology of the machine for NUMA emulation. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Ankita Garg <ankita@in.ibm.com> Cc: Len Brown <len.brown@intel.com> LKML-Reference: <alpine.DEB.1.00.0909251518400.14754@chino.kir.corp.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12x86: Clean up and add missing log levels for k8David Rientjes
Convert all printk's in arch/x86/mm/k8topology_64.c to use pr_info() or pr_err() appropriately. Adds log levels for messages currently lacking them. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Ankita Garg <ankita@in.ibm.com> Cc: Len Brown <len.brown@intel.com> LKML-Reference: <alpine.DEB.1.00.0909251517440.14754@chino.kir.corp.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-11pci: increase alignment to make more space for hidden codeYinghai Lu
As reported in http://bugzilla.kernel.org/show_bug.cgi?id=13940 on some system when acpi are enabled, acpi clears some BAR for some devices without reason, and kernel will need to allocate devices for them. It then apparently hits some undocumented resource conflict, resulting in non-working devices. Try to increase alignment to get more safe range for unassigned devices. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-11headers: remove sched.h from interrupt.hAlexey Dobriyan
After m68k's task_thread_info() doesn't refer to current, it's possible to remove sched.h from interrupt.h and not break m68k! Many thanks to Heiko Carstens for allowing this. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2009-10-08Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, pci: Correct spelling in a comment x86: Simplify bound checks in the MTRR code x86: EDAC: carve out AMD MCE decoding logic initcalls: Add early_initcall() for modules x86: EDAC: MCE: Fix MCE decoding callback logic
2009-10-08x86, timers: Check for pending timers after (device) interruptsArjan van de Ven
Now that range timers and deferred timers are common, I found a problem with these using the "perf timechart" tool. Frans Pop also reported high scheduler latencies via LatencyTop, when using iwlagn. It turns out that on x86, these two 'opportunistic' timers only get checked when another "real" timer happens. These opportunistic timers have the objective to save power by hitchhiking on other wakeups, as to avoid CPU wakeups by themselves as much as possible. The change in this patch runs this check not only at timer interrupts, but at all (device) interrupts. The effect is that: 1) the deferred timers/range timers get delayed less 2) the range timers cause less wakeups by themselves because the percentage of hitchhiking on existing wakeup events goes up. I've verified the working of the patch using "perf timechart", the original exposed bug is gone with this patch. Frans also reported success - the latencies are now down in the expected ~10 msec range. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Tested-by: Frans Pop <elendil@planet.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091008064041.67219b13@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-05Merge branch 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
* 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: add support for change_pte mmu notifiers KVM: MMU: add SPTE_HOST_WRITEABLE flag to the shadow ptes KVM: MMU: dont hold pagecount reference for mapped sptes pages KVM: Prevent overflow in KVM_GET_SUPPORTED_CPUID KVM: VMX: flush TLB with INVEPT on cpu migration KVM: fix LAPIC timer period overflow KVM: s390: fix memsize >= 4G KVM: SVM: Handle tsc in svm_get_msr/svm_set_msr correctly KVM: SVM: Fix tsc offset adjustment when running nested
2009-10-05Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Don't leak 64-bit kernel register values to 32-bit processes x86, SLUB: Remove unused CONFIG FAST_CMPXCHG_LOCAL x86: earlyprintk: Fix regression to handle serial,ttySn as 1 arg x86: Don't generate cmpxchg8b_emu if CONFIG_X86_CMPXCHG64=y x86: Fix csum_ipv6_magic asm memory clobber x86: Optimize cmpxchg64() at build-time some more
2009-10-04KVM: add support for change_pte mmu notifiersIzik Eidus
this is needed for kvm if it want ksm to directly map pages into its shadow page tables. [marcelo: cast pfn assignment to u64] Signed-off-by: Izik Eidus <ieidus@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-10-04KVM: MMU: add SPTE_HOST_WRITEABLE flag to the shadow ptesIzik Eidus
this flag notify that the host physical page we are pointing to from the spte is write protected, and therefore we cant change its access to be write unless we run get_user_pages(write = 1). (this is needed for change_pte support in kvm) Signed-off-by: Izik Eidus <ieidus@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-10-04KVM: MMU: dont hold pagecount reference for mapped sptes pagesIzik Eidus
When using mmu notifiers, we are allowed to remove the page count reference tooken by get_user_pages to a specific page that is mapped inside the shadow page tables. This is needed so we can balance the pagecount against mapcount checking. (Right now kvm increase the pagecount and does not increase the mapcount when mapping page into shadow page table entry, so when comparing pagecount against mapcount, you have no reliable result.) Signed-off-by: Izik Eidus <ieidus@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-10-04KVM: Prevent overflow in KVM_GET_SUPPORTED_CPUIDAvi Kivity
The number of entries is multiplied by the entry size, which can overflow on 32-bit hosts. Bound the entry count instead. Reported-by: David Wagner <daw@cs.berkeley.edu> Cc: stable@kernel.org Signed-off-by: Avi Kivity <avi@redhat.com>
2009-10-04KVM: VMX: flush TLB with INVEPT on cpu migrationMarcelo Tosatti
It is possible that stale EPTP-tagged mappings are used, if a vcpu migrates to a different pcpu. Set KVM_REQ_TLB_FLUSH in vmx_vcpu_load, when switching pcpus, which will invalidate both VPID and EPT mappings on the next vm-entry. Cc: stable@kernel.org Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-10-04KVM: fix LAPIC timer period overflowAurelien Jarno
Don't overflow when computing the 64-bit period from 32-bit registers. Fixes sourceforge bug #2826486. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Cc: stable@kernel.org Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-10-04KVM: SVM: Handle tsc in svm_get_msr/svm_set_msr correctlyJoerg Roedel
When running nested we need to touch the l1 guests tsc_offset. Otherwise changes will be lost or a wrong value be read. Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-10-04KVM: SVM: Fix tsc offset adjustment when running nestedJoerg Roedel
When svm_vcpu_load is called while the vcpu is running in guest mode the tsc adjustment made there is lost on the next emulated #vmexit. This causes the tsc running backwards in the guest. This patch fixes the issue by also adjusting the tsc_offset in the emulated hsave area so that it will not get lost. Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-10-03x86, pci: Correct spelling in a commentMarin Mitov
Signed-off-by: Marin Mitov <mitov@issp.bas.bg> Cc: Joerg Roedel <joerg.roedel@amd.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> LKML-Reference: <200910032045.02523.mitov@issp.bas.bg> Signed-off-by: Ingo Molnar <mingo@elte.hu> ======================================================
2009-10-02x86: Simplify bound checks in the MTRR codeArjan van de Ven
The current bound checks for copy_from_user in the MTRR driver are not as obvious as they could be, and gcc agrees with that. This patch simplifies the boundary checks to the point that gcc can now prove to itself that the copy_from_user() is never going past its bounds. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <20090926205150.30797709@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-02x86: EDAC: MCE: Fix MCE decoding callback logicIngo Molnar
Make decoding of MCEs happen only on AMD hardware by registering a non-default callback only on CPU families which support it. While looking at the interaction of decode_mce() with the other MCE code i also noticed a few other things and made the following cleanups/fixes: - Fixed the mce_decode() weak alias - a weak alias is really not good here, it should be a proper callback. A weak alias will be overriden if a piece of code is built into the kernel - not good, obviously. - The patch initializes the callback on AMD family 10h and 11h. - Added the more correct fallback printk of: No support for human readable MCE decoding on this CPU type. Transcribe the message and run it through 'mcelog --ascii' to decode. On CPUs that dont have a decoder. - Made the surrounding code more readable. Note that the callback allows us to have a default fallback - without having to check the CPU versions during the printout itself. When an EDAC module registers itself, it can install the decode-print function. (there's no unregister needed as this is core code.) version -v2 by Borislav Petkov: - add K8 to the set of supported CPUs - always build in edac_mce_amd since we use an early_initcall now - fix checkpatch warnings Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andi Kleen <andi@firstfloor.org> LKML-Reference: <20091001141432.GA11410@aftab> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-01x86: fix csum_ipv6_magic asm memory clobberSamuel Thibault
Just like ip_fast_csum, the assembly snippet in csum_ipv6_magic needs a memory clobber, as it is only passed the address of the buffer, not a memory reference to the buffer itself. This caused failures in Hurd's pfinetv4 when we tried to compile it with gcc-4.3 (bogus checksums). Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Acked-by: "David S. Miller" <davem@davemloft.net> Cc: Andi Kleen <andi@firstfloor.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-01const: constify remaining file_operationsAlexey Dobriyan
[akpm@linux-foundation.org: fix KVM] Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-01x86: Don't leak 64-bit kernel register values to 32-bit processesJan Beulich
While 32-bit processes can't directly access R8...R15, they can gain access to these registers by temporarily switching themselves into 64-bit mode. Therefore, registers not preserved anyway by called C functions (i.e. R8...R11) must be cleared prior to returning to user mode. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: <stable@kernel.org> LKML-Reference: <4AC34D73020000780001744A@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-01x86, SLUB: Remove unused CONFIG FAST_CMPXCHG_LOCALJaswinder Singh Rajput
Remove unused CONFIG FAST_CMPXCHG_LOCAL from Kconfig. Reported-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Acked-by: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Matt Mackall <mpm@selenic.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Robert P. J. Day" <rpjday@crashcourse.ca> Cc: linux-mm@kvack.org LKML-Reference: <1253981501.4568.61.camel@ht.satnam> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-01x86: earlyprintk: Fix regression to handle serial,ttySn as 1 argJason Wessel
Commit c953094 ("early_printk: Allow more than one early console") introduced a regression in the parsing of the earlyprintk= kernel arguments. If you specify "earlyprintk=serial,ttyS0,115200" as a kernel argument, the "serial,ttyS" should be parsed as a single argument and not as "serial" and then "ttyS". Also update the documentation to reflect you can specify the ttyS directly without the "serial" argument. Signed-off-by: Jason Wessel <jason.wessel@windriver.com> Cc: Len Brown <lenb@kernel.org> Cc: Greg KH <gregkh@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Johannes Weiner <hannes@cmpxchg.org> LKML-Reference: <4ABB7D5E.6000301@windriver.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-01x86: Don't generate cmpxchg8b_emu if CONFIG_X86_CMPXCHG64=yEric Dumazet
Conditionaly compile cmpxchg8b_emu.o and EXPORT_SYMBOL(cmpxchg8b_emu). This reduces the kernel size a bit. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: John Stultz <johnstul@us.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <4AC43E7E.1000600@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-01x86: Fix csum_ipv6_magic asm memory clobberSamuel Thibault
Just like ip_fast_csum, the assembly snippet in csum_ipv6_magic needs a memory clobber, as it is only passed the address of the buffer, not a memory reference to the buffer itself. This caused failures in Hurd's pfinetv4 when we tried to compile it with gcc-4.3 (bogus checksums). Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Acked-by: David S. Miller <davem@davemloft.net> Cc: Andi Kleen <andi@firstfloor.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-01x86: Optimize cmpxchg64() at build-time some moreLinus Torvalds
Try to avoid the 'alternates()' code when we can statically determine that cmpxchg8b is fine. We already have that CONFIG_x86_CMPXCHG64 (enabled by PAE support), and we could easily also enable it for some of the CPU cases. Note, this patch only adds CMPXCHG8B for the obvious Intel CPU's, not for others. (There was something really messy about cmpxchg8b and clone CPU's, so if you enable it on other CPUs later, do it carefully.) If we avoid that asm-alternative thing when we can assume the instruction exists, we'll generate less support crud, and we'll avoid the whole issue with that extra 'nop' for padding instruction sizes etc. LKML-Reference: <alpine.LFD.2.01.0909301743150.6996@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-30Merge branch 'sched-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched_clock: Fix atomicity/continuity bug by using cmpxchg64() x86: Provide an alternative() based cmpxchg64()
2009-09-30x86: Provide an alternative() based cmpxchg64()Arjan van de Ven
cmpxchg64() today generates, to quote Linus, "barf bag" code. cmpxchg64() is about to get used in the scheduler to fix a bug there, but it's a prerequisite that cmpxchg64() first be made non-sucking. This patch turns cmpxchg64() into an efficient implementation that uses the alternative() mechanism to just use the raw instruction on all modern systems. Note: the fallback is NOT smp safe, just like the current fallback is not SMP safe. (Interested parties with i486 based SMP systems are welcome to submit fix patches for that.) Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> [ fixed asm constraint bug ] Fixed-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: John Stultz <johnstul@us.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090930170754.0886ff2e@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-30Revert "x86, mce: do not compile mcelog message on AMD"Linus Torvalds
This reverts commit 22223c9b417be5fd0ab2cf9ad17eb7bd1e19f7b9, as requested by Andi Kleen: "Obviously kernels compiled with AMD support can still run on non AMD systems, so messages like this can never be removed at compile time." Requsted-by: Andi Kleen <andi@firstfloor.org> Cc: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-27const: mark struct vm_struct_operationsAlexey Dobriyan
* mark struct vm_area_struct::vm_ops as const * mark vm_ops in AGP code But leave TTM code alone, something is fishy there with global vm_ops being used. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-27Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: ACPI: IA64=y ACPI=n build fix ACPI: Kill overly verbose "power state" log messages ACPI: fix Compaq Evo N800c (Pentium 4m) boot hang regression ACPI: Clarify resource conflict message thinkpad-acpi: fix CONFIG_THINKPAD_ACPI_HOTKEY_POLL build problem
2009-09-27x86: Fix hwpoison code related build failure on 32-bit NUMAQLinus Torvalds
This build failure triggers: In file included from include/linux/suspend.h:8, from arch/x86/kernel/asm-offsets_32.c:11, from arch/x86/kernel/asm-offsets.c:2: include/linux/mm.h:503:2: error: #error SECTIONS_WIDTH+NODES_WIDTH+ZONES_WIDTH > BITS_PER_LONG - NR_PAGEFLAGS Because due to the hwpoison page flag we ran out of page flags on 32-bit. Dont turn on hwpoison on 32-bit NUMA (it's rare in any case). Also clean up the Kconfig dependencies in the generic MM code by introducing ARCH_SUPPORTS_MEMORY_FAILURE. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-27ACPI: fix Compaq Evo N800c (Pentium 4m) boot hang regressionZhao Yakui
Don't disable ARB_DISABLE when the familary ID is 0x0F. http://bugzilla.kernel.org/show_bug.cgi?id=14211 This was a 2.6.31 regression, and so this patch needs to be applied to 2.6.31.stable Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2009-09-26Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf tools: Dont use openat() perf tools: Fix buffer allocation perf tools: .gitignore += perf*.html perf tools: Handle relative paths while loading module symbols perf tools: Fix module symbol loading bug perf_event, x86: Fix 'perf sched record' crashing the machine perf_event: Update PERF_EVENT_FORK header definition perf stat: Fix zero total printouts
2009-09-26Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Remove redundant non-NUMA topology functions x86: early_printk: Protect against using the same device twice x86: Reduce verbosity of "PAT enabled" kernel message x86: Reduce verbosity of "TSC is reliable" message x86: mce: Use safer ways to access MCE registers x86: mce, inject: Use real inject-msg in raise_local x86: mce: Fix thermal throttling message storm x86: mce: Clean up thermal throttling state tracking code x86: split NX setup into separate file to limit unstack-protected code xen: check EFER for NX before setting up GDT mapping x86: Cleanup linker script using new linker script macros. x86: Use section .data.page_aligned for the idt_table. x86: convert to use __HEAD and HEAD_TEXT macros. x86: convert compressed loader to use __HEAD and HEAD_TEXT macros. x86: fix fragile computation of vsyscall address
2009-09-25Merge branch 'x86/asm' into x86/urgentIngo Molnar
Merge reason: The linker script cleanups are ready for upstream. Signed-off-by: Ingo Molnar <mingo@elte.hu>