summaryrefslogtreecommitdiffstats
path: root/arch/x86/include/asm
AgeCommit message (Collapse)Author
2011-03-16Merge branch 'for-2.6.39' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu, x86: Add arch-specific this_cpu_cmpxchg_double() support percpu: Generic support for this_cpu_cmpxchg_double() alpha: use L1_CACHE_BYTES for cacheline size in the linker script percpu: align percpu readmostly subsection to cacheline Fix up trivial conflict in arch/x86/kernel/vmlinux.lds.S due to the percpu alignment having changed ("x86: Reduce back the alignment of the per-CPU data section")
2011-03-15Merge branch 'x86-platform-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (27 commits) x86: Clean up apic.c and apic.h x86: Remove superflous goal definition of tsc_sync x86: dt: Correct local apic documentation in device tree bindings x86: dt: Cleanup local apic setup x86: dt: Fix OLPC=y/INTEL_CE=n build rtc: cmos: Add OF bindings x86: ce4100: Use OF to setup devices x86: ioapic: Add OF bindings for IO_APIC x86: dtb: Add generic bus probe x86: dtb: Add support for PCI devices backed by dtb nodes x86: dtb: Add device tree support for HPET x86: dtb: Add early parsing of IO_APIC x86: dtb: Add irq domain abstraction x86: dtb: Add a device tree for CE4100 x86: Add device tree support x86: e820: Remove conditional early mapping in parse_e820_ext x86: OLPC: Make OLPC=n build again x86: OLPC: Remove extra OLPC_OPENFIRMWARE_DT indirection x86: OLPC: Cleanup config maze completely x86: OLPC: Hide OLPC_OPENFIRMWARE config switch ... Fix up conflicts in arch/x86/platform/ce4100/ce4100.c
2011-03-15Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (93 commits) x86, tlb, UV: Do small micro-optimization for native_flush_tlb_others() x86-64, NUMA: Don't call numa_set_distanc() for all possible node combinations during emulation x86-64, NUMA: Don't assume phys node 0 is always online in numa_emulation() x86-64, NUMA: Clean up initmem_init() x86-64, NUMA: Fix numa_emulation code with node0 without RAM x86-64, NUMA: Revert NUMA affine page table allocation x86: Work around old gas bug x86-64, NUMA: Better explain numa_distance handling x86-64, NUMA: Fix distance table handling mm: Move early_node_map[] reverse scan helpers under HAVE_MEMBLOCK x86-64, NUMA: Fix size of numa_distance array x86: Rename e820_table_* to pgt_buf_* bootmem: Move __alloc_memory_core_early() to nobootmem.c bootmem: Move contig_page_data definition to bootmem.c/nobootmem.c bootmem: Separate out CONFIG_NO_BOOTMEM code into nobootmem.c x86-64, NUMA: Seperate out numa_alloc_distance() from numa_set_distance() x86-64, NUMA: Add proper function comments to global functions x86-64, NUMA: Move NUMA emulation into numa_emulation.c x86-64, NUMA: Prepare numa_emulation() for moving NUMA emulation into a separate file x86-64, NUMA: Do not scan two times for setup_node_bootmem() ... Fix up conflicts in arch/x86/kernel/smpboot.c
2011-03-15Merge branch 'irq-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (116 commits) x86: Enable forced interrupt threading support x86: Mark low level interrupts IRQF_NO_THREAD x86: Use generic show_interrupts x86: ioapic: Avoid redundant lookup of irq_cfg x86: ioapic: Use new move_irq functions x86: Use the proper accessors in fixup_irqs() x86: ioapic: Use irq_data->state x86: ioapic: Simplify irq chip and handler setup x86: Cleanup the genirq name space genirq: Add chip flag to force mask on suspend genirq: Add desc->irq_data accessor genirq: Add comments to Kconfig switches genirq: Fixup fasteoi handler for oneshot mode genirq: Provide forced interrupt threading sched: Switch wait_task_inactive to schedule_hrtimeout() genirq: Add IRQF_NO_THREAD genirq: Allow shared oneshot interrupts genirq: Prepare the handling of shared oneshot interrupts genirq: Make warning in handle_percpu_event useful x86: ioapic: Move trigger defines to io_apic.h ... Fix up trivial(?) conflicts in arch/x86/pci/xen.c due to genirq name space changes clashing with the Xen cleanups. The set_irq_msi() had moved to xen_bind_pirq_msi_to_irq().
2011-03-15Merge branch 'x86-cpu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Fix and clean up generic_processor_info() x86: Don't copy per_cpu cpuinfo for BSP two times x86: Move llc_shared_map out of cpu_info
2011-03-15Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, binutils, xen: Fix another wrong size directive x86: Remove dead config option X86_CPU x86: Really print supported CPUs if PROCESSOR_SELECT=y x86: Fix a bogus unwind annotation in lib/semaphore_32.S um, x86-64: Fix UML build after adding CFI annotations to lib/rwsem_64.S x86: Remove unused bits from lib/thunk_*.S x86: Use {push,pop}_cfi in more places x86-64: Add CFI annotations to lib/rwsem_64.S x86, asm: Cleanup unnecssary macros in asm-offsets.c x86, system.h: Drop unused __SAVE/__RESTORE macros x86: Use bitmap library functions x86: Partly unify asm-offsets_{32,64}.c x86: Reduce back the alignment of the per-CPU data section
2011-03-15Merge branch 'timers-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (62 commits) posix-clocks: Check write permissions in posix syscalls hrtimer: Remove empty hrtimer_init_hres_timer() hrtimer: Update hrtimer->state documentation hrtimer: Update base[CLOCK_BOOTTIME].offset correctly timers: Export CLOCK_BOOTTIME via the posix timers interface timers: Add CLOCK_BOOTTIME hrtimer base time: Extend get_xtime_and_monotonic_offset() to also return sleep time: Introduce get_monotonic_boottime and ktime_get_boottime hrtimers: extend hrtimer base code to handle more then 2 clockids ntp: Remove redundant and incorrect parameter check mn10300: Switch do_timer() to xtimer_update() posix clocks: Introduce dynamic clocks posix-timers: Cleanup namespace posix-timers: Add support for fd based clocks x86: Add clock_adjtime for x86 posix-timers: Introduce a syscall for clock tuning. time: Splitout compat timex accessors ntp: Add ADJ_SETOFFSET mode bit time: Introduce timekeeping_inject_offset posix-timer: Update comment ... Fix up new system-call-related conflicts in arch/x86/ia32/ia32entry.S arch/x86/include/asm/unistd_32.h arch/x86/include/asm/unistd_64.h arch/x86/kernel/syscall_table_32.S (name_to_handle_at()/open_by_handle_at() vs clock_adjtime()), and some due to movement of get_jiffies_64() in: kernel/time.c
2011-03-15Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (184 commits) perf probe: Clean up probe_point_lazy_walker() return value tracing: Fix irqoff selftest expanding max buffer tracing: Align 4 byte ints together in struct tracer tracing: Export trace_set_clr_event() tracing: Explain about unstable clock on resume with ring buffer warning ftrace/graph: Trace function entry before updating index ftrace: Add .ref.text as one of the safe areas to trace tracing: Adjust conditional expression latency formatting. tracing: Fix event alignment: skb:kfree_skb tracing: Fix event alignment: mce:mce_record tracing: Fix event alignment: kvm:kvm_hv_hypercall tracing: Fix event alignment: module:module_request tracing: Fix event alignment: ftrace:context_switch and ftrace:wakeup tracing: Remove lock_depth from event entry perf header: Stop using 'self' perf session: Use evlist/evsel for managing perf.data attributes perf top: Don't let events to eat up whole header line perf top: Fix events overflow in top command ring-buffer: Remove unused #include <linux/trace_irq.h> tracing: Add an 'overwrite' trace_option. ...
2011-03-15Merge branch 'core-locking-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: rtmutex: tester: Remove the remaining BKL leftovers lockdep/timers: Explain in detail the locking problems del_timer_sync() may cause rtmutex: Simplify PI algorithm and make highest prio task get lock rwsem: Remove redundant asmregparm annotation rwsem: Move duplicate function prototypes to linux/rwsem.h rwsem: Unify the duplicate rwsem_is_locked() inlines rwsem: Move duplicate init macros and functions to linux/rwsem.h rwsem: Move duplicate struct rwsem declaration to linux/rwsem.h x86: Cleanup rwsem_count_t typedef rwsem: Cleanup includes locking: Remove deprecated lock initializers cred: Replace deprecated spinlock initialization kthread: Replace deprecated spinlock initialization xtensa: Replace deprecated spinlock initialization um: Replace deprecated spinlock initialization sparc: Replace deprecated spinlock initialization mips: Replace deprecated spinlock initialization cris: Replace deprecated spinlock initialization alpha: Replace deprecated spinlock initialization rtmutex-tester: Remove BKL tests
2011-03-15Merge branch 'core-futexes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-futexes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: arm: Remove bogus comment in futex_atomic_cmpxchg_inatomic() futex: Deobfuscate handle_futex_death() plist: Add priority list test plist: Shrink struct plist_head futex,plist: Remove debug lock assignment from plist_node futex,plist: Pass the real head of the priority list to plist_del() futex: Sanitize futex ops argument types futex: Sanitize cmpxchg_futex_value_locked API futex: Remove redundant pagefault_disable in futex_atomic_cmpxchg_inatomic() futex: Avoid redudant evaluation of task_pid_vnr() futex: Update futex_wait_setup comments about locking
2011-03-15Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (57 commits) tidy the trailing symlinks traversal up Turn resolution of trailing symlinks iterative everywhere simplify link_path_walk() tail Make trailing symlink resolution in path_lookupat() iterative update nd->inode in __do_follow_link() instead of after do_follow_link() pull handling of one pathname component into a helper fs: allow AT_EMPTY_PATH in linkat(), limit that to CAP_DAC_READ_SEARCH Allow passing O_PATH descriptors via SCM_RIGHTS datagrams readlinkat(), fchownat() and fstatat() with empty relative pathnames Allow O_PATH for symlinks New kind of open files - "location only". ext4: Copy fs UUID to superblock ext3: Copy fs UUID to superblock. vfs: Export file system uuid via /proc/<pid>/mountinfo unistd.h: Add new syscalls numbers to asm-generic x86: Add new syscalls for x86_64 x86: Add new syscalls for x86_32 fs: Remove i_nlink check from file system link callback fs: Don't allow to create hardlink for deleted file vfs: Add open by file handle support ...
2011-03-15Merge branch 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvmLinus Torvalds
* 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm: xen: suspend: remove xen_hvm_suspend xen: suspend: pull pre/post suspend hooks out into suspend_info xen: suspend: move arch specific pre/post suspend hooks into generic hooks xen: suspend: refactor non-arch specific pre/post suspend hooks xen: suspend: add "arch" to pre/post suspend hooks xen: suspend: pass extra hypercall argument via suspend_info struct xen: suspend: refactor cancellation flag into a structure xen: suspend: use HYPERVISOR_suspend for PVHVM case instead of open coding xen: switch to new schedop hypercall by default. xen: use new schedop interface for suspend xen: do not respond to unknown xenstore control requests xen: fix compile issue if XEN is enabled but XEN_PVHVM is disabled xen: PV on HVM: support PV spinlocks and IPIs xen: make the ballon driver work for hvm domains xen-blkfront: handle Xen major numbers other than XENVBD xen: do not use xen_info on HVM, set pv_info name to "Xen HVM" xen: no need to delay xen_setup_shutdown_event for hvm guests anymore
2011-03-15Merge branches 'stable/irq.rework' and 'stable/pcifront-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/irq.rework' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/irq: Cleanup up the pirq_to_irq for DomU PV PCI passthrough guests as well. xen: Use IRQF_FORCE_RESUME xen/timer: Missing IRQF_NO_SUSPEND in timer code broke suspend. xen: Fix compile error introduced by "switch to new irq_chip functions" xen: Switch to new irq_chip functions xen: Remove stale irq_chip.end xen: events: do not free legacy IRQs xen: events: allocate GSIs and dynamic IRQs from separate IRQ ranges. xen: events: add xen_allocate_irq_{dynamic, gsi} and xen_free_irq xen:events: move find_unbound_irq inside CONFIG_PCI_MSI xen: handled remapped IRQs when enabling a pcifront PCI device. genirq: Add IRQF_FORCE_RESUME * 'stable/pcifront-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: pci/xen: When free-ing MSI-X/MSI irq->desc also use generic code. pci/xen: Cleanup: convert int** to int[] pci/xen: Use xen_allocate_pirq_msi instead of xen_allocate_pirq xen-pcifront: Sanity check the MSI/MSI-X values xen-pcifront: don't use flush_scheduled_work()
2011-03-15Merge branches 'stable/p2m-identity.v4.9.1' and 'stable/e820' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/p2m-identity.v4.9.1' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/m2p: Check whether the MFN has IDENTITY_FRAME bit set.. xen/m2p: No need to catch exceptions when we know that there is no RAM xen/debug: WARN_ON when identity PFN has no _PAGE_IOMAP flag set. xen/debugfs: Add 'p2m' file for printing out the P2M layout. xen/setup: Set identity mapping for non-RAM E820 and E820 gaps. xen/mmu: WARN_ON when racing to swap middle leaf. xen/mmu: Set _PAGE_IOMAP if PFN is an identity PFN. xen/mmu: Add the notion of identity (1-1) mapping. xen: Mark all initial reserved pages for the balloon as INVALID_P2M_ENTRY. * 'stable/e820' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/e820: Don't mark balloon memory as E820_UNUSABLE when running as guest and fix overflow. xen/setup: Inhibit resource API from using System RAM E820 gaps as PCI mem gaps.
2011-03-15Merge commit 'v2.6.38' into x86/mmIngo Molnar
Conflicts: arch/x86/mm/numa_64.c Merge reason: Resolve the conflict, update the branch to .38. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-15x86: Add new syscalls for x86_64Aneesh Kumar K.V
This patch add new syscalls to x86_64 Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-15x86: Add new syscalls for x86_32Aneesh Kumar K.V
This patch adds new syscalls to x86_32 Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-14Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: ce4100: Set pci ops via callback instead of module init x86/mm: Fix pgd_lock deadlock x86/mm: Handle mm_fault_error() in kernel space x86: Don't check for BIOS corruption in first 64K when there's no need to
2011-03-14xen/m2p: Check whether the MFN has IDENTITY_FRAME bit set..Stefano Stabellini
If there is no proper PFN value in the M2P for the MFN (so we get 0xFFFFF.. or 0x55555, or 0x0), we should consult the M2P override to see if there is an entry for this. [Note: we also consult the M2P override if the MFN is past our machine_to_phys size]. We consult the P2M with the PFN. In case the returned MFN is one of the special values: 0xFFF.., 0x5555 (which signify that the MFN can be either "missing" or it belongs to DOMID_IO) or the p2m(m2p(mfn)) != mfn, we check the M2P override. If we fail the M2P override check, we reset the PFN value to INVALID_P2M_ENTRY. Next we try to find the MFN in the P2M using the MFN value (not the PFN value) and if found, we know that this MFN is an identity value and return it as so. Otherwise we have exhausted all the posibilities and we return the PFN, which at this stage can either be a real PFN value found in the machine_to_phys.. array, or INVALID_P2M_ENTRY value. [v1: Added Review-by tag] Reviewed-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-03-14xen/m2p: No need to catch exceptions when we know that there is no RAMKonrad Rzeszutek Wilk
.. beyound what we think is the end of memory. However there might be more System RAM - but assigned to a guest. Hence jump to the M2P override check and consult. [v1: Added Review-by tag] Reviewed-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-03-14xen/debugfs: Add 'p2m' file for printing out the P2M layout.Konrad Rzeszutek Wilk
We walk over the whole P2M tree and construct a simplified view of which PFN regions belong to what level and what type they are. Only enabled if CONFIG_XEN_DEBUG_FS is set. [v2: UNKN->UNKNOWN, use uninitialized_var] [v3: Rebased on top of mmu->p2m code split] [v4: Fixed the else if] Reviewed-by: Ian Campbell <Ian.Campbell@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-03-14xen/mmu: Add the notion of identity (1-1) mapping.Konrad Rzeszutek Wilk
Our P2M tree structure is a three-level. On the leaf nodes we set the Machine Frame Number (MFN) of the PFN. What this means is that when one does: pfn_to_mfn(pfn), which is used when creating PTE entries, you get the real MFN of the hardware. When Xen sets up a guest it initially populates a array which has descending (or ascending) MFN values, as so: idx: 0, 1, 2 [0x290F, 0x290E, 0x290D, ..] so pfn_to_mfn(2)==0x290D. If you start, restart many guests that list starts looking quite random. We graft this structure on our P2M tree structure and stick in those MFN in the leafs. But for all other leaf entries, or for the top root, or middle one, for which there is a void entry, we assume it is "missing". So pfn_to_mfn(0xc0000)=INVALID_P2M_ENTRY. We add the possibility of setting 1-1 mappings on certain regions, so that: pfn_to_mfn(0xc0000)=0xc0000 The benefit of this is, that we can assume for non-RAM regions (think PCI BARs, or ACPI spaces), we can create mappings easily b/c we get the PFN value to match the MFN. For this to work efficiently we introduce one new page p2m_identity and allocate (via reserved_brk) any other pages we need to cover the sides (1GB or 4MB boundary violations). All entries in p2m_identity are set to INVALID_P2M_ENTRY type (Xen toolstack only recognizes that and MFNs, no other fancy value). On lookup we spot that the entry points to p2m_identity and return the identity value instead of dereferencing and returning INVALID_P2M_ENTRY. If the entry points to an allocated page, we just proceed as before and return the PFN. If the PFN has IDENTITY_FRAME_BIT set we unmask that in appropriate functions (pfn_to_mfn). The reason for having the IDENTITY_FRAME_BIT instead of just returning the PFN is that we could find ourselves where pfn_to_mfn(pfn)==pfn for a non-identity pfn. To protect ourselves against we elect to set (and get) the IDENTITY_FRAME_BIT on all identity mapped PFNs. This simplistic diagram is used to explain the more subtle piece of code. There is also a digram of the P2M at the end that can help. Imagine your E820 looking as so: 1GB 2GB /-------------------+---------\/----\ /----------\ /---+-----\ | System RAM | Sys RAM ||ACPI| | reserved | | Sys RAM | \-------------------+---------/\----/ \----------/ \---+-----/ ^- 1029MB ^- 2001MB [1029MB = 263424 (0x40500), 2001MB = 512256 (0x7D100), 2048MB = 524288 (0x80000)] And dom0_mem=max:3GB,1GB is passed in to the guest, meaning memory past 1GB is actually not present (would have to kick the balloon driver to put it in). When we are told to set the PFNs for identity mapping (see patch: "xen/setup: Set identity mapping for non-RAM E820 and E820 gaps.") we pass in the start of the PFN and the end PFN (263424 and 512256 respectively). The first step is to reserve_brk a top leaf page if the p2m[1] is missing. The top leaf page covers 512^2 of page estate (1GB) and in case the start or end PFN is not aligned on 512^2*PAGE_SIZE (1GB) we loop on aligned 1GB PFNs from start pfn to end pfn. We reserve_brk top leaf pages if they are missing (means they point to p2m_mid_missing). With the E820 example above, 263424 is not 1GB aligned so we allocate a reserve_brk page which will cover the PFNs estate from 0x40000 to 0x80000. Each entry in the allocate page is "missing" (points to p2m_missing). Next stage is to determine if we need to do a more granular boundary check on the 4MB (or 2MB depending on architecture) off the start and end pfn's. We check if the start pfn and end pfn violate that boundary check, and if so reserve_brk a middle (p2m[x][y]) leaf page. This way we have a much finer granularity of setting which PFNs are missing and which ones are identity. In our example 263424 and 512256 both fail the check so we reserve_brk two pages. Populate them with INVALID_P2M_ENTRY (so they both have "missing" values) and assign them to p2m[1][2] and p2m[1][488] respectively. At this point we would at minimum reserve_brk one page, but could be up to three. Each call to set_phys_range_identity has at maximum a three page cost. If we were to query the P2M at this stage, all those entries from start PFN through end PFN (so 1029MB -> 2001MB) would return INVALID_P2M_ENTRY ("missing"). The next step is to walk from the start pfn to the end pfn setting the IDENTITY_FRAME_BIT on each PFN. This is done in 'set_phys_range_identity'. If we find that the middle leaf is pointing to p2m_missing we can swap it over to p2m_identity - this way covering 4MB (or 2MB) PFN space. At this point we do not need to worry about boundary aligment (so no need to reserve_brk a middle page, figure out which PFNs are "missing" and which ones are identity), as that has been done earlier. If we find that the middle leaf is not occupied by p2m_identity or p2m_missing, we dereference that page (which covers 512 PFNs) and set the appropriate PFN with IDENTITY_FRAME_BIT. In our example 263424 and 512256 end up there, and we set from p2m[1][2][256->511] and p2m[1][488][0->256] with IDENTITY_FRAME_BIT set. All other regions that are void (or not filled) either point to p2m_missing (considered missing) or have the default value of INVALID_P2M_ENTRY (also considered missing). In our case, p2m[1][2][0->255] and p2m[1][488][257->511] contain the INVALID_P2M_ENTRY value and are considered "missing." This is what the p2m ends up looking (for the E820 above) with this fabulous drawing: p2m /--------------\ /-----\ | &mfn_list[0],| /-----------------\ | 0 |------>| &mfn_list[1],| /---------------\ | ~0, ~0, .. | |-----| | ..., ~0, ~0 | | ~0, ~0, [x]---+----->| IDENTITY [@256] | | 1 |---\ \--------------/ | [p2m_identity]+\ | IDENTITY [@257] | |-----| \ | [p2m_identity]+\\ | .... | | 2 |--\ \-------------------->| ... | \\ \----------------/ |-----| \ \---------------/ \\ | 3 |\ \ \\ p2m_identity |-----| \ \-------------------->/---------------\ /-----------------\ | .. +->+ | [p2m_identity]+-->| ~0, ~0, ~0, ... | \-----/ / | [p2m_identity]+-->| ..., ~0 | / /---------------\ | .... | \-----------------/ / | IDENTITY[@0] | /-+-[x], ~0, ~0.. | / | IDENTITY[@256]|<----/ \---------------/ / | ~0, ~0, .... | | \---------------/ | p2m_missing p2m_missing /------------------\ /------------\ | [p2m_mid_missing]+---->| ~0, ~0, ~0 | | [p2m_mid_missing]+---->| ..., ~0 | \------------------/ \------------/ where ~0 is INVALID_P2M_ENTRY. IDENTITY is (PFN | IDENTITY_BIT) Reviewed-by: Ian Campbell <ian.campbell@citrix.com> [v5: Changed code to use ranges, added ASCII art] [v6: Rebased on top of xen->p2m code split] [v4: Squished patches in just this one] [v7: Added RESERVE_BRK for potentially allocated pages] [v8: Fixed alignment problem] [v9: Changed 1<<3X to 1<<BITS_PER_LONG-X] [v10: Copied git commit description in the p2m code + Add Review tag] [v11: Title had '2-1' - should be '1-1' mapping] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-03-14x86: ce4100: Set pci ops via callback instead of module initSebastian Andrzej Siewior
Setting the pci ops on subsys initcall unconditionally will break multi platform kernels on anything except ce4100. Use x86_init.pci.init ops to call this only on real ce4100 platforms. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: sodaville@linutronix.de LKML-Reference: <20110314093340.GA21026@www.tglx.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-11futex: Sanitize futex ops argument typesMichel Lespinasse
Change futex_atomic_op_inuser and futex_atomic_cmpxchg_inatomic prototypes to use u32 types for the futex as this is the data type the futex core code uses all over the place. Signed-off-by: Michel Lespinasse <walken@google.com> Cc: Darren Hart <darren@dvhart.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <20110311025058.GD26122@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-11futex: Sanitize cmpxchg_futex_value_locked APIMichel Lespinasse
The cmpxchg_futex_value_locked API was funny in that it returned either the original, user-exposed futex value OR an error code such as -EFAULT. This was confusing at best, and could be a source of livelocks in places that retry the cmpxchg_futex_value_locked after trying to fix the issue by running fault_in_user_writeable(). This change makes the cmpxchg_futex_value_locked API more similar to the get_futex_value_locked one, returning an error code and updating the original value through a reference argument. Signed-off-by: Michel Lespinasse <walken@google.com> Acked-by: Chris Metcalf <cmetcalf@tilera.com> [tile] Acked-by: Tony Luck <tony.luck@intel.com> [ia64] Acked-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Michal Simek <monstr@monstr.eu> [microblaze] Acked-by: David Howells <dhowells@redhat.com> [frv] Cc: Darren Hart <darren@dvhart.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <20110311024851.GC26122@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-11x86: Clean up apic.c and apic.hHenrik Kretzschmar
This patch moves some functions and variables into init sections, makes a function static and removes some lines of cruft. Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> LKML-Reference: <1299826956-8607-2-git-send-email-henne@nachtwindheim.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-10Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, UV: Initialize the broadcast assist unit base destination node id properly x86, numa: Fix numa_emulation code with memory-less node0 x86, build: Make sure mkpiggy fails on read error
2011-03-09x86, UV: Initialize the broadcast assist unit base destination node id properlyCliff Wickman
The BAU's initialization of the broadcast description header is lacking the coherence domain (high bits) in the nasid. This causes a catastrophic system failure when running on a system with multiple coherence domains. Signed-off-by: Cliff Wickman <cpw@sgi.com> LKML-Reference: <E1PxKBB-0005F0-3U@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-09Merge commit 'v2.6.38-rc8' into x86/asmIngo Molnar
Merge reason: Update with the latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-08Merge commit 'v2.6.38-rc8' into perf/coreIngo Molnar
Merge reason: Merge latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-05Merge branch 'x86-mm' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into x86/mm
2011-03-05perf: Avoid the percore allocations if the CPU is not HT capableLin Ming
Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1299119690-13991-5-git-send-email-ming.m.lin@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-04perf: Add support for supplementary event registersAndi Kleen
Change logs against Andi's original version: - Extends perf_event_attr:config to config{,1,2} (Peter Zijlstra) - Fixed a major event scheduling issue. There cannot be a ref++ on an event that has already done ref++ once and without calling put_constraint() in between. (Stephane Eranian) - Use thread_cpumask for percore allocation. (Lin Ming) - Use MSR names in the extra reg lists. (Lin Ming) - Remove redundant "c = NULL" in intel_percore_constraints - Fix comment of perf_event_attr::config1 Intel Nehalem/Westmere have a special OFFCORE_RESPONSE event that can be used to monitor any offcore accesses from a core. This is a very useful event for various tunings, and it's also needed to implement the generic LLC-* events correctly. Unfortunately this event requires programming a mask in a separate register. And worse this separate register is per core, not per CPU thread. This patch: - Teaches perf_events that OFFCORE_RESPONSE needs extra parameters. The extra parameters are passed by user space in the perf_event_attr::config1 field. - Adds support to the Intel perf_event core to schedule per core resources. This adds fairly generic infrastructure that can be also used for other per core resources. The basic code has is patterned after the similar AMD northbridge constraints code. Thanks to Stephane Eranian who pointed out some problems in the original version and suggested improvements. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1299119690-13991-2-git-send-email-ming.m.lin@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-04x86-64, NUMA: Revert NUMA affine page table allocationTejun Heo
This patch reverts NUMA affine page table allocation added by commit 1411e0ec31 (x86-64, numa: Put pgtable to local node memory). The commit made an undocumented change where the kernel linear mapping strictly follows intersection of e820 memory map and NUMA configuration. If the physical memory configuration has holes or NUMA nodes are not properly aligned, this leads to using unnecessarily smaller mapping size which leads to increased TLB pressure. For details, http://thread.gmane.org/gmane.linux.kernel/1104672 Patches to fix the problem have been proposed but the underlying code needs more cleanup and the approach itself seems a bit heavy handed and it has been determined to revert the feature for now and come back to it in the next developement cycle. http://thread.gmane.org/gmane.linux.kernel/1105959 As init_memory_mapping_high() callsites have been consolidated since the commit, reverting is done manually. Also, the RED-PEN comment in arch/x86/mm/init.c is not restored as the problem no longer exists with memblock based top-down early memory allocation. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de>
2011-03-03xen: Mark all initial reserved pages for the balloon as INVALID_P2M_ENTRY.Konrad Rzeszutek Wilk
With this patch, we diligently set regions that will be used by the balloon driver to be INVALID_P2M_ENTRY and under the ownership of the balloon driver. We are OK using the __set_phys_to_machine as we do not expect to be allocating any P2M middle or entries pages. The set_phys_to_machine has the side-effect of potentially allocating new pages and we do not want that at this stage. We can do this because xen_build_mfn_list_list will have already allocated all such pages up to xen_max_p2m_pfn. We also move the check for auto translated physmap down the stack so it is present in __set_phys_to_machine. [v2: Rebased with mmu->p2m code split] Reviewed-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-03-03x86: Work around old gas bugJan Beulich
Add extra parentheses around a couple of definitions introduced by "x86: Cleanup vector usage" and used in assembly macro arguments, and remove spaces. Without that old (2.16.1) gas would see more macro arguments than were actually specified. Reported-and-tested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Shaohua Li <shaohua.li@intel.com> LKML-Reference: <4D6F81B10200007800034B0B@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-02Merge branch 'idle-release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6 * 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6: intel_idle: disable Atom/Lincroft HW C-state auto-demotion intel_idle: disable NHM/WSM HW C-state auto-demotion
2011-02-28x86: Use {push,pop}_cfi in more placesJan Beulich
Cleaning up and shortening code... Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Alexander van Heukelum <heukelum@fastmail.fm> LKML-Reference: <4D6BD35002000078000341DA@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-28x86: Use u32 instead of long to set reset vector back to 0Don Zickus
A customer of ours, complained that when setting the reset vector back to 0, it trashed other data and hung their box. They noticed when only 4 bytes were set to 0 instead of 8, everything worked correctly. Mathew pointed out: | | We're supposed to be resetting trampoline_phys_low and | trampoline_phys_high here, which are two 16-bit values. | Writing 64 bits is definitely going to overwrite space | that we're not supposed to be touching. | So limit the area modified to u32. Signed-off-by: Don Zickus <dzickus@redhat.com> Acked-by: Matthew Garrett <mjg@redhat.com> Cc: <stable@kernel.org> LKML-Reference: <1297139100-424-1-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-28percpu, x86: Add arch-specific this_cpu_cmpxchg_double() supportChristoph Lameter
Support this_cpu_cmpxchg_double() using the cmpxchg16b and cmpxchg8b instructions. -tj: s/percpu_cmpxchg16b/percpu_cmpxchg16b_double/ for consistency and other cosmetic changes. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2011-02-25Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86 quirk: Fix polarity for IRQ0 pin2 override on SB800 systems x86/mrst: Fix apb timer rating when lapic timer is used x86: Fix reboot problem on VersaLogic Menlow boards
2011-02-25xen: switch to new schedop hypercall by default.Ian Campbell
Rename old interface to sched_op_compat and rename sched_op_new to simply sched_op. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-02-25xen: use new schedop interface for suspendIan Campbell
Take the opportunity to comment on the semantics of the PV guest suspend hypercall arguments. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-02-25x86: dt: Cleanup local apic setupThomas Gleixner
Up to now we force enable the local apic in the devicetree setup uncoditionally and set smp_found_config unconditionally to 1 when a devicetree blob is available. This breaks, when local apic is disabled in the Kconfig. Make it consistent by initializing device tree explicitely before smp_get_config() so a non lapic configuration could be used as well. To be functional that would require to implement PIT as an interrupt host, but the only user of this code until now is ce4100 which requires apics to be available. So we leave this up to those who need it. Tested-by: Sebastian Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-24x86 quirk: Fix polarity for IRQ0 pin2 override on SB800 systemsAndreas Herrmann
On some SB800 systems polarity for IOAPIC pin2 is wrongly specified as low active by BIOS. This caused system hangs after resume from S3 when HPET was used in one-shot mode on such systems because a timer interrupt was missed (HPET signal is high active). For more details see: http://marc.info/?l=linux-kernel&m=129623757413868 Tested-by: Manoj Iyer <manoj.iyer@canonical.com> Tested-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: stable@kernel.org # 37.x, 32.x LKML-Reference: <20110224145346.GD3658@alberich.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-24x86: Rename e820_table_* to pgt_buf_*Yinghai Lu
e820_table_{start|end|top}, which are used to buffer page table allocation during early boot, are now derived from memblock and don't have much to do with e820. Change the names so that they reflect what they're used for. This patch doesn't introduce any behavior change. -v2: Ingo found that earlier patch "x86: Use early pre-allocated page table buffer top-down" caused crash on 32bit and needed to be dropped. This patch was updated to reflect the change. -tj: Updated commit description. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2011-02-24x86: dt: Fix OLPC=y/INTEL_CE=n buildSebastian Andrzej Siewior
Both OLPC and CE4100 activate CONFIG_OF. OLPC uses PROMTREE while CE uses FLATTREE. Compiling for OLPC only breaks due to missing flat tree functions and variables. Use proper wrappers and provide an empty x86_flattree_get_config() inline so OF=y FLATTREE=n builds and works. [ tglx: Make it work with HPET_TIMER=n and make a function static ] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-23x86: ioapic: Add OF bindings for IO_APICSebastian Andrzej Siewior
ioapic_xlate provides a translation from the information in device tree to ioapic related informations. This includes - obtaining hw irq which is the vector number "=> pin number + gsi" - obtaining type (level/edge/..) - programming this information into ioapic ioapic_add_ofnode adds an irq_domain based on informations from the device tree. This information (irq_domain) is required in order to map a device to its proper interrupt controller. [ tglx: Adapted to the io_apic changes, which let us move that whole code to devicetree.c ] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Dirk Brandewie <dirk.brandewie@gmail.com> Acked-by: Grant Likely <grant.likely@secretlab.ca> Cc: sodaville@linutronix.de Cc: devicetree-discuss@lists.ozlabs.org LKML-Reference: <1298405266-1624-10-git-send-email-bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-23x86: dtb: Add support for PCI devices backed by dtb nodesSebastian Andrzej Siewior
x86_of_pci_init() does two things: - it provides a generic irq enable and disable function. enable queries the device tree for the interrupt information, calls ->xlate on the irq host and updates the pci->irq information for the device. - it walks through PCI bus(es) in the device tree and adds its children (device) nodes to appropriate pci_dev nodes in kernel. So the dtb node information is available at probe time of the PCI device. Adding a PCI bus based on the information in the device tree is currently not supported. Right now direct access via ioports is used. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: Dirk Brandewie <dirk.brandewie@gmail.com> Acked-by: Grant Likely <grant.likely@secretlab.ca> Cc: sodaville@linutronix.de Cc: devicetree-discuss@lists.ozlabs.org LKML-Reference: <1298405266-1624-8-git-send-email-bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-23x86: dtb: Add early parsing of IO_APICSebastian Andrzej Siewior
APIC and IO_APIC have to be added to the system early because native_init_IRQ() requires it. In order to obtain the address of the ioapic the device tree has to be unflattened so of_address_to_resource() works. The device tree is relocated to ensure it is always covered by the kernel mapping. That way the boot loader does not have to make any assumptions about kernel's memory layout. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Grant Likely <grant.likely@secretlab.ca> Cc: sodaville@linutronix.de Cc: devicetree-discuss@lists.ozlabs.org Cc: Dirk Brandewie <dirk.brandewie@gmail.com> LKML-Reference: <1298405266-1624-6-git-send-email-bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>