summaryrefslogtreecommitdiffstats
path: root/arch/x86_64/mm
AgeCommit message (Collapse)Author
2006-02-04[PATCH] x86_64: minor odering correction to dump_pagetable()Jan Beulich
Checking of the validity of pointers should be consistently done before dereferencing the pointer. Signed-Off-By: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: Do more checking in the SRAT header codeAndi Kleen
- Check if the processor/memory affinity entries are long enough according to the ACPI 3.0 spec. - Ignore memory affinity entries that define a zero length region. All based on BIOS issues found in the field @) Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: Clear more state when ignoring empty node in SRAT parsingAndi Kleen
Might fix boot failures on systems with empty PXMs in SRAT Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16[PATCH] x86_64: add x86-64 support for memory hot-addMatt Tolentino
Add x86-64 specific memory hot-add functions, Kconfig options, and runtime kernel page table update functions to make hot-add usable on x86-64 machines. Also, fixup the nefarious conditional locking and exports pointed out by Andi. Tested on Intel and IBM x86-64 memory hot-add capable systems. Signed-off-by: Matt Tolentino <matthew.e.tolentino@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16[PATCH] x86_64: Flexmap for 32bit and randomized mappings for 64bitAndi Kleen
Another try at this. For 32bit follow the 32bit implementation from Ingo - mappings are growing down from the end of stack now and vary randomly by 1GB. Randomized mappings for 64bit just vary the normal mmap break by 1TB. I didn't bother implementing full flex mmap for 64bit because it shouldn't be needed there. Cc: mingo@elte.hu Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16x86-64: fix initrd freeingLinus Torvalds
The comparison of the initrd start address against "&_end" is unnecessary and incorrect. Make it match the x86 code that just compares the passed-in arguments. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16[PATCH] x86_64: Don't try to put kernel page tables beyond ZONE_DMA32.Andi Kleen
For not fully explained reasons it broke mem=... on several setups. Also minor cleanup. Cc: axboe@suse.de Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Allow kernel page tables upto the end of memoryAndi Kleen
Previously they would be only allocated before the kernel text at 1MB. This limited the maximum supported memory to 128GB. Now allow the e820 allocator to put them everywhere. Try to put them beyond any DMA zones to avoid filling them up. This should free some GFP_DMA memory compared to earlier kernels. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Move NUMA page_to_pfn/pfn_to_page functions out of lineAndi Kleen
Saves about ~18K .text in defconfig There would be more optimization potential, but that's for later. Suggestion originally from Bill Irwin. Fix from Andy Whitcroft. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Node local pda take 2 -- cpu_pda preparationRavikiran G Thirumalai
Helper patch to change cpu_pda users to use macros to access cpu_pda instead of the cpu_pda[] array. Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Early initialization of cpu_to_nodeRavikiran Thirumalai
Patch enables early intialization of cpu_to_node. apicid_to_node is built by reading the SRAT table, from acpi_numa_init with ACPI_NUMA and k8_scan_nodes with K8_NUMA. x86_cpu_to_apicid is built by parsing the ACPI MADT table, from acpi_boot_init. We combine these two tables and setup cpu_to_node. Early intialization helps the static per_cpu_areas in getting pages from correct node. Change since last release: Do not initialize early init_cpu_to_node for faking node cases. Patch tested on TYAN dual core 4P board with K8 only, ACPI_NUMA. Tested on EM64T NUMA. Also tested with numa=off, numa=fake, and running a kernel compiled with NUMA on a regular EM64 2 way SMP. Signed-off-by: Alok N Kataria <alokk@calsoftinc.com> Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Use function pointers to call DMA mapping functionsMuli Ben-Yehuda
AK: I hacked Muli's original patch a lot and there were a lot of changes - all bugs are probably to blame on me now. There were also some changes in the fall back behaviour for swiotlb - in particular it doesn't try to use GFP_DMA now anymore. Also all DMA mapping operations use the same core dma_alloc_coherent code with proper fallbacks now. And various other changes and cleanups. Known problems: iommu=force swiotlb=force together breaks needs more testing. This patch cleans up x86_64's DMA mapping dispatching code. Right now we have three possible IOMMU types: AGP GART, swiotlb and nommu, and in the future we will also have Xen's x86_64 swiotlb and other HW IOMMUs for x86_64. In order to support all of them cleanly, this patch: - introduces a struct dma_mapping_ops with function pointers for each of the DMA mapping operations of gart (AMD HW IOMMU), swiotlb (software IOMMU) and nommu (no IOMMU). - gets rid of: if (swiotlb) return swiotlb_xxx(); - PCI_DMA_BUS_IS_PHYS is now checked against the dma_ops being set This makes swiotlb faster by avoiding double copying in some cases. Signed-Off-By: Muli Ben-Yehuda <mulix@mulix.org> Signed-Off-By: Jon D. Mason <jdmason@us.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Reject SRAT tables that don't cover all memoryAndi Kleen
Broken BIOS on Iwill 8way systems reports these and it causes the bootmem allocator to crash. Add a sanity check if all the PXMs in the SRAT table cover all memory as reported by e820. If the sanity check fails the SRAT is rejected and the code will fall back to discover the NUMA topology using the K8 northbridge registers when applicable. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Clean up some printks in NUMA codeAndi Kleen
Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Fix up coding style in numa.cAndi Kleen
No functional changes Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Fix off by one in IOMMU checkAndi Kleen
Fix off by one when checking if the machine has enougn memory to need IOMMU This caused the IOMMUs to be needlessly enabled for mem=4G Based on a patch from Jon Mason Signed-off-by: jdmason@us.ibm.com Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Convert page fault error codes to symbolic constants.Andi Kleen
Much better to deal with these than with the magic numbers. And remove the comment describing the bits - kernel source is no replacement for an architecture manual. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Remove unnecessary case from the page fault handlerAndi Kleen
Don't need to do the vmalloc check for the module range because its PML4 is shared with the kernel text. Also removed an unnecessary TLB flush. Pointed out by Jan Beulich Cc: jbeulich@novell.com Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Return -1 for unknown PCI bus affinityAndi Kleen
When we don't know the node a PCI bus is connected to return -1. This matches the generic code. Noticed by Ravikiran G Thirumalai <kiran@scalex86.org> Cc: Ravikiran G Thirumalai <kiran@scalex86.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Validate SLIT tableAndi Kleen
A lot of Opteron BIOS just pass 10 in all SLIT entries (10 is the normalized unit). This is actually worse than the default heuristic because it leads to pci_distance not knowing the difference between local and remote nodes anymore. This messes up some NUMA heuristics in generic code. In this case it's better to fall back to the default heuristic which just does nodea == nodeb ? 10 : 20. This patch does some basic sanity checking on the SLIT and only accepts the SLIT when it passes. Invariants enforced are: - Node to itself shall be 10 - Any other distance shouldn't be 10 - Distances smaller than 10 are illegal Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Adjust page fault handlingJan Beulich
Adjust page fault protection error check before considering it to be a vmalloc synchronization candidate. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: make trap information available to die notification handlersJan Beulich
This adjusts things so that handlers of the die() notifier will have sufficient information about the trap currently being handled. It also adjusts the notify_die() prototype to (again) match that of i386. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06[PATCH] x86/x86_64: mark rodata section read-only: x86-64 supportArjan van de Ven
x86-64 specific parts to make the .rodata section read only Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06[PATCH] x86/x86_64: mark rodata section read only: generic x86-64 bugfixArjan van de Ven
Bug fix required for the .rodata work on x86-64: when change_page_attr() and friends need to break up a 2Mb page into 4Kb pages, it always set the NX bit on the PMD, which causes the cpu to consider the entire 2Mb region to be NX regardless of the actual PTE perms. This is fine in general, with one big exception: the 2Mb page that covers the last part of the kernel .text! The fix is to not invent a new permission for the new PMD entry, but to just inherit the existing one minus the PSE bit. Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-12-29[PATCH] x86_64: Fix incorrect node_present_pages on NUMARavikiran G Thirumalai
Currently, we do not pass the correct start_pfn to e820_hole_size, to calculate holes. Following patch fixes that. The bug results in incorrect number of node_present_pages for each pgdat and causes ugly output in /sys and probably VM inbalances. Signed-off-by: Alok N Kataria <alokk@calsoftinc.com> Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Andi Kleen <ak@suse.de> Sighed-off-by: Shair Fultheim <shai@scalex86.org> Sighed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-12-15[PATCH] i386,amd64: ioremap.c __iomem annotationsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-12-12[PATCH] x86_64: Bug correction in populate_memnodemap()Eric Dumazet
As reported by Keith Mannthey, there are problems in populate_memnodemap() The bug was that the compute_hash_shift() was returning 31, with incorrect initialization of memnodemap[] To correct the bug, we must use (1UL << shift) instead of (1 << shift) to avoid an integer overflow, and we must check that shift < 64 to avoid an infinite loop. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-12-12[PATCH] i386/x86-64: Don't call change_page_attr with a spinlock heldAndi Kleen
It's illegal because it can sleep. Use a two step lookup scheme instead. First look up the vm_struct, then change the direct mapping, then finally unmap it. That's ok because nobody can change the particular virtual address range as long as the vm_struct is still in the global list. Also added some LinuxDoc documentation to iounmap. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14[PATCH] x86_64: Fix sparse memBob Picco
Fix up booting with sparse mem enabled. Otherwise it would just cause an early PANIC at boot. Signed-off-by: Bob Picco <bob.picco@hp.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14[PATCH] x86_64: Remove CONFIG_CHECKING and add command line option for ↵Andi Kleen
pagefault tracing CONFIG_CHECKING covered some debugging code used in the early times of the port. But it wasn't even SMP safe for quite some time and the bugs it checked for seem to be gone. This patch removes all the code to verify GS at kernel entry. There haven't been any new bugs in this area for a long time. Previously it also covered the sysctl for the page fault tracing. That didn't make much sense because that code was unconditionally compiled in. I made that a boot option now because it is typically only useful at boot. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14[PATCH] x86_64: Make node boundaries consistentMagnus Damm
The current x86_64 NUMA memory code is inconsequent when it comes to node memory ranges. The exact behaviour varies depending on which config option that is used. setup_node_bootmem() has start and end as arguments and these are used to calculate the size of the node like this: (end - start). This is all fine if end is pointing to the first non-available byte. The problem is that the current x86_64 code sometimes treats it as the last present byte and sometimes as the first non-available byte. The result is that some configurations might lose a page at the end of the range. This patch tries to fix CONFIG_ACPI_NUMA, CONFIG_K8_NUMA and CONFIG_NUMA_EMU so they all treat the end variable as the first non-available byte. This is the same way as the single node code. The patch is boot tested on dual x86_64 hardware with the above configurations, but maybe the removed code is needed as some workaround? Signed-off-by: Magnus Damm <magnus@valinux.co.jp> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14[PATCH] x86_64: Optimize NUMA node hash functionEric Dumazet
Compute the highest possible value for memnode_shift, in order to reduce footprint of memnodemap[] to the minimum, thus making all users (phys_to_nid(), kfree()), more cache friendly. Before the patch : Node 0 MemBase 0000000000000000 Limit 00000001ffffffff Node 1 MemBase 0000000200000000 Limit 00000003ffffffff Using 23 for the hash shift. Max adder is 3ffffffff After the patch : Node 0 MemBase 0000000000000000 Limit 00000001ffffffff Node 1 MemBase 0000000200000000 Limit 00000003ffffffff Using 33 for the hash shift. In this case, only 2 bytes of memnodemap[] are used, instead of 2048 Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14[PATCH] x86_64: Replace swiotlb extern with includeAndi Kleen
Minor victory on the continuous quest against all stray extern. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14[PATCH] x86_64: Replace cpu_pda extern with includeAndi Kleen
Minor cleanup - remove obsolete extern Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14[PATCH] x86_64: Only use asm/sections.h to declare section symbolsAndi Kleen
Adding __initdata_* to asm-generic/sections.h Replaces a lot of open coded externs in arch/x86_64/* I had to change __bss_end to __bss_stop to match the other architectures. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14[PATCH] x86_64: Unmap NULL during early bootupSiddha, Suresh B
We should zap the low mappings, as soon as possible, so that we can catch kernel bugs more effectively. Previously early boot had NULL mapped and didn't trap on NULL references. This patch introduces boot_level4_pgt, which will always have low identity addresses mapped. Druing boot, all the processors will use this as their level4 pgt. On BP, we will switch to init_level4_pgt as soon as we enter C code and zap the low mappings as soon as we are done with the usage of identity low mapped addresses. On AP's we will zap the low mappings as soon as we jump to C code. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14[PATCH] x86_64: Speed up numa_node_id by putting it directly into the PDAAndi Kleen
Not go from the CPU number to an mapping array. Mode number is often used now in fast paths. This also adds a generic numa_node_id to all the topology includes Suggested by Eric Dumazet Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14[PATCH] x86_64: Account mem_map in VM holes accountingAndi Kleen
The VM needs to know about lost memory in zones to accurately balance dirty pages. This patch accounts mem_map in there too, which fixes a constant errror of a few percent. Also some other misc mappings and the kernel text itself are accounted too. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14[PATCH] x86_64: Add 4GB DMA32 zoneAndi Kleen
Add a new 4GB GFP_DMA32 zone between the GFP_DMA and GFP_NORMAL zones. As a bit of historical background: when the x86-64 port was originally designed we had some discussion if we should use a 16MB DMA zone like i386 or a 4GB DMA zone like IA64 or both. Both was ruled out at this point because it was in early 2.4 when VM is still quite shakey and had bad troubles even dealing with one DMA zone. We settled on the 16MB DMA zone mainly because we worried about older soundcards and the floppy. But this has always caused problems since then because device drivers had trouble getting enough DMA able memory. These days the VM works much better and the wide use of NUMA has proven it can deal with many zones successfully. So this patch adds both zones. This helps drivers who need a lot of memory below 4GB because their hardware is not accessing more (graphic drivers - proprietary and free ones, video frame buffer drivers, sound drivers etc.). Previously they could only use IOMMU+16MB GFP_DMA, which was not enough memory. Another common problem is that hardware who has full memory addressing for >4GB misses it for some control structures in memory (like transmit rings or other metadata). They tended to allocate memory in the 16MB GFP_DMA or the IOMMU/swiotlb then using pci_alloc_consistent, but that can tie up a lot of precious 16MB GFPDMA/IOMMU/swiotlb memory (even on AMD systems the IOMMU tends to be quite small) especially if you have many devices. With the new zone pci_alloc_consistent can just put this stuff into memory below 4GB which works better. One argument was still if the zone should be 4GB or 2GB. The main motivation for 2GB would be an unnamed not so unpopular hardware raid controller (mostly found in older machines from a particular four letter company) who has a strange 2GB restriction in firmware. But that one works ok with swiotlb/IOMMU anyways, so it doesn't really need GFP_DMA32. I chose 4GB to be compatible with IA64 and because it seems to be the most common restriction. The new zone is so far added only for x86-64. For other architectures who don't set up this new zone nothing changes. Architectures can set a compatibility define in Kconfig CONFIG_DMA_IS_DMA32 that will define GFP_DMA32 as GFP_DMA. Otherwise it's a nop because on 32bit architectures it's normally not needed because GFP_NORMAL (=0) is DMA able enough. One problem is still that GFP_DMA means different things on different architectures. e.g. some drivers used to have #ifdef ia64 use GFP_DMA (trusting it to be 4GB) #elif __x86_64__ (use other hacks like the swiotlb because 16MB is not enough) ... . This was quite ugly and is now obsolete. These should be now converted to use GFP_DMA32 unconditionally. I haven't done this yet. Or best only use pci_alloc_consistent/dma_alloc_coherent which will use GFP_DMA32 transparently. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29[PATCH] mm: init_mm without ptlockHugh Dickins
First step in pushing down the page_table_lock. init_mm.page_table_lock has been used throughout the architectures (usually for ioremap): not to serialize kernel address space allocation (that's usually vmlist_lock), but because pud_alloc,pmd_alloc,pte_alloc_kernel expect caller holds it. Reverse that: don't lock or unlock init_mm.page_table_lock in any of the architectures; instead rely on pud_alloc,pmd_alloc,pte_alloc_kernel to take and drop it when allocating a new one, to check lest a racing task already did. Similarly no page_table_lock in vmalloc's map_vm_area. Some temporary ugliness in __pud_alloc and __pmd_alloc: since they also handle user mms, which are converted only by a later patch, for now they have to lock differently according to whether or not it's init_mm. If sources get muddled, there's a danger that an arch source taking init_mm.page_table_lock will be mixed with common source also taking it (or neither take it). So break the rules and make another change, which should break the build for such a mismatch: remove the redundant mm arg from pte_alloc_kernel (ppc64 scrapped its distinct ioremap_mm in 2.6.13). Exceptions: arm26 used pte_alloc_kernel on user mm, now pte_alloc_map; ia64 used pte_alloc_map on init_mm, now pte_alloc_kernel; parisc had bad args to pmd_alloc and pte_alloc_kernel in unused USE_HPPA_IOREMAP code; ppc64 map_io_page forgot to unlock on failure; ppc mmu_mapin_ram and ppc64 im_free took page_table_lock for no good reason. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-10[PATCH] x86_64: Fix change_page_attr cache flushingAndi Kleen
Noticed by Terence Ripperda Undo wrong change in global_flush_tlb. We need to flush the caches in all cases, not just when pages were reverted. This was a bogus optimization added earlier, but it was wrong. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-30[PATCH] x86_64 early numa init fixRavikiran G Thirumalai
The tests Alok carried out on Petr's box confirmed that cpu_to_node[BP] is not setup early enough by numa_init_array due to the x86_64 changes in 2.6.14-rc*, and unfortunately set wrongly by the work around code in numa_init_array(). cpu_to_node[0] gets set with 1 early and later gets set properly to 0 during identify_cpu() when all cpus are brought up, but confusing the numa slab in the process. Here is a quick fix for this. The right fix obviously is to have cpu_to_node[bsp] setup early for numa_init_array(). The following patch will fix the problem now, and the code can stay on even when cpu_to_node{BP] gets fixed early correctly. Thanks to Petr for access to his box. Signed off by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Alok N Kataria <alokk@calsoftinc.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-30[PATCH] x86_64: fix the BP node_to_cpumaskRavikiran G Thirumalai
Fix the BP node_to_cpumask. 2.6.14-rc* broke the boot cpu bit as the cpu_to_node(0) is now not setup early enough for numa_init_array. cpu_to_node[] is setup much later at srat_detect_node on acpi srat based em64t machines. This seems like a problem on amd machines too, Tested on em64t though. /sys/devices/system/node/node0/cpumap shows up sanely after this patch. Signed off by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12[PATCH] x86-64: Use correct mask to compute conflicting nodes in SRATAndi Kleen
The nodes are not set online yet at this point. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12[PATCH] x86-64: reset apicid<->node tables when SRAT cannot be parsedAndi Kleen
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12[PATCH] x86-64: Clean up the SRAT node list before computing the hash functionAndi Kleen
Also use for_each_node_mask instead of hand crafted loops. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12[PATCH] x86-64: Safe interrupts in oops_begin/endJan Beulich
Rather than blindly re-enabling interrupts in oops_end(), save their state in oope_begin() and then restore that state. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12[PATCH] x86-64: Improve error handling for overlapping PXMs in SRAT.Andi Kleen
- Report PXMs instead of nodes - Report the correct PXM, not always the one of node 1. - Only warn for the case of a PXM overlapping by itself Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12[PATCH] x86-64: Fix show_mem a little bitAndi Kleen
- Add KERN_INFO to printks (from i386) - Use longs instead of ints to accumulate pages. - Fix broken indenting. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12[PATCH] x86-64: Use ACPI PXM to parse PCI<->node assignmentsAndi Kleen
Since this is shared code I had to implement it for i386 too Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>