summaryrefslogtreecommitdiffstats
path: root/arch/x86/kernel/i387.c
AgeCommit message (Collapse)Author
2010-05-10x86: Introduce 'struct fpu' and related APIAvi Kivity
Currently all fpu state access is through tsk->thread.xstate. Since we wish to generalize fpu access to non-task contexts, wrap the state in a new 'struct fpu' and convert existing access to use an fpu API. Signal frame handlers are not converted to the API since they will remain task context only things. Signed-off-by: Avi Kivity <avi@redhat.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1273135546-29690-3-git-send-email-avi@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-05-10x86: Eliminate TS_XSAVEAvi Kivity
The fpu code currently uses current->thread_info->status & TS_XSAVE as a way to distinguish between XSAVE capable processors and older processors. The decision is not really task specific; instead we use the task status to avoid a global memory reference - the value should be the same across all threads. Eliminate this tie-in into the task structure by using an alternative instruction keyed off the XSAVE cpu feature; this results in shorter and faster code, without introducing a global memory reference. [ hpa: in the future, this probably should use an asm jmp ] Signed-off-by: Avi Kivity <avi@redhat.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1273135546-29690-2-git-send-email-avi@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-02-23x86, ptrace: Remove set_stopped_child_used_math() in [x]fpregs_setSuresh Siddha
init_fpu() already ensures that the used_math() is set for the stopped child. Remove the redundant set_stopped_child_used_math() in [x]fpregs_set() Reported-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20100222225240.642169080@sbs-t61.sc.intel.com> Acked-by: Rolan McGrath <roland@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-23x86, ptrace: Simplify xstateregs_get()Suresh Siddha
48 bytes (bytes 464..511) of the xstateregs payload come from the kernel defined structure (xstate_fx_sw_bytes). Rest comes from the xstate regs structure in the thread struct. Instead of having multiple user_regset_copyout()'s, simplify the xstateregs_get() by first copying the SW bytes into the xstate regs structure in the thread structure and then using one user_regset_copyout() to copyout the xstateregs. Requested-by: Roland McGrath <roland@redhat.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20100222225240.494688491@sbs-t61.sc.intel.com> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Oleg Nesterov <oleg@redhat.com>
2010-02-11x86, ptrace: regset extensions to support xstateSuresh Siddha
Add the xstate regset support which helps extend the kernel ptrace and the core-dump interfaces to support AVX state etc. This regset interface is designed to support all the future state that gets supported using xsave/xrstor infrastructure. Looking at the memory layout saved by "xsave", one can't say which state is represented in the memory layout. This is because if a particular state is in init state, in the xsave hdr it can be represented by bit '0'. And hence we can't really say by the xsave header wether a state is in init state or the state is not saved in the memory layout. And hence the xsave memory layout available through this regset interface uses SW usable bytes [464..511] to convey what state is represented in the memory layout. First 8 bytes of the sw_usable_bytes[464..467] will be set to OS enabled xstate mask(which is same as the 64bit mask returned by the xgetbv's xCR0). The note NT_X86_XSTATE represents the extended state information in the core file, using the above mentioned memory layout. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20100211195614.802495327@sbs-t61.sc.intel.com> Signed-off-by: Hongjiu Lu <hjl.tools@gmail.com> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-03-04x86, math-emu: fix init_fpu for task != currentDaniel Glöckner
Impact: fix math-emu related crash while using GDB/ptrace init_fpu() calls finit to initialize a task's xstate, while finit always works on the current task. If we use PTRACE_GETFPREGS on another process and both processes did not already use floating point, we get a null pointer exception in finit. This patch creates a new function finit_task that takes a task_struct parameter. finit becomes a wrapper that simply calls finit_task with current. On the plus side this avoids many calls to get_current which would each resolve to an inline assembler mov instruction. An empty finit_task has been added to i387.h to avoid linker errors in case the compiler still emits the call in init_fpu when CONFIG_MATH_EMULATION is not defined. The declaration of finit in i387.h has been removed as the remaining code using this function gets its prototype from fpu_proto.h. Signed-off-by: Daniel Glöckner <dg@emlix.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: "Pallipadi Venkatesh" <venkatesh.pallipadi@intel.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Bill Metzenthen <billm@melbpc.org.au> LKML-Reference: <E1Lew31-0004il-Fg@mailer.emlix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-20x86: fix __cpuinit/__init tangle in init_thread_xstate()Rakib Mullick
Impact: fix incorrect __init annotation This patch removes the following section mismatch warning. A patch set was send previously (http://lkml.org/lkml/2008/11/10/407). But introduce some other problem, reported by Rufus (http://lkml.org/lkml/2008/11/11/46). Then Ingo Molnar suggest that, it's best to remove __init from xsave_cntxt_init(void). Which is the second patch in this series. Now, this one removes the following warning. WARNING: arch/x86/kernel/built-in.o(.cpuinit.text+0x2237): Section mismatch in reference from the function cpu_init() to the function .init.text:init_thread_xstate() The function __cpuinit cpu_init() references a function __init init_thread_xstate(). If init_thread_xstate is only used by cpu_init then annotate init_thread_xstate with a matching annotation. Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-07x86: xsave: set FP, SSE bits in the xsave header in the user sigcontextSuresh Siddha
If a processor implementation discern that a processor state component is in its initialized state, it may modify the corresponding bit in the xsave header.xstate_bv as '0'. State in the memory layout setup by 'xsave' will be consistent with the bit values in the header. During signal handling, legacy applications may change the FP/SSE bits in the sigcontext memory layout without touching the FP/SSE header bits in the xsave header. So always set FP/SSE bits in the xsave header while saving the sigcontext state to the user space. During signal return, this will enable the kernel to capture any changes to the FP/SSE bits by the legacy applications which don't touch xsave headers. xsave aware apps can change the xstate_bv in the xsave header aswell as change any contents in the memory layout. xrestor as part of sigreturn will capture all the changes. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-07-30x86, xsave: keep the XSAVE feature mask as an u64H. Peter Anvin
The XSAVE feature mask is a 64-bit number; keep it that way, in order to avoid the mistake done with rdmsr/wrmsr. Use the xsetbv() function provided in the previous patch. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30x86, xsave: update xsave header bits during ptrace fpregs setSuresh Siddha
FP/SSE bits may be zero in the xsave header(representing the init state). Update these bits during the ptrace fpregs set operation, to indicate the non-init state. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30x86, xsave: save/restore the extended state context in sigframeSuresh Siddha
On cpu's supporting xsave/xrstor, fpstate pointer in the sigcontext, will include the extended state information along with fpstate information. Presence of extended state information is indicated by the presence of FP_XSTATE_MAGIC1 at fpstate.sw_reserved.magic1 and FP_XSTATE_MAGIC2 at fpstate + (fpstate.sw_reserved.extended_size - FP_XSTATE_MAGIC2_SIZE). Extended feature bit mask that is saved in the memory layout is represented by the fpstate.sw_reserved.xstate_bv For RT signal frames, UC_FP_XSTATE in the uc_flags also indicate the presence of extended state information in the sigcontext's fpstate pointer. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30x86, xsave: reorganization of signal save/restore fpstate code layoutSuresh Siddha
move 64bit routines that saves/restores fpstate in/from user stack from signal_64.c to xsave.c restore_i387_xstate() now handles the condition when user passes NULL fpstate. Other misc changes for prepartion of xsave/xrstor sigcontext support. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30x86, xsave: dynamically allocate sigframes fpstate instead of static allocationSuresh Siddha
dynamically allocate fpstate on the stack, instead of static allocation in the current sigframe layout on the user stack. This will allow the fpstate structure to grow in the future, which includes extended state information supporting xsave/xrstor. signal handlers will be able to access the fpstate pointer from the sigcontext structure asusual, with no change. For the non RT sigframe's (which are supported only for 32bit apps), current static fpstate layout in the sigframe will be unused(so that we don't change the extramask[] offset in the sigframe and thus prevent breaking app's which modify extramask[]). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30x86, xsave: context switch support using xsave/xrstorSuresh Siddha
Uses xsave/xrstor (instead of traditional fxsave/fxrstor) in context switch when available. Introduces TS_XSAVE flag, which determine the need to use xsave/xrstor instructions during context switch instead of the legacy fxsave/fxrstor instructions. Thread-synchronous status word is already in L1 cache during this code patch and thus minimizes the performance penality compared to (cpu_has_xsave) checks. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30x86, xsave: enable xsave/xrstor on cpus with xsave supportSuresh Siddha
Enables xsave/xrstor by turning on cr4.osxsave on cpu's which have the xsave support. For now, features that OS supports/enabled are FP and SSE. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-04x86: fix broken math-emu with lazy allocation of fpu areaSuresh Siddha
Fix the math emulation that got broken with the recent lazy allocation of FPU area. init_fpu() need to be added for the math-emulation path aswell for the FPU area allocation. math emulation enabled kernel booted fine with this, in the presence of "no387 nofxsr" boot param. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: hpa@zytor.com Cc: mingo@elte.hu Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-10x86: fix fpu restore from sig returnSuresh Siddha
If the task never used fpu, initialize the fpu before restoring the FP state from the signal handler context. This will allocate the fpu state, if the task never needed it before. Reported-and-bisected-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Tested-by: Eric Sesterhenn <snakebyte@gmx.de> Cc: Frederik Deweerdt <deweerdt@free.fr> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-19x86, fpu: lazy allocation of FPU area - v5Suresh Siddha
Only allocate the FPU area when the application actually uses FPU, i.e., in the first lazy FPU trap. This could save memory for non-fpu using apps. for example: on my system after boot, there are around 300 processes, with only 17 using FPU. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-19x86, fpu: split FPU state from task struct - v5Suresh Siddha
Split the FPU save area from the task struct. This allows easy migration of FPU context, and it's generally cleaner. It also allows the following two optimizations: 1) only allocate when the application actually uses FPU, so in the first lazy FPU trap. This could save memory for non-fpu using apps. Next patch does this lazy allocation. 2) allocate the right size for the actual cpu rather than 512 bytes always. Patches enabling xsave/xrstor support (coming shortly) will take advantage of this. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-17x86: clean up i387.cIngo Molnar
minor coding style cleanups. Before: total: 0 errors, 3 warnings, 479 lines checked After: total: 0 errors, 1 warnings, 483 lines checked No code changed: arch/x86/kernel/i387.o: text data bss dec hex filename 2379 4 8 2391 957 i387.o.before 2379 4 8 2391 957 i387.o.after md5: e1434553a3b4ff1f52ad97a68b1fad8a i387.o.before.asm e1434553a3b4ff1f52ad97a68b1fad8a i387.o.after.asm Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-07x86: fix merge mistake in i387.cJan Beulich
convert_fxsr_to_user() in 2.6.24's i387_32.c did this, and convert_to_fxsr() also does the inverse, so I assume it's an oversight that it is no longer being done. [ mingo@elte.hu: we encode it this way because there's no space for the 'FPU Last Instruction Opcode' (->fop) field in the legacy user_i387_ia32_struct that PTRACE_GETFPREGS/PTRACE_SETFPREGS uses. it's probably pure legacy - i'd be surprised if any user-space relied on the FPU Last Opcode in any way. But indeed we used to do it previously so the most conservative thing is to preserve that piece of information. ] Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-04x86, i387: fix ptrace leakage using init_fpu()Suresh Siddha
This bug got introduced by the recent i387 merge: commit 4421011120b2304e5c248ae4165a2704588aedf1 Author: Roland McGrath <roland@redhat.com> Date: Wed Jan 30 13:31:50 2008 +0100 x86: x86 i387 user_regset Current usage of unlazy_fpu() in ptrace specific routines is wrong. unlazy_fpu() will not init fpu if the task never used math. So the ptrace calls can expose the parent tasks FPU data in some cases. Replace it with the init_fpu() which will init the math state, if the task never used math before. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-19x86: make mxcsr_feature_mask static againAdrian Bunk
Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Roland McGrath <roland@redhat.com> Cc: hpa@zytor.com Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: x86 user_regset cleanupRoland McGrath
This removes a bunch of dead code that is no longer needed now that the user_regset interfaces are being used for all these jobs. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: x86 i387 user_regsetRoland McGrath
This revamps the i387 code to be shared across 32-bit, 64-bit, and 32-on-64. It does so by consolidating the code in one place based on the user_regset accessor interfaces. This switches 32-bit to using the i387_64.h header and 64-bit to using the i387.c that was previously i387_32.c, but that's what took the least cleanup in each file. Here i387.h is stubbed to always include i387_64.h rather than renaming the file, to keep this diff smaller and easier to read. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: i387 renamingRoland McGrath
This renames arch/x86/kernel/{i387_32.c => i387.c}. This is a pure renaming, but paves the way for merging the 32-bit and 64-bit versions of this code. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>