Age | Commit message (Collapse) | Author |
|
When a secondary hardware thread has finished running a KVM guest, we
currently put that thread into nap mode using a nap instruction in
the KVM code. This changes the code so that instead of doing a nap
instruction directly, we instead cause the call to power7_nap() that
put the thread into nap mode to return. The reason for doing this is
to avoid having the KVM code having to know what low-power mode to
put the thread into.
In the case of a secondary thread used to run a KVM guest, the thread
will be offline from the point of view of the host kernel, and the
relevant power7_nap() call is the one in pnv_smp_cpu_disable().
In this case we don't want to clear pending IPIs in the offline loop
in that function, since that might cause us to miss the wakeup for
the next time the thread needs to run a guest. To tell whether or
not to clear the interrupt, we use the SRR1 value returned from
power7_nap(), and check if it indicates an external interrupt. We
arrange that the return from power7_nap() when we have finished running
a guest returns 0, so pending interrupts don't get flushed in that
case.
Note that it is important a secondary thread that has finished
executing in the guest, or that didn't have a guest to run, should
not return to power7_nap's caller while the kvm_hstate.hwthread_req
flag in the PACA is non-zero, because the return from power7_nap
will reenable the MMU, and the MMU might still be in guest context.
In this situation we spin at low priority in real mode waiting for
hwthread_req to become zero.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
introduce new setsockopt() command:
setsockopt(sock, SOL_SOCKET, SO_ATTACH_BPF, &prog_fd, sizeof(prog_fd))
where prog_fd was received from syscall bpf(BPF_PROG_LOAD, attr, ...)
and attr->prog_type == BPF_PROG_TYPE_SOCKET_FILTER
setsockopt() calls bpf_prog_get() which increments refcnt of the program,
so it doesn't get unloaded while socket is using the program.
The same eBPF program can be attached to multiple sockets.
User task exit automatically closes socket which calls sk_filter_uncharge()
which decrements refcnt of eBPF program
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The existing MCE code calls flush_tlb hook with IS=0 (single page) resulting
in partial invalidation of TLBs which is not right. This patch fixes
that by passing IS=0xc00 to invalidate whole TLB for successful recovery
from TLB and ERAT errors.
Cc: stable@vger.kernel.org
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
upatepp can get called for a nohpte fault when we find from the linux
page table that the translation was hashed before. In that case
we are sure that there is no existing translation, hence we could
avoid doing tlbie.
We could possibly race with a parallel fault filling the TLB. But
that should be ok because updatepp is only ever relaxing permissions.
We also look at linux pte permission bits when filling hash pte
permission bits. We also hold the linux pte busy bits while
inserting/updating a hashpte entry, hence a paralle update of
linux pte is not possible. On the other hand mprotect involves
ptep_modify_prot_start which cause a hpte invalidate and not updatepp.
Performance number:
We use randbox_access_bench written by Anton.
Kernel with THP disabled and smaller hash page table size.
86.60% random_access_b [kernel.kallsyms] [k] .native_hpte_updatepp
2.10% random_access_b random_access_bench [.] doit
1.99% random_access_b [kernel.kallsyms] [k] .do_raw_spin_lock
1.85% random_access_b [kernel.kallsyms] [k] .native_hpte_insert
1.26% random_access_b [kernel.kallsyms] [k] .native_flush_hash_range
1.18% random_access_b [kernel.kallsyms] [k] .__delay
0.69% random_access_b [kernel.kallsyms] [k] .native_hpte_remove
0.37% random_access_b [kernel.kallsyms] [k] .clear_user_page
0.34% random_access_b [kernel.kallsyms] [k] .__hash_page_64K
0.32% random_access_b [kernel.kallsyms] [k] fast_exception_return
0.30% random_access_b [kernel.kallsyms] [k] .hash_page_mm
With Fix:
27.54% random_access_b random_access_bench [.] doit
22.90% random_access_b [kernel.kallsyms] [k] .native_hpte_insert
5.76% random_access_b [kernel.kallsyms] [k] .native_hpte_remove
5.20% random_access_b [kernel.kallsyms] [k] fast_exception_return
5.12% random_access_b [kernel.kallsyms] [k] .__hash_page_64K
4.80% random_access_b [kernel.kallsyms] [k] .hash_page_mm
3.31% random_access_b [kernel.kallsyms] [k] data_access_common
1.84% random_access_b [kernel.kallsyms] [k] .trace_hardirqs_on_caller
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Memset on a local variable may be removed when it is called just before the
variable goes out of scope. Using memzero_explicit defeats this
optimization. A simplified version of the semantic patch that makes this
change is as follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
identifier x;
type T;
@@
{
... when any
T x[...];
... when any
when exists
- memset
+ memzero_explicit
(x,
-0,
...)
... when != x
when strict
}
// </smpl>
This change was suggested by Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Drop BP_IABR_TE, which though used, does not do anything useful. Rename
BP_IABR to BP_CIABR. Renumber the flags.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
This patch enables support for hardware instruction breakpoint in xmon
on POWER8 platform with the help of a new register called the CIABR
(Completed Instruction Address Breakpoint Register). With this patch, a
single hardware instruction breakpoint can be added and cleared during
any active xmon debug session. The hardware based instruction breakpoint
mechanism works correctly with the existing TRAP based instruction
breakpoint available on xmon.
There are no powerpc CPU with CPU_FTR_IABR feature any more. This patch
has re-purposed all the existing IABR related code to work with CIABR
register based HW instruction breakpoint.
This has one odd feature, which is that when we hit a breakpoint xmon
doesn't tell us we have hit the breakpoint. This is because xmon is
expecting bp->address == regs->nip. Because CIABR fires on completition
regs->nip points to the instruction after the breakpoint. We could fix
that, but it would then confuse other parts of the xmon code which think
we need to emulate the instruction. [mpe]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
|
|
Merge updates collected & acked by Ben. A few EEH patches from Gavin,
some mm updates from Aneesh and a few odds and ends.
|
|
If we know that user address space has never executed on other cpus
we could use tlbiel.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Rename invalidate_old_hpte to flush_hash_hugepage and use that in
other places.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Limit the number of gigantic hugepages specified by the
hugepages= parameter to MAX_NUMBER_GPAGES.
Signed-off-by: James Yang <James.Yang@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
A page fault occurred during reading user stack in oprofile backtrace
would lead following calltrace:
WARNING: at linux/kernel/smp.c:210
Modules linked in:
CPU: 5 PID: 736 Comm: sh Tainted: G W 3.14.23-WR7.0.0.0_standard #1
task: c0000000f6208bc0 ti: c00000007c72c000 task.ti: c00000007c72c000
NIP: c0000000000ed6e4 LR: c0000000000ed5b8 CTR: 0000000000000000
REGS: c00000007c72f050 TRAP: 0700 Tainted: G W (3.14.23-WR7.0.0
tandard)
MSR: 0000000080021000 <CE,ME> CR: 48222482 XER: 00000000
SOFTE: 0
GPR00: c0000000000ed5b8 c00000007c72f2d0 c0000000010aa048 0000000000000005
GPR04: c000000000fdb820 c00000007c72f410 0000000000000001 0000000000000005
GPR08: c0000000010b5768 c000000000f8a048 0000000000000001 0000000000000000
GPR12: 0000000048222482 c00000000fffe580 0000000022222222 0000000010129664
GPR16: 0000000010143cc0 0000000000000000 0000000044444444 0000000000000000
GPR20: c00000007c7221d8 c0000000f638e3c8 000003f15a20120d 0000000000000001
GPR24: 000000005a20120d c00000007c722000 c00000007cdedda8 00003fffef23b160
GPR28: 0000000000000001 c00000007c72f410 c000000000fdb820 0000000000000006
NIP [c0000000000ed6e4] .smp_call_function_single+0x18c/0x248
LR [c0000000000ed5b8] .smp_call_function_single+0x60/0x248
Call Trace:
[c00000007c72f2d0] [c0000000000ed5b8] .smp_call_function_single+0x60/0x248 (unreliable)
[c00000007c72f3a0] [c000000000030810] .__flush_tlb_page+0x164/0x1b0
[c00000007c72f460] [c00000000002e054] .ptep_set_access_flags+0xb8/0x168
[c00000007c72f500] [c0000000001ad3d8] .handle_mm_fault+0x4a8/0xbac
[c00000007c72f5e0] [c000000000bb3238] .do_page_fault+0x3b8/0x868
[c00000007c72f810] [c00000000001e1d0] storage_fault_common+0x20/0x44
Exception: 301 at .__copy_tofrom_user_base+0x54/0x5b0
LR = .op_powerpc_backtrace+0x190/0x20c
[c00000007c72fb00] [c000000000a2ec34] .op_powerpc_backtrace+0x204/0x20c (unreliable)
[c00000007c72fbc0] [c000000000a2b5fc] .oprofile_add_ext_sample+0xe8/0x118
[c00000007c72fc70] [c000000000a2eee0] .fsl_emb_handle_interrupt+0x20c/0x27c
[c00000007c72fd30] [c000000000a2e440] .op_handle_interrupt+0x44/0x58
[c00000007c72fdb0] [c000000000016d68] .performance_monitor_exception+0x74/0x90
[c00000007c72fe30] [c00000000001d8b4] exc_0x260_common+0xfc/0x100
performance_monitor_exception() is executed in a context with interrupt
disabled and preemption enabled. When there is a user space page fault
happened, do_page_fault() invoke in_atomic() to decide whether kernel
should handle such page fault. in_atomic() only check preempt_count.
So need call pagefault_disable() to disable preemption before reading
user stack.
Signed-off-by: Jiang Lu <lu.jiang@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
With smaller hash page table config, we would end up in situation
where we would be replacing hash page table slot frequently. In
such config, we will find the hpte to be not matching, and we
can do that check without holding the hpte lock. We need to
recheck the hpte again after holding lock.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
This is what we get in dmesg when booting a pseries guest and
the hypervisor doesn't provide EEH support.
[ 0.166655] EEH functionality not supported
[ 0.166778] eeh_init: Failed to call platform init function (-22)
Since both powernv_eeh_init() and pseries_eeh_init() already complain when
hitting an error, it is not needed to print more (especially such an
uninformative message).
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Cleanup OpalMCE_* definitions/declarations and other related code which
is not used anymore.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Acked-by: Benjamin Herrrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
On PowerNV platform, PHB diag-data is dumped after stopping device
drivers. In case of recursive EEH errors, the kernel is usually
crashed before dumping PHB diag-data for the second EEH error. It's
hard to locate the root cause of the second EEH error without PHB
diag-data.
The patch adds one more EEH option "eeh=early_log", which helps
dumping PHB diag-data immediately once frozen PE is detected, in
order to get the PHB diag-data for the second EEH error.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
In PCI passthrou scenario, we need simulate EEH recovery for Emulex
adapters when their ownership changes, as we did in commit 5cfb20b96
("powerpc/eeh: Emulate EEH recovery for VFIO devices"). Broadcom
BCM5719 adpaters are facing same problem and needs same cure.
Reported-by: Rajeshkumar Subramanian <rajeshkumars@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The patch introduces additional flag EEH_PE_RESET to indicate the
corresponding PE is under reset. In turn, the PE retrieval bakcend
on PowerNV platform can return unfrozen state for the EEH core to
moving forward. Flag EEH_PE_CFG_BLOCKED isn't the correct one for
the purpose.
In PCI passthrou case, the problem is more worse: Guest doesn't
recover 6th EEH error. The PE is left in isolated (frozen) and
config blocked state on Broadcom adapters. We can't retrieve the
PE's state correctly any more, even from the host side via sysfs
/sys/bus/pci/devices/xxx/eeh_pe_state.
Reported-by: Rajeshkumar Subramanian <rajeshkumars@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The patch refactors eeh_reset_pe() in order for:
* Varied return values for different failure cases.
* Replace pr_err() with pr_warn() and print function name.
* Coding style cleanup.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The current driver probe() function assumes the sensor device to be
always present and gets executed every time if the driver is loaded,
but the appropriate hardware could not be present.
So, move the platform device creation as part of platform init code
and use the 'id_table' to check if the device is present or not.
Signed-off-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
In commit 9c62a68d13119a1ca9718381d97b0cb415ff4e9d ("netpoll:
Remove dead packet receive code (CONFIG_NETPOLL_TRAP)") this
Kconfig option was removed. So remove references to it from
all defconfigs as well.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux
Pull powerpc fixes from Michael Ellerman:
"Here are five fixes for you to pull please.
They're all CC'ed to stable except the "Fix PE state format" one which
went in this release"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
powerpc: 32 bit getcpu VDSO function uses 64 bit instructions
powerpc/powernv: Replace OPAL_DEASSERT_RESET with EEH_RESET_DEACTIVATE
powerpc/eeh: Fix PE state format
powerpc/pseries: Fix endiannes issue in RTAS call from xmon
powerpc/powernv: Fix the hmi event version check.
|
|
I used some 64 bit instructions when adding the 32 bit getcpu VDSO
function. Fix it.
Fixes: 18ad51dd342a ("powerpc: Add VDSO version of getcpu")
Cc: stable@vger.kernel.org
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The flag passed to ioda_eeh_phb_reset() should be EEH_RESET_DEACTIVATE,
which is translated to OPAL_DEASSERT_RESET or something else by the
EEH backend accordingly.
The patch replaces OPAL_DEASSERT_RESET with EEH_RESET_DEACTIVATE for
ioda_eeh_phb_reset().
Cc: stable@vger.kernel.org
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Obviously I had wrong format given to the PE state output from
/sys/bus/pci/devices/xxxx/eeh_pe_state with some typoes, which
was introduced by commit 2013add4ce73. The patch fixes it up.
Fixes: 2013add4ce73 ("powerpc/eeh: Show hex prefix for PE state sysfs")
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
On pseries system (LPAR) xmon failed to enter when running in LE mode,
system is hunging. Inititating xmon will lead to such an output on the
console:
SysRq : Entering xmon
cpu 0x15: Vector: 0 at [c0000003f39ffb10]
pc: c00000000007ed7c: sysrq_handle_xmon+0x5c/0x70
lr: c00000000007ed7c: sysrq_handle_xmon+0x5c/0x70
sp: c0000003f39ffc70
msr: 8000000000009033
current = 0xc0000003fafa7180
paca = 0xc000000007d75e80 softe: 0 irq_happened: 0x01
pid = 14617, comm = bash
Bad kernel stack pointer fafb4b0 at eca7cc4
cpu 0x15: Vector: 300 (Data Access) at [c000000007f07d40]
pc: 000000000eca7cc4
lr: 000000000eca7c44
sp: fafb4b0
msr: 8000000000001000
dar: 10000000
dsisr: 42000000
current = 0xc0000003fafa7180
paca = 0xc000000007d75e80 softe: 0 irq_happened: 0x01
pid = 14617, comm = bash
cpu 0x15: Exception 300 (Data Access) in xmon, returning to main loop
xmon: WARNING: bad recursive fault on cpu 0x15
The root cause is that xmon is calling RTAS to turn off the surveillance
when entering xmon, and RTAS is requiring big endian parameters.
This patch is byte swapping the RTAS arguments when running in LE mode.
Cc: stable@vger.kernel.org
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The current HMI event structure is an ABI and carries a version field to
accommodate future changes without affecting/rearranging current structure
members that are valid for previous versions.
The current version check "if (hmi_evt->version != OpalHMIEvt_V1)"
doesn't accomodate the fact that the version number may change in
future.
If firmware starts returning an HMI event with version > 1, this check
will fail and no HMI information will be printed on older kernels.
This patch fixes this issue.
Cc: stable@vger.kernel.org # 3.17+
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
[mpe: Reword changelog]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The OF_RECONFIG notifier callback uses a different structure depending
on whether it is a node change or a property change. This is silly, and
not very safe. Rework the code to use the same data structure regardless
of the type of notifier.
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
Cc: <linuxppc-dev@lists.ozlabs.org>
|
|
This prefixes all crypto module loading with "crypto-" so we never run
the risk of exposing module auto-loading to userspace via a crypto API,
as demonstrated by Mathias Krause:
https://lkml.org/lkml/2013/3/4/70
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
This is now fully replaced with the generic "no_64bit_msi" one
that is set by the respective drivers directly.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Instead of the arch specific quirk which we are deprecating
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@vger.kernel.org>
|
|
Instead of the arch specific quirk which we are deprecating
and that drivers don't understand.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@vger.kernel.org>
|
|
The PCI/MSI irq chip callbacks mask/unmask_msi_irq have been renamed
to pci_msi_mask/unmask_irq to mark them PCI specific. Rename all usage
sites. The conversion helper functions are kept around to avoid
conflicts in next and will be removed after merging into mainline.
Coccinelle assisted conversion. No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: x86@kernel.org
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Murali Karicheri <m-karicheri2@ti.com>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Mohit Kumar <mohit.kumar@st.com>
Cc: Simon Horman <horms@verge.net.au>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Yijing Wang <wangyijing@huawei.com>
|
|
Rename write_msi_msg() to pci_write_msi_msg() to mark it as PCI
specific.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Yingjoe Chen <yingjoe.chen@mediatek.com>
Cc: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Rename __read_msi_msg() to __pci_read_msi_msg() and kill unused
read_msi_msg(). It's a preparation to separate generic MSI code from
PCI core.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Yingjoe Chen <yingjoe.chen@mediatek.com>
Cc: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Conflicts:
drivers/net/ieee802154/fakehard.c
A bug fix went into 'net' for ieee802154/fakehard.c, which is removed
in 'net-next'.
Add build fix into the merge from Stephen Rothwell in openvswitch, the
logging macros take a new initial 'log' argument, a new call was added
in 'net' so when we merge that in here we have to explicitly add the
new 'log' arg to it else the build fails.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch fix spelling typo in printk and Kconfig within
various part of kernel sources.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
current trivial.git base
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Although we are now selecting NO_BOOTMEM, we still have some traces of
bootmem lying around. That is because even with NO_BOOTMEM there is
still a shim that converts bootmem calls into memblock calls, but
ultimately we want to remove all traces of bootmem.
Most of the patch is conversions from alloc_bootmem() to
memblock_virt_alloc(). In general a call such as:
p = (struct foo *)alloc_bootmem(x);
Becomes:
p = memblock_virt_alloc(x, 0);
We don't need the cast because memblock_virt_alloc() returns a void *.
The alignment value of zero tells memblock to use the default alignment,
which is SMP_CACHE_BYTES, the same value alloc_bootmem() uses.
We remove a number of NULL checks on the result of
memblock_virt_alloc(). That is because memblock_virt_alloc() will panic
if it can't allocate, in exactly the same way as alloc_bootmem(), so the
NULL checks are and always have been redundant.
The memory returned by memblock_virt_alloc() is already zeroed, so we
remove several memsets of the result of memblock_virt_alloc().
Finally we convert a few uses of __alloc_bootmem(x, y, MAX_DMA_ADDRESS)
to just plain memblock_virt_alloc(). We don't use memblock_alloc_base()
because MAX_DMA_ADDRESS is ~0ul on powerpc, so limiting the allocation
to that is pointless, 16XB ought to be enough for anyone.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
nvram_pstore_info's buf_lock is not initialized before registering,
which is clearly incorrect.
It causes some strange behavior when trying to obtain the lock during
kdump process.
On a UP configuration, the console stopped for a couple of seconds, then
"lockup suspected" warning printed out, but then it continued to run.
So try lock fails, and lockup reported, but then arch_spin_lock()
passes.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
[mpe: Edited changelog]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Reduce duplicated code by unifying
BPF_ALU | BPF_MOD | BPF_X and BPF_ALU | BPF_DIV | BPF_X
CC: Alexei Starovoitov<alexei.starovoitov@gmail.com>
CC: Daniel Borkmann<dborkman@redhat.com>
CC: Philippe Bergheaud<felix@linux.vnet.ibm.com>
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The IOMMU-API gained support for a new iommu_map_sg
function. This causes compile failures on powerpc because
the function name is already globally used there.
This patch renames adds a ppc_ prefix to these functions to
solve the compile problem.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Scott says:
"Highlights include a bunch of 8xx optimizations, device tree bindings
for Freescale BMan, QMan, and FMan datapath components, misc device tree
updates, and inbound rio window support."
|
|
The commit 543c043cbae7 ("powerpc/fsl_msi: change the irq handler from
chained to normal") changes the msi cascade handler from chained to
normal. Since cascade handler must run in hard interrupt context, this
will cause kernel panic if we force threading of all the interrupt
handler via kernel command parameter 'threadirqs'. So mark the irq
handler IRQF_NO_THREAD explicitly.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
As Freescale IFC controller has been moved to driver to driver/memory.
So enable memory driver in powerpc config
Signed-off-by: Prabhakar Kushwaha <prabhakar@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
On architectures with hardware broadcasting of TLB invalidation messages
, it makes sense to reduce the range of the mmu_gather structure when
unmapping page ranges based on the dirty address information passed to
tlb_remove_tlb_entry.
arm64 already does this by directly manipulating the start/end fields
of the gather structure, but this confuses the generic code which
does not expect these fields to change and can end up calculating
invalid, negative ranges when forcing a flush in zap_pte_range.
This patch moves the minimal range calculation out of the arm64 code
and into the generic implementation, simplifying zap_pte_range in the
process (which no longer needs to care about start/end, since they will
point to the appropriate ranges already). With the range being tracked
by core code, the need_flush flag is dropped in favour of checking that
the end of the range has actually been set.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Michal Simek <monstr@monstr.eu>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
The patch implements the OPAL rtc driver that binds with the rtc
driver subsystem. The driver uses the platform device infrastructure
to probe the rtc device and register it to rtc class framework. The
'wakeup' is supported depending upon the property 'has-tpo' present
in the OF node. It provides a way to load the generic rtc driver in
in the absence of an OPAL driver.
The patch also moves the existing OPAL rtc get/set time interfaces to the
new driver and exposes the necessary OPAL calls using EXPORT_SYMBOL_GPL.
Test results:
-------------
Host:
[root@tul169p1 ~]# ls -l /sys/class/rtc/
total 0
lrwxrwxrwx 1 root root 0 Oct 14 03:07 rtc0 -> ../../devices/opal-rtc/rtc/rtc0
[root@tul169p1 ~]# cat /sys/devices/opal-rtc/rtc/rtc0/time
08:10:07
[root@tul169p1 ~]# echo `date '+%s' -d '+ 2 minutes'` > /sys/class/rtc/rtc0/wakealarm
[root@tul169p1 ~]# cat /sys/class/rtc/rtc0/wakealarm
1413274345
[root@tul169p1 ~]#
FSP:
$ smgr mfgState
standby
$ rtim timeofday
System time is valid: 2014/10/14 08:12:04.225115
$ smgr mfgState
ipling
$
CC: devicetree@vger.kernel.org
CC: tglx@linutronix.de
CC: rtc-linux@googlegroups.com
CC: a.zummo@towertech.it
Signed-off-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Back in 2009 we merged 501cb16d3cfd "Randomise PIEs", which added support for
randomizing PIE (Position Independent Executable) binaries.
That commit added randomize_et_dyn(), which correctly randomized the addresses,
but failed to honor PF_RANDOMIZE. That means it was not possible to disable PIE
randomization via the personality flag, or /proc/sys/kernel/randomize_va_space.
Since then there has been generic support for PIE randomization added to
binfmt_elf.c, selectable via ARCH_BINFMT_ELF_RANDOMIZE_PIE.
Enabling that allows us to drop randomize_et_dyn(), which means we start
honoring PF_RANDOMIZE correctly.
It also causes a fairly major change to how we layout PIE binaries.
Currently we will place the binary at 512MB-520MB for 32 bit binaries, or
512MB-1.5GB for 64 bit binaries, eg:
$ cat /proc/$$/maps
4e550000-4e580000 r-xp 00000000 08:02 129813 /bin/dash
4e580000-4e590000 rw-p 00020000 08:02 129813 /bin/dash
10014110000-10014140000 rw-p 00000000 00:00 0 [heap]
3fffaa3f0000-3fffaa5a0000 r-xp 00000000 08:02 921 /lib/powerpc64le-linux-gnu/libc-2.19.so
3fffaa5a0000-3fffaa5b0000 rw-p 001a0000 08:02 921 /lib/powerpc64le-linux-gnu/libc-2.19.so
3fffaa5c0000-3fffaa5d0000 rw-p 00000000 00:00 0
3fffaa5d0000-3fffaa5f0000 r-xp 00000000 00:00 0 [vdso]
3fffaa5f0000-3fffaa620000 r-xp 00000000 08:02 1246 /lib/powerpc64le-linux-gnu/ld-2.19.so
3fffaa620000-3fffaa630000 rw-p 00020000 08:02 1246 /lib/powerpc64le-linux-gnu/ld-2.19.so
3ffffc340000-3ffffc370000 rw-p 00000000 00:00 0 [stack]
With this commit applied we don't do any special randomisation for the binary,
and instead rely on mmap randomisation. This means the binary ends up at high
addresses, eg:
$ cat /proc/$$/maps
3fff99820000-3fff999d0000 r-xp 00000000 08:02 921 /lib/powerpc64le-linux-gnu/libc-2.19.so
3fff999d0000-3fff999e0000 rw-p 001a0000 08:02 921 /lib/powerpc64le-linux-gnu/libc-2.19.so
3fff999f0000-3fff99a00000 rw-p 00000000 00:00 0
3fff99a00000-3fff99a20000 r-xp 00000000 00:00 0 [vdso]
3fff99a20000-3fff99a50000 r-xp 00000000 08:02 1246 /lib/powerpc64le-linux-gnu/ld-2.19.so
3fff99a50000-3fff99a60000 rw-p 00020000 08:02 1246 /lib/powerpc64le-linux-gnu/ld-2.19.so
3fff99a60000-3fff99a90000 r-xp 00000000 08:02 129813 /bin/dash
3fff99a90000-3fff99aa0000 rw-p 00020000 08:02 129813 /bin/dash
3fffc3de0000-3fffc3e10000 rw-p 00000000 00:00 0 [stack]
3fffc55e0000-3fffc5610000 rw-p 00000000 00:00 0 [heap]
Although this should be OK, it's possible it might break badly written
binaries that make assumptions about the address space layout.
Signed-off-by: Vineeth Vijayan <vvijayan@mvista.com>
[mpe: Rewrite changelog]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|