Age | Commit message (Collapse) | Author |
|
local_irq_disable/enable()
Ran into this while looking at some new crypto code using FPU
hitting a WARN_ON_ONCE(!irq_fpu_usable()) in the kernel_fpu_begin()
on a x86 kernel that uses the new eagerfpu model. In short, current eagerfpu
changes return 0 for interrupted_kernel_fpu_idle() and the in_interrupt()
thinks it is in the interrupt context because of the local_bh_disable().
Thus resulting in the WARN_ON().
Remove the local_bh_disable/enable() calls around the existing
local_irq_disable/enable() calls. local_irq_disable/enable() already
disables the BH.
[ If there are any other legitimate users calling kernel_fpu_begin() from
the process context but with BH disabled, then we can look into fixing the
irq_fpu_usable() in future. ]
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Convert a nonnegative error return code to a negative one, as returned
elsewhere in the function.
A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
(
if@p1 (\(ret < 0\|ret != 0\))
{ ... return ret; }
|
ret@p1 = 0
)
... when != ret = e1
when != &ret
*if(...)
{
... when != ret = e2
when forall
return ret;
}
// </smpl>
Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Convert a nonnegative error return code to a negative one, as returned
elsewhere in the function.
A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
(
if@p1 (\(ret < 0\|ret != 0\))
{ ... return ret; }
|
ret@p1 = 0
)
... when != ret = e1
when != &ret
*if(...)
{
... when != ret = e2
when forall
return ret;
}
// </smpl>
Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Convert a nonnegative error return code to a negative one, as returned
elsewhere in the function.
A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
(
if@p1 (\(ret < 0\|ret != 0\))
{ ... return ret; }
|
ret@p1 = 0
)
... when != ret = e1
when != &ret
*if(...)
{
... when != ret = e2
when forall
return ret;
}
// </smpl>
Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Convert a nonnegative error return code to a negative one, as returned
elsewhere in the function.
A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
(
if@p1 (\(ret < 0\|ret != 0\))
{ ... return ret; }
|
ret@p1 = 0
)
... when != ret = e1
when != &ret
*if(...)
{
... when != ret = e2
when forall
return ret;
}
// </smpl>
Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com>
Reviewed-by: Arun Murthy <arunrmurthy83@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
According to SEC v5.0-v5.3 reference manuals.
Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Acked-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
In current driver, everytime we need to access the rng clock
,ie to enable or disable it, a call to clk_get is done.
This is not correct and the preferred way is to provide a rng data structure
that could be used for accessing rng resources.
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Adapt clocks to the new i.mx clock framework and fix the following warning:
------------[ cut here ]------------
WARNING: at drivers/clk/clk.c:511 __clk_enable+0x9c/0xac()
Modules linked in:
Backtrace:
[<800124c8>] (dump_backtrace+0x0/0x10c) from [<804172dc>] (dump_stack+0x18/0x1c)
r7:00000009 r6:000001ff r5:8032cb50 r4:00000000
[<804172c4>] (dump_stack+0x0/0x1c) from [<80021834>] (warn_slowpath_common+0x54)
[<800217e0>] (warn_slowpath_common+0x0/0x6c) from [<80021870>] (warn_slowpath_n)
r9:80581cac r8:8700a9c0 r7:805ab070 r6:80000013 r5:806133d4
r4:8700a9c0
[<8002184c>] (warn_slowpath_null+0x0/0x2c) from [<8032cb50>] (__clk_enable+0x9c)
[<8032cab4>] (__clk_enable+0x0/0xac) from [<8032cb88>] (clk_enable+0x28/0x44)
r5:806133d4 r4:8700a9c0
[<8032cb60>] (clk_enable+0x0/0x44) from [<80560f14>] (mxc_rnga_probe+0x68/0x164)
r7:805ab070 r6:8706ec00 r5:80611314 r4:00000000
[<80560eac>] (mxc_rnga_probe+0x0/0x164) from [<8025914c>] (platform_drv_probe+0)
[<8025912c>] (platform_drv_probe+0x0/0x24) from [<80257c7c>] (driver_probe_devi)
[<80257bfc>] (driver_probe_device+0x0/0x204) from [<80257e94>] (__driver_attach)
r9:80581cac r8:0000008e r7:00000000 r6:8706ec3c r5:805ab070
r4:8706ec08
[<80257e00>] (__driver_attach+0x0/0x98) from [<8025642c>] (bus_for_each_dev+0x6)
r7:00000000 r6:80257e00 r5:87035e98 r4:805ab070
[<802563c4>] (bus_for_each_dev+0x0/0x94) from [<80257adc>] (driver_attach+0x20/)
r7:00000000 r6:873f2380 r5:805ab338 r4:805ab070
[<80257abc>] (driver_attach+0x0/0x28) from [<80256d50>] (bus_add_driver+0x18c/0)
[<80256bc4>] (bus_add_driver+0x0/0x268) from [<802584c4>] (driver_register+0x80)
[<80258444>] (driver_register+0x0/0x134) from [<802594f4>] (platform_driver_reg)
r7:00000000 r6:805c2e00 r5:00000007 r4:805ab05c
[<802594a8>] (platform_driver_register+0x0/0x60) from [<80259528>] (platform_dr)
[<80259508>] (platform_driver_probe+0x0/0xa4) from [<80560ea0>] (mod_init+0x18/)
r7:00000000 r6:805c2e00 r5:00000007 r4:87034000
[<80560e88>] (mod_init+0x0/0x24) from [<800086b4>] (do_one_initcall+0x40/0x194)
[<80008674>] (do_one_initcall+0x0/0x194) from [<8053d3f4>] (kernel_init+0xfc/0x)
[<8053d2f8>] (kernel_init+0x0/0x1cc) from [<80027190>] (do_exit+0x0/0x7ec)
---[ end trace 4198eed02050f461 ]---
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Support for ESNs (extended sequence numbers).
Tested with strongswan by connecting back-to-back P1010RDB with P2020RDB.
Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
.cra_list initialization is unneeded and have been removed from all other
crypto modules except 842.
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
This reverts commit e6ccc727f30a02670f6a00df6d548942bc988f43.
Above commit caused performance regression for CAST6. Reverting gives
following increase in tcrypt speed tests (revert-vs-old ratios).
AMD Phenom II X6 1055T, x86-64:
size ecb cbc ctr lrw xts
enc dec enc dec enc dec enc dec enc dec
16b 1.15x 1.17x 1.16x 1.17x 1.16x 1.16x 1.14x 1.19x 1.05x 1.07x
64b 1.19x 1.23x 1.20x 1.22x 1.19x 1.19x 1.16x 1.24x 1.12x 1.12x
256b 1.21x 1.24x 1.22x 1.24x 1.20x 1.20x 1.17x 1.21x 1.16x 1.14x
1kb 1.21x 1.25x 1.22x 1.24x 1.21x 1.21x 1.18x 1.22x 1.17x 1.15x
8kb 1.21x 1.25x 1.22x 1.24x 1.21x 1.21x 1.18x 1.22x 1.18x 1.15x
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Fix "symbol 'x' was not declared. Should it be static?" sparse warnings.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Fix "symbol 'x' was not declared. Should it be static?" sparse warnings.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Fix "constant 0xXXXXXXXXXXXXXXXX is so big it's unsigned long" sparse warnings.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
be static?)
Fix "symbol 'x' was not declared. Should it be static?" sparse warnings.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Patch replaces 'movb' instructions with 'movzbl' to break false register
dependencies, interleaves instructions better for out-of-order scheduling
and merges constant 16-bit rotation with round-key variable rotation.
tcrypt ECB results:
Intel Core i5-2450M:
size old-vs-new new-vs-generic old-vs-generic
enc dec enc dec enc dec
256 1.13x 1.19x 2.05x 2.17x 1.82x 1.82x
1k 1.18x 1.21x 2.26x 2.33x 1.93x 1.93x
8k 1.19x 1.19x 2.32x 2.33x 1.95x 1.95x
[v2]
- Do instruction interleaving another way to avoid adding new FPU<=>CPU
register moves as these cause performance drop on Bulldozer.
- Improvements to round-key variable rotation handling.
- Further interleaving improvements for better out-of-order scheduling.
Cc: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Patch replaces 'movb' instructions with 'movzbl' to break false register
dependencies, interleaves instructions better for out-of-order scheduling
and merges constant 16-bit rotation with round-key variable rotation.
tcrypt ECB results (128bit key):
Intel Core i5-2450M:
size old-vs-new new-vs-generic old-vs-generic
enc dec enc dec enc dec
256 1.18x 1.18x 2.45x 2.47x 2.08x 2.10x
1k 1.20x 1.20x 2.73x 2.73x 2.28x 2.28x
8k 1.20x 1.19x 2.73x 2.73x 2.28x 2.29x
[v2]
- Do instruction interleaving another way to avoid adding new FPU<=>CPU
register moves as these cause performance drop on Bulldozer.
- Improvements to round-key variable rotation handling.
- Further interleaving improvements for better out-of-order scheduling.
Cc: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Patch replaces 'movb' instructions with 'movzbl' to break false register
dependencies and interleaves instructions better for out-of-order scheduling.
Tested on Intel Core i5-2450M and AMD FX-8100.
tcrypt ECB results:
Intel Core i5-2450M:
size old-vs-new new-vs-3way old-vs-3way
enc dec enc dec enc dec
256 1.12x 1.13x 1.36x 1.37x 1.21x 1.22x
1k 1.14x 1.14x 1.48x 1.49x 1.29x 1.31x
8k 1.14x 1.14x 1.50x 1.52x 1.32x 1.33x
AMD FX-8100:
size old-vs-new new-vs-3way old-vs-3way
enc dec enc dec enc dec
256 1.10x 1.11x 1.01x 1.01x 0.92x 0.91x
1k 1.11x 1.12x 1.08x 1.07x 0.97x 0.96x
8k 1.11x 1.13x 1.10x 1.08x 0.99x 0.97x
[v2]
- Do instruction interleaving another way to avoid adding new FPU<=>CPU
register moves as these cause performance drop on Bulldozer.
- Further interleaving improvements for better out-of-order scheduling.
Tested-by: Borislav Petkov <bp@alien8.de>
Cc: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
module_pci_driver makes the code simpler by eliminating
module_init and module_exit calls.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Remove duplicated include.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
use true/false for bool, fix code alignment, and fix two allocs with
no test.
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Signed-off-by: Devendra Naga <develkernel412222@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Add assembler versions of AES and SHA1 for ARM platforms. This has provided
up to a 50% improvement in IPsec/TCP throughout for tunnels using AES128/SHA1.
Platform CPU SPeed Endian Before (bps) After (bps) Improvement
IXP425 533 MHz big 11217042 15566294 ~38%
KS8695 166 MHz little 3828549 5795373 ~51%
Signed-off-by: David McCullough <ucdevel@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Add a MAINTAINERS entry for the IBM Power in-Nest Crypto Acceleators
driver.
Signed-off-by: Kent Yoder <key@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Support for ESNs (extended sequence numbers).
Tested with strongswan on a P2020RDB back-to-back setup.
Extracted from /etc/ipsec.conf:
esp=aes-sha1-esn-modp4096!
Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Generate a link table in case assoc data is a scatterlist.
While at it, add support for handling non-contiguous assoc data and iv.
Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
It's more natural to think of these vars as bool rather than int.
Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
talitos_edesc_alloc does not need hash_result param.
Checking whether dst scatterlist is NULL or not is all that is required.
Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
For IPsec encryption, in the case when:
-the input buffer is fragmented (edesc->src_nents > 0)
-the output buffer is not fragmented (edesc->dst_nents = 0)
the ICV is not output in the link table, but after the encrypted payload.
Copying the ICV must be avoided in this case; consequently the condition
edesc->dma_len > 0 must be more specific, i.e. must depend on the type
of the output buffer - fragmented or not.
Testing was performed by modifying testmgr to support src != dst,
since currently native kernel IPsec does in-place encryption
(src == dst).
Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
the entry points and geniv definitions for all aead,
ablkcipher, and hash algorithms are all common; move them to a
single assignment in talitos_alg_alloc().
This assumes it's ok to assign a setkey() on non-hmac algs.
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
lighten driver_algs[] by moving them to talitos_alg_alloc().
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Commit bd3c7b5c2aba ("crypto: atmel - add Atmel AES driver") possibly
has a typo error of adding an extra CONFIG_.
CC: Nicolas Royer <nicolas@eukrea.com>
Signed-off-by: Tushar Behera <tushar.behera@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
devm_kfree and devm_iounmap should not have to be explicitly used.
The semantic patch that fixes this problem is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression x,d;
@@
x = devm_kzalloc(...)
...
?-devm_kfree(d,x);
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
virt_to_abs() is just a wrapper around __pa(), use __pa() directly.
We should be including <asm/page.h> to get __pa(). abs_addr.h will be
removed shortly so drop that.
We were getting of.h via abs_addr.h so we need to include that directly.
Having done all that, clean up the ordering of the includes.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
AES-NI hardware pipelines
Use parallel LRW and XTS encryption facilities to better utilize AES-NI
hardware pipelines and gain extra performance.
Tcrypt benchmark results (async), old vs new ratios:
Intel Core i5-2450M CPU (fam: 6, model: 42, step: 7)
aes:128bit
lrw:256bit xts:256bit
size lrw-enc lrw-dec xts-dec xts-dec
16B 0.99x 1.00x 1.22x 1.19x
64B 1.38x 1.50x 1.58x 1.61x
256B 2.04x 2.02x 2.27x 2.29x
1024B 2.56x 2.54x 2.89x 2.92x
8192B 2.85x 2.99x 3.40x 3.23x
aes:192bit
lrw:320bit xts:384bit
size lrw-enc lrw-dec xts-dec xts-dec
16B 1.08x 1.08x 1.16x 1.17x
64B 1.48x 1.54x 1.59x 1.65x
256B 2.18x 2.17x 2.29x 2.28x
1024B 2.67x 2.67x 2.87x 3.05x
8192B 2.93x 2.84x 3.28x 3.33x
aes:256bit
lrw:348bit xts:512bit
size lrw-enc lrw-dec xts-dec xts-dec
16B 1.07x 1.07x 1.18x 1.19x
64B 1.56x 1.56x 1.70x 1.71x
256B 2.22x 2.24x 2.46x 2.46x
1024B 2.76x 2.77x 3.13x 3.05x
8192B 2.99x 3.05x 3.40x 3.30x
Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Reviewed-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
This patch add the 842 cryptographic API driver that
submits compression requests to the 842 hardware compression
accelerator driver (nx-compress).
If the hardware accelerator goes offline for any reason
(dynamic disable, migration, etc...), this driver will use LZO
as a software failover for all future compression requests.
For decompression requests, the 842 hardware driver contains
a software implementation of the 842 decompressor to support
the decompression of data that was compressed before the accelerator
went offline.
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
This patch adds the driver for interacting with the 842
compression accelerator on IBM Power7+ systems.
The device is a child of the Platform Facilities Option (PFO)
and shows up as a child of the IBM VIO bus.
The compression/decompression API takes the same arguments
as existing compression methods like lzo and deflate. The 842
hardware operates on 4K hardware pages and the driver breaks up
input on 4K boundaries to submit it to the hardware accelerator.
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
This patch enables compression engine support in the
architecture vector. This causes the Power hypervisor
to allow access to the nx comrpession accelerator.
Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
This patch creates a new submenu for the NX cryptographic
hardware accelerator and breaks the NX options into their own
Kconfig file under drivers/crypto/nx/Kconfig.
This will permit additional NX functionality to be easily
and more cleanly added in the future without touching
drivers/crypto/Makefile|Kconfig.
Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
SHARE_WAIT, whilst more optimal for association-less crypto,
has the ability to start thrashing the CCB descriptor/key
caches, given high levels of traffic across multiple security
associations (and thus keys).
Switch to using the SERIAL sharing type, which prefers
the last used CCB for the SA. On a 2-DECO platform
such as the P3041, this can improve performance by
about 3.7%.
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
In some device trees of previous version, there were string "fsl,sec4.0".
To be backward compatible with device trees, we first check "fsl,sec-v4.0",
if it fails, then check for "fsl,sec4.0".
Signed-off-by: Shengzhou Liu <Shengzhou.Liu@freescale.com>
extended to include new hash and rng code, which was omitted from
the previous version of this patch during a rebase of the SDK
version.
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
commit "crypto: caam - use non-irq versions of spinlocks for job rings"
made two bad assumptions:
(a) The caam_jr_enqueue lock isn't used in softirq context.
Not true: jr_enqueue can be interrupted by an incoming net
interrupt and the received packet may be sent for encryption,
via caam_jr_enqueue in softirq context, thereby inducing a
deadlock.
This is evidenced when running netperf over an IPSec tunnel
between two P4080's, with spinlock debugging turned on:
[ 892.092569] BUG: spinlock lockup on CPU#7, netperf/10634, e8bf5f70
[ 892.098747] Call Trace:
[ 892.101197] [eff9fc10] [c00084c0] show_stack+0x48/0x15c (unreliable)
[ 892.107563] [eff9fc50] [c0239c2c] do_raw_spin_lock+0x16c/0x174
[ 892.113399] [eff9fc80] [c0596494] _raw_spin_lock+0x3c/0x50
[ 892.118889] [eff9fc90] [c0445e74] caam_jr_enqueue+0xf8/0x250
[ 892.124550] [eff9fcd0] [c044a644] aead_decrypt+0x6c/0xc8
[ 892.129625] BUG: spinlock lockup on CPU#5, swapper/5/0, e8bf5f70
[ 892.129629] Call Trace:
[ 892.129637] [effa7c10] [c00084c0] show_stack+0x48/0x15c (unreliable)
[ 892.129645] [effa7c50] [c0239c2c] do_raw_spin_lock+0x16c/0x174
[ 892.129652] [effa7c80] [c0596494] _raw_spin_lock+0x3c/0x50
[ 892.129660] [effa7c90] [c0445e74] caam_jr_enqueue+0xf8/0x250
[ 892.129666] [effa7cd0] [c044a644] aead_decrypt+0x6c/0xc8
[ 892.129674] [effa7d00] [c0509724] esp_input+0x178/0x334
[ 892.129681] [effa7d50] [c0519778] xfrm_input+0x77c/0x818
[ 892.129688] [effa7da0] [c050e344] xfrm4_rcv_encap+0x20/0x30
[ 892.129697] [effa7db0] [c04b90c8] ip_local_deliver+0x190/0x408
[ 892.129703] [effa7de0] [c04b966c] ip_rcv+0x32c/0x898
[ 892.129709] [effa7e10] [c048b998] __netif_receive_skb+0x27c/0x4e8
[ 892.129715] [effa7e80] [c048d744] netif_receive_skb+0x4c/0x13c
[ 892.129726] [effa7eb0] [c03c28ac] _dpa_rx+0x1a8/0x354
[ 892.129732] [effa7ef0] [c03c2ac4] ingress_rx_default_dqrr+0x6c/0x108
[ 892.129742] [effa7f10] [c0467ae0] qman_poll_dqrr+0x170/0x1d4
[ 892.129748] [effa7f40] [c03c153c] dpaa_eth_poll+0x20/0x94
[ 892.129754] [effa7f60] [c048dbd0] net_rx_action+0x13c/0x1f4
[ 892.129763] [effa7fa0] [c003d1b8] __do_softirq+0x108/0x1b0
[ 892.129769] [effa7ff0] [c000df58] call_do_softirq+0x14/0x24
[ 892.129775] [ebacfe70] [c0004868] do_softirq+0xd8/0x104
[ 892.129780] [ebacfe90] [c003d5a4] irq_exit+0xb8/0xd8
[ 892.129786] [ebacfea0] [c0004498] do_IRQ+0xa4/0x1b0
[ 892.129792] [ebacfed0] [c000fad8] ret_from_except+0x0/0x18
[ 892.129798] [ebacff90] [c0009010] cpu_idle+0x94/0xf0
[ 892.129804] [ebacffb0] [c059ff88] start_secondary+0x42c/0x430
[ 892.129809] [ebacfff0] [c0001e28] __secondary_start+0x30/0x84
[ 892.281474]
[ 892.282959] [eff9fd00] [c0509724] esp_input+0x178/0x334
[ 892.288186] [eff9fd50] [c0519778] xfrm_input+0x77c/0x818
[ 892.293499] [eff9fda0] [c050e344] xfrm4_rcv_encap+0x20/0x30
[ 892.299074] [eff9fdb0] [c04b90c8] ip_local_deliver+0x190/0x408
[ 892.304907] [eff9fde0] [c04b966c] ip_rcv+0x32c/0x898
[ 892.309872] [eff9fe10] [c048b998] __netif_receive_skb+0x27c/0x4e8
[ 892.315966] [eff9fe80] [c048d744] netif_receive_skb+0x4c/0x13c
[ 892.321803] [eff9feb0] [c03c28ac] _dpa_rx+0x1a8/0x354
[ 892.326855] [eff9fef0] [c03c2ac4] ingress_rx_default_dqrr+0x6c/0x108
[ 892.333212] [eff9ff10] [c0467ae0] qman_poll_dqrr+0x170/0x1d4
[ 892.338872] [eff9ff40] [c03c153c] dpaa_eth_poll+0x20/0x94
[ 892.344271] [eff9ff60] [c048dbd0] net_rx_action+0x13c/0x1f4
[ 892.349846] [eff9ffa0] [c003d1b8] __do_softirq+0x108/0x1b0
[ 892.355338] [eff9fff0] [c000df58] call_do_softirq+0x14/0x24
[ 892.360910] [e7169950] [c0004868] do_softirq+0xd8/0x104
[ 892.366135] [e7169970] [c003d5a4] irq_exit+0xb8/0xd8
[ 892.371101] [e7169980] [c0004498] do_IRQ+0xa4/0x1b0
[ 892.375979] [e71699b0] [c000fad8] ret_from_except+0x0/0x18
[ 892.381466] [e7169a70] [c0445e74] caam_jr_enqueue+0xf8/0x250
[ 892.387127] [e7169ab0] [c044ad4c] aead_givencrypt+0x6ac/0xa70
[ 892.392873] [e7169b20] [c050a0b8] esp_output+0x2b4/0x570
[ 892.398186] [e7169b80] [c0519b9c] xfrm_output_resume+0x248/0x7c0
[ 892.404194] [e7169bb0] [c050e89c] xfrm4_output_finish+0x18/0x28
[ 892.410113] [e7169bc0] [c050e8f4] xfrm4_output+0x48/0x98
[ 892.415427] [e7169bd0] [c04beac0] ip_local_out+0x48/0x98
[ 892.420740] [e7169be0] [c04bec7c] ip_queue_xmit+0x16c/0x490
[ 892.426314] [e7169c10] [c04d6128] tcp_transmit_skb+0x35c/0x9a4
[ 892.432147] [e7169c70] [c04d6f98] tcp_write_xmit+0x200/0xa04
[ 892.437808] [e7169cc0] [c04c8ccc] tcp_sendmsg+0x994/0xcec
[ 892.443213] [e7169d40] [c04eebfc] inet_sendmsg+0xd0/0x164
[ 892.448617] [e7169d70] [c04792f8] sock_sendmsg+0x8c/0xbc
[ 892.453931] [e7169e40] [c047aecc] sys_sendto+0xc0/0xfc
[ 892.459069] [e7169f10] [c047b934] sys_socketcall+0x110/0x25c
[ 892.464729] [e7169f40] [c000f480] ret_from_syscall+0x0/0x3c
(b) since the caam_jr_dequeue lock is only used in bh context,
then semantically it should use _bh spin_lock types. spin_lock_bh
semantics are to disable back-halves, and used when a lock is shared
between softirq (bh) context and process and/or h/w IRQ context.
Since the lock is only used within softirq context, and this tasklet
is atomic, there is no need to do the additional work to disable
back halves.
This patch adds back-half disabling protection to caam_jr_enqueue
spin_locks to fix (a), and drops it from caam_jr_dequeue to fix (b).
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
This patch adds a x86_64/avx assembler implementation of the Cast6 block
cipher. The implementation processes eight blocks in parallel (two 4 block
chunk AVX operations). The table-lookups are done in general-purpose registers.
For small blocksizes the functions from the generic module are called. A good
performance increase is provided for blocksizes greater or equal to 128B.
Patch has been tested with tcrypt and automated filesystem tests.
Tcrypt benchmark results:
Intel Core i5-2500 CPU (fam:6, model:42, step:7)
cast6-avx-x86_64 vs. cast6-generic
128bit key: (lrw:256bit) (xts:256bit)
size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
16B 0.97x 1.00x 1.01x 1.01x 0.99x 0.97x 0.98x 1.01x 0.96x 0.98x
64B 0.98x 0.99x 1.02x 1.01x 0.99x 1.00x 1.01x 0.99x 1.00x 0.99x
256B 1.77x 1.84x 0.99x 1.85x 1.77x 1.77x 1.70x 1.74x 1.69x 1.72x
1024B 1.93x 1.95x 0.99x 1.96x 1.93x 1.93x 1.84x 1.85x 1.89x 1.87x
8192B 1.91x 1.95x 0.99x 1.97x 1.95x 1.91x 1.86x 1.87x 1.93x 1.90x
256bit key: (lrw:384bit) (xts:512bit)
size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
16B 0.97x 0.99x 1.02x 1.01x 0.98x 0.99x 1.00x 1.00x 0.98x 0.98x
64B 0.98x 0.99x 1.01x 1.00x 1.00x 1.00x 1.01x 1.01x 0.97x 1.00x
256B 1.77x 1.83x 1.00x 1.86x 1.79x 1.78x 1.70x 1.76x 1.71x 1.69x
1024B 1.92x 1.95x 0.99x 1.96x 1.93x 1.93x 1.83x 1.86x 1.89x 1.87x
8192B 1.94x 1.95x 0.99x 1.97x 1.95x 1.95x 1.87x 1.87x 1.93x 1.91x
Signed-off-by: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
New ECB, CBC, CTR, LRW and XTS testvectors for cast6. We need larger
testvectors to check parallel code paths in the optimized implementation. Tests
have also been added to the tcrypt module.
Signed-off-by: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Rename cast6 module to cast6_generic to allow autoloading of optimized
implementations. Generic functions and s-boxes are exported to be able to use
them within optimized implementations.
Signed-off-by: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
This patch adds a x86_64/avx assembler implementation of the Cast5 block
cipher. The implementation processes sixteen blocks in parallel (four 4 block
chunk AVX operations). The table-lookups are done in general-purpose registers.
For small blocksizes the functions from the generic module are called. A good
performance increase is provided for blocksizes greater or equal to 128B.
Patch has been tested with tcrypt and automated filesystem tests.
Tcrypt benchmark results:
Intel Core i5-2500 CPU (fam:6, model:42, step:7)
cast5-avx-x86_64 vs. cast5-generic
64bit key:
size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec
16B 0.99x 0.99x 1.00x 1.00x 1.02x 1.01x
64B 1.00x 1.00x 0.98x 1.00x 1.01x 1.02x
256B 2.03x 2.01x 0.95x 2.11x 2.12x 2.13x
1024B 2.30x 2.24x 0.95x 2.29x 2.35x 2.35x
8192B 2.31x 2.27x 0.95x 2.31x 2.39x 2.39x
128bit key:
size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec
16B 0.99x 0.99x 1.00x 1.00x 1.01x 1.01x
64B 1.00x 1.00x 0.98x 1.01x 1.02x 1.01x
256B 2.17x 2.13x 0.96x 2.19x 2.19x 2.19x
1024B 2.29x 2.32x 0.95x 2.34x 2.37x 2.38x
8192B 2.35x 2.32x 0.95x 2.35x 2.39x 2.39x
Signed-off-by: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
New ECB, CBC and CTR testvectors for cast5. We need larger testvectors to check
parallel code paths in the optimized implementation. Tests have also been added
to the tcrypt module.
Signed-off-by: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Rename cast5 module to cast5_generic to allow autoloading of optimized
implementations. Generic functions and s-boxes are exported to be able to use
them within optimized implementations.
Signed-off-by: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Initialization of cra_list is currently mixed, most ciphers initialize this
field and most shashes do not. Initialization however is not needed at all
since cra_list is initialized/overwritten in __crypto_register_alg() with
list_add(). Therefore perform cleanup to remove all unneeded initializations
of this field in 'arch/s390/crypto/'
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Initialization of cra_list is currently mixed, most ciphers initialize this
field and most shashes do not. Initialization however is not needed at all
since cra_list is initialized/overwritten in __crypto_register_alg() with
list_add(). Therefore perform cleanup to remove all unneeded initializations
of this field in 'crypto/drivers/'.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linux-geode@lists.infradead.org
Cc: Michal Ludvig <michal@logix.cz>
Cc: Dmitry Kasatkin <dmitry.kasatkin@nokia.com>
Cc: Varun Wadekar <vwadekar@nvidia.com>
Cc: Eric Bénard <eric@eukrea.com>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: Kent Yoder <key@linux.vnet.ibm.com>
Acked-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|