Age | Commit message (Collapse) | Author |
|
So far, only net_device * could be passed along with netdevice notifier
event. This patch provides a possibility to pass custom structure
able to provide info that event listener needs to know.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
v2->v3: fix typo on simeth
shortened dev_getter
shortened notifier_info struct name
v1->v2: fix notifier_call parameter in call_netdevice_notifier()
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The netpoll_rx_disable() will always return 0, it is no use and looks wordy,
so remove the unnecessary code and get rid of it in _dev_open and _dev_close.
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In the case where a non-MPLS packet is received and an MPLS stack is
added it may well be the case that the original skb is GSO but the
NIC used for transmit does not support GSO of MPLS packets.
The aim of this code is to provide GSO in software for MPLS packets
whose skbs are GSO.
SKB Usage:
When an implementation adds an MPLS stack to a non-MPLS packet it should do
the following to skb metadata:
* Set skb->inner_protocol to the old non-MPLS ethertype of the packet.
skb->inner_protocol is added by this patch.
* Set skb->protocol to the new MPLS ethertype of the packet.
* Set skb->network_header to correspond to the
end of the L3 header, including the MPLS label stack.
I have posted a patch, "[PATCH v3.29] datapath: Add basic MPLS support to
kernel" which adds MPLS support to the kernel datapath of Open vSwtich.
That patch sets the above requirements in datapath/actions.c:push_mpls()
and was used to exercise this code. The datapath patch is against the Open
vSwtich tree but it is intended that it be added to the Open vSwtich code
present in the mainline Linux kernel at some point.
Features:
I believe that the approach that I have taken is at least partially
consistent with the handling of other protocols. Jesse, I understand that
you have some ideas here. I am more than happy to change my implementation.
This patch adds dev->mpls_features which may be used by devices
to advertise features supported for MPLS packets.
A new NETIF_F_MPLS_GSO feature is added for devices which support
hardware MPLS GSO offload. Currently no devices support this
and MPLS GSO always falls back to software.
Alternate Implementation:
One possible alternate implementation is to teach netif_skb_features()
and skb_network_protocol() about MPLS, in a similar way to their
understanding of VLANs. I believe this would avoid the need
for net/mpls/mpls_gso.c and in particular the calls to
__skb_push() and __skb_push() in mpls_gso_segment().
I have decided on the implementation in this patch as it should
not introduce any overhead in the case where mpls_gso is not compiled
into the kernel or inserted as a module.
MPLS GSO suggested by Jesse Gross.
Based in part on "v4 GRE: Add TCP segmentation offload for GRE"
by Pravin B Shelar.
Cc: Jesse Gross <jesse@nicira.com>
Cc: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In order to mitigate ongoing incresase in the size of struct skbuff
use 16 bit integer offsets rather than pointers for inner_*_headers.
This appears to reduce the size of struct skbuff from 0xd0 to 0xc0
bytes on x86_64 with the following all unset.
CONFIG_XFRM
CONFIG_NF_CONNTRACK
CONFIG_NF_CONNTRACK_MODULE
NET_SKBUFF_NF_DEFRAG_NEEDED
CONFIG_BRIDGE_NETFILTER
CONFIG_NET_SCHED
CONFIG_IPV6_NDISC_NODETYPE
CONFIG_NET_DMA
CONFIG_NETWORK_SECMARK
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
libphy currently always reports a PHY as an external transceiver from
the ethtool output. This is inaccurate, because some drivers should be
able to tell that a PHY device is an internal transceiver of an Ethernet
MAC. Add a new flag (PHY_IS_INTERNAL) which can be set by PHY drivers
just like other flags, and a corresponding helper: phy_is_internal()
which can be used by networking drivers to query if a given
PHY device is internal.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a comment which explains the real meaning of XCVR_INTERNAL (PHY and
Ethernet MAC in the same package/die) and XCVR_EXTERNAL (PHY and
Ethernet MAC in a different package/die). Most if not all of the drivers
setting their transceiver type already do it the way the comment
describes it.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now when upper device is changed, event is not propagated via RT Netlink
to userspace. Userspace might never now about the change. Fix this by
adding upper-device-change notifier event.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This adds the ability to send ICMPv6 echo requests without a
raw socket. The equivalent ability for ICMPv4 was added in
2011.
Instead of having separate code paths for IPv4 and IPv6, make
most of the code in net/ipv4/ping.c dual-stack and only add a
few IPv6-specific bits (like the protocol definition) to a new
net/ipv6/ping.c. Hopefully this will reduce divergence and/or
duplication of bugs in the future.
Caveats:
- Setting options via ancillary data (e.g., using IPV6_PKTINFO
to specify the outgoing interface) is not yet supported.
- There are no separate security settings for IPv4 and IPv6;
everything is controlled by /proc/net/ipv4/ping_group_range.
- The proc interface does not yet display IPv6 ping sockets
properly.
Tested with a patched copy of ping6 and using raw socket calls.
Compiles and works with all of CONFIG_IPV6={n,m,y}.
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Merge net into net-next because some upcoming net-next changes
build on top of bug fixes that went into net.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull networking fixes from David Miller:
"It's been a while since my last pull request so quite a few fixes have
piled up."
Indeed.
1) Fix nf_{log,queue} compilation with PROC_FS disabled, from Pablo
Neira Ayuso.
2) Fix data corruption on some tg3 chips with TSO enabled, from Michael
Chan.
3) Fix double insertion of VLAN tags in be2net driver, from Sarveshwar
Bandi.
4) Don't have TCP's MD5 support pass > PAGE_SIZE page offsets in
scatter-gather entries into the crypto layer, the crypto layer can't
handle that. From Eric Dumazet.
5) Fix lockdep splat in 802.1Q MRP code, also from Eric Dumazet.
6) Fix OOPS in netfilter log module when called from conntrack, from
Hans Schillstrom.
7) FEC driver needs to use netif_tx_{lock,unlock}_bh() rather than the
non-BH disabling variants. From Fabio Estevam.
8) TCP GSO can generate out-of-order packets, fix from Eric Dumazet.
9) vxlan driver doesn't update 'used' field of fdb entries when it
should, from Sridhar Samudrala.
10) ipv6 should use kzalloc() to allocate inet6 socket cork options,
otherwise we can OOPS in ip6_cork_release(). From Eric Dumazet.
11) Fix races in bonding set mode, from Nikolay Aleksandrov.
12) Fix checksum generation regression added by "r8169: fix 8168evl
frame padding.", from Francois Romieu.
13) ip_gre can look at stale SKB data pointer, fix from Eric Dumazet.
14) Fix checksum handling when GSO is enabled in bnx2x driver with
certain chips, from Yuval Mintz.
15) Fix double free in batman-adv, from Martin Hundebøll.
16) Fix device startup synchronization with firmware in tg3 driver, from
Nithin Sujit.
17) perf networking dropmonitor doesn't work at all due to mixed up
trace parameter ordering, from Ben Hutchings.
18) Fix proportional rate reduction handling in tcp_ack(), from Nandita
Dukkipati.
19) IPSEC layer doesn't return an error when a valid state is detected,
causing an OOPS. Fix from Timo Teräs.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (85 commits)
be2net: bug fix on returning an invalid nic descriptor
tcp: xps: fix reordering issues
net: Revert unused variable changes.
xfrm: properly handle invalid states as an error
virtio_net: enable napi for all possible queues during open
tcp: bug fix in proportional rate reduction.
net: ethernet: sun: drop unused variable
net: ethernet: korina: drop unused variable
net: ethernet: apple: drop unused variable
qmi_wwan: Added support for Cinterion's PLxx WWAN Interface
perf: net_dropmonitor: Remove progress indicator
perf: net_dropmonitor: Use bisection in symbol lookup
perf: net_dropmonitor: Do not assume ordering of dictionaries
perf: net_dropmonitor: Fix symbol-relative addresses
perf: net_dropmonitor: Fix trace parameter order
net: fec: use a more proper compatible string for MVF type device
qlcnic: Fix updating netdev->features
qlcnic: remove netdev->trans_start updates within the driver
qlcnic: Return proper error codes from probe failure paths
tg3: Update version to 3.132
...
|
|
This patch synchronises documentation for feature-split-event-channels from
Xen canonical header file.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
"Here are some more fixes for v3.10. The Moorestown update broke Intel
Medfield devices, so I reverted it. The acpiphp change fixes a
regression: we broke hotplug notifications to host bridges when we
split acpiphp into the host-bridge related part and the
endpoint-related part.
Moorestown
Revert "x86/pci/mrst: Use configuration mechanism 1 for 00:00.0, 00:02.0, 00:03.0"
Hotplug
PCI: acpiphp: Re-enumerate devices when host bridge receives Bus Check"
* tag 'pci-v3.10-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
Revert "x86/pci/mrst: Use configuration mechanism 1 for 00:00.0, 00:02.0, 00:03.0"
PCI: acpiphp: Re-enumerate devices when host bridge receives Bus Check
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg Kroah-Hartman:
"Here are some tty / serial driver fixes for 3.10-rc2.
Nothing huge, although the rocket driver fix looks large, it's just
moving the code around to fix the reported build issues in it. Other
than that, this has the fix for the of-reported lockdep warning from
the vt layer, as well as some other needed bugfixes.
All of these have been in linux-next for a while"
* tag 'tty-3.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
tty: mxser: Fix build warning introduced by dfc7b837c7f9 (Re: linux-next: build warning after merge of the tty.current tree)
tty: mxser: fix usage of opmode_ioaddr
serial: 8250_dw: add ACPI ID for Intel BayTrail
TTY: Fix tty miss restart after we turn off flow-control
tty/vt: Fix vc_deallocate() lock order
TTY: ehv_bytechan: add missing platform_driver_unregister() when module exit
TTY: rocket, fix more no-PCI warnings
serial: mcf: missing uart_unregister_driver() on error in mcf_init()
tty: serial: mpc5xxx: fix error handing in mpc52xx_uart_init()
serial: samsung: add missing platform_driver_unregister() when module exit
serial: pl011: protect attribute read from NULL platform data struct
tty: nwpserial: Pass correct pointer to free_irq()
serial: 8250_dw: Add valid clk pointer check
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg Kroah-Hartman:
"Here are a number of tiny USB bugfixes / new device ids for 3.10-rc2
The majority of these are USB gadget fixes, but they are all small.
Other than that, some USB host controller fixes, and USB serial driver
fixes for problems reported with them.
Also hopefully a fixed up USB_OTG Kconfig dependancy, that one seems
to be almost impossible to get right for all of the different
platforms these days."
* tag 'usb-3.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (56 commits)
USB: cxacru: potential underflow in cxacru_cm_get_array()
USB: ftdi_sio: Add support for Newport CONEX motor drivers
USB: option: add device IDs for Dell 5804 (Novatel E371) WWAN card
usb: ohci: fix goto wrong tag in err case
usb: isp1760-if: fix memleak when platform_get_resource fail
usb: ehci-s5p: fix memleak when fallback to pdata
USB: serial: clean up chars_in_buffer
USB: ti_usb_3410_5052: fix chars_in_buffer overhead
USB: io_ti: fix chars_in_buffer overhead
USB: ftdi_sio: fix chars_in_buffer overhead
USB: ftdi_sio: clean up get_modem_status
USB: serial: add generic wait_until_sent implementation
USB: serial: add wait_until_sent operation
USB: set device dma_mask without reference to global data
USB: Blacklisted Cinterion's PLxx WWAN Interface
usb: option: Add Telewell TW-LTE 4G
USB: EHCI: remove bogus #error
USB: reset resume quirk needed by a hub
USB: usb-stor: realtek_cr: Fix compile error
usb, chipidea: fix link error when USB_EHCI_HCD is a module
...
|
|
Pull MIPS update from Ralf Baechle:
- Fix a build error if <linux/printk.h> is included without
<linux/linkage.h> having been included before.
- Cleanup and fix the damage done by the generic idle loop patch.
- A kprobes fix that brings the MIPS code in line with what other
architectures are for quite a while already.
- Wire up the native getdents64(2) syscall for 64 bit - for some reason
it was only for the compat ABIs. This has been reported to cause an
application issue. This turned out bigger than I meant but the wait
instruction support code was driving me nuts.
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: N64: Wire getdents64(2)
kprobes/mips: Fix to check double free of insn slot
MIPS: Idle: Break r4k_wait into two functions and fix it.
MIPS: Idle: Do address fiddlery in helper functions.
MIPS: Idle: Consolidate all declarations in <asm/idle.h>.
MIPS: Idle: Don't call local_irq_disable() in cpu_wait() implementations.
MIPS: Idle: Re-enable irqs at the end of r3081, au1k and loongson2 cpu_wait.
MIPS: Idle: Make call of function pointer readable.
MIPS: Idle: Consistently reformat inline assembler.
MIPS: Idle: cleaup SMTC idle hook as per Linux coding style.
MIPS: Consolidate idle loop / WAIT instruction support in a single file.
MIPS: clock.h: Remove declaration of cpu_wait.
Add include dependencies to <linux/printk.h>.
MIPS: Rewrite pfn_valid to work in modules, too.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixes
Pull mfd fixes from Samuel Ortiz:
"This is the first batch of MFD fixes for 3.10.
It's bigger than I would like, most of it is due to the big ab/db8500
merge that went through during the 3.10 merge window.
So we have:
- Some build fixes for the tps65912 and ab8500 drivers.
- A couple of build fixes for the the si476x driver with pre 4.3 gcc
compilers.
- A few runtime breakage fixes (probe failures or oopses) for the
ab8500 and db8500 drivers.
- Some sparse or regular gcc warning fixes for the si476x, ab8500 and
cros_ec drivers."
* tag 'mfd-fixes-3.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixes:
mfd: ab8500-sysctrl: Let sysctrl driver work without pdata
mfd: db8500-prcmu: Update stored DSI PLL divider value
mfd: ab8500-sysctrl: Always enable pm_power_off handler
mfd: ab8500-core: Pass GPADC compatible string to MFD core
mfd: db8500-prcmu: Supply the pdata_size attribute for db8500-thermal
mfd: ab8500-core: Use the correct driver name when enabling gpio/pinctrl
mfd: ab8500: Pass AB8500 IRQ to debugfs code by resource
mfd: ab8500-gpadc: Suppress 'ignoring regulator_enable() return value' warning
mfd: ab8500-sysctrl: Set sysctrl_dev during probe
mfd: ab8500-sysctrl: Fix sparse warning
mfd: abx500-core: Fix sparse warning
mfd: ab8500: Debugfs code depends on gpadc
mfd: si476x: Use get_unaligned_be16() for unaligned be16 loads
mfd: cros_ec_spi: Use %z to format pointer differences
mfd: si476x: Do not use binary constants
mfd: tps65912: Select MFD_CORE
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull virtio fixes from Rusty Russell:
"A build fix and a uapi exposure fix. The build fix is later than I
liked, but my first version broke linux-next due to overzealous header
clean."
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
virtio_console: fix uapi header
Hoist memcpy_fromiovec/memcpy_toiovec into lib/
|
|
If <linux/linkage.h> has not been included before <linux/printk.h>,
a build error like the below one will result:
CC arch/mips/kernel/idle.o
In file included from arch/mips/kernel/idle.c:17:0:
include/linux/printk.h:109:1: error: data definition has no type or storage class [-Werror]
include/linux/printk.h:109:1: error: type defaults to ‘int’ in declaration of ‘asmlinkage’ [-Werror=implicit-int]
include/linux/printk.h:110:1: error: ‘format’ attribute only applies to function types [-Werror=attributes]
include/linux/printk.h:110:1: error: expected ‘,’ or ‘;’ before ‘int’
include/linux/printk.h:114:1: error: data definition has no type or storage class [-Werror]
include/linux/printk.h:114:1: error: type defaults to ‘int’ in declaration of ‘asmlinkage’ [-Werror=implicit-int]
include/linux/printk.h:115:1: error: ‘format’ attribute only applies to function types [-Werror=attributes]
include/linux/printk.h:115:1: error: expected ‘,’ or ‘;’ before ‘int’
include/linux/printk.h:117:1: error: data definition has no type or storage class [-Werror]
include/linux/printk.h:117:1: error: type defaults to ‘int’ in declaration of ‘asmlinkage’ [-Werror=implicit-int]
include/linux/printk.h:118:1: error: ‘format’ attribute only applies to function types [-Werror=attributes]
include/linux/printk.h:118:1: error: ‘__cold__’ attribute ignored [-Werror=attributes]
include/linux/printk.h:118:1: error: expected ‘,’ or ‘;’ before ‘asmlinkage’
include/linux/printk.h:122:1: error: data definition has no type or storage class [-Werror]
include/linux/printk.h:122:1: error: type defaults to ‘int’ in declaration of ‘asmlinkage’ [-Werror=implicit-int]
include/linux/printk.h:123:1: error: ‘format’ attribute only applies to function types [-Werror=attributes]
include/linux/printk.h:123:1: error: ‘__cold__’ attribute ignored [-Werror=attributes]
include/linux/printk.h:123:1: error: expected ‘,’ or ‘;’ before ‘int’
In file included from include/linux/kernel.h:14:0,
from include/linux/sched.h:15,
from arch/mips/kernel/idle.c:18:
include/linux/dynamic_debug.h: In function ‘ddebug_dyndbg_module_param_cb’:
include/linux/dynamic_debug.h:124:3: error: implicit declaration of function ‘printk’ [-Werror=implicit-function-declaration]
Fixed by including <linux/linkage.h>.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
git://people.freedesktop.org/~airlied/linux
Pull radeon sun/hainan support from Dave Airlie:
"Since I know its outside the merge window, but since this is new hw I
thought I'd try and provoke the new hw exception, it just fills in the
blanks in the driver for the new AMD sun and hainan chipsets."
* 'drm-radeon-sun-hainan' of git://people.freedesktop.org/~airlied/linux:
drm/radeon: add Hainan pci ids
drm/radeon: add golden register settings for Hainan (v2)
drm/radeon: sun/hainan chips do not have UVD (v2)
drm/radeon: track which asics have UVD
drm/radeon: radeon-asic updates for Hainan
drm/radeon: fill in ucode loading support for Hainan
drm/radeon: don't touch DCE or VGA regs on Hainan (v3)
drm/radeon: fill in GPU init for Hainan (v2)
drm/radeon: add chip family for Hainan
|
|
There is currently no way for an Ethernet MAC driver servicing PHY link
interrupts to notify this to the PHY state machine without defining its
own state machine. Since most drivers are not so special, introduce a
helper: phy_mac_interrupt() which can be called from a link up/down
interrupt routine to update the PHY state machine. To avoid code
duplication some refactoring has been done to expose the workqueue and
its corresponding callback internally.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When a PHY device is registered with the special IRQ value
PHY_IGNORE_INTERRUPT (-2) it will not properly be handled by the PHY
library:
- it continues to poll its register, while we do not want this
because such PHY link events or register changes are serviced by an
Ethernet MAC
- it will still try to configure PHY interrupts at the PHY level, such
interrupts do not exist at the PHY but at the MAC level
- the state machine only handles PHY_POLL, but should also handle
PHY_IGNORE_INTERRUPT similarly
This patch updates the PHY state machine and initialization paths to
account for the specific PHY_IGNORE_INTERRUPT. Based on an earlier patch
by Thomas Petazzoni, and reworked to add the missing bits. Add a helper
phy_interrupt_is_valid() which specifically tests for a PHY interrupt
not to be PHY_POLL or PHY_IGNORE_INTERRUPT and use it throughout the
code.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
TCP md5 code uses per cpu variables but protects access to them with
a shared spinlock, which is a contention point.
[ tcp_md5sig_pool_lock is locked twice per incoming packet ]
Makes things much simpler, by allocating crypto structures once, first
time a socket needs md5 keys, and not deallocating them as they are
really small.
Next step would be to allow crypto allocations being done in a NUMA
aware way.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The next pointer within the inet6_dev structure seems not to be used
anywhere. So just remove it. Tested with allmodconfig on x86_64.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A cpu executing the network receive path sheds packets when its input
queue grows to netdev_max_backlog. A single high rate flow (such as a
spoofed source DoS) can exceed a single cpu processing rate and will
degrade throughput of other flows hashed onto the same cpu.
This patch adds a more fine grained hashtable. If the netdev backlog
is above a threshold, IRQ cpus track the ratio of total traffic of
each flow (using 4096 buckets, configurable). The ratio is measured
by counting the number of packets per flow over the last 256 packets
from the source cpu. Any flow that occupies a large fraction of this
(set at 50%) will see packet drop while above the threshold.
Tested:
Setup is a muli-threaded UDP echo server with network rx IRQ on cpu0,
kernel receive (RPS) on cpu0 and application threads on cpus 2--7
each handling 20k req/s. Throughput halves when hit with a 400 kpps
antagonist storm. With this patch applied, antagonist overload is
dropped and the server processes its complete load.
The patch is effective when kernel receive processing is the
bottleneck. The above RPS scenario is a extreme, but the same is
reached with RFS and sufficient kernel processing (iptables, packet
socket tap, ..).
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
|
|
Now that the tty port owns the flip buffers and i/o is allowed
from the driver even when no tty is attached, the destruction
of the tty port (and the flip buffers) must ensure that no
outstanding work is pending.
Unfortunately, this creates a lock order problem with the
console_lock (see attached lockdep report [1] below).
For single console deallocation, drop the console_lock prior
to port destruction. When multiple console deallocation,
defer port destruction until the consoles have been
deallocated.
tty_port_destroy() is not required if the port has not
been used; remove from vc_allocate() failure path.
[1] lockdep report from Dave Jones <davej@redhat.com>
======================================================
[ INFO: possible circular locking dependency detected ]
3.9.0+ #16 Not tainted
-------------------------------------------------------
(agetty)/26163 is trying to acquire lock:
blocked: ((&buf->work)){+.+...}, instance: ffff88011c8b0020, at: [<ffffffff81062065>] flush_work+0x5/0x2e0
but task is already holding lock:
blocked: (console_lock){+.+.+.}, instance: ffffffff81c2fde0, at: [<ffffffff813bc201>] vt_ioctl+0xb61/0x1230
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (console_lock){+.+.+.}:
[<ffffffff810b3f74>] lock_acquire+0xa4/0x210
[<ffffffff810416c7>] console_lock+0x77/0x80
[<ffffffff813c3dcd>] con_flush_chars+0x2d/0x50
[<ffffffff813b32b2>] n_tty_receive_buf+0x122/0x14d0
[<ffffffff813b7709>] flush_to_ldisc+0x119/0x170
[<ffffffff81064381>] process_one_work+0x211/0x700
[<ffffffff8106498b>] worker_thread+0x11b/0x3a0
[<ffffffff8106ce5d>] kthread+0xed/0x100
[<ffffffff81601cac>] ret_from_fork+0x7c/0xb0
-> #0 ((&buf->work)){+.+...}:
[<ffffffff810b349a>] __lock_acquire+0x193a/0x1c00
[<ffffffff810b3f74>] lock_acquire+0xa4/0x210
[<ffffffff810620ae>] flush_work+0x4e/0x2e0
[<ffffffff81065305>] __cancel_work_timer+0x95/0x130
[<ffffffff810653b0>] cancel_work_sync+0x10/0x20
[<ffffffff813b8212>] tty_port_destroy+0x12/0x20
[<ffffffff813c65e8>] vc_deallocate+0xf8/0x110
[<ffffffff813bc20c>] vt_ioctl+0xb6c/0x1230
[<ffffffff813b01a5>] tty_ioctl+0x285/0xd50
[<ffffffff811ba825>] do_vfs_ioctl+0x305/0x530
[<ffffffff811baad1>] sys_ioctl+0x81/0xa0
[<ffffffff81601d59>] system_call_fastpath+0x16/0x1b
other info that might help us debug this:
[ 6760.076175] Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(console_lock);
lock((&buf->work));
lock(console_lock);
lock((&buf->work));
*** DEADLOCK ***
1 lock on stack by (agetty)/26163:
#0: blocked: (console_lock){+.+.+.}, instance: ffffffff81c2fde0, at: [<ffffffff813bc201>] vt_ioctl+0xb61/0x1230
stack backtrace:
Pid: 26163, comm: (agetty) Not tainted 3.9.0+ #16
Call Trace:
[<ffffffff815edb14>] print_circular_bug+0x200/0x20e
[<ffffffff810b349a>] __lock_acquire+0x193a/0x1c00
[<ffffffff8100a269>] ? sched_clock+0x9/0x10
[<ffffffff8100a269>] ? sched_clock+0x9/0x10
[<ffffffff8100a200>] ? native_sched_clock+0x20/0x80
[<ffffffff810b3f74>] lock_acquire+0xa4/0x210
[<ffffffff81062065>] ? flush_work+0x5/0x2e0
[<ffffffff810620ae>] flush_work+0x4e/0x2e0
[<ffffffff81062065>] ? flush_work+0x5/0x2e0
[<ffffffff810b15db>] ? mark_held_locks+0xbb/0x140
[<ffffffff8113c8a3>] ? __free_pages_ok.part.57+0x93/0xc0
[<ffffffff810b15db>] ? mark_held_locks+0xbb/0x140
[<ffffffff810652f2>] ? __cancel_work_timer+0x82/0x130
[<ffffffff81065305>] __cancel_work_timer+0x95/0x130
[<ffffffff810653b0>] cancel_work_sync+0x10/0x20
[<ffffffff813b8212>] tty_port_destroy+0x12/0x20
[<ffffffff813c65e8>] vc_deallocate+0xf8/0x110
[<ffffffff813bc20c>] vt_ioctl+0xb6c/0x1230
[<ffffffff810aec41>] ? lock_release_holdtime.part.30+0xa1/0x170
[<ffffffff813b01a5>] tty_ioctl+0x285/0xd50
[<ffffffff812b00f6>] ? inode_has_perm.isra.46.constprop.61+0x56/0x80
[<ffffffff811ba825>] do_vfs_ioctl+0x305/0x530
[<ffffffff812b04db>] ? selinux_file_ioctl+0x5b/0x110
[<ffffffff811baad1>] sys_ioctl+0x81/0xa0
[<ffffffff81601d59>] system_call_fastpath+0x16/0x1b
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pinctrl fixes from Linus Walleij:
- Three fixes to make the boot path for device tree work properly on
the Nomadik pin controller.
- Compile warning fix for the vt8500 driver.
- Fix error path in pinctrl-single.
- Free mappings in error path of the Lantiq controller.
- Documentation fixes.
* tag 'pinctrl-fixes-v3.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl/lantiq: Free mapping configs for both pin and groups
pinctrl: single: fix error return code in pcs_parse_one_pinctrl_entry()
pinctrl: generic: Fix typos and clarify comments
pinctrl: vt8500: Fix incorrect data in WM8750 pinctrl table
pinctrl: abx500: Rejiggle platform data and DT initialisation
pinctrl: abx500: Specify failed sub-driver by ID instead of driver_data
|
|
Do not leak starting address of BPF JIT code for non root users,
as it might help intruders to perform an attack.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
tcp_timeout_skb() was intended to trigger fast recovery on timeout,
unfortunately in reality it often causes spurious retransmission
storms during fast recovery. The particular sign is a fast retransmit
over the highest sacked sequence (SND.FACK).
Currently the RTO timer re-arming (as in RFC6298) offers a nice cushion
to avoid spurious timeout: when SND.UNA advances the sender re-arms
RTO and extends the timeout by icsk_rto. The sender does not offset
the time elapsed since the packet at SND.UNA was sent.
But if the next (DUP)ACK arrives later than ~RTTVAR and triggers
tcp_fastretrans_alert(), then tcp_timeout_skb() will mark any packet
sent before the icsk_rto interval lost, including one that's above the
highest sacked sequence. Most likely a large part of scorebard will be
marked.
If most packets are not lost then the subsequent DUPACKs with new SACK
blocks will cause the sender to continue to retransmit packets beyond
SND.FACK spuriously. Even if only one packet is lost the sender may
falsely retransmit almost the entire window.
The situation becomes common in the world of bufferbloat: the RTT
continues to grow as the queue builds up but RTTVAR remains small and
close to the minimum 200ms. If a data packet is lost and the DUPACK
triggered by the next data packet is slightly delayed, then a spurious
retransmission storm forms.
As the original comment on tcp_timeout_skb() suggests: the usefulness
of this feature is questionable. It also wastes cycles walking the
sack scoreboard and is actually harmful because of false recovery.
It's time to remove this.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
uapi should use __u32 not u32.
Fix a macro in virtio_console.h which uses u32.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
|
|
ERROR: "memcpy_fromiovec" [drivers/vhost/vhost_scsi.ko] undefined!
That function is only present with CONFIG_NET. Turns out that
crypto/algif_skcipher.c also uses that outside net, but it actually
needs sockets anyway.
In addition, commit 6d4f0139d642c45411a47879325891ce2a7c164a added
CONFIG_NET dependency to CONFIG_VMCI for memcpy_toiovec, so hoist
that function and revert that commit too.
socket.h already includes uio.h, so no callers need updating; trying
only broke things fo x86_64 randconfig (thanks Fengguang!).
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
This patch adds the support of peer address for IPv6. For example, it is
possible to specify the remote end of a 6inY tunnel.
This was already possible in IPv4:
ip addr add ip1 peer ip2 dev dev1
The peer address is specified with IFA_ADDRESS and the local address with
IFA_LOCAL (like explained in include/uapi/linux/if_addr.h).
Note that the API is not changed, because before this patch, it was not
possible to specify two different addresses in IFA_LOCAL and IFA_REMOTE.
There is a small change for the dump: if the peer is different from ::,
IFA_ADDRESS will contain the peer address instead of the local address.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull device tree fixes from Grant Likely:
"Device tree bug fixes and documentation updates for v3.10
Nothing earth shattering here. A build failure fix, and fix for
releasing nodes and some documenation updates."
* tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux:
Documentation/devicetree: make semantic of initrd-end more explicit
of/base: release the node correctly in of_parse_phandle_with_args()
of/documentation: move video device bindings to a common place
<linux/of_platform.h>: fix compilation warnings with DT disabled
|
|
When a PCI host bridge device receives a Bus Check notification, we
must re-enumerate starting with the bridge to discover changes (devices
that have been added or removed).
Prior to 668192b678 ("PCI: acpiphp: Move host bridge hotplug to
pci_root.c"), this happened in _handle_hotplug_event_bridge(). After that
commit, _handle_hotplug_event_bridge() is not installed for host bridges,
and the host bridge notify handler, _handle_hotplug_event_root() did not
re-enumerate.
This patch adds re-enumeration to _handle_hotplug_event_root().
This fixes cases where we don't notice the addition or removal of
PCI devices, e.g., the PCI-to-USB ExpressCard in the bugzilla below.
[bhelgaas: changelog, references]
Reference: https://lkml.kernel.org/r/CAAh6nkmbKR3HTqm5ommevsBwhL_u0N8Rk7Wsms_LfP=nBgKNew@mail.gmail.com
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=57961
Reported-by: Gavin Guo <tuffkidtt@gmail.com>
Tested-by: Gavin Guo <tuffkidtt@gmail.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: stable@vger.kernel.org # v3.9+
|
|
PCIe and ARM CR4 cores were found on 14e4:43b1 AKA BCM4352.
Reported-by: Gabriel Thörnblad <gabriel@thornblad.com>
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
Pull drm fixes from Dave Airlie:
"Fix for radeon nomodeset regression, old radeon interface cliprects
fix, 2 qxl crasher fixes, and a couple of minor cleanups.
I may have a new AMD hw support branch next week, its one of those
doesn't affect anything existing just adds new support, I'll see how
it shapes up and I might ask you to take it, just thought I'd warn in
advance."
* 'drm-next' of git://people.freedesktop.org/~airlied/linux:
drm/radeon: restore nomodeset operation (v2)
qxl: fix bug with object eviction and update area
drm/qxl: drop active_user_framebuffer as its unneeded
qxl: drop unused variable.
drm/qxl: fix ioport interactions for kernel submitted commands.
drm: remove unused wrapper macros
drm/radeon: check incoming cliprects pointer
|
|
Add generic wait_until_sent implementation which polls for empty
hardware buffers using the new port-operation tx_empty.
The generic implementation will be used for all sub-drivers that
implement tx_empty but does not define wait_until_sent.
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add wait_until_sent operation which can be used to wait for hardware
buffers to drain.
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
AB8500 sysctrl driver implements a pm_power_off handler, but that is
currently not registered until a specific platform data field is
enabled.
This patch drops the platform data field and always registers
ab8500_power_off if no other pm_power_off handler was defined before,
and also introduces the necessary cleanup code in the driver's remove
function.
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Fabio Baltieri <fabio.baltieri@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI fixes from Rafael Wysocki:
- intel_pstate driver fixes and cleanups from Dirk Brandewie and Wei
Yongjun.
- cpufreq fixes related to ARM big.LITTLE support and the cpufreq-cpu0
driver from Viresh Kumar.
- Assorted cpufreq fixes from Srivatsa S Bhat, Borislav Petkov, Wolfram
Sang, Alexander Shiyan, and Nishanth Menon.
- Assorted ACPI fixes from Catalin Marinas, Lan Tianyu, Alex Hung,
Jan-Simon Möller, and Rafael J Wysocki.
- Fix for a kfree() under spinlock in the PM core from Shuah Khan.
- PM documentation updates from Borislav Petkov and Zhang Rui.
* tag 'pm+acpi-3.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (30 commits)
cpufreq: Preserve sysfs files across suspend/resume
ACPI / scan: Fix memory leak on acpi_scan_init_hotplug() error path
PM / hibernate: Correct documentation
PM / Documentation: remove inaccurate suspend/hibernate transition lantency statement
PM: Documentation update for freeze state
cpufreq / intel_pstate: use vzalloc() instead of vmalloc()/memset(0)
cpufreq, ondemand: Remove leftover debug line
PM: Avoid calling kfree() under spinlock in dev_pm_put_subsys_data()
cpufreq / kirkwood: don't check resource with devm_ioremap_resource
cpufreq / intel_pstate: remove #ifdef MODULE compile fence
cpufreq / intel_pstate: Remove idle mode PID
cpufreq / intel_pstate: fix ffmpeg regression
cpufreq / intel_pstate: use lowest requested max performance
cpufreq / intel_pstate: remove idle time and duration from sample and calculations
cpufreq: Fix incorrect dependecies for ARM SA11xx drivers
cpufreq: ARM big LITTLE: Fix Kconfig entries
cpufreq: cpufreq-cpu0: Free parent node for error cases
cpufreq: cpufreq-cpu0: defer probe when regulator is not ready
cpufreq: Issue CPUFREQ_GOV_POLICY_EXIT notifier before dropping policy refcount
cpufreq: governors: Fix CPUFREQ_GOV_POLICY_{INIT|EXIT} notifiers
...
|
|
In some situations, we need to disable TSO on bonding slaves.
bonding device automatically unset TSO in bond_fix_features(), and
performance is not good because :
1) We consume more cpu cycles.
2) GSO segmentation has some bugs leading to out of order TCP packets
if this segmentation is done before virtual device. This particular
problem will be addressed in a separate patch.
This patch allows TSO being set/unset on the bonding master,
so that GSO segmentation is done after bonding layer.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Michał Mirosław <mirqus@gmail.com>
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pablo Neira Ayuso says:
====================
The following patchset contains three Netfilter fixes and update
for the MAINTAINER file for your net tree, they are:
* Fix crash if nf_log_packet is called from conntrack, in that case
both interfaces are NULL, from Hans Schillstrom. This bug introduced
with the logging netns support in the previous merge window.
* Fix compilation of nf_log and nf_queue without CONFIG_PROC_FS,
from myself. This bug was introduced in the previous merge window
with the new netns support for the netfilter logging infrastructure.
* Fix possible crash in xt_TCPOPTSTRIP due to missing sanity
checkings to validate that the TCP header is well-formed, from
myself. I can find this bug in 2.6.25, probably it's been there
since the beginning. I'll pass this to -stable.
* Update MAINTAINER file to point to new nf trees at git.kernel.org,
remove Harald and use M: instead of P: (now obsolete tag) to
keep Jozsef in the list of people.
Please, consider pulling this. Thanks!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Document rx vs tx status concurrency requirements.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull target fixes from Nicholas Bellinger:
"A handful of fixes + minor changes this time around, along with one
important >= v3.9 regression fix for IBLOCK backends. The highlights
include:
- Use FD_MAX_SECTORS in FILEIO for block_device as
well as files (agrover)
- Fix processing of out-of-order CmdSNs with
iSBD driver (shlomo)
- Close long-standing target_put_sess_cmd() vs.
core_tmr_abort_task() race with the addition of
kref_put_spinlock_irqsave() (joern + greg-kh)
- Fix IBLOCK WCE=1 + DPOFUA=1 backend WRITE
regression in >= v3.9 (nab + bootc)
Note these four patches are CC'ed to stable.
Also, there is still some work left to be done on the active I/O
shutdown path in target_wait_for_sess_cmds() used by tcm_qla2xxx +
ib_isert fabrics that is still being discussed on the list, and will
hopefully be resolved soon."
* 'queue' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
target: close target_put_sess_cmd() vs. core_tmr_abort_task() race
target: removed unused transport_state flag
target/iblock: Fix WCE=1 + DPOFUA=1 backend WRITE regression
MAINTAINERS: Update target git tree URL
iscsi-target: Fix typos in RDMAEXTENSIONS macro usage
target/rd: Add ramdisk bit for NULLIO operation
iscsi-target: Fix processing of OOO commands
iscsi-target: Make buf param of iscsit_do_crypto_hash_buf() const void *
iscsi-target: Fix NULL pointer dereference in iscsit_send_reject
target: Have dev/enable show if TCM device is configured
target: Use FD_MAX_SECTORS/FD_BLOCKSIZE for blockdevs using fileio
target: Remove unused struct members in se_dev_entry
|
|
include/linux/brcmphy.h is currently not protected against double
inclusion, add ifdefs guard to fix that.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
- Cure for not using zalloc in the first place, which leads to random
crashes with CPUMASK_OFF_STACK.
- Revert a user space visible change which broke udev
- Add a missing cpu_online early return introduced by the new full
dyntick conversions
- Plug a long standing race in the timer wheel cpu hotplug code.
Sigh...
- Cleanup NOHZ per cpu data on cpu down to prevent stale data on cpu
up.
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
time: Revert ALWAYS_USE_PERSISTENT_CLOCK compile time optimizaitons
timer: Don't reinitialize the cpu base lock during CPU_UP_PREPARE
tick: Don't invoke tick_nohz_stop_sched_tick() if the cpu is offline
tick: Cleanup NOHZ per cpu data on cpu down
tick: Use zalloc_cpumask_var for allocating offstack cpumasks
|
|
Tidy up kernel-doc content for USB GADGET. No functional change.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
Since (69b34fb netfilter: xt_LOG: add net namespace support
for xt_LOG), we hit this:
[ 4224.708977] BUG: unable to handle kernel NULL pointer dereference at 0000000000000388
[ 4224.709074] IP: [<ffffffff8147f699>] ipt_log_packet+0x29/0x270
when callling log functions from conntrack both in and out
are NULL i.e. the net pointer is invalid.
Adding struct net *net in call to nf_logfn() will secure that
there always is a vaild net ptr.
Reported as netfilter's bugzilla bug 818:
https://bugzilla.netfilter.org/show_bug.cgi?id=818
Reported-by: Ronald <ronald645@gmail.com>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
It is possible for one thread to to take se_sess->sess_cmd_lock in
core_tmr_abort_task() before taking a reference count on
se_cmd->cmd_kref, while another thread in target_put_sess_cmd() drops
se_cmd->cmd_kref before taking se_sess->sess_cmd_lock.
This introduces kref_put_spinlock_irqsave() and uses it in
target_put_sess_cmd() to close the race window.
Signed-off-by: Joern Engel <joern@logfs.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|