Age | Commit message (Collapse) | Author |
|
Because of rounding, in certain conditions, i.e. when in congestion
avoidance state rho is smaller than 1/128 of the current cwnd, TCP
Hybla congestion control starves and the cwnd is kept constant
forever.
This patch forces an increment by one segment after #send_cwnd calls
without increments(newreno behavior).
Signed-off-by: Daniele Lacamera <root@danielinux.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Benjamin Thery tracked down a bug that explains many instances
of the error
unregister_netdevice: waiting for %s to become free. Usage count = %d
It turns out that netdev_run_todo can dead-lock with itself if
a second instance of it is run in a thread that will then free
a reference to the device waited on by the first instance.
The problem is really quite silly. We were trying to create
parallelism where none was required. As netdev_run_todo always
follows a RTNL section, and that todo tasks can only be added
with the RTNL held, by definition you should only need to wait
for the very ones that you've added and be done with it.
There is no need for a second mutex or spinlock.
This is exactly what the following patch does.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch add mc_count to struct in_device and updates
increment/decrement/initilaize of this field in IPv4 and in IPv6.
- Also printing the vfs /proc entry (/proc/net/igmp) is adjusted to
use the new mc_count.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
From: Ali Saidi <saidi@engin.umich.edu>
When TCP receive copy offload is enabled it's possible that
tcp_rcv_established() will cause two acks to be sent for a single
packet. In the case that a tcp_dma_early_copy() is successful,
copied_early is set to true which causes tcp_cleanup_rbuf() to be
called early which can send an ack. Further along in
tcp_rcv_established(), __tcp_ack_snd_check() is called and will
schedule a delayed ACK. If no packets are processed before the delayed
ack timer expires the packet will be acked twice.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Jesper Dangaard Brouer <hawk@comx.dk> reported a bug when setting a VLAN
device down that is in promiscous mode:
When the VLAN device is set down, the promiscous count on the real
device is decremented by one by vlan_dev_stop(). When removing the
promiscous flag from the VLAN device afterwards, the promiscous
count on the real device is decremented a second time by the
vlan_change_rx_flags() callback.
The root cause for this is that the ->change_rx_flags() callback is
invoked while the device is down. The synchronization is meant to mirror
the behaviour of the ->set_rx_mode callbacks, meaning the ->open function
is responsible for doing a full sync on open, the ->close() function is
responsible for doing full cleanup on ->stop() and ->change_rx_flags()
is meant to do incremental changes while the device is UP.
Only invoke ->change_rx_flags() while the device is UP to provide the
intended behaviour.
Tested-by: Jesper Dangaard Brouer <jdb@comx.dk>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
On a system with nfs mounts, if a task unshares its mount namespace,
a oops can occur when the system is rebooted if the task is the last
to unreference the nfs mount. It will try to create a rpc request
using utsname() which has been invalidated by free_nsproxy().
The patch fixes the issue by using the global init_utsname() which is
always valid. the capability of identifying rpc clients per uts namespace
stills needs some extra work so this should not be a problem.
BUG: unable to handle kernel NULL pointer dereference at 00000004
IP: [<c024c9ab>] rpc_create+0x332/0x42f
Oops: 0000 [#1] DEBUG_PAGEALLOC
Pid: 1857, comm: uts-oops Not tainted (2.6.27-rc5-00319-g7686ad5 #4)
EIP: 0060:[<c024c9ab>] EFLAGS: 00210287 CPU: 0
EIP is at rpc_create+0x332/0x42f
EAX: 00000000 EBX: df26adf0 ECX: c0251887 EDX: 00000001
ESI: df26ae58 EDI: c02f293c EBP: dda0fc9c ESP: dda0fc2c
DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
Process uts-oops (pid: 1857, ti=dda0e000 task=dd9a0778 task.ti=dda0e000)
Stack: c0104532 dda0fffc dda0fcac dda0e000 dda0e000 dd93b7f0 00000009 c02f2880
df26aefc dda0fc68 c01096b7 00000000 c0266ee0 c039a070 c039a070 dda0fc74
c012ca67 c039a064 dda0fc8c c012cb20 c03daf74 00000011 00000000 c0275c90
Call Trace:
[<c0104532>] ? dump_trace+0xc2/0xe2
[<c01096b7>] ? save_stack_trace+0x1c/0x3a
[<c012ca67>] ? save_trace+0x37/0x8c
[<c012cb20>] ? add_lock_to_list+0x64/0x96
[<c0256fc4>] ? rpcb_register_call+0x62/0xbb
[<c02570c8>] ? rpcb_register+0xab/0xb3
[<c0252f4d>] ? svc_register+0xb4/0x128
[<c0253114>] ? svc_destroy+0xec/0x103
[<c02531b2>] ? svc_exit_thread+0x87/0x8d
[<c01a75cd>] ? lockd_down+0x61/0x81
[<c01a577b>] ? nlmclnt_done+0xd/0xf
[<c01941fe>] ? nfs_destroy_server+0x14/0x16
[<c0194328>] ? nfs_free_server+0x4c/0xaa
[<c019a066>] ? nfs_kill_super+0x23/0x27
[<c0158585>] ? deactivate_super+0x3f/0x51
[<c01695d1>] ? mntput_no_expire+0x95/0xb4
[<c016965b>] ? release_mounts+0x6b/0x7a
[<c01696cc>] ? __put_mnt_ns+0x62/0x70
[<c0127501>] ? free_nsproxy+0x25/0x80
[<c012759a>] ? switch_task_namespaces+0x3e/0x43
[<c01275a9>] ? exit_task_namespaces+0xa/0xc
[<c0117fed>] ? do_exit+0x4fd/0x666
[<c01181b3>] ? do_group_exit+0x5d/0x83
[<c011fa8c>] ? get_signal_to_deliver+0x2c8/0x2e0
[<c0102630>] ? do_notify_resume+0x69/0x700
[<c011d85a>] ? do_sigaction+0x134/0x145
[<c0127205>] ? hrtimer_nanosleep+0x8f/0xce
[<c0126d1a>] ? hrtimer_wakeup+0x0/0x1c
[<c0103488>] ? work_notifysig+0x13/0x1b
=======================
Code: 70 20 68 cb c1 2c c0 e8 75 4e 01 00 8b 83 ac 00 00 00 59 3d 00 f0 ff ff 5f 77 63 eb 57 a1 00 80 2d c0 8b 80 a8 02 00 00 8d 73 68 <8b> 40 04 83 c0 45 e8 41 46 f7 ff ba 20 00 00 00 83 f8 21 0f 4c
EIP: [<c024c9ab>] rpc_create+0x332/0x42f SS:ESP 0068:dda0fc2c
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Serge E. Hallyn" <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Despite the fact that cloned rpc clients won't have the cl_autobind flag
set, they may still find themselves calling rpcb_getport_async(). For this
to happen, it suffices for a _parent_ rpc_clnt to use autobinding, in which
case any clone may find itself triggering the !xprt_bound() case in
call_bind().
The correct fix for this is to walk back up the tree of cloned rpc clients,
in order to find the parent that 'owns' the transport, either because it
has clnt->cl_autobind set, or because it originally created the
transport...
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Basically, try_module_get here are pretty useless. Any other module using
this API will pin sunrpc in memory due using exported symbols.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The content of init_ipv6_mibs/cleanup_ipv6_mibs will be moved to new
calls one by one next.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Unused net variable will become used very soon.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
idev has been stored on seq->private. NULL has been stored for global
statistics.
The situation is changed with net namespace. We need to store pointer to
struct net and the only place is seq->private. So, we'll have for
/proc/net/dev_snmp6/* and for /proc/net/snmp6 pointers of two different
types stored in the same field.
This effectively requires to separate seq_ops of these files.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Simple, comsolidate sockstat6 staff in one place, at the beginning of
the file. Right now sockstat6_seq_open/sockstat6_seq_fops looks like an
intrusion in the middle of snmp6 code.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Do the same for /proc/net/snmp6.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
I'm quite sure that if I give this function in its old format
for you to inspect, you start to wonder what is the type of
demanded or if it's a global variable.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It all started from me noticing that this urgent check in
tcp_clean_rtx_queue is unnecessarily inside the loop. Then
I took a longer look to it and found out that the users of
urg_mode can trivially do without, well almost, there was
one gotcha.
Bonus: those funny people who use urg with >= 2^31 write_seq -
snd_una could now rejoice too (that's the only purpose for the
between being there, otherwise a simple compare would have done
the thing). Not that I assume that the rest of the tcp code
happily lives with such mind-boggling numbers :-). Alas, it
turned out to be impossible to set wmem to such numbers anyway,
yes I really tried a big sendfile after setting some wmem but
nothing happened :-). ...Tcp_wmem is int and so is sk_sndbuf...
So I hacked a bit variable to long and found out that it seems
to work... :-)
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add some packet-split receive hooks.
For one this allows to do NUMA node affine page allocs. Later on these
hooks will be extended to do emergency reserve allocations for
fragments.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Wrap calling sk->sk_backlog_rcv() in a function. This will allow extending the
generic sk_backlog_rcv behaviour.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This makes that ip6_route_net_init() does all of the route init code.
There used to be a race between ip6_route_net_init() and ip6_net_init()
and someone relying on the combined result was left out cold.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
ip6_route_net_init() error handling looked less than solid, fix 'er up.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use the socket cached in the skb if it's present.
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
To be able to use the cached socket reference in the skb during input
processing we add a new set of lookup functions that receive the skb on
their argument list.
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
To be able to use the cached socket reference in the skb during input
processing we add a new set of lookup functions that receive the skb on
their argument list.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The association response IEs are sent to userland with an IWEVCUSTOM
event, which unfortunately is limited to a little more than 100 bytes
of IE information with the encoding used. Many APs send so much
IE information that this message overflows. When the IWEVCUSTOM
event is too large, the kernel doesn't send it to userland anyway --
better just not to send it.
An attempt was made by Jouni Malinen to correct this issue by
converting to use IWEVASSOCREQIE and IWEVASSOCRESPIE messages instead
("mac80211: Use IWEVASSOCREQIE instead of IWEVCUSTOM"). Unfortunately,
that caused a problem due to 32-/64-bit interactions on some systems and
was reverted after the 'userland ABI' rule was invoked. That leaves
us with this option instead of a proper fix, at least until we move
to a cfg80211-based solution.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
This patch adjusts the rate control API to allow multi-rate retry
if supported by the driver. The ieee80211_hw struct specifies how
many alternate rate selections the driver supports.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
Free up 2 bytes in skb->cb to be used for multi-rate retry later.
Move iv_len and icv_len initialization into key alloc.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
The LED state was not being updated by rfkill_force_state(), which
will cause regressions in wireless drivers that had old-style rfkill
support and are updated to use rfkill_force_state().
The LED state was not being updated when a change was detected through
the rfkill->get_state() hook, either.
Move the LED trigger update calls into notify_rfkill_state_change(),
where it should have been in the first place. This takes care of both
issues above.
Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Cc: stable@kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
(net/mac80211/tx.c)
- This patch (against the linux-wireless-next git tree) removes a
redundant check in ieee80211_master_start_xmit (net/mac80211/tx.c)
and adjust indentation in this method accordingly.
In this method, there is no need to call again the
ieee80211_is_data() method; this is checked immediately before, in the
"if" command (we will not enter this block unless ieee80211_is_data()
is true, so that the "and" (&&) condition in that "if" command will be
fullfilled ).
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
Signed-off-by: Davide Pesavento <davidepesa@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
This patch removes doubly defined variables in ieee80211_master_start_xmit
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
Restore revert "mac80211: Use IWEVASSOCREQIE instead of IWEVCUSTOM",
originally reverted in commit bf7394ccc13fe291d9258f01113b4c61214ddeae.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
lvs-next-2.6
|
|
Since IPVS now has partial IPv6 support, this patch moves IPVS from
net/ipv4/ipvs to net/netfilter/ipvs. It's a result of:
$ git mv net/ipv4/ipvs net/netfilter
and adapting the relevant Kconfigs/Makefiles to the new path.
Signed-off-by: Julius Volz <juliusv@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
While debugging another bug it was found that NetRom socks
are sometimes seen unorphaned in sk_free(). This patch moves
sock_orphan() in nr_release() to the beginning (like in ax25,
or rose).
Reported-and-tested-by: Bernard Pidoux f6bvp <f6bvp@free.fr>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since we reverted 30902dc3cb0ea1cfc7ac2b17bcf478ff98420d74 ("ax25: Fix
std timer socket destroy handling.") we have to put some kind of fix
in to cure the issue whereby unaccepted connections do not get destroyed.
The approach used here is from Tihomir Heidelberg - 9a4gl
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit 30902dc3cb0ea1cfc7ac2b17bcf478ff98420d74.
It causes all kinds of problems, based upon a report by
Bernard (f6bvp) and analysis by Jarek Poplawski.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The inititator/responder resources in the event have been swapped. They
no represent what the local peer would set their values to in order to
match the peer. Note that iWARP does not exchange these on the wire and
the provider is simply putting in the local device max.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
|
|
Update the svc_rdma_send_error code to use the DMA LKEY which is valid
regardless of the memory registration strategy in use.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
|
|
Use FRMR to map local RPC reply data. This allows RDMA_WRITE to send reply
data using a single WR. The FRMR is invalidated by linking the LOCAL_INV WR
to the RDMA_SEND message used to complete the reply.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
|
|
RPCRDMA requests that specify a read-list are fetched with RDMA_READ. Using
an FRMR to map the data sink improves NFSRDMA security on transports that
place the RDMA_READ data sink LKEY on the wire because the valid lifetime
of the MR is only the duration of the RDMA_READ. The LKEY is invalidated
when the last RDMA_READ WR completes.
Mapping the data sink also allows for very large amounts to data to be
fetched with a single WR, so if the client is also using FRMR, the entire
RPC read-list can be fetched with a single WR.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
|
|
WR can be submitted as linked lists of WR. Update the svc_rdma_send
routine to handle WR chains. This will be used to submit a WR that
uses an FRMR with another WR that invalidates the FRMR.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
|
|
Update the svc_rdma_post_recv routine to use the adapter's global LKEY
instead of sc_phys_mr which is only valid when using a DMA MR.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
|
|
Fast Reg MR introduces a new WR type. Add a service to register the
region with the adapter and update the completion handling to support
completions with a NULL WR context.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
|
|
Query the device capabilities in the svc_rdma_accept function to determine
what advanced memory management capabilities are supported by the device.
Based on the query, select the most secure model available given the
requirements of the transport and capabilities of the adapter.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
|
|
Add services for the allocating, freeing, and unmapping Fast Reg MR. These
services will be used by the transport connection setup, send and receive
routines.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
|