Age | Commit message (Collapse) | Author |
|
Maintain a local hash table of TCP dynamic metrics blobs.
Computed TCP metrics are no longer maintained in the route metrics.
The table uses RCU and an extremely simple hash so that it has low
latency and low overhead. A simple hash is legitimate because we only
make metrics blobs for fully established connections.
Some tweaking of the default hash table sizes, metric timeouts, and
the hash chain length limit certainly could use some tweaking. But
the basic design seems sound.
With help from Eric Dumazet and Joe Perches.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
All paths assume, when CONFIG_IP_MULTIPLE_TABLES is enabled, that any
successful call to fib_lookup() will initialize the fib_result->r
value to something.
We violated that expectation in the new fib_lookup() fast path.
Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Some NFC chips will statically create and open pipes for both standard
and proprietary gates. The driver can now pass this information to HCI
such that HCI will not attempt to create and open them, but will instead
directly use the passed pipe ids.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
This API should be used by drivers, HCI, SHDLC or NCI stacks to report an
unrecoverable error.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
The management frame and remain-on-channel APIs will be
needed in the P2P device abstraction, so move them over
to the new wdev-based APIs. Userspace can still use both
the interface index and wdev identifier for them so it's
backward compatible, but for the P2P Device wdev it will
be able to use the wdev identifier only.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
In order to support a P2P device abstraction and
Bluetooth high-speed AMPs, we need to have a way
to identify virtual interfaces that don't have a
netdev associated.
Do this by adding a NL80211_ATTR_WDEV attribute
to identify a wdev which may or may not also be
a netdev.
To simplify things, use a 64-bit value with the
high 32 bits being the wiphy index for this new
wdev identifier in the nl80211 API.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
This API call was intended to be used by drivers
if they want to optimize key handling by removing
one key when another is added. Remove it since no
driver is using it. If needed, it can always be
added back.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Hans reports that he's still hitting:
BUG: unable to handle kernel NULL pointer dereference at 000000000000027c
IP: [<ffffffff813615db>] netlink_has_listeners+0xb/0x60
PGD 0
Oops: 0000 [#3] PREEMPT SMP
CPU 0
It happens when adding a number of containers with do:
nfct_query(h, NFCT_Q_CREATE, ct);
and most likely one namespace shuts down.
this problem was supposed to be fixed by:
70e9942 netfilter: nf_conntrack: make event callback registration per-netns
Still, it was missing one rcu_access_pointer to check if the callback
is set or not.
Reported-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
If the user hasn't actually installed any custom rules, or fiddled
with the default ones, don't go through the whole FIB rules layer.
It's just pure overhead.
Instead do what we do with CONFIG_IP_MULTIPLE_TABLES disabled, check
the individual tables by hand, one by one.
Also, move fib_num_tclassid_users into the ipv4 network namespace.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
60g band uses different from .11n MCS scheme, so bitrate
should be calculated differently
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Until now, a u16 value was used to represent bitrate value.
With VHT bitrates this becomes too small.
Introduce a new 32-bit bitrate attribute. nl80211 will report
both the new and the old attribute, unless the bitrate doesn't
fit into the old u16 attribute in which case only the new one
will be reported.
User space tools encouraged to prefer the 32-bit attribute, if
available (since it won't be available on older kernels.)
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
[reword commit message and comments a bit]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
|
|
No longer used.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This makes for a simplified conversion away from dst_get_neighbour*().
All code outside of ipv6 will use neigh lookups via dst_neigh_lookup*().
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This allows an easy conversion away from dst_get_neighbour*().
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Causes the handler to use the daddr in the ipv4/ipv6 header when
the route gateway is unspecified (local subnet).
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When a dst_confirm() happens, mark the confirmation as pending in the
dst. Then on the next packet out, when we have the neigh in-hand, do
the update.
This removes the dependency in dst_confirm() of dst's having an
attached neigh.
While we're here, remove the explicit 'dst' NULL check, all except 2
or 3 call sites ensure it's not NULL. So just fix those cases up.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Do not use the dst cached neigh, we'll be getting rid of that.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch generalizes nf_ct_l4proto_net by splitting it into chunks and
moving the corresponding protocol part to where it really belongs to.
To clarify, note that we follow two different approaches to support per-net
depending if it's built-in or run-time loadable protocol tracker.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Gao feng <gaofeng@cn.fujitsu.com>
|
|
Some drivers require setup before being able to send
management frames in managed mode, in particular in
multi-channel cases.
Introduce API to allow the drivers to do such setup
while being able to sleep waiting for the setup to
finish in the device. This isn't possible inside the
TX call since that can't sleep.
A future patch may also restructure the TX retry to
wait for the driver to report the frame status, as
suggested by Arik in
http://mid.gmane.org/CA+XVXffKSEL6ZQPQ98x-zO-NL2=TNF1uN==mprRyUmAaRn254g@mail.gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
IEEE80211_TX_MAX_RATES can be reduced from 5 to 4 as there
is no current hardware supporting a rate chain with 5 multi
rate stages (mrr), so 4 mrr stages are sufficient.
The memory that is freed within the ieee80211_tx_info struct
will be used in the upcoming Transmission Power Control (TPC)
implementation.
Suggested-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de>
[reword commit message]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The implementation of tx_frags is buggy due to
not handling queue stop, and there's no driver
implementing it so remove it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Add enumerations for both cfg80211 and nl80211.
This expands wiphy.bands etc. arrays.
Extend channel <-> frequency translation to cover 60g band
and modify the rate check logic since there are no legacy
mandatory rates (only MCS is used.)
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
It was noticed recently that when we send data on a transport, its possible that
we might bundle a sack that arrived on a different transport. While this isn't
a major problem, it does go against the SHOULD requirement in section 6.4 of RFC
2960:
An endpoint SHOULD transmit reply chunks (e.g., SACK, HEARTBEAT ACK,
etc.) to the same destination transport address from which it
received the DATA or control chunk to which it is replying. This
rule should also be followed if the endpoint is bundling DATA chunks
together with the reply chunk.
This patch seeks to correct that. It restricts the bundling of sack operations
to only those transports which have moved the ctsn of the association forward
since the last sack. By doing this we guarantee that we only bundle outbound
saks on a transport that has received a chunk since the last sack. This brings
us into stricter compliance with the RFC.
Vlad had initially suggested that we strictly allow only sack bundling on the
transport that last moved the ctsn forward. While this makes sense, I was
concerned that doing so prevented us from bundling in the case where we had
received chunks that moved the ctsn on multiple transports. In those cases, the
RFC allows us to select any of the transports having received chunks to bundle
the sack on. so I've modified the approach to allow for that, by adding a state
variable to each transport that tracks weather it has moved the ctsn since the
last sack. This I think keeps our behavior (and performance), close enough to
our current profile that I think we can do this without a sysctl knob to
enable/disable it.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Vlad Yaseivch <vyasevich@gmail.com>
CC: David S. Miller <davem@davemloft.net>
CC: linux-sctp@vger.kernel.org
Reported-by: Michele Baldessari <michele@redhat.com>
Reported-by: sorin serban <sserban@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Improve debugging of hci_conn objects by: adding print to hci_conn
refcounting, adding object spcifier when missing, change conn to hcon
since conn is heavily used for l2cap_conn objects and this is misleading.
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
Conflicts:
drivers/net/wireless/brcm80211/brcmfmac/dhd_sdio.c
|
|
We do not need it anymore since cfg80211 tracks
monitor channel and monitor channel type.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Implements .set_monitor_enabled(wiphy, enabled).
Notifies driver upon change of interface layout.
If only monitor interfaces become present it is
called with 2nd argument being true. If
non-monitor interface appears then 2nd argument
is false. Driver is notified only upon change.
This makes it more obvious about the fact that
cfg80211 supports single monitor channel. Once we
implement multi-channel we don't want to allow
setting monitor channel while other interface
types are running. Otherwise it would be ambiguous
once we start considering num_different_channels.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
IBSS may hop between channels. It is necessary to
account this special case when considering
interface combinations.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
We need to know which channel is used by a running
AP and mesh for channel context accounting and
finding matching/active interface combination.
STA/IBSS have current_bss already which allows us
to check which channel a vif is tuned to.
Non-fixed channel IBSS can be handled with
additional changes.
Monitor mode is going to be handled differently.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
If rpfilter is off (or the SKB has an IPSEC path) and there are not
tclassid users, we don't have to do anything at all when
fib_validate_source() is invoked besides setting the itag to zero.
We monitor tclassid uses with a counter (modified only under RTNL and
marked __read_mostly) and we protect the fib_validate_source() real
work with a test against this counter and whether rpfilter is to be
done.
Having a way to know whether we need no tclassid processing or not
also opens the door for future optimized rpfilter algorithms that do
not perform full FIB lookups.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
sanity checks
At Facebook, we do Layer-3 DSR via IP-in-IP tunneling. Our load balancers wrap
an extra IP header on incoming packets so they can be routed to the backend.
In the v4 tunnel driver, when these packets fall on the default tunl0 device,
the behavior is to decapsulate them and drop them back on the stack. So our
setup is that tunl0 has the VIP and eth0 has (obviously) the backend's real
address.
In IPv6 we do the same thing, but the v6 tunnel driver didn't have this same
behavior - if you didn't have an explicit tunnel setup, it would drop the
packet.
This patch brings that v4 feature to the v6 driver.
The same IPv6 address checks are performed as with any normal tunnel,
but as the fallback tunnel endpoint addresses are unspecified, the checks
must be performed on a per-packet basis, rather than at tunnel
configuration time.
[Patch description modified by phil@ipom.com]
Signed-off-by: Ville Nuorvala <ville.nuorvala@gmail.com>
Tested-by: Phil Dibowitz <phil@ipom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Checking for in_dev being NULL is pointless.
In fact, all of our callers have in_dev precomputed already,
so just pass it in and remove the NULL checking.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Using NLMSG_GOODSIZE results in multiple pages being used as
nlmsg_new() will automatically add the size of the netlink
header to the payload thus exceeding the page limit.
NLMSG_DEFAULT_SIZE takes this into account.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Cc: Jiri Pirko <jpirko@redhat.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Sergey Lapin <slapin@ossfans.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Lauro Ramos Venancio <lauro.venancio@openbossa.org>
Cc: Aloisio Almeida Jr <aloisio.almeida@openbossa.org>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Reviewed-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This commit changes inet_csk_route_req() so that it uses a pointer to
a struct flowi6, rather than allocating its own on the stack. This
brings its behavior in line with its IPv4 cousin,
inet_csk_route_req(), and allows a follow-on patch to fix a dst leak.
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
Allow drivers to advertise their VHT capabilities
and export them to userspace via nl80211.
Signed-off-by: Mahesh Palivela <maheshp@posedge.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The specific destination is the host we direct unicast replies to.
Usually this is the original packet source address, but if we are
responding to a multicast or broadcast packet we have to use something
different.
Specifically we must use the source address we would use if we were to
send a packet to the unicast source of the original packet.
The routing cache precomputes this value, but we want to remove that
precomputation because it creates a hard dependency on the expensive
rpfilter source address validation which we'd like to make cheaper.
There are only three places where this matters:
1) ICMP replies.
2) pktinfo CMSG
3) IP options
Now there will be no real users of rt->rt_spec_dst and we can simply
remove it altogether.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Rename it to ip_send_unicast_reply() and add explicit 'saddr'
argument.
This removed one of the few users of rt->rt_spec_dst.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This and ieee80211_add_ext_srates_ie() aren't
exported, so can't be used by drivers anyway,
but there's also no reason that they should be
so make them private to mac80211 and use sdata
instead of vif arguments.
Acked-by: Arik Nemtsov <arik@wizery.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
It's completely unnecessary.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Instead of using a fixed value of "-1" or "-EMSGSIZE", propagate what
the nla_*() interfaces actually return.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit c074da2810c118b3812f32d6754bd9ead2f169e7.
This change has several unwanted side effects:
1) Sockets will cache the DST_NOCACHE route in sk->sk_rx_dst and we'll
thus never create a real cached route.
2) All TCP traffic will use DST_NOCACHE and never use the routing
cache at all.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
DDOS synflood attacks hit badly IP route cache.
On typical machines, this cache is allowed to hold up to 8 Millions dst
entries, 256 bytes for each, for a total of 2GB of memory.
rt_garbage_collect() triggers and tries to cleanup things.
Eventually route cache is disabled but machine is under fire and might
OOM and crash.
This patch exploits the new TCP early demux, to set a nocache
boolean in case incoming TCP frame is for a not yet ESTABLISHED or
TIMEWAIT socket.
This 'nocache' boolean is then used in case dst entry is not found in
route cache, to create an unhashed dst entry (DST_NOCACHE)
SYN-cookie-ACK sent use a similar mechanism (ipv4: tcp: dont cache
output dst for syncookies), so after this patch, a machine is able to
absorb a DDOS synflood attack without polluting its IP route cache.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|