summaryrefslogtreecommitdiffstats
path: root/net
AgeCommit message (Collapse)Author
2009-05-20Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
2009-05-20net: fix rtable leak in net/ipv4/route.cEric Dumazet
Alexander V. Lukyanov found a regression in 2.6.29 and made a complete analysis found in http://bugzilla.kernel.org/show_bug.cgi?id=13339 Quoted here because its a perfect one : begin_of_quotation 2.6.29 patch has introduced flexible route cache rebuilding. Unfortunately the patch has at least one critical flaw, and another problem. rt_intern_hash calculates rthi pointer, which is later used for new entry insertion. The same loop calculates cand pointer which is used to clean the list. If the pointers are the same, rtable leak occurs, as first the cand is removed then the new entry is appended to it. This leak leads to unregister_netdevice problem (usage count > 0). Another problem of the patch is that it tries to insert the entries in certain order, to facilitate counting of entries distinct by all but QoS parameters. Unfortunately, referencing an existing rtable entry moves it to list beginning, to speed up further lookups, so the carefully built order is destroyed. For the first problem the simplest patch it to set rthi=0 when rthi==cand, but it will also destroy the ordering. end_of_quotation Problematic commit is 1080d709fb9d8cd4392f93476ee46a9d6ea05a5b (net: implement emergency route cache rebulds when gc_elasticity is exceeded) Trying to keep dst_entries ordered is too complex and breaks the fact that order should depend on the frequency of use for garbage collection. A possible fix is to make rt_intern_hash() simpler, and only makes rt_check_expire() a litle bit smarter, being able to cope with an arbitrary entries order. The added loop is running on cache hot data, while cpu is prefetching next object, so should be unnoticied. Reported-and-analyzed-by: Alexander V. Lukyanov <lav@yar.ru> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-20net: fix length computation in rt_check_expire()Eric Dumazet
rt_check_expire() computes average and standard deviation of chain lengths, but not correclty reset length to 0 at beginning of each chain. This probably gives overflows for sum2 (and sum) on loaded machines instead of meaningful results. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-20cfg80211: fix race between core hint and driver's custom applyLuis R. Rodriguez
Its possible for cfg80211 to have scheduled the work and for the global workqueue to not have kicked in prior to a cfg80211 driver's regulatory hint or wiphy_apply_custom_regulatory(). Although this is very unlikely its possible and should fix this race. When this race would happen you are expected to have hit a null pointer dereference panic. Cc: stable@kernel.org Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Tested-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-20wext: verify buffer size for SIOCSIWENCODEEXTJohannes Berg
Another design flaw in wireless extensions (is anybody surprised?) in the way it handles the iw_encode_ext structure: The structure is part of the 'extra' memory but contains the key length explicitly, instead of it just being the length of the extra buffer - size of the struct and using the explicit key length only for the get operation (which only writes it). Therefore, we have this layout: extra: +-------------------------+ | struct iw_encode_ext { | | ... | | u16 key_len; | | u8 key[0]; | | }; | +-------------------------+ | key material | +-------------------------+ Now, all drivers I checked use ext->key_len without checking that both key_len and the struct fit into the extra buffer that has been copied from userspace. This leads to a buffer overrun while reading that buffer, depending on the driver it may be possible to specify arbitrary key_len or it may need to be a proper length for the key algorithm specified. Thankfully, this is only exploitable by root, but root can actually cause a segfault or use kernel memory as a key (which you can even get back with siocgiwencode or siocgiwencodeext from the key buffer). Fix this by verifying that key_len fits into the buffer along with struct iw_encode_ext. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-18ipv4: make default for INET_LRO consistent with help textFrans Pop
Commit e81963b1 ("ipv4: Make INET_LRO a bool instead of tristate.") changed this config from tristate to bool. Add default so that it is consistent with the help text. Signed-off-by: Frans Pop <elendil@planet.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-18net: fix skb_seq_read returning wrong offset/length for page frag dataThomas Chenault
When called with a consumed value that is less than skb_headlen(skb) bytes into a page frag, skb_seq_read() incorrectly returns an offset/length relative to skb->data. Ensure that data which should come from a page frag does. Signed-off-by: Thomas Chenault <thomas_chenault@dell.com> Tested-by: Shyam Iyer <shyam_iyer@dell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-18pkt_sched: gen_estimator: use 64 bit intermediate counters for bpsEric Dumazet
gen_estimator can overflow bps (bytes per second) with Gb links, while it was designed with a u32 API, with a theorical limit of 34360Mbit (2^32 bytes) Using 64 bit intermediate avbps/brate counters can allow us to reach this theorical limit. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-18sch_teql: should not dereference skb after ndo_start_xmit()Eric Dumazet
It is illegal to dereference a skb after a successful ndo_start_xmit() call. We must store skb length in a local variable instead. Bug was introduced in 2.6.27 by commit 0abf77e55a2459aa9905be4b226e4729d5b4f0cb (net_sched: Add accessor function for packet length for qdiscs) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-18tcp: fix MSG_PEEK race checkIlpo Järvinen
Commit 518a09ef11 (tcp: Fix recvmsg MSG_PEEK influence of blocking behavior) lets the loop run longer than the race check did previously expect, so we need to be more careful with this check and consider the work we have been doing. I tried my best to deal with urg hole madness too which happens here: if (!sock_flag(sk, SOCK_URGINLINE)) { ++*seq; ... by using additional offset by one but I certainly have very little interest in testing that part. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Tested-by: Frans Pop <elendil@planet.nl> Tested-by: Ian Zimmermann <itz@buug.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-17bridge: fix initial packet flood if !STPStephen Hemminger
If bridge is configured with no STP and forwarding delay of 0 (which is typical for virtualization) then when link starts it will flood all packets for the first 20 seconds. This bug was introduced by a combination of earlier changes: * forwarding database uses hold time of zero to indicate user wants to always flood packets * optimzation of the case of forwarding delay of 0 avoids the initial timer tick The fix is to just skip all the topology change detection code if kernel STP is not being used. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-17bridge: relay bridge multicast pkgs if !STPStephen Hemminger
Currently the bridge catches all STP packets; even if STP is turned off. This prevents other systems (which do have STP turned on) from being able to detect loops in the network. With this patch, if STP is off, then any packet sent to the STP multicast group address is forwarded to all ports. Based on earlier patch by Joakim Tjernlund with changes to go through forwarding (not local chain), and optimization that only last octet needs to be checked. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-17ipconfig: handle case of delayed DHCP serverChris Friesen
If a DHCP server is delayed, it's possible for the client to receive the DHCPOFFER after it has already sent out a new DHCPDISCOVER message from a second interface. The client then sends out a DHCPREQUEST from the second interface, but the server doesn't recognize the device and rejects the request. This patch simply tracks the current device being configured and throws away the OFFER if it is not intended for the current device. A more sophisticated approach would be to put the OFFER information into the struct ic_device rather than storing it globally. Signed-off-by: Chris Friesen <cfriesen@nortel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-17netpoll: don't dereference NULL dev from npPavel Emelyanov
It looks like the dev in netpoll_poll can be NULL - at lease it's checked at the function beginning. Thus the dev->netde_ops dereference looks dangerous. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6: Bluetooth: Don't trigger disconnect timeout for security mode 3 pairing Bluetooth: Don't use hci_acl_connect_cancel() for incoming connections Bluetooth: Fix wrong module refcount when connection setup fails Another case of me handling the fallout from Davem's unfortunate addiction to shuffleboard. Won't anybody think of the children? Join the anti-shuffleboard league today!
2009-05-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6: iwlwifi: fix device id registration for 6000 series 2x2 devices ath5k: update channel in sw state after stopping RX and TX rtl8187: use DMA-aware buffers with usb_control_msg mac80211: avoid NULL ptr deref when finding max_rates in PID and minstrel airo: airo_get_encode{,ext} potential buffer overflow Pulled directly by Linus because Davem is off playing shuffle-board at some Alaskan cruise, and the NULL ptr deref issue hits people and should get merged sooner rather than later. David - make us proud on the shuffle-board tournament!
2009-05-12Merge branch 'for-2.6.30' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
* 'for-2.6.30' of git://linux-nfs.org/~bfields/linux: nfsd: silence lockdep warning lockd: fix list corruption on lockd restart nfsd4: check for negative dentry before use in nfsv4 readdir nfsd41: slots are freed with session svcrdma: clean up error paths. svcrdma: Fix dma map direction for rdma read targets
2009-05-11mac80211: avoid NULL ptr deref when finding max_rates in PID and minstrelJohn W. Linville
"There is another problem with this piece of code. The sband will be NULL after second iteration on single band device and cause null pointer dereference. Everything is working with dual band card. Sorry, but i don't know how to explain this clearly in English. I have looked on the second patch for pid algorithm and found similar bug." Reported-by: Karol Szuster <qflon@o2.pl> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (26 commits) bonding: fix panic if initialization fails IXP4xx: complete Ethernet netdev setup before calling register_netdev(). IXP4xx: use "ENODEV" instead of "ENOSYS" in module initialization. ipvs: Fix IPv4 FWMARK virtual services ipv4: Make INET_LRO a bool instead of tristate. net: remove stale reference to fastroute from Kconfig help text net: update skb_recycle_check() for hardware timestamping changes bnx2: Fix panic in bnx2_poll_work(). net-sched: fix bfifo default limit igb: resolve panic on shutdown when SR-IOV is enabled wimax: oops: wimax_dev_add() is the only one that can initialize the state wimax: fix oops if netlink fails to add attribute Bluetooth: Move dev_set_name() to a context that can sleep netfilter: ctnetlink: fix wrong message type in user updates netfilter: xt_cluster: fix use of cluster match with 32 nodes netfilter: ip6t_ipv6header: fix match on packets ending with NEXTHDR_NONE netfilter: add missing linux/types.h include to xt_LED.h mac80211: pid, fix memory corruption mac80211: minstrel, fix memory corruption cfg80211: fix comment on regulatory hint processing ...
2009-05-09Bluetooth: Don't trigger disconnect timeout for security mode 3 pairingMarcel Holtmann
A remote device in security mode 3 that tries to connect will require the pairing during the connection setup phase. The disconnect timeout is now triggered within 10 milliseconds and causes the pairing to fail. If a connection is not fully established and a PIN code request is received, don't trigger the disconnect timeout. The either successful or failing connection complete event will make sure that the timeout is triggered at the right time. The biggest problem with security mode 3 is that many Bluetooth 2.0 device and before use a temporary security mode 3 for dedicated bonding. Based on a report by Johan Hedberg <johan.hedberg@nokia.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Tested-by: Johan Hedberg <johan.hedberg@nokia.com>
2009-05-09Bluetooth: Don't use hci_acl_connect_cancel() for incoming connectionsMarcel Holtmann
The connection setup phase takes around 2 seconds or longer and in that time it is possible that the need for an ACL connection is no longer present. If that happens then, the connection attempt will be canceled. This only applies to outgoing connections, but currently it can also be triggered by incoming connection. Don't call hci_acl_connect_cancel() on incoming connection since these have to be either accepted or rejected in this state. Once they are successfully connected they need to be fully disconnected anyway. Also remove the wrong hci_acl_disconn() call for SCO and eSCO links since at this stage they can't be disconnected either, because the connection handle is still unknown. Based on a report by Johan Hedberg <johan.hedberg@nokia.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Tested-by: Johan Hedberg <johan.hedberg@nokia.com>
2009-05-09Bluetooth: Fix wrong module refcount when connection setup failsMarcel Holtmann
The module refcount is increased by hci_dev_hold() call in hci_conn_add() and decreased by hci_dev_put() call in del_conn(). In case the connection setup fails, hci_dev_put() is never called. Procedure to reproduce the issue: # hciconfig hci0 up # lsmod | grep btusb -> "used by" refcount = 1 # hcitool cc <non-exisiting bdaddr> -> will get timeout # lsmod | grep btusb -> "used by" refcount = 2 # hciconfig hci0 down # lsmod | grep btusb -> "used by" refcount = 1 # rmmod btusb -> ERROR: Module btusb is in use The hci_dev_put() call got moved into del_conn() with the 2.6.25 kernel to fix an issue with hci_dev going away before hci_conn. However that change was wrong and introduced this problem. When calling hci_conn_del() it has to call hci_dev_put() after freeing the connection details. This handling should be fully symmetric. The execution of del_conn() is done in a work queue and needs it own calls to hci_dev_hold() and hci_dev_put() to ensure that the hci_dev stays until the connection cleanup has been finished. Based on a report by Bing Zhao <bzhao@marvell.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Tested-by: Bing Zhao <bzhao@marvell.com>
2009-05-08ipvs: Fix IPv4 FWMARK virtual servicesSimon Horman
This fixes the use of fwmarks to denote IPv4 virtual services which was unfortunately broken as a result of the integration of IPv6 support into IPVS, which was included in 2.6.28. The problem arises because fwmarks are stored in the 4th octet of a union nf_inet_addr .all, however in the case of IPv4 only the first octet, corresponding to .ip, is assigned and compared. In other words, using .all = { 0, 0, 0, htonl(svc->fwmark) always results in a value of 0 (32bits) being stored for IPv4. This means that one fwmark can be used, as it ends up being mapped to 0, but things break down when multiple fwmarks are used, as they all end up being mapped to 0. As fwmarks are 32bits a reasonable fix seems to be to just store the fwmark in .ip, and comparing and storing .ip when fwmarks are used. This patch makes the assumption that in calls to ip_vs_ct_in_get() and ip_vs_sched_persist() if the proto parameter is IPPROTO_IP then we are dealing with an fwmark. I believe this is valid as ip_vs_in() does fairly strict filtering on the protocol and IPPROTO_IP should not be used in these calls unless explicitly passed when making these calls for fwmarks in ip_vs_sched_persist(). Tested-by: Fabien Duchêne <fabien.duchene@student.uclouvain.be> Cc: Joseph Mack NA3T <jmack@wm7d.net> Cc: Julius Volz <julius.volz@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-08ipv4: Make INET_LRO a bool instead of tristate.David S. Miller
This code is used as a library by several device drivers, which select INET_LRO. If some are modules and some are statically built into the kernel, we get build failures if INET_LRO is modular. Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-07net: remove stale reference to fastroute from Kconfig help textAshish Karkare
Signed-off-by: Ashish Karkare <akarkare@marvell.com> Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-06net: update skb_recycle_check() for hardware timestamping changesLennert Buytenhek
Commit ac45f602ee3d1b6f326f68bc0c2591ceebf05ba4 ("net: infrastructure for hardware time stamping") added two skb initialization actions to __alloc_skb(), which need to be added to skb_recycle_check() as well. Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: Patrick Ohly <patrick.ohly@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-06net-sched: fix bfifo default limitPatrick McHardy
When no limit is given, the bfifo uses a default of tx_queue_len * mtu. Packets handled by qdiscs include the link layer header, so this should be taken into account, similar to what other qdiscs do. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-06Merge branch 'linux-2.6.30.y' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/inaky/wimax
2009-05-06wimax: oops: wimax_dev_add() is the only one that can initialize the stateInaky Perez-Gonzalez
When a new wimax_dev is created, it's state has to be __WIMAX_ST_NULL until wimax_dev_add() is succesfully called. This allows calls into the stack that happen before said time to be rejected. Until now, the state was being set (by mistake) to UNINITIALIZED, which was allowing calls such as wimax_report_rfkill_hw() to go through even when a call to wimax_dev_add() had failed; that was causing an oops when touching uninitialized data. This situation is normal when the device starts reporting state before the whole initialization has been completed. It just has to be dealt with. Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
2009-05-06wimax: fix oops if netlink fails to add attributeInaky Perez-Gonzalez
When sending a message to user space using wimax_msg(), if nla_put() fails, correctly interpret the return code from wimax_msg_alloc() as an err ptr and return the error code instead of crashing (as it is assuming than non-NULL means the pointer is ok). Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
2009-05-05Bluetooth: Move dev_set_name() to a context that can sleepMarcel Holtmann
Setting the name of a sysfs device has to be done in a context that can actually sleep. It allocates its memory with GFP_KERNEL. Previously it was a static (size limited) string and that got changed to accommodate longer device names. So move the dev_set_name() just before calling device_add() which is executed in a work queue. This fixes the following error: [ 110.012125] BUG: sleeping function called from invalid context at mm/slub.c:1595 [ 110.012135] in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper [ 110.012141] 2 locks held by swapper/0: [ 110.012145] #0: (hci_task_lock){++.-.+}, at: [<ffffffffa01f822f>] hci_rx_task+0x2f/0x2d0 [bluetooth] [ 110.012173] #1: (&hdev->lock){+.-.+.}, at: [<ffffffffa01fb9e2>] hci_event_packet+0x72/0x25c0 [bluetooth] [ 110.012198] Pid: 0, comm: swapper Tainted: G W 2.6.30-rc4-g953cdaa #1 [ 110.012203] Call Trace: [ 110.012207] <IRQ> [<ffffffff8023eabd>] __might_sleep+0x14d/0x170 [ 110.012228] [<ffffffff802cfbe1>] __kmalloc+0x111/0x170 [ 110.012239] [<ffffffff803c2094>] kvasprintf+0x64/0xb0 [ 110.012248] [<ffffffff803b7a5b>] kobject_set_name_vargs+0x3b/0xa0 [ 110.012257] [<ffffffff80465326>] dev_set_name+0x76/0xa0 [ 110.012273] [<ffffffffa01fb9e2>] ? hci_event_packet+0x72/0x25c0 [bluetooth] [ 110.012289] [<ffffffffa01ffc1d>] hci_conn_add_sysfs+0x3d/0x70 [bluetooth] [ 110.012303] [<ffffffffa01fba2c>] hci_event_packet+0xbc/0x25c0 [bluetooth] [ 110.012312] [<ffffffff80516eb0>] ? sock_def_readable+0x80/0xa0 [ 110.012328] [<ffffffffa01fee0c>] ? hci_send_to_sock+0xfc/0x1c0 [bluetooth] [ 110.012343] [<ffffffff80516eb0>] ? sock_def_readable+0x80/0xa0 [ 110.012347] [<ffffffff805e88c5>] ? _read_unlock+0x75/0x80 [ 110.012354] [<ffffffffa01fee0c>] ? hci_send_to_sock+0xfc/0x1c0 [bluetooth] [ 110.012360] [<ffffffffa01f8403>] hci_rx_task+0x203/0x2d0 [bluetooth] [ 110.012365] [<ffffffff80250ab5>] tasklet_action+0xb5/0x160 [ 110.012369] [<ffffffff8025116c>] __do_softirq+0x9c/0x150 [ 110.012372] [<ffffffff805e850f>] ? _spin_unlock+0x3f/0x80 [ 110.012376] [<ffffffff8020cbbc>] call_softirq+0x1c/0x30 [ 110.012380] [<ffffffff8020f01d>] do_softirq+0x8d/0xe0 [ 110.012383] [<ffffffff80250df5>] irq_exit+0xc5/0xe0 [ 110.012386] [<ffffffff8020e71d>] do_IRQ+0x9d/0x120 [ 110.012389] [<ffffffff8020c3d3>] ret_from_intr+0x0/0xf [ 110.012391] <EOI> [<ffffffff80431832>] ? acpi_idle_enter_bm+0x264/0x2a6 [ 110.012399] [<ffffffff80431828>] ? acpi_idle_enter_bm+0x25a/0x2a6 [ 110.012403] [<ffffffff804f50d5>] ? cpuidle_idle_call+0xc5/0x130 [ 110.012407] [<ffffffff8020a4b4>] ? cpu_idle+0xc4/0x130 [ 110.012411] [<ffffffff805d2268>] ? rest_init+0x88/0xb0 [ 110.012416] [<ffffffff807e2fbd>] ? start_kernel+0x3b5/0x412 [ 110.012420] [<ffffffff807e2281>] ? x86_64_start_reservations+0x91/0xb5 [ 110.012424] [<ffffffff807e2394>] ? x86_64_start_kernel+0xef/0x11b Based on a report by Davide Pesavento <davidepesa@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Tested-by: Hugo Mildenberger <hugo.mildenberger@namir.de> Tested-by: Bing Zhao <bzhao@marvell.com>
2009-05-05Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6
2009-05-05Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
2009-05-05netfilter: ctnetlink: fix wrong message type in user updatesPablo Neira Ayuso
This patch fixes the wrong message type that are triggered by user updates, the following commands: (term1)# conntrack -I -p tcp -s 1.1.1.1 -d 2.2.2.2 -t 10 --sport 10 --dport 20 --state LISTEN (term1)# conntrack -U -p tcp -s 1.1.1.1 -d 2.2.2.2 -t 10 --sport 10 --dport 20 --state SYN_SENT (term1)# conntrack -U -p tcp -s 1.1.1.1 -d 2.2.2.2 -t 10 --sport 10 --dport 20 --state SYN_RECV only trigger event message of type NEW, when only the first is NEW while others should be UPDATE. (term2)# conntrack -E [NEW] tcp 6 10 LISTEN src=1.1.1.1 dst=2.2.2.2 sport=10 dport=20 [UNREPLIED] src=2.2.2.2 dst=1.1.1.1 sport=20 dport=10 mark=0 [NEW] tcp 6 10 SYN_SENT src=1.1.1.1 dst=2.2.2.2 sport=10 dport=20 [UNREPLIED] src=2.2.2.2 dst=1.1.1.1 sport=20 dport=10 mark=0 [NEW] tcp 6 10 SYN_RECV src=1.1.1.1 dst=2.2.2.2 sport=10 dport=20 [UNREPLIED] src=2.2.2.2 dst=1.1.1.1 sport=20 dport=10 mark=0 This patch also removes IPCT_REFRESH from the bitmask since it is not of any use. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-05-05netfilter: xt_cluster: fix use of cluster match with 32 nodesPablo Neira Ayuso
This patch fixes a problem when you use 32 nodes in the cluster match: % iptables -I PREROUTING -t mangle -i eth0 -m cluster \ --cluster-total-nodes 32 --cluster-local-node 32 \ --cluster-hash-seed 0xdeadbeef -j MARK --set-mark 0xffff iptables: Invalid argument. Run `dmesg' for more information. % dmesg | tail -1 xt_cluster: this node mask cannot be higher than the total number of nodes The problem is related to this checking: if (info->node_mask >= (1 << info->total_nodes)) { printk(KERN_ERR "xt_cluster: this node mask cannot be " "higher than the total number of nodes\n"); return false; } (1 << 32) is 1. Thus, the checking fails. BTW, I said this before but I insist: I have only tested the cluster match with 2 nodes getting ~45% extra performance in an active-active setup. The maximum limit of 32 nodes is still completely arbitrary. I'd really appreciate if people that have more nodes in their setups let me know. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-05-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (30 commits) e1000: fix virtualization bug bonding: fix alb mode locking regression Bluetooth: Fix issue with sysfs handling for connections usbnet: CDC EEM support (v5) tcp: Fix tcp_prequeue() to get correct rto_min value ehea: fix invalid pointer access ne2k-pci: Do not register device until initialized. Subject: [PATCH] br2684: restore net_dev initialization net: Only store high 16 bits of kernel generated filter priorities virtio_net: Fix function name typo virtio_net: Cleanup command queue scatterlist usage bonding: correct the cleanup in bond_create() virtio: add missing include to virtio_net.h smsc95xx: add support for LAN9512 and LAN9514 smsc95xx: configure LED outputs netconsole: take care of NETDEV_UNREGISTER event xt_socket: checks for the state of nf_conntrack bonding: bond_slave_info_query() fix cxgb3: fixing gcc 4.4 compiler warning: suggest parentheses around operand of ‘!’ netfilter: use likely() in xt_info_rdlock_bh() ...
2009-05-05netfilter: ip6t_ipv6header: fix match on packets ending with NEXTHDR_NONEChristoph Paasch
As packets ending with NEXTHDR_NONE don't have a last extension header, the check for the length needs to be after the check for NEXTHDR_NONE. Signed-off-by: Christoph Paasch <christoph.paasch@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-05-04Bluetooth: Fix issue with sysfs handling for connectionsMarcel Holtmann
Due to a semantic changes in flush_workqueue() the current approach of synchronizing the sysfs handling for connections doesn't work anymore. The whole approach is actually fully broken and based on assumptions that are no longer valid. With the introduction of Simple Pairing support, the creation of low-level ACL links got changed. This change invalidates the reason why in the past two independent work queues have been used for adding/removing sysfs devices. The adding of the actual sysfs device is now postponed until the host controller successfully assigns an unique handle to that link. So the real synchronization happens inside the controller and not the host. The only left-over problem is that some internals of the sysfs device handling are not initialized ahead of time. This leaves potential access to invalid data and can cause various NULL pointer dereferences. To fix this a new function makes sure that all sysfs details are initialized when an connection attempt is made. The actual sysfs device is only registered when the connection has been successfully established. To avoid a race condition with the registration, the check if a device is registered has been moved into the removal work. As an extra protection two flush_work() calls are left in place to make sure a previous add/del work has been completed first. Based on a report by Marc Pignat <marc.pignat@hevs.ch> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Tested-by: Justin P. Mattock <justinmattock@gmail.com> Tested-by: Roger Quadros <ext-roger.quadros@nokia.com> Tested-by: Marc Pignat <marc.pignat@hevs.ch>
2009-05-04mac80211: pid, fix memory corruptionJiri Slaby
pid doesn't count with some band having more bitrates than the one associated the first time. Fix that by counting the maximal available bitrate count and allocate big enough space. Secondly, fix touching uninitialized memory which causes panics. Index sucked from this random memory points to the hell. The fix is to sort the rates on each band change. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-04mac80211: minstrel, fix memory corruptionJiri Slaby
minstrel doesn't count max rate count in fact, since it doesn't use a loop variable `i' and hence allocs space only for bitrates found in the first band. Fix it by involving the `i' as an index so that it traverses all the bands now and finds the real max bitrate count. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-04cfg80211: fix comment on regulatory hint processingLuis R. Rodriguez
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-04cfg80211: fix bug while trying to process beacon hints on initLuis R. Rodriguez
During initialization we would not have received any beacons so skip processing reg beacon hints, also adds a check to reg_is_world_roaming() for last_request before accessing its fields. This should fix this: BUG: unable to handle kernel NULL pointer dereference at IP: [<e0171332>] wiphy_update_regulatory+0x20f/0x295 *pdpt = 0000000008bf1001 *pde = 0000000000000000 Oops: 0000 [#1] last sysfs file: /sys/class/backlight/eeepc/brightness Modules linked in: ath5k(+) mac80211 led_class cfg80211 go_bit cfbcopyarea cfbimgblt cfbfillrect ipv6 ydev usual_tables(P) snd_hda_codec_realtek snd_hda_intel nd_hwdep uhci_hcd snd_pcm_oss snd_mixer_oss i2c_i801 e serio_raw i2c_core pcspkr atl2 snd_pcm intel_agp re agpgart eeepc_laptop snd_page_alloc ac video backlight rfkill button processor evdev thermal fan ata_generic Pid: 2909, comm: modprobe Tainted: Pc #112) 701 EIP: 0060:[<e0171332>] EFLAGS: 00010246 CPU: 0 EIP is at wiphy_update_regulatory+0x20f/0x295 [cfg80211] EAX: 00000000 EBX: c5da0000 ECX: 00000000 EDX: c5da0060 ESI: 0000001a EDI: c5da0060 EBP: df3bdd70 ESP: df3bdd40 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 Process modprobe (pid: 2909, ti=df3bc000 task=c5d030000) Stack: df3bdd90 c5da0060 c04277e0 00000001 00000044 c04277e402 00000002 c5da0000 0000001a c5da0060 df3bdda8 e01706a2 02 00000282 000080d0 00000068 c5d53500 00000080 0000028240 Call Trace: [<e01706a2>] ? wiphy_register+0x122/0x1b7 [cfg80211] [<e0328e02>] ? ieee80211_register_hw+0xd8/0x346 [<e06a7c9f>] ? ath5k_hw_set_bssid_mask+0x71/0x78 [ath5k] [<e06b0c52>] ? ath5k_pci_probe+0xa5c/0xd0a [ath5k] [<c01a6037>] ? sysfs_find_dirent+0x16/0x27 [<c01fec95>] ? local_pci_probe+0xe/0x10 [<c01ff526>] ? pci_device_probe+0x48/0x66 [<c024c9fd>] ? driver_probe_device+0x7f/0xf2 [<c024cab3>] ? __driver_attach+0x43/0x5f [<c024c0af>] ? bus_for_each_dev+0x39/0x5a [<c024c8d0>] ? driver_attach+0x14/0x16 [<c024ca70>] ? __driver_attach+0x0/0x5f [<c024c5b3>] ? bus_add_driver+0xd7/0x1e7 [<c024ccb9>] ? driver_register+0x7b/0xd7 [<c01ff827>] ? __pci_register_driver+0x32/0x85 [<e00a8018>] ? init_ath5k_pci+0x18/0x30 [ath5k] [<c0101131>] ? _stext+0x49/0x10b [<e00a8000>] ? init_ath5k_pci+0x0/0x30 [ath5k] [<c012f452>] ? __blocking_notifier_call_chain+0x40/0x4c [<c013a714>] ? sys_init_module+0x87/0x18b [<c0102804>] ? sysenter_do_call+0x12/0x22 Code: b8 da 17 e0 83 c0 04 e8 92 f9 ff ff 84 c0 75 2a 8b 85 c0 74 0c 83 c0 04 e8 7c f9 ff ff 84 c0 75 14 a1 bc da 4 03 74 66 8b 4d d4 80 79 08 00 74 5d a1 e0 d2 17 e0 48 EIP: [<e0171332>] wiphy_update_regulatory+0x20f/0x295 SP 0068:df3bdd40 CR2: 0000000000000004 ---[ end trace 830f2dd2a95fd1a8 ]--- This issue is hard to reproduce, but it was noticed and discussed on this thread: http://marc.info/?t=123938022700005&r=1&w=2 Cc: stable@kernel.org Reported-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk> Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-04cfg80211: fix race condition with wiphy_apply_custom_regulatory()Luis R. Rodriguez
We forgot to lock using the cfg80211_mutex in wiphy_apply_custom_regulatory(). Without the lock there is possible race between processing a reply from CRDA and a driver calling wiphy_apply_custom_regulatory(). During the processing of the reply from CRDA we free last_request and wiphy_apply_custom_regulatory() eventually accesses an element from last_request in the through freq_reg_info_regd(). This is very difficult to reproduce (I haven't), it takes us 3 hours and you need to be banging hard, but the race is obvious by looking at the code. This should only affect those who use this caller, which currently is ath5k, ath9k, and ar9170. EIP: 0060:[<f8ebec50>] EFLAGS: 00210282 CPU: 1 EIP is at freq_reg_info_regd+0x24/0x121 [cfg80211] EAX: 00000000 EBX: f7ca0060 ECX: f5183d94 EDX: 0024cde0 ESI: f8f56edc EDI: 00000000 EBP: 00000000 ESP: f5183d44 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process modprobe (pid: 14617, ti=f5182000 task=f3934d10 task.ti=f5182000) Stack: c0505300 f7ca0ab4 f5183d94 0024cde0 f8f403a6 f8f63160 f7ca0060 00000000 00000000 f8ebedf8 f5183d90 f8f56edc 00000000 00000004 00000f40 f8f56edc f7ca0060 f7ca1234 00000000 00000000 00000000 f7ca14f0 f7ca0ab4 f7ca1289 Call Trace: [<f8ebedf8>] wiphy_apply_custom_regulatory+0x8f/0x122 [cfg80211] [<f8f3f798>] ath_attach+0x707/0x9e6 [ath9k] [<f8f45e46>] ath_pci_probe+0x18d/0x29a [ath9k] [<c023c7ba>] pci_device_probe+0xa3/0xe4 [<c02a860b>] really_probe+0xd7/0x1de [<c02a87e7>] __driver_attach+0x37/0x55 [<c02a7eed>] bus_for_each_dev+0x31/0x57 [<c02a83bd>] driver_attach+0x16/0x18 [<c02a78e6>] bus_add_driver+0xec/0x21b [<c02a8959>] driver_register+0x85/0xe2 [<c023c9bb>] __pci_register_driver+0x3c/0x69 [<f8e93043>] ath9k_init+0x43/0x68 [ath9k] [<c010112b>] _stext+0x3b/0x116 [<c014a872>] sys_init_module+0x8a/0x19e [<c01049ad>] sysenter_do_call+0x12/0x21 [<ffffe430>] 0xffffe430 ======================= Code: 0f 94 c0 c3 31 c0 c3 55 57 56 53 89 c3 83 ec 14 8b 74 24 2c 89 54 24 0c 89 4c 24 08 85 f6 75 06 8b 35 c8 bb ec f8 a1 cc bb ec f8 <8b> 40 04 83 f8 03 74 3a 48 74 37 8b 43 28 85 c0 74 30 89 c6 8b EIP: [<f8ebec50>] freq_reg_info_regd+0x24/0x121 [cfg80211] SS:ESP 0068:f5183d44 Cc: stable@kernel.org Reported-by: Nataraj Sadasivam <Nataraj.Sadasivam@Atheros.com> Reported-by: Vivek Natarajan <Vivek.Natarajan@Atheros.com> Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-04cfg80211: fix truncated IEsJohannes Berg
Another bug in the "cfg80211: do not replace BSS structs" patch, a forgotten length update leads to bogus data being stored and passed to userspace, often truncated. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-04mac80211: correct fragmentation threshold checkJohannes Berg
The fragmentation threshold is defined to be including the FCS, and the code that sets the TX_FRAGMENTED flag correctly accounts for those four bytes. The code that verifies this doesn't though, which could lead to spurious warnings and frames being dropped although everything is ok. Correct the code by accounting for the FCS. (JWL -- The problem is described here: http://article.gmane.org/gmane.linux.kernel.wireless.general/32205 ) Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-04tcp: Fix tcp_prequeue() to get correct rto_min valueSatoru SATOH
tcp_prequeue() refers to the constant value (TCP_RTO_MIN) regardless of the actual value might be tuned. The following patches fix this and make tcp_prequeue get the actual value returns from tcp_rto_min(). Signed-off-by: Satoru SATOH <satoru.satoh@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-03svcrdma: clean up error paths.Steve Wise
These fixes resolved crashes due to resource leak BUG_ON checks. The resource leaks were detected by introducing asynchronous transport errors. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-05-02Subject: [PATCH] br2684: restore net_dev initializationRabin Vincent
Commit 0ba25ff4c669e5395110ba6ab4958a97a9f96922 ("br2684: convert to net_device_ops") inadvertently deleted the initialization of the net_dev pointer in the br2684_dev structure, leading to crashes. This patch adds it back. Reported-by: Mikko Vinni <mmvinni@yahoo.com> Tested-by: Mikko Vinni <mmvinni@yahoo.com> Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-02net: Only store high 16 bits of kernel generated filter prioritiesRobert Love
The kernel should only be using the high 16 bits of a kernel generated priority. Filter priorities in all other cases only use the upper 16 bits of the u32 'prio' field of 'struct tcf_proto', but when the kernel generates the priority of a filter is saves all 32 bits which can result in incorrect lookup failures when a filter needs to be deleted or modified. Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-01xt_socket: checks for the state of nf_conntrackLaszlo Attila Toth
xt_socket can use connection tracking, and checks whether it is a module. Signed-off-by: Laszlo Attila Toth <panther@balabit.hu> Signed-off-by: David S. Miller <davem@davemloft.net>