summaryrefslogtreecommitdiffstats
path: root/net
AgeCommit message (Collapse)Author
2005-11-15[PATCH] knfsd: make sure nfsd doesn't hog a cpu foreverNeilBrown
Being kernel-threads, nfsd servers don't get pre-empted (depending on CONFIG). If there is a steady stream of NFS requests that can be served from cache, an nfsd thread may hold on to a cpu indefinitely, which isn't very friendly. So it is good to have a cond_resched in there (just before looking for a new request to serve), to make sure we play nice. Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14[NETFILTER] fix leak of fragment queue at unloading nf_conntrack_ipv6Yasuyuki Kozakai
This patch makes nf_conntrack_ipv6 free all IPv6 fragment queues at module unloading time. Also introduce a BUG_ON if we ever again have leaks in the memory accounting. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-14[NETFILTER] nf_conntrack: fix possibility of infinite loop while evicting ↵Yasuyuki Kozakai
nf_ct_frag6_queue This synchronizes nf_ct_reasm with ipv6 reassembly, and fixes a possibility of an infinite loop if CPUs evict and create nf_ct_frag6_queue in parallel. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-14[NETFILTER]: fix type of sysctl variables in nf_conntrack_ipv6Yasuyuki Kozakai
These variables should be unsigned. This fixes sysctl handler for nf_ct_frag6_{low,high}_thresh. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-14[NETFILTER]: cleanup IPv6 Netfilter KconfigYasuyuki Kozakai
This removes linux 2.4 configs in comments as TODO lists. And this also move the entry of nf_conntrack to top like IPv4 Netfilter Kconfig. Based on original patch by Krzysztof Piotr Oledzki <ole@ans.pl>. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-14[NETFILTER]: link 'netfilter' before ipv4Krzysztof Oledzki
Staticaly linked nf_conntrack_ipv4 requires nf_conntrack. but currently nf_conntrack is linked after it. This changes the order of ipv4 and netfilter to fix this. Signed-off-by: Krzysztof Oledzki <olenf@ans.pl> Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-14[NETFILTER] nfnetlink: unconditionally require CAP_NET_ADMINHarald Welte
This patch unconditionally requires CAP_NET_ADMIN for all nfnetlink messages. It also removes the per-message cap_required field, since all existing subsystems use CAP_NET_ADMIN for all their messages anyway. Patrick McHardy owes me a beer if we ever need to re-introduce this. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-14[NETFILTER] nf_conntrack: Add missing code to TCP conntrack moduleKOVACS Krisztian
Looks like the nf_conntrack TCP code was slightly mismerged: it does not contain an else branch present in the IPv4 version. Let's add that code and make the testsuite happy. Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-14[NETFILTER] ctnetlink: More thorough size checking of attributesPablo Neira Ayuso
Add missing size checks. Thanks Patrick McHardy for the hint. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-14[NETFILTER] ctnetlink: use size_t to make gcc-4.x happyPablo Neira Ayuso
Make gcc-4.x happy. Use size_t instead of int. Thanks to Patrick McHardy for the hint. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-12[IPV6]: Fix unnecessary GFP_ATOMIC allocation in fib6 dumpThomas Graf
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-12[NETFILTER] {ip,nf}_conntrack TCP: Accept SYN+PUSH like SYNVlad Drukker
Some devices (e.g. Qlogic iSCSI HBA hardware like QLA4010 up to firmware 3.0.0.4) initiates TCP with SYN and PUSH flags set. The Linux TCP/IP stack deals fine with that, but the connection tracking code doesn't. This patch alters TCP connection tracking to accept SYN+PUSH as a valid flag combination. Signed-off-by: Vlad Drukker <vlad@storewiz.com> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-12[IPV6]: Fix rtnetlink dump infinite loopHerbert Xu
The recent change to netlink dump "done" callback handling broke IPv6 which played dirty tricks with the "done" callback. This causes an infinite loop during a dump. The following patch fixes it. This bug was reported by Jeff Garzik. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-11[SCTP]: Include ulpevents in socket receive buffer accounting.Neil Horman
Also introduces a sysctl option to configure the receive buffer accounting policy to be either at socket or association level. Default is all the associations on the same socket share the receive buffer. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-11[SCTP]: Remove timeouts[] array from sctp_endpoint.Vladislav Yasevich
The socket level timeout values are maintained in sctp_sock and association level timeouts are in sctp_association. So there is no need for ep->timeouts. Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-11[SCTP]: Fix potential NULL pointer dereference in sctp_v4_get_saddrVladislav Yasevich
It is possible to get to sctp_v4_get_saddr() without a valid association. This happens when processing OOTB packets and the cached route entry is no longer valid. However, when responding to OOTB packets we already properly set the source address based on the information in the OOTB packet. So, if we we get to sctp_v4_get_saddr() without an association we can simply return. Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-11[IPV6]: Fix inet6_init missing unregister.David S. Miller
Based mostly upon a patch from Olaf Kirch <okir@suse.de> When initialization fails in inet6_init(), we should unregister the PF_INET6 socket ops. Also, check sock_register()'s return value for errors. Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-11[DECNET]: fix SIGPIPEPatrick Caulfield
Currently recvmsg generates SIGPIPE whereas sendmsg does not; for the other stacks it seems to be the other way round! It also fixes the bug where reading from a socket whose peer has shutdown returned -EINVAL rather than 0. Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-11[PATCH] TCP: fix vegas buildJeff Garzik
Recent TCP changes broke the build. Signed-off-by: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-10[TCP]: speed up SACK processingStephen Hemminger
Use "hints" to speed up the SACK processing. Various forms of this have been used by TCP developers (Web100, STCP, BIC) to avoid the 2x linear search of outstanding segments. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10[TCP]: spelling fixesStephen Hemminger
Minor spelling fixes for TCP code. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10[TCP]: receive buffer growth limiting with mixed MTUJohn Heffner
This is a patch for discussion addressing some receive buffer growing issues. This is partially related to the thread "Possible BUG in IPv4 TCP window handling..." last week. Specifically it addresses the problem of an interaction between rcvbuf moderation (receiver autotuning) and rcv_ssthresh. The problem occurs when sending small packets to a receiver with a larger MTU. (A very common case I have is a host with a 1500 byte MTU sending to a host with a 9k MTU.) In such a case, the rcv_ssthresh code is targeting a window size corresponding to filling up the current rcvbuf, not taking into account that the new rcvbuf moderation may increase the rcvbuf size. One hunk makes rcv_ssthresh use tcp_rmem[2] as the size target rather than rcvbuf. The other changes the behavior when it overflows its memory bounds with in-order data so that it tries to grow rcvbuf (the same as with out-of-order data). These changes should help my problem of mixed MTUs, and should also help the case from last week's thread I think. (In both cases though you still need tcp_rmem[2] to be set much larger than the TCP window.) One question is if this is too aggressive at trying to increase rcvbuf if it's under memory stress. Orignally-from: John Heffner <jheffner@psc.edu> Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10[TCP]: Appropriate Byte Count supportStephen Hemminger
This is an updated version of the RFC3465 ABC patch originally for Linux 2.6.11-rc4 by Yee-Ting Li. ABC is a way of counting bytes ack'd rather than packets when updating congestion control. The orignal ABC described in the RFC applied to a Reno style algorithm. For advanced congestion control there is little change after leaving slow start. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10[TCP]: add tcp_slow_start helperStephen Hemminger
Move all the code that does linear TCP slowstart to one inline function to ease later patch to add ABC support. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10[TCP]: simplify microsecond rtt samplingStephen Hemminger
Simplify the code that comuputes microsecond rtt estimate used by TCP Vegas. Move the callback out of the RTT sampler and into the end of the ack cleanup. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10[TCP]: fix congestion window update when using TSO deferalStephen Hemminger
TCP peformance with TSO over networks with delay is awful. On a 100Mbit link with 150ms delay, we get 4Mbits/sec with TSO and 50Mbits/sec without TSO. The problem is with TSO, we intentionally do not keep the maximum number of packets in flight to fill the window, we hold out to until we can send a MSS chunk. But, we also don't update the congestion window unless we have filled, as per RFC2861. This patch replaces the check for the congestion window being full with something smarter that accounts for TSO. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10[NET]: Detect hardware rx checksum faults correctlyHerbert Xu
Here is the patch that introduces the generic skb_checksum_complete which also checks for hardware RX checksum faults. If that happens, it'll call netdev_rx_csum_fault which currently prints out a stack trace with the device name. In future it can turn off RX checksum. I've converted every spot under net/ that does RX checksum checks to use skb_checksum_complete or __skb_checksum_complete with the exceptions of: * Those places where checksums are done bit by bit. These will call netdev_rx_csum_fault directly. * The following have not been completely checked/converted: ipmr ip_vs netfilter dccp This patch is based on patches and suggestions from Stephen Hemminger and David S. Miller. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
2005-11-09[PATCH] SUNRPC: don't reencode when looping in call transmit.Trond Myklebust
If the call to xprt_transmit() fails due to socket buffer space exhaustion, we do not need to re-encode the RPC message when we loop back through call_transmit. Re-encoding can actually end up triggering the WARN_ON() in call_decode() if we re-encode something like a read() request and auth->au_rslack has changed. It can also cause us to increment the RPCSEC_GSS sequence number beyond the limits of the allowed window. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-10[NETLINK]: Generic netlink familyThomas Graf
The generic netlink family builds on top of netlink and provides simplifies access for the less demanding netlink users. It solves the problem of protocol numbers running out by introducing a so called controller taking care of id management and name resolving. Generic netlink modules register themself after filling out their id card (struct genl_family), after successful registration the modules are able to register callbacks to command numbers by filling out a struct genl_ops and calling genl_register_op(). The registered callbacks are invoked with attributes parsed making life of simple modules a lot easier. Although generic netlink modules can request static identifiers, it is recommended to use GENL_ID_GENERATE and to let the controller assign a unique identifier to the module. Userspace applications will then ask the controller and lookup the idenfier by the module name. Due to the current multicast implementation of netlink, the number of generic netlink modules is restricted to 1024 to avoid wasting memory for the per socket multiacst subscription bitmask. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10[RTNETLINK]: Use generic netlink receive queue processorThomas Graf
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10[XFRM]: Use generic netlink receive queue processorThomas Graf
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10[NETLINK]: Generic netlink receive queue processorThomas Graf
Introduces netlink_run_queue() to handle the receive queue of a netlink socket in a generic way. Processes as much as there was in the queue upon entry and invokes a callback function for each netlink message found. The callback function may refuse a message by returning a negative error code but setting the error pointer to 0 in which case netlink_run_queue() will return with a qlen != 0. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10[NETLINK]: Make netlink_callback->done() optionalThomas Graf
Most netlink families make no use of the done() callback, making it optional gets rid of all unnecessary dummy implementations. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10[NETLINK]: Type-safe netlink messages/attributes interfaceThomas Graf
Introduces a new type-safe interface for netlink message and attributes handling. The interface is fully binary compatible with the old interface towards userspace. Besides type safety, this interface features attribute validation capabilities, simplified message contstruction, and documentation. The resulting netlink code should be smaller, less error prone and easier to understand. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09[NETFILTER]: Add nf_conntrack subsystem.Yasuyuki Kozakai
The existing connection tracking subsystem in netfilter can only handle ipv4. There were basically two choices present to add connection tracking support for ipv6. We could either duplicate all of the ipv4 connection tracking code into an ipv6 counterpart, or (the choice taken by these patches) we could design a generic layer that could handle both ipv4 and ipv6 and thus requiring only one sub-protocol (TCP, UDP, etc.) connection tracking helper module to be written. In fact nf_conntrack is capable of working with any layer 3 protocol. The existing ipv4 specific conntrack code could also not deal with the pecularities of doing connection tracking on ipv6, which is also cured here. For example, these issues include: 1) ICMPv6 handling, which is used for neighbour discovery in ipv6 thus some messages such as these should not participate in connection tracking since effectively they are like ARP messages 2) fragmentation must be handled differently in ipv6, because the simplistic "defrag, connection track and NAT, refrag" (which the existing ipv4 connection tracking does) approach simply isn't feasible in ipv6 3) ipv6 extension header parsing must occur at the correct spots before and after connection tracking decisions, and there were no provisions for this in the existing connection tracking design 4) ipv6 has no need for stateful NAT The ipv4 specific conntrack layer is kept around, until all of the ipv4 specific conntrack helpers are ported over to nf_conntrack and it is feature complete. Once that occurs, the old conntrack stuff will get placed into the feature-removal-schedule and we will fully kill it off 6 months later. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-09[IPV6]: ip6ip6_lock is not unlocked in error path.Ken-ichirou MATSUZAWA
From: Ken-ichirou MATSUZAWA <chamas@h4.dion.ne.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09[IPV6]: Fix fallout from CONFIG_IPV6_PRIVACYPeter Chubb
Trying to build today's 2.6.14+git snapshot gives undefined references to use_tempaddr Looks like an ifdef got left out. Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09[NETFILTER] ctnetlink: ICMP_ID is u_int16_t not u_int8_t.Krzysztof Piotr Oledzki
Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09[NETFILTER] ctnetlink: Fix oops when no ICMP ID info in messageKrzysztof Piotr Oledzki
This patch fixes an userspace triggered oops. If there is no ICMP_ID info the reference to attr will be NULL. Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09[NETFILTER] ctnetlink: Add support to identify expectations by ID'sPablo Neira Ayuso
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09[NETFILTER] ctnetlink: propagate error instaed of returning -EPERMPablo Neira Ayuso
Propagate the error to userspace instead of returning -EPERM if the get conntrack operation fails. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09[NETFILTER] ctnetlink: return -EINVAL if size is wrongPablo Neira Ayuso
Return -EINVAL if the size isn't OK instead of -EPERM. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09[NETFILTER]: stop tracking ICMP error at early pointYasuyuki Kozakai
Currently connection tracking handles ICMP error like normal packets if it failed to get related connection. But it fails that after all. This makes connection tracking stop tracking ICMP error at early point. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09[NETFILTER] nfnetlink: only load subsystems if CAP_NET_ADMIN is setHarald Welte
Without this patch, any user can cause nfnetlink subsystems to be autoloaded. Those subsystems however could add significant processing overhead to packet processing, and would refuse any configuration messages from non-CAP_NET_ADMIN processes anyway. This patch follows a suggestion from Patrick McHardy. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09[NETFILTER] PPTP helper: fix PNS-PAC expectation call idPhilip Craig
The reply tuple of the PNS->PAC expectation was using the wrong call id. So we had the following situation: - PNS behind NAT firewall - PNS call id requires NATing - PNS->PAC gre packet arrives first then the PNS->PAC expectation is matched, and the other expectation is deleted, but the PAC->PNS gre packets do not match the gre conntrack because the call id is wrong. We also cannot use ip_nat_follow_master(). Signed-off-by: Philip Craig <philipc@snapgear.com> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09[NETFILTER] ctnetlink: get_conntrack can use GFP_KERNELPablo Neira Ayuso
ctnetlink_get_conntrack is always called from user context, so GFP_KERNEL is enough. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09[NETFILTER] ctnetlink: kill unused includesPablo Neira Ayuso
Kill some useless headers included in ctnetlink. They aren't used in any way. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09[NETFILTER] ctnetlink: add module alias to fix autoloadingPablo Neira Ayuso
Add missing module alias. This is a must to load ctnetlink on demand. For example, the conntrack tool will fail if the module isn't loaded. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09[NETFILTER] ctnetlink: add marking support from userspacePablo Neira Ayuso
This patch adds support for conntrack marking from user space. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>