summaryrefslogtreecommitdiffstats
path: root/net/dccp/ccids/ccid3.h
AgeCommit message (Collapse)Author
2008-09-09This reverts "Merge branch 'dccp' of git://eden-feed.erg.abdn.ac.uk/dccp_exp"Gerrit Renker
as it accentally contained the wrong set of patches. These will be submitted separately. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp ccid-3: Preventing OscillationsGerrit Renker
This implements [RFC 3448, 4.5], which performs congestion avoidance behaviour by reducing the transmit rate as the queueing delay (measured in terms of long-term RTT) increases. Oscillation can be turned on/off via a module option (do_osc_prev) and via sysfs (using mode 0644), the default is off. Overflow analysis: ------------------ * oscillation prevention is done after update_x(), so that t_ipi <= 64000; * hence the multiplication "t_ipi * sqrt(R_sample)" needs 64 bits; * done using u64 for sqrt_sample and explicit typecast of t_ipi; * the divisor, R_sqmean, is non-zero because oscillation prevention is first called when receiving the second feedback packet, and tfrc_scaled_rtt() > 0. A detailed discussion of the algorithm (with plots) is on http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/ccid3/sender_notes/oscillation_prevention/ The algorithm has negative side effects: * when allowing to decrease t_ipi (leads to a large RTT) and * when using it during slow-start; both uses are therefore disabled. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp ccid-3: Always perform receiver RTT samplingGerrit Renker
This updates the CCID-3 receiver in part with regard to errata 610 and 611 (http://www.rfc-editor.org/errata_list.php), which change RFC 4342 to use the Receive Rate as specified in rfc3448bis, requiring to constantly sample the RTT (or use a sender RTT). Doing this requires reusing the RX history structure after dealing with a loss. The patch does not resolve how to compute X_recv if the interval is less than 1 RTT. A FIXME has been added (and is resolved in subsequent patch). Furthermore, since this is all TFRC-based functionality, the RTT estimation is now also performed by the dccp_tfrc_lib module. This further simplifies the CCID-3 code. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp ccid-3: Remove duplicate RX statesGerrit Renker
The only state information that the CCID-3 receiver keeps is whether initial feedback has been sent or not. Further, this overlaps with use of feedback: * state == TFRC_RSTATE_NO_DATA as long as no feedback has been sent; * state == TFRC_RSTATE_DATA as soon as the first feedback has been sent. This patch reduces the duplication, by memorising the type of the last feedback. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp tfrc: Let dccp_tfrc_lib do the sampling workGerrit Renker
This migrates more TFRC-related code into the dccp_tfrc_lib: * sampling of the packet size `s' (which is only needed until the first loss interval is computed (ccid3_first_li)); * updating the byte-counter `bytes_recvd' in between sending feedbacks. The result is a better separation of CCID-3 specific and TFRC specific code, which aids future integration with ECN and e.g. CCID-4. Further changes: ---------------- * replaced magic number of 536 with equivalent constant TCP_MIN_RCVMSS; (this constant is also used when no estimate for `s' is available). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp ccid-3: Simplified handling of TX statesGerrit Renker
Since CCIDs are only used during the established phase of a connection, they have very little internal state; this specifically reduces to: * "no packet sent" if and only if s == 0, for the TX packet size s; * when the first packet has been sent (i.e. `s' > 0), the question is whether or not feedback has been received: - if a feedback packet is received, "feedback = yes" is set, - if the nofeedback timer expires, "feedback = no" is set. Thus the CCID only needs to remember state about whether or not feedback has been received. This is now implemented using a boolean flag, which is toggled when a feedback packet arrives or the nofeedback timer expires. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp ccid-3: Remove dead statesGerrit Renker
This patch is thanks to an investigation by Leandro Sales de Melo and his colleagues. They worked out two state diagrams which highlight the fact that the xxx_TERM states in CCID-3/4 are in fact not necessary. And this can be confirmed by in turn looking at the code: the xxx_TERM states are only ever set in ccid3_hc_{rx,tx}_exit(). These two functions are part of the following call chain: * ccid_hc_{tx,rx}_exit() are called from ccid_delete() only; * ccid_delete() invokes ccid_hc_{tx,rx}_exit() in the way of a destructor: after calling ccid_hc_{tx,rx}_exit(), the CCID is released from memory; * ccid_delete() is in turn called only by ccid_hc_{tx,rx}_delete(); * ccid_hc_{tx,rx}_delete() is called only if - feature negotiation failed (dccp_feat_activate_values()), - when changing the RX/TX CCID (to eject the current CCID), - when destroying the socket (in dccp_destroy_sock()). In other words, when CCID-3 sets the state to xxx_TERM, it is at a time where no more processing should be going on, hence it is not necessary to introduce a dedicated exit state - this is implicit when unloading the CCID. The patch removes this state, one switch-statement collapses as a result. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp ccid-3: Remove duplicate documentationGerrit Renker
This removes RX-socket documentation which is either duplicate or non-existent. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp ccid-3: Remove redundant 'options_received' structGerrit Renker
The `options_received' struct is redundant, since it re-duplicates the existing `p' and `x_recv' fields. This patch removes the sub-struct and migrates the format conversion operations (cf. below) to ccid3_hc_tx_parse_options(). Why the fields are redundant ---------------------------- The Loss Event Rate p and the Receive Rate x_recv are initially 0 when first loading CCID-3, as ccid_new() zeroes out the entire ccid3_hc_tx_sock. When Loss Event Rate or Receive Rate options are received, they are stored by ccid3_hc_tx_parse_options() into the fields `ccid3or_loss_event_rate' and `ccid3or_receive_rate' of the sub-struct `options_received' in ccid3_hc_tx_sock. After parsing (considering only the established state - dccp_rcv_established()), the packet is passed on to ccid_hc_tx_packet_recv(). This calls the CCID-3 specific routine ccid3_hc_tx_packet_recv(), which performs the following copy operations between fields of ccid3_hc_tx_sock: * hctx->options_received.ccid3or_receive_rate is copied into hctx->x_recv, after scaling it for fixpoint arithmetic, by 2^64; * hctx->options_received.ccid3or_loss_event_rate is copied into hctx->p, considering the above special cases; in addition, a value of 0 here needs to be mapped into p=0 (when no Loss Event Rate option has been received yet). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp ccid-3: Simplify and consolidate tx_parse_optionsGerrit Renker
This simplifies and consolidates the TX option-parsing code: 1. The Loss Intervals option is not currently used, so dead code related to this option is removed. I am aware of no plans to support the option, but if someone wants to implement it (e.g. for inter-op tests), it is better to start afresh than having to also update currently unused code. 2. The Loss Event and Receive Rate options have a lot of code in common (both are 32 bit, both have same length etc.), so this is consolidated. 3. The test against GSR is not necessary, because - on first loading CCID3, ccid_new() zeroes out all fields in the socket; - ccid3_hc_tx_packet_recv() treats 0 and ~0U equivalently, due to pinv = opt_recv->ccid3or_loss_event_rate; if (pinv == ~0U || pinv == 0) hctx->p = 0; - as a result, the sequence number field is removed from opt_recv. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp ccid-3: Bug fix for the inter-packet scheduling algorithmGerrit Renker
This fixes a subtle bug in the calculation of the inter-packet gap and shows that t_delta, as it is currently used, is not needed. And hence replaced. The algorithm from RFC 3448, 4.6 below continually computes a send time t_nom, which is initialised with the current time t_now; t_gran = 1E6 / HZ specifies the scheduling granularity, s the packet size, and X the sending rate: t_distance = t_nom - t_now; // in microseconds t_delta = min(t_ipi, t_gran) / 2; // `delta' parameter in microseconds if (t_distance >= t_delta) { reschedule after (t_distance / 1000) milliseconds; } else { t_ipi = s / X; // inter-packet interval in usec t_nom += t_ipi; // compute the next send time send packet now; } 1) Description of the bug ------------------------- Rescheduling requires a conversion into milliseconds, due to this call chain: * ccid3_hc_tx_send_packet() returns a timeout in milliseconds, * this value is converted by msecs_to_jiffies() in dccp_write_xmit(), * and finally used as jiffy-expires-value for sk_reset_timer(). The highest jiffy resolution with HZ=1000 is 1 millisecond, so using a higher granularity does not make much sense here. As a consequence, values of t_distance < 1000 are truncated to 0. This issue has so far been resolved by using instead if (t_distance >= t_delta + 1000) reschedule after (t_distance / 1000) milliseconds; The bug is in artificially inflating t_delta to t_delta' = t_delta + 1000. This is unnecessarily large, a more adequate value is t_delta' = max(t_delta, 1000). 2) Consequences of using the corrected t_delta' ----------------------------------------------- Since t_delta <= t_gran/2 = 10^6/(2*HZ), we have t_delta <= 1000 as long as HZ >= 500. This means that t_delta' = max(1000, t_delta) is constant at 1000. On the other hand, when using a coarse HZ value of HZ < 500, we have three sub-cases that can all be reduced to using another constant of t_gran/2. (a) The first case arises when t_ipi > t_gran. Here t_delta' is the constant t_delta' = max(1000, t_gran/2) = t_gran/2. (b) If t_ipi <= 2000 < t_gran = 10^6/HZ usec, then t_delta = t_ipi/2 <= 1000, so that t_delta' = max(1000, t_delta) = 1000 < t_gran/2. (c) If 2000 < t_ipi <= t_gran, we have t_delta' = max(t_delta, 1000) = t_ipi/2. In the second and third cases we have delay values less than t_gran/2, which is in the order of less than or equal to half a jiffy. How these are treated depends on how fractions of a jiffy are handled: they are either always rounded down to 0, or always rounded up to 1 jiffy (assuming non-zero values). In both cases the error is on average in the order of 50%. Thus we are not increasing the error when in the second/third case we replace a value less than t_gran/2 with 0, by setting t_delta' to the constant t_gran/2. 3) Summary ---------- Fixing (1) and considering (2), the patch replaces t_delta with a constant, whose value depends on CONFIG_HZ, changing the above algorithm to: if (t_distance >= t_delta') reschedule after (t_distance / 1000) milliseconds; where t_delta' = 10^6/(2*HZ) if HZ < 500, and t_delta' = 1000 otherwise. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp ccid-3: Remove ccid3hc{tx,rx}_ prefixesGerrit Renker
This patch does the same for CCID-3 as the previous patch for CCID-2: s#ccid3hctx_##g; s#ccid3hcrx_##g; plus manual editing to retain consistency. Please note: expanded the fields of the `struct tfrc_tx_info' in the hc_tx_sock, since using short #define identifiers is not a good idea. The only place where this embedded struct was used is ccid3_hc_tx_getsockopt(). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-01-28[CCID3]: Use a function to update p_inv, and p is never usedGerrit Renker
This patch 1) concentrates previously scattered computation of p_inv into one function; 2) removes the `p' element of the CCID3 RX sock (it is redundant); 3) makes the tfrc_rx_info structure standalone, only used on demand. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: Interface CCID3 code with newer Loss Intervals DatabaseGerrit Renker
This hooks up the TFRC Loss Interval database with CCID 3 packet reception. In addition, it makes the CCID-specific computation of the first loss interval (which requires access to all the guts of CCID3) local to ccid3.c. The patch also fixes an omission in the DCCP code, that of a default / fallback RTT value (defined in section 3.4 of RFC 4340 as 0.2 sec); while at it, the upper bound of 4 seconds for an RTT sample has been reduced to match the initial TCP RTO value of 3 seconds from[RFC 1122, 4.2.3.1]. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: Redundant debugging output / documentationGerrit Renker
Each time feedback is sent two lines are printed: ccid3_hc_rx_send_feedback: client ... - entry ccid3_hc_rx_send_feedback: Interval ...usec, X_recv=..., 1/p=... The first line is redundant and thus removed. Further, documentation of ccid3_hc_rx_sock (capitalisation) is made consistent. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: New rx history codeArnaldo Carvalho de Melo
Credit here goes to Gerrit Renker, that provided the initial implementation for this new codebase. I modified it just to try to make it closer to the existing API, renaming some functions, add namespacing and fix one bug where the tfrc_rx_hist_alloc was not freeing the allocated ring entries on the error path. Original changeset comment from Gerrit: ----------- This provides a new, self-contained and generic RX history service for TFRC based protocols. Details: * new data structure, initialisation and cleanup routines; * allocation of dccp_rx_hist entries local to packet_history.c, as a service exported by the dccp_tfrc_lib module. * interface to automatically track highest-received seqno; * receiver-based RTT estimation (needed for instance by RFC 3448, 6.3.1); * a generic function to test for `data packets' as per RFC 4340, sec. 7.7. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: Migrate TX history to singly-linked lisArnaldo Carvalho de Melo
This patch was based on another made by Gerrit Renker, his changelog was: ------------------------------------------------------ The patch set migrates TFRC TX history to a singly-linked list. The details are: * use of a consistent naming scheme (all TFRC functions now begin with `tfrc_'); * allocation and cleanup are taken care of internally; * provision of a lookup function, which is used by the CCID TX infrastructure to determine the time a packet was sent (in turn used for RTT sampling); * integration of the new interface with the present use in CCID3. ------------------------------------------------------ Simplifications I did: . removing the tfrc_tx_hist_head that had a pointer to the list head and another for the slabcache. . No need for creating a slabcache for each CCID that wants to use the TFRC tx history routines, create a single slabcache when the dccp_tfrc_lib module init routine is called. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: Accurately determine idle & application-limited periodsGerrit Renker
This fixes/updates the handling of idle and application-limited periods in CCID3, which currently is broken: there is no detection as to how long a sender has been idle - there is only one flag which is toggled in between function calls. Being obsolete now, the `idle' flag is removed. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: Ignore trivial amounts of elapsed timeGerrit Renker
This patch fixes a previously undiscovered bug; the problem is in computing the elapsed time as the time between `receiving' the packet (i.e. skb enters CCID module) and sending feedback: - there is no layer-processing, queueing, or delay involved, - hence the elapsed time is in the order of 1 function call - this is in the dimension of maximally 50..100usec - which renders the use of elapsed time almost entirely useless. The fix is simply to ignore such trivial amounts of elapsed time. As a further advantage, the now useless elapsed_time field can be removed from the socket, which reduces the socket structure by another four bytes. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[CCID3]: Remove ifdef surrounding BUG_ONArnaldo Carvalho de Melo
As suggested by DaveM. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
2007-10-10[CCID3]: Move NULL-protection into functionGerrit Renker
This moves several instances of testing against NULL into the function which is used to de-reference the CCID-private data. Committer note: Made the BUG_ON depend on having CONFIG_IP_DCCP_CCID3_DEBUG, as it is too much to have this on production code. Also made sure that the macro is used only after checking if sk_state is not LISTEN, to make it equivalent to what we had before. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
2007-10-10[DCCP]: Convert ccid3hcrx_tstamp_last_feedback to ktime_tArnaldo Carvalho de Melo
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[DCCP]: Convert ccid3hcrx_tstamp_last_ack to ktime_tArnaldo Carvalho de Melo
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[DCCP]: Convert ccid3hctx_t_ld to ktime_tArnaldo Carvalho de Melo
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10[CCID3]: Sending time: update to ktime_tGerrit Renker
This updates the computation of t_nom and t_last_win_count to use the newer gettimeofday interface. Committer note: used ktime_to_timeval to set the 'now' variable to t_ld in ccid3hctx_no_feedback_timer Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
2007-04-25[CCID3]: Use function for RTT samplingGerrit Renker
This replaces the existing occurrences of RTT sampling with the use of the new function dccp_sample_rtt. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-11[DCCP]: Whitespace cleanupsArnaldo Carvalho de Melo
That accumulated over the last months hackaton, shame on me for not using git-apply whitespace helping hand, will do that from now on. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-11[DCCP] ccid3: Fixup some type conversions related to rttsArnaldo Carvalho de Melo
Spotted by David Miller when compiling on sparc64, I reproduced it here on parisc64, that are the only platforms to define __kernel_suseconds_t as an 'int', all the others, x86_64 and x86 included typedef it as a 'long', but from the definition of suseconds_t it should just be an 'int' on platforms where it is >= 32bits, it would not require all the castings from suseconds_t to (int) when printking variables of this type, that are not needed on parisc64 and sparc64. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-11[DCCP] ccid3: Sanity-check RTT samplesGerrit Renker
CCID3 performance depends much on the accuracy of RTT samples. If RTT samples grow too large, performance can be catastrophically poor. To limit the amount of possible damage in such cases, the patch * introduces an upper limit which identifies a maximum `sane' RTT value; * uses a macro to enforce this upper limit. Using a macro was given preference, since it is necessary to identify the calling function in the warning message. Since exceeding this threshold identifies a critical condition, DCCP_CRIT is used and not DCCP_WARN. Many thanks to Ian McDonald for collaboration on this issue. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-11[DCCP]: Simplify TFRC calculationGerrit Renker
In migrating towards using the newer functions scaled_div/scaled_div32 for TFRC computations mapped from floating-point onto integer arithmetic, this completes the last stage of modifications. In particular, the overflow case for computing X_calc is circumvented by * breaking the computation into two stages * the first stage, res = (s*1E6)/R, cannot overflow due to use of u64 * in the second stage, res = (res*1E6)/f, overflow on u32 is avoided due to (i) returning UINT_MAX in this case (which is logically appropriate) and (ii) issuing a warning message into the system log (since very likely there is a problem somewhere else with the parameters) Lastly, all such scaling operations are now exported into tfrc.h, since actually this form of scaled computation is specific to TFRC and not to CCID3. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-11[DCCP] ccid3: Finer-grained resolution of sending ratesGerrit Renker
This patch * resolves a bug where packets smaller than 32/64 bytes resulted in sending rates of 0 * supports all sending rates from 1/64 bytes/second up to 4Gbyte/second * simplifies the present overflow problems in calculations Current sending rate X and the cached value X_recv of the receiver-estimated sending rate are both scaled by 64 (2^6) in order to * cope with low sending rates (minimally 1 byte/second) * allow upgrading to use a packets-per-second implementation of CCID 3 * avoid calculation errors due to integer arithmetic cut-off The patch implements a revised strategy from http://www.mail-archive.com/dccp@vger.kernel.org/msg01040.html The only difference with regard to that strategy is that t_ipi is already used in the calculation of the nofeedback timeout, which saves one division. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-03[DCCP] ccid3: Deprecate TFRC_SMALLEST_PGerrit Renker
This patch deprecates the existing use of an arbitrary value TFRC_SMALLEST_P for low-threshold values of p. This avoids masking low-resolution errors. Instead, the code now checks against real boundaries (implemented by preceding patch) and provides warnings whenever a real value falls below the threshold. If such messages are observed, it is a better solution to take this as an indication that the lookup table needs to be re-engineered. Changelog: ---------- This patch * makes handling all TFRC resolution errors local to the TFRC library * removes unnecessary test whether X_calc is 'infinity' due to p==0 -- this condition is already caught by tfrc_calc_x() * removes setting ccid3hctx_p = TFRC_SMALLEST_P in ccid3_hc_tx_packet_recv since this is now done by the TFRC library * updates BUG_ON test in ccid3_hc_tx_no_feedback_timer to take into account that p now is either 0 (and then X_calc is irrelevant), or it is > 0; since the handling of TFRC_SMALLEST_P is now taken care of in the tfrc library Justification: -------------- The TFRC code uses a lookup table which has a bounded resolution. The lowest possible value of the loss event rate `p' which can be resolved is currently 0.0001. Substituting this lower threshold for p when p is less than 0.0001 results in a huge, exponentially-growing error. The error can be computed by the following formula: (f(0.0001) - f(p))/f(p) * 100 for p < 0.0001 Currently the solution is to use an (arbitrary) value TFRC_SMALLEST_P = 40 * 1E-6 = 0.00004 and to consider all values below this value as `virtually zero'. Due to the exponentially growing resolution error, this is not a good idea, since it hides the fact that the table can not resolve practically occurring cases. Already at p == TFRC_SMALLEST_P, the error is as high as 58.19%! Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02[DCCP] ccid3: Larger initial windowsGerrit Renker
This implements the larger-initial-windows feature for CCID 3, as described in section 5 of RFC 4342. When the first feedback packet arrives, the sender can send up to 2..4 packets per RTT, instead of just one. The patch further * reduces the number of timestamping calls by passing the timestamp value (which is computed in one of the calling functions anyway) as argument * renames one constant with a very long name into one which is shorter and resembles the one in RFC 3448 (t_mbi) * simplifies some of the min_t/max_t cases where both `x', `y' have the same type Commiter note: renamed TFRC_t_mbi to TFRC_T_MBI, to follow Linux coding style. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02[DCCP]: Tidy up unused structuresGerrit Renker
This removes and cleans up unused variables and structures which have become unnecessary following the introduction of the EWMA patch to automatically track the CCID 3 receiver/sender packet sizes `s'. It deprecates the PACKET_SIZE socket option by returning an error code and printing a deprecation warning if an application tries to read or write this socket option. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02[DCCP] ccid3: Consolidate timer resetsGerrit Renker
This patch concerns updating the value of the nofeedback timer when no feedback has been received so far. Since in this case the value of R is still undefined according to [RFC 3448, 4.2], we can not perform step (3) of [RFC 3448, 4.3]. A clarification is provided in [RFC 4342, sec. 5], which states that in these cases the nofeedback timer (still) expires "after two seconds". Many thanks to Ian McDonald for pointing this out and providing the clarification. The patch * implements [RFC 4342, sec. 5] with regard to the above case * consolidates handling timer restart by - adding an appropriate jump label and - initialising the timeout value Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02[DCCP] ccid3: Fix bug in calculation of first t_nom and first t_ipiGerrit Renker
Problem:
2006-12-02[CCID 3]: Add annotations for socket structuresGerrit Renker
This adds documentation to the CCID 3 rx/tx socket fields, plus some minor re-formatting. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02[DCCP]: Introduce DCCP_{BUG{_ON},CRIT} macros, use enum:8 for the ccid3 statesGerrit Renker
This patch tackles the following problem: * the ccid3_hc_{t,r}x_sock define ccid3hc{t,r}x_state as `u8', but in reality there can only be a few, pre-defined enum names * this necessitates addiditional checking for unexpected values which would otherwise be caught by the compiler Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-08-26[DCCP]: Fix CCID3Ian McDonald
This fixes CCID3 to give much closer performance to RFC4342. CCID3 is meant to alter sending rate based on RTT and loss. The performance was verified against: http://wand.net.nz/~perry/max_download.php For example I tested with netem and had the following parameters: Delayed Acks 1, MSS 256 bytes, RTT 105 ms, packet loss 5%. This gives a theoretical speed of 71.9 Kbits/s. I measured across three runs with this patch set and got 70.1 Kbits/s. Without this patchset the average was 232 Kbits/s which means Linux can't be used for CCID3 research properly. I also tested with netem turned off so box just acting as router with 1.2 msec RTT. The performance with this is the same with or without the patch at around 30 Mbit/s. Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-26[DCCP]: Update contact details and copyrightIan McDonald
Just updating copyright and contacts Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-30Remove obsolete #include <linux/config.h>Jörn Engel
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-03-20[DCCP] CCID: Improve CCID infrastructureArnaldo Carvalho de Melo
1. No need for ->ccid_init nor ->ccid_exit, this is what module_{init,exit} does and anynways neither ccid2 nor ccid3 were using it. 2. Rename struct ccid to struct ccid_operations and introduce struct ccid with a pointer to ccid_operations and rigth after it the rx or tx private state. 3. Remove the pointer to the state of the half connections from struct dccp_sock, now its derived thru ccid_priv() from the ccid pointer. Now we also can implement the setsockopt for changing the CCID easily as no ccid init routines can affect struct dccp_sock in any way that prevents other CCIDs from working if a CCID switch operation is asked by apps. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-18[CCID3]: Introduce include/linux/tfrc.hArnaldo Carvalho de Melo
Moving the TFRC sender and receiver variables to separate structs, so that we can copy these structs to userspace thru getsockopt, dccp_diag, etc. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-09[CCID3] Initialize ccid3hctx_t_ipi to 250msArnaldo Carvalho de Melo
To match more closely what is described in RFC 3448. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: Ian McDonald <iam4@cs.waikato.ac.nz>
2005-09-09[CCID3] Introduce ccid3_hc_[rt]x_sk() for overal consistencyArnaldo Carvalho de Melo
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-09[DCCP] Introduce dccp_timestampArnaldo Carvalho de Melo
To start the timestamps with 0.0ms, easing the integer maths in the CCIDs, this probably will be reworked to use the to be introduced struct timeval_offset infrastructure out of skb_get_timestamp, etc. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-08-29[CCID3]: Move ccid3_hc_rx_add_hist to packet_history.cArnaldo Carvalho de Melo
Renaming it to dccp_rx_hist_add_packet. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[CCID3]: Move the loss interval code to loss_interval.[ch]Arnaldo Carvalho de Melo
And put this into net/dccp/ccids/lib/, where packet_history.[ch] will also be moved and then we'll have a tfrc_lib.ko module that will be used by dccp_ccid3.ko and other CCIDs that are variations of TFRC (RFC 3448). Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[CCID3]: Move the CCID3 defines to ccid3.hArnaldo Carvalho de Melo
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[CCID3]: Reintroduce ccid3hctx_t_rtoArnaldo Carvalho de Melo
CCID3 keeps this variable in usecs, inet_connection_socks in jiffies, so to avoid Mars orbiter losses lets reintroduce ccid3hctx_t_rto 8) Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>