asmadeus/linux.git - The linux kernel

Age	Commit message (Collapse)	Author
2011-07-20	fs: add SEEK_HOLE and SEEK_DATA flags	Josef Bacik
	This just gets us ready to support the SEEK_HOLE and SEEK_DATA flags. Turns out using fiemap in things like cp cause more problems than it solves, so lets try and give userspace an interface that doesn't suck. We need to match solaris here, and the definitions are o If /whence/ is SEEK_HOLE, the offset of the start of the next hole greater than or equal to the supplied offset is returned. The definition of a hole is provided near the end of the DESCRIPTION. o If /whence/ is SEEK_DATA, the file pointer is set to the start of the next non-hole file region greater than or equal to the supplied offset. So in the generic case the entire file is data and there is a virtual hole at the end. That means we will just return i_size for SEEK_HOLE and will return the same offset for SEEK_DATA. This is how Solaris does it so we have to do it the same way. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	fs: seq_file - add event counter to simplify poll() support	Kay Sievers
	Moving the event counter into the dynamically allocated 'struc seq_file' allows poll() support without the need to allocate its own tracking structure. All current users are switched over to use the new counter. Requested-by: Andrew Morton akpm@linux-foundation.org Acked-by: NeilBrown <neilb@suse.de> Tested-by: Lucas De Marchi lucas.demarchi@profusion.mobi Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	fs: simplify the blockdev_direct_IO prototype	Christoph Hellwig
	Simple filesystems always pass inode->i_sb_bdev as the block device argument, and never need a end_io handler. Let's simply things for them and for my grepping activity by dropping these arguments. The only thing not falling into that scheme is ext4, which passes and end_io handler without needing special flags (yet), but given how messy the direct I/O code there is use of __blockdev_direct_IO in one instead of two out of three cases isn't going to make a large difference anyway. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	rw_semaphore: remove up/down_read_non_owner	Christoph Hellwig
	Now that the last users is gone these can be removed. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	fs: kill i_alloc_sem	Christoph Hellwig
	i_alloc_sem is a rather special rw_semaphore. It's the last one that may be released by a non-owner, and it's write side is always mirrored by real exclusion. It's intended use it to wait for all pending direct I/O requests to finish before starting a truncate. Replace it with a hand-grown construct: - exclusion for truncates is already guaranteed by i_mutex, so it can simply fall way - the reader side is replaced by an i_dio_count member in struct inode that counts the number of pending direct I/O requests. Truncate can't proceed as long as it's non-zero - when i_dio_count reaches non-zero we wake up a pending truncate using wake_up_bit on a new bit in i_flags - new references to i_dio_count can't appear while we are waiting for it to read zero because the direct I/O count always needs i_mutex (or an equivalent like XFS's i_iolock) for starting a new operation. This scheme is much simpler, and saves the space of a spinlock_t and a struct list_head in struct inode (typically 160 bits on a non-debug 64-bit system). Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	anonfd: fix missing declaration	Tomasz Stanislawski
	The forward declaration of struct file_operations is added to avoid compilation warnings. Signed-off-by: Tomasz Stanislawski <t.stanislaws@samsung.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	superblock: add filesystem shrinker operations	Dave Chinner
	Now we have a per-superblock shrinker implementation, we can add a filesystem specific callout to it to allow filesystem internal caches to be shrunk by the superblock shrinker. Rather than perpetuate the multipurpose shrinker callback API (i.e. nr_to_scan == 0 meaning "tell me how many objects freeable in the cache), two operations will be added. The first will return the number of objects that are freeable, the second is the actual shrinker call. Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	superblock: introduce per-sb cache shrinker infrastructure	Dave Chinner
	With context based shrinkers, we can implement a per-superblock shrinker that shrinks the caches attached to the superblock. We currently have global shrinkers for the inode and dentry caches that split up into per-superblock operations via a coarse proportioning method that does not batch very well. The global shrinkers also have a dependency - dentries pin inodes - so we have to be very careful about how we register the global shrinkers so that the implicit call order is always correct. With a per-sb shrinker callout, we can encode this dependency directly into the per-sb shrinker, hence avoiding the need for strictly ordering shrinker registrations. We also have no need for any proportioning code for the shrinker subsystem already provides this functionality across all shrinkers. Allowing the shrinker to operate on a single superblock at a time means that we do less superblock list traversals and locking and reclaim should batch more effectively. This should result in less CPU overhead for reclaim and potentially faster reclaim of items from each filesystem. Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	inode: move to per-sb LRU locks	Dave Chinner
	With the inode LRUs moving to per-sb structures, there is no longer a need for a global inode_lru_lock. The locking can be made more fine-grained by moving to a per-sb LRU lock, isolating the LRU operations of different filesytsems completely from each other. Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	inode: Make unused inode LRU per superblock	Dave Chinner
	The inode unused list is currently a global LRU. This does not match the other global filesystem cache - the dentry cache - which uses per-superblock LRU lists. Hence we have related filesystem object types using different LRU reclaimation schemes. To enable a per-superblock filesystem cache shrinker, both of these caches need to have per-sb unused object LRU lists. Hence this patch converts the global inode LRU to per-sb LRUs. The patch only does rudimentary per-sb propotioning in the shrinker infrastructure, as this gets removed when the per-sb shrinker callouts are introduced later on. Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	vmscan: add customisable shrinker batch size	Dave Chinner
	For shrinkers that have their own cond_resched* calls, having shrink_slab break the work down into small batches is not paticularly efficient. Add a custom batchsize field to the struct shrinker so that shrinkers can use a larger batch size if they desire. A value of zero (uninitialised) means "use the default", so behaviour is unchanged by this patch. Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	vmscan: add shrink_slab tracepoints	Dave Chinner
	It is impossible to understand what the shrinkers are actually doing without instrumenting the code, so add a some tracepoints to allow insight to be gained. Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	btrfs: kill magical embedded struct superblock	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	switch vfs_path_lookup() to struct path	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	kill lookup_create()	Al Viro
	folded into the only caller (kern_path_create()) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	make sure that nsproxy_cache is initialized early enough	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	new helpers: kern_path_create/user_path_create	Al Viro
	combination of kern_path_parent() and lookup_create(). Does not expose struct nameidata to caller. Syscalls converted to that... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	kill LOOKUP_CONTINUE	Al Viro
	LOOKUP_PARENT is equivalent to it now Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	nfs_open_context doesn't need struct path either	Al Viro
	just dentry, please... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	kill IPERM_FLAG_RCU	Al Viro
	not used anymore Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	->permission() sanitizing: don't pass flags to exec_permission()	Al Viro
	pass mask instead; kill security_inode_exec_permission() since we can use security_inode_permission() instead. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	->permission() sanitizing: don't pass flags to ->inode_permission()	Al Viro
	pass that via mask instead. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	->permission() sanitizing: don't pass flags to ->permission()	Al Viro
	not used by the instances anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	->permission() sanitizing: don't pass flags to generic_permission()	Al Viro
	redundant; all callers get it duplicated in mask & MAY_NOT_BLOCK and none of them removes that bit. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	->permission() sanitizing: don't pass flags to ->check_acl()	Al Viro
	not used in the instances anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	->permission() sanitizing: MAY_NOT_BLOCK	Al Viro
	Duplicate the flags argument into mask bitmap. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	kill check_acl callback of generic_permission()	Al Viro
	its value depends only on inode and does not change; we might as well store it in ->i_op->check_acl and be done with that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	lockless get_write_access/deny_write_access	Al Viro
	new helpers: atomic_inc_unless_negative()/atomic_dec_unless_positive() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	kill file_permission() completely	Al Viro
	convert the last remaining caller to inode_permission() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	consolidate BINPRM_FLAGS_ENFORCE_NONDUMP handling	Al Viro
	new helper: would_dump(bprm, file). Checks if we are allowed to read the file and if we are not - sets ENFORCE_NODUMP. Exported, used in places that previously open-coded the same logics. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	new helper: iterate_supers_type()	Al Viro
	Call the given function for all superblocks of given type. Function gets a superblock (with s_umount locked shared) and (void *) argument supplied by caller of iterator. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	fs: add a DCACHE_NEED_LOOKUP flag for d_flags	Josef Bacik
	Btrfs (and I'd venture most other fs's) stores its indexes in nice disk order for readdir, but unfortunately in the case of anything that stats the files in order that readdir spits back (like oh say ls) that means we still have to do the normal lookup of the file, which means looking up our other index and then looking up the inode. What I want is a way to create dummy dentries when we find them in readdir so that when ls or anything else subsequently does a stat(), we already have the location information in the dentry and can go straight to the inode itself. The lookup stuff just assumes that if it finds a dentry it is done, it doesn't perform a lookup. So add a DCACHE_NEED_LOOKUP flag so that the lookup code knows it still needs to run i_op->lookup() on the parent to get the inode for the dentry. I have tested this with btrfs and I went from something that looks like this http://people.redhat.com/jwhiter/ls-noreada.png To this http://people.redhat.com/jwhiter/ls-good.png Thats a savings of 1300 seconds, or 22 minutes. That is a significant savings. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-18	include/linux/sdla.h: remove the prototype of sdla()	WANG Cong
	`make headers_check` complains that linux-2.6/usr/include/linux/sdla.h:116: userspace cannot reference function or variable defined in the kernel this is due to that there is no such a kernel function, void sdla(void cfg_info, char dev, struct frad_conf *conf, int quiet); I don't know why we have it in a kernel header, so remove it. Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-17	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: Bluetooth: Fix crash with incoming L2CAP connections Bluetooth: Fix regression in L2CAP connection procedure gianfar: rx parser r6040: only disable RX interrupt if napi_schedule_prep is successful net: remove NETIF_F_ALL_TX_OFFLOADS net: sctp: fix checksum marking for outgoing packets
2011-07-17	Merge branch 'release' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: ACPI: Fixes device power states array overflow ACPI, APEI, HEST, Detect duplicated hardware error source ID ACPI: Fix lockdep false positives in acpi_power_off()
2011-07-15	drm/radeon/kms: add new NI pci ids	Alex Deucher
	Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-07-14	net: remove NETIF_F_ALL_TX_OFFLOADS	Michał Mirosław
	There is no software fallback implemented for SCTP or FCoE checksumming, and so it should not be passed on by software devices like bridge or bonding. For VLAN devices, this is different. First, the driver for underlying device should be prepared to get offloaded packets even when the feature is disabled (especially if it advertises it in vlan_features). Second, devices under VLANs do not get replaced without tearing down the VLAN first. This fixes a mess I accidentally introduced while converting bonding to ndo_fix_features. NETIF_F_SOFT_FEATURES are removed from BOND_VLAN_FEATURES because they are unused as of commit 712ae51afd. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-14	Merge branches 'd3cold', 'bugzilla-37412' and 'bugzilla-38152' into release	Len Brown

2011-07-13	ACPI: Fixes device power states array overflow	Lin Ming
	Commit 28c2103 added new state ACPI_STATE_D3_COLD, so the device power states array must be expanded by one also. v2: Use ACPI_D_STATE_COUNT instead of number 5 for the array size. Reported-by: Dan Carpenter <error27@gmail.com> Suggested-by: Oldřich Jedlička <oldium.pro@seznam.cz> Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2011-07-13	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc: mmc: core: Bus width testing needs to handle suspend/resume
2011-07-13	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits) slip: fix wrong SLIP6 ifdef-endif placing natsemi: fix another dma-debug report sctp: ABORT if receive, reassmbly, or reodering queue is not empty while closing socket net: Fix default in docs for tcp_orphan_retries. hso: fix a use after free condition net/natsemi: Fix module parameter permissions XFRM: Fix memory leak in xfrm_state_update sctp: Enforce retransmission limit during shutdown mac80211: fix TKIP replay vulnerability mac80211: fix ie memory allocation for scheduled scans ssb: fix init regression of hostmode PCI core rtlwifi: rtl8192cu: Add new USB ID for Netgear WNA1000M ath9k: Fix tx throughput drops for AR9003 chips with AES encryption carl9170: add NEC WL300NU-AG usbid cfg80211: fix deadlock with rfkill/sched_scan by adding new mutex ath5k: fix incorrect use of drvdata in PCI suspend/resume code ath5k: fix incorrect use of drvdata in sysfs code Bluetooth: Fix memory leak under page timeouts Bluetooth: Fix regression with incoming L2CAP connections Bluetooth: Fix hidp disconnect deadlocks and lost wakeup ...
2011-07-13	mmc: core: Bus width testing needs to handle suspend/resume	Philip Rakity
	On reading the ext_csd for the first time (in 1 bit mode), save the ext_csd information needed for bus width compare. On every pass we make re-reading the ext_csd, compare the data against the saved ext_csd data. This fixes a regression introduced in 3.0-rc1 by 08ee80cc397ac1a3 ("mmc: core: eMMC bus width may not work on all platforms"), which incorrectly assumed we would be re-reading the ext_csd at resume- time. Signed-off-by: Philip Rakity <prakity@marvell.com> Tested-by: Jaehoon Chung <jh80.chung@samsung.com> Signed-off-by: Chris Ball <cjb@laptop.org>
2011-07-13	ACPI: Fix lockdep false positives in acpi_power_off()	Rafael J. Wysocki
	All ACPICA locks are allocated by the same function, acpi_os_create_lock(), with the help of a local variable called "lock". Thus, when lockdep is enabled, it uses "lock" as the name of all those locks and regards them as instances of the same lock, which causes it to report possible locking problems with them when there aren't any. To work around this problem, define acpi_os_create_lock() as a macro and make it pass its argument to spin_lock_init(), so that lockdep uses it as the name of the new lock. Define this macron in a Linux-specific file, to minimize the resulting modifications of the OS-independent ACPICA parts. This change is based on an earlier patch from Andrea Righi and it addresses a regression from 2.6.39 tracked as https://bugzilla.kernel.org/show_bug.cgi?id=38152 Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Reported-and-tested-by: Borislav Petkov <bp@alien8.de> Tested-by: Andrea Righi <andrea@betterlinux.com> Reviewed-by: Florian Mickler <florian@mickler.org> Signed-off-by: Len Brown <len.brown@intel.com>
2011-07-12	Merge branch 'merge' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/mm: Fix memory_block_size_bytes() for non-pseries mm: Move definition of MIN_MEMORY_BLOCK_SIZE to a header
2011-07-12	Merge branch 'fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc: pcmcia: pxa2xx/vpac270: free gpios on exist rather than requesting ARM: pxa/raumfeld: fix device name for codec ak4104 ARM: pxa/raumfeld: display initialisation fixes ARM: pxa/raumfeld: adapt to upcoming hardware change ARM: pxa: fix gpio_to_chip() clash with gpiolib namespace genirq: replace irq_gc_ack() with {set,clr}_bit variants (fwd) arm: mach-vt8500: add forgotten irq_data conversion ARM: pxa168: correct nand pmu setting ARM: pxa910: correct nand pmu setting ARM: pxa: fix PGSR register address calculation
2011-07-12	mm: Move definition of MIN_MEMORY_BLOCK_SIZE to a header	Benjamin Herrenschmidt
	The macro MIN_MEMORY_BLOCK_SIZE is currently defined twice in two .c files, and I need it in a third one to fix a powerpc bug, so let's first move it into a header Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Ingo Molnar <mingo@elte.hu>
2011-07-11	Merge branch 'v4l_for_linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6 * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: [media] msp3400: fill in v4l2_tuner based on vt->type field [media] tuner-core.c: don't change type field in g_tuner or g_frequency [media] cx18/ivtv: fix g_tuner support [media] tuner-core: power up tuner when called with s_power(1) [media] v4l2-ioctl.c: check for valid tuner type in S_HW_FREQ_SEEK [media] tuner-core: simplify the standard fixup [media] tuner-core/v4l2-subdev: document that the type field has to be filled in [media] v4l2-subdev.h: remove unused s_mode tuner op [media] feature-removal-schedule: change in how radio device nodes are handled [media] bttv: fix s_tuner for radio [media] pvrusb2: fix g/s_tuner support [media] v4l2-ioctl.c: prefill tuner type for g_frequency and g/s_tuner [media] tuner-core: fix tuner_resume: use t->mode instead of t->type [media] tuner-core: fix s_std and s_tuner
2011-07-08	w1: ds1wm: add a reset recovery parameter	Jean-François Dagenais
	This fixes a regression in 3.0 reported by Paul Parsons regarding the removal of the msleep(1) in the ds1wm_reset() function: : The linux-3.0-rc4 DS1WM 1-wire driver is logging "bus error, retrying" : error messages on an HP iPAQ hx4700 PDA (XScale-PXA270): : : <snip> : Driver for 1-wire Dallas network protocol. : DS1WM w1 busmaster driver - (c) 2004 Szabolcs Gyurko : 1-Wire driver for the DS2760 battery monitor chip - (c) 2004-2005, Szabolcs Gyurko : ds1wm ds1wm: pass: 1 bus error, retrying : ds1wm ds1wm: pass: 2 bus error, retrying : ds1wm ds1wm: pass: 3 bus error, retrying : ds1wm ds1wm: pass: 4 bus error, retrying : ds1wm ds1wm: pass: 5 bus error, retrying : ... : : The visible result is that the battery charging LED is erratic; sometimes : it works, mostly it doesn't. : : The linux-2.6.39 DS1WM 1-wire driver worked OK. I haven't tried 3.0-rc1, : 3.0-rc2, or 3.0-rc3. This sleep should not be required on normal circuitry provided the pull-ups on the bus are correctly adapted to the slaves. Unfortunately, this is not always the case. The sleep is restored but as a parameter to the probe function in the pdata. [akpm@linux-foundation.org: coding-style fixes] Reported-by: Paul Parsons <lost.distance@yahoo.com> Tested-by: Paul Parsons <lost.distance@yahoo.com> Signed-off-by: Jean-François Dagenais <dagenaisj@sonatest.com> Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-08	sctp: ABORT if receive, reassmbly, or reodering queue is not empty while ↵	Thomas Graf
	closing socket Trigger user ABORT if application closes a socket which has data queued on the socket receive queue or chunks waiting on the reassembly or ordering queue as this would imply data being lost which defeats the point of a graceful shutdown. This behavior is already practiced in TCP. We do not check the input queue because that would mean to parse all chunks on it to look for unacknowledged data which seems too much of an effort. Control chunks or duplicated chunks may also be in the input queue and should not be stopping a graceful shutdown. Signed-off-by: Thomas Graf <tgraf@infradead.org> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-07	sctp: Enforce retransmission limit during shutdown	Thomas Graf
	When initiating a graceful shutdown while having data chunks on the retransmission queue with a peer which is in zero window mode the shutdown is never completed because the retransmission error count is reset periodically by the following two rules: - Do not timeout association while doing zero window probe. - Reset overall error count when a heartbeat request has been acknowledged. The graceful shutdown will wait for all outstanding TSN to be acknowledged before sending the SHUTDOWN request. This never happens due to the peer's zero window not acknowledging the continuously retransmitted data chunks. Although the error counter is incremented for each failed retransmission, the receiving of the SACK announcing the zero window clears the error count again immediately. Also heartbeat requests continue to be sent periodically. The peer acknowledges these requests causing the error counter to be reset as well. This patch changes behaviour to only reset the overall error counter for the above rules while not in shutdown. After reaching the maximum number of retransmission attempts, the T5 shutdown guard timer is scheduled to give the receiver some additional time to recover. The timer is stopped as soon as the receiver acknowledges any data. The issue can be easily reproduced by establishing a sctp association over the loopback device, constantly queueing data at the sender while not reading any at the receiver. Wait for the window to reach zero, then initiate a shutdown by killing both processes simultaneously. The association will never be freed and the chunks on the retransmission queue will be retransmitted indefinitely. Signed-off-by: Thomas Graf <tgraf@infradead.org> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>