asmadeus/linux.git - The linux kernel

Age	Commit message (Collapse)	Author
2015-02-05	Merge tag 'asoc-fix-v3.19-rc7' of ↵	Takashi Iwai
	git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v3.19 A few last minute fixes for v3.19, all driver specific. None of them stand out particularly - it's all the standard people who are affected will care stuff. The Samsung fix is a DT only fix for the audio controller, it's being merged via the ASoC tree due to process messups (the submitter sent it at the end of a tangentally related series rather than separately to the ARM folks) in order to make sure that it gets to people sooner.
2015-02-05	MMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	Linus Torvalds
	Pull networking fixes from David Miller: 1) Stretch ACKs can kill performance with Reno and CUBIC congestion control, largely due to LRO and GRO. Fix from Neal Cardwell. 2) Fix userland breakage because we accidently emit zero length netlink messages from the bridging code. From Roopa Prabhu. 3) Carry handling in generic csum_tcpudp_nofold is broken, fix from Karl Beldan. 4) Remove bogus dev_set_net() calls from CAIF driver, from Nicolas Dichtel. 5) Make sure PPP deflation never returns a length greater then the output buffer, otherwise we overflow and trigger skb_over_panic(). Fix from Florian Westphal. 6) COSA driver needs VIRT_TO_BUS Kconfig dependencies, from Arnd Bergmann. 7) Don't increase route cached MTU on datagram too big ICMPs. From Li Wei. 8) Fix error path leaks in nf_tables, from Pablo Neira Ayuso. 9) Fix bitmask handling regression in netlink that broke things like acpi userland tools. From Pablo Neira Ayuso. 10) Wrong header pointer passed to param_type2af() in SCTP code, from Saran Maruti Ramanara. 11) Stacked vlans not handled correctly by vlan_get_protocol(), from Toshiaki Makita. 12) Add missing DMA memory barrier to xgene driver, from Iyappan Subramanian. 13) Fix crash in rate estimators, from Eric Dumazet. 14) We've been adding various workarounds, one after another, for the change which added the per-net tcp_sock. It was meant to reduce socket contention but added lots of problems. Reduce this instead to a proper per-cpu socket and that rids us of all the daemons. From Eric Dumazet. 15) Fix memory corruption and OOPS in mlx4 driver, from Jack Morgenstein. 16) When we disabled UFO in the virtio_net device, it introduces some serious performance regressions. The orignal problem was IPV6 fragment ID generation, so fix that properly instead. From Vlad Yasevich. 17) sr9700 driver build breaks on xtensa because it defines macros with the same name as those used by the arch code. Use more unique names. From Chen Gang. 18) Fix endianness in new virio 1.0 mode of the vhost net driver, from Michael S Tsirkin. 19) Several sysctls were setting the maxlen attribute incorrectly, from Sasha Levin. 20) Don't accept an FQ scheduler quantum of zero, that leads to crashes. From Kenneth Klette Jonassen. 21) Fix dumping of non-existing actions in the packet scheduler classifier. From Ignacy Gawędzki. 22) Return the write work_done value when doing TX work in the qlcnic driver. 23) ip6gre_err accesses the info field with the wrong endianness, from Sabrina Dubroca. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (54 commits) sit: fix some __be16/u16 mismatches ipv6: fix sparse errors in ip6_make_flowlabel() net: remove some sparse warnings flow_keys: n_proto type should be __be16 ip6_gre: fix endianness errors in ip6gre_err qlcnic: Fix NAPI poll routine for Tx completion amd-xgbe: Set RSS enablement based on hardware features amd-xgbe: Adjust for zero-based traffic class count cls_api.c: Fix dumping of non-existing actions' stats. pkt_sched: fq: avoid hang when quantum 0 net: rds: use correct size for max unacked packets and bytes vhost/net: fix up num_buffers endian-ness gianfar: correct the bad expression while writing bit-pattern net: usb: sr9700: Use 'SR_' prefix for the common register macros Revert "drivers/net: Disable UFO through virtio" Revert "drivers/net, ipv6: Select IPv6 fragment idents for virtio UFO packets" ipv6: Select fragment id during UFO segmentation if not set. xen-netback: stop the guest rx thread after a fatal error net/mlx4_core: Fix kernel Oops (mem corruption) when working with more than 80 VFs isdn: off by one in connect_res() ...
2015-02-05	block: merge __bio_map_user_iov into bio_map_user_iov	Christoph Hellwig
	And also remove the unused bdev argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <tom.leiming@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-02-05	block: pass iov_iter to the BLOCK_PC mapping functions	Kent Overstreet
	Make use of a new interface provided by iov_iter, backed by scatter-gather list of iovec, instead of the old interface based on sg_iovec. Also use iov_iter_advance() instead of manual iteration. This commit should contain only literal replacements, without functional changes. Cc: Christoph Hellwig <hch@infradead.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Doug Gilbert <dgilbert@interlog.com> Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Signed-off-by: Kent Overstreet <kmo@daterainc.com> [dpark: add more description in commit message] Signed-off-by: Dongsu Park <dongsu.park@profitbricks.com> [hch: fixed to do a deep clone of the iov_iter, and to properly use the iov_iter direction] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <tom.leiming@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-02-05	block: use blk_rq_map_user_iov to implement blk_rq_map_user	Christoph Hellwig
	Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <tom.leiming@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-02-05	ACPICA: Events: Introduce ACPI_GPE_DISPATCH_RAW_HANDLER to fix 2 issues for ↵	Lv Zheng
	the current GPE APIs ACPICA commit 199cad16530a45aea2bec98e528866e20c5927e1 Since whether the GPE should be disabled/enabled/cleared should only be determined by the GPE driver's state machine: 1. GPE should be disabled if the driver wants to switch to the GPE polling mode when a GPE storm condition is indicated and should be enabled if the driver wants to switch back to the GPE interrupt mode when all of the storm conditions are cleared. The conditions should be protected by the driver's specific lock. 2. GPE should be enabled if the driver has accepted more than one request and should be disabled if the driver has completed all of the requests. The request count should be protected by the driver's specific lock. 3. GPE should be cleared either when the driver is about to handle an edge triggered GPE or when the driver has completed to handle a level triggered GPE. The handling code should be protected by the driver's specific lock. Thus the GPE enabling/disabling/clearing operations are likely to be performed with the driver's specific lock held while we currently cannot do this. This is because: 1. We have the acpi_gbl_gpe_lock held before invoking the GPE driver's handler. Driver's specific lock is likely to be held inside of the handler, thus we can see some dead lock issues due to the reversed locking order or recursive locking. In order to solve such dead lock issues, we need to unlock the acpi_gbl_gpe_lock before invoking the handler. BZ 1100. 2. Since GPE disabling/enabling/clearing should be determined by the GPE driver's state machine, we shouldn't perform such operations inside of ACPICA for a GPE handler to mess up the driver's state machine. BZ 1101. Originally this patch includes a logic to flush GPE handlers, it is dropped due to the following reasons: 1. This is a different issue; 2. Linux OSL has fixed this by flushing SCI in acpi_os_wait_events_complete(). We will pick up this topic when the Linux OSL fix turns out to be not sufficient. Note that currently the internal operations and the acpi_gbl_gpe_lock are also used by ACPI_GPE_DISPATCH_METHOD and ACPI_GPE_DISPATCH_NOTIFY. In order not to introduce regressions, we add one ACPI_GPE_DISPATCH_RAW_HANDLER type to be distiguished from ACPI_GPE_DISPATCH_HANDLER. For which the acpi_gbl_gpe_lock is unlocked before invoking the GPE handler and the internal enabling/disabling operations are bypassed to allow drivers to perform them at a proper position using the GPE APIs and ACPI_GPE_DISPATCH_RAW_HANDLER users should invoke acpi_set_gpe() instead of acpi_enable_gpe()/acpi_disable_gpe() to bypass the internal GPE clearing code in acpi_enable_gpe(). Lv Zheng. Known issues: 1. Edge-triggered GPE lost for frequent enablings On some buggy silicon platforms, GPE enable line may not be directly wired to the GPE trigger line. In that case, when GPE enabling is frequently performed for edge-triggered GPEs, GPE status may stay set without being triggered. This patch may maginify this problem as it allows GPE enabling to be parallel performed during the process the GPEs are handled. This is an existing issue, because: 1. For task context: Current ACPI_GPE_DISPATCH_METHOD practices have proven that this isn't a real issue - we can re-enable edge-triggered GPE in a work queue where the GPE status bit might already be set. 2. For IRQ context: This can even happen when the GPE enabling occurs before returning from the GPE handler and after unlocking the GPE lock. Thus currently no code is included to protect this. Link: https://github.com/acpica/acpica/commit/199cad16 Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-05	ACPICA: Update version to 20150204	David E. Box
	ACPICA commit e06b1624b02dc8317d144e9a6fe9d684c5fa2f90 Version 20150204. Link: https://github.com/acpica/acpica/commit/e06b1624 Signed-off-by: David E. Box <david.e.box@linux.intel.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-05	ACPICA: Update Copyright headers to 2015	David E. Box
	ACPICA commit 8990e73ab2aa15d6a0068b860ab54feff25bee36 Link: https://github.com/acpica/acpica/commit/8990e73a Signed-off-by: David E. Box <david.e.box@linux.intel.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-05	ACPICA: Events: Cleanup GPE dispatcher type obtaining code	Lv Zheng
	ACPICA commit 7926d5ca9452c87f866938dcea8f12e1efb58f89 There is an issue in acpi_install_gpe_handler() and acpi_remove_gpe_handler(). The code to obtain the GPE dispatcher type from the Handler->original_flags is wrong: if (((Handler->original_flags & ACPI_GPE_DISPATCH_METHOD) \|\| (Handler->original_flags & ACPI_GPE_DISPATCH_NOTIFY)) && ACPI_GPE_DISPATCH_NOTIFY is 0x03 and ACPI_GPE_DISPATCH_METHOD is 0x02, thus this statement is TRUE for the following dispatcher types: 0x01 (ACPI_GPE_DISPATCH_HANDLER): not expected 0x02 (ACPI_GPE_DISPATCH_METHOD): expected 0x03 (ACPI_GPE_DISPATCH_NOTIFY): expected There is no functional issue due to this because Handler->original_flags is only set in acpi_install_gpe_handler(), and an earlier checker has excluded the ACPI_GPE_DISPATCH_HANDLER: if ((gpe_event_info->Flags & ACPI_GPE_DISPATCH_MASK) == ACPI_GPE_DISPATCH_HANDLER) { Status = AE_ALREADY_EXISTS; goto free_and_exit; } ... Handler->original_flags = (u8) (gpe_event_info->Flags & (ACPI_GPE_XRUPT_TYPE_MASK \| ACPI_GPE_DISPATCH_MASK)); We need to clean this up before modifying the GPE dispatcher type values. In order to prevent such issue from happening in the future, this patch introduces ACPI_GPE_DISPATCH_TYPE() macro to be used to obtain the GPE dispatcher types. Lv Zheng. Link: https://github.com/acpica/acpica/commit/7926d5ca Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: David E. Box <david.e.box@linux.intel.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-05	ACPI: Add interfaces to parse IOAPIC ID for IOAPIC hotplug	Yinghai Lu
	We need to parse APIC ID for IOAPIC registration for IOAPIC hotplug. ACPI _MAT method and MADT table are used to figure out IOAPIC ID, just like parsing CPU APIC ID for CPU hotplug. [ tglx: Fixed docbook comment ] Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Len Brown <lenb@kernel.org> Link: http://lkml.kernel.org/r/1414387308-27148-8-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-05	PCI: Use common resource list management code instead of private implementation	Jiang Liu
	Use common resource list management data structure and interfaces instead of private implementation. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-05	resources: Move struct resource_list_entry from ACPI into resource core	Jiang Liu
	Currently ACPI, PCI and pnp all implement the same resource list management with different data structure. We need to transfer from one data structure into another when passing resources from one subsystem into another subsystem. So move struct resource_list_entry from ACPI into resource core and rename it as resource_entry, then it could be reused by different subystems and avoid the data structure conversion. Introduce dedicated header file resource_ext.h instead of embedding it into ioport.h to avoid header file inclusion order issues. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Acked-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-05	MIPS: cevt-r4k: Drop GIC special case	James Hogan
	The cevt-r4k driver used to call into the GIC driver to find whether the timer was pending, but only with External Interrupt Controller (EIC) mode, where the Cause.IP bits can't be used as they encode the interrupt priority level (Cause.RIPL) instead. However commit e9de688dac65 ("irqchip: mips-gic: Support local interrupts") changed the condition from cpu_has_veic to gic_present. This fails on cores such as P5600 which have a GIC but the local interrupts aren't routable by the GIC, causing c0_compare_int_usable() to consider the interrupt unusable so r4k_clockevent_init() fails. The previous behaviour, added in commit 98b67c37db33 ("MIPS: Add EIC support for GIC."), wasn't really correct either as far as I can tell, since P5600 apparently supports EIC mode too, and in any case the use of Cause.TI with r2 should have been sufficient anyway since commit 010c108d7af7 ("MIPS: PowerTV: Fix support for timer interrupts with > 64 external IRQs"). Therefore drop the call into the gic driver altogether, and add a comment in c0_compare_int_pending() to clarify that Cause.TI does get checked since MIPS r2. Signed-off-by: James Hogan <james.hogan@imgtec.com> Fixes: e9de688dac65 ("irqchip: mips-gic: Support local interrupts") Reviewed-by: Andrew Bresticker <abrestic@chromium.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Steven J. Hill <steven.hill@imgtec.com> Cc: Qais Yousef <qais.yousef@imgtec.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/9077/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-02-05	exportfs: add methods for block layout exports	Christoph Hellwig
	Add three methods to allow exporting pnfs block layout volumes: - get_uuid: get a filesystem unique signature exposed to clients - map_blocks: map and if nessecary allocate blocks for a layout - commit_blocks: commit blocks in a layout once the client is done with them For now we stick the external pnfs block layout interfaces into s_export_op to avoid mixing them up with the internal interface between the NFS server and the layout drivers. Once we've fully internalized the latter interface we can redecide if these methods should stay in s_export_ops. Signed-off-by: Christoph Hellwig <hch@lst.de>
2015-02-05	kvm: remove KVM_MMIO_SIZE	Tiejun Chen
	After f78146b0f923, "KVM: Fix page-crossing MMIO", and 87da7e66a405, "KVM: x86: fix vcpu->mmio_fragments overflow", actually KVM_MMIO_SIZE is gone. Signed-off-by: Tiejun Chen <tiejun.chen@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-02-05	ipv6: fix sparse errors in ip6_make_flowlabel()	Eric Dumazet
	include/net/ipv6.h:713:22: warning: incorrect type in assignment (different base types) include/net/ipv6.h:713:22: expected restricted __be32 [usertype] hash include/net/ipv6.h:713:22: got unsigned int include/net/ipv6.h:719:25: warning: restricted __be32 degrades to integer include/net/ipv6.h:719:22: warning: invalid assignment: ^= include/net/ipv6.h:719:22: left side has type restricted __be32 include/net/ipv6.h:719:22: right side has type unsigned int Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-05	flow_keys: n_proto type should be __be16	Eric Dumazet
	(struct flow_keys)->n_proto is in network order, use proper type for this. Fixes following sparse errors : net/core/flow_dissector.c:139:39: warning: incorrect type in assignment (different base types) net/core/flow_dissector.c:139:39: expected unsigned short [unsigned] [usertype] n_proto net/core/flow_dissector.c:139:39: got restricted __be16 [assigned] [usertype] proto net/core/flow_dissector.c:237:23: warning: incorrect type in assignment (different base types) net/core/flow_dissector.c:237:23: expected unsigned short [unsigned] [usertype] n_proto net/core/flow_dissector.c:237:23: got restricted __be16 [assigned] [usertype] proto Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: e0f31d849867 ("flow_keys: Record IP layer protocol in skb_flow_dissect()") Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-05	ext4: add optimization for the lazytime mount option	Theodore Ts'o
	Add an optimization for the MS_LAZYTIME mount option so that we will opportunistically write out any inodes with the I_DIRTY_TIME flag set in a particular inode table block when we need to update some inode in that inode table block anyway. Also add some temporary code so that we can set the lazytime mount option without needing a modified /sbin/mount program which can set MS_LAZYTIME. We can eventually make this go away once util-linux has added support. Google-Bug-Id: 18297052 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-05	vfs: add find_inode_nowait() function	Theodore Ts'o
	Add a new function find_inode_nowait() which is an even more general version of ilookup5_nowait(). It is designed for callers which need very fine grained control over when the function is allowed to block or increment the inode's reference count. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-05	vfs: add support for a lazytime mount option	Theodore Ts'o
	Add a new mount option which enables a new "lazytime" mode. This mode causes atime, mtime, and ctime updates to only be made to the in-memory version of the inode. The on-disk times will only get updated when (a) if the inode needs to be updated for some non-time related change, (b) if userspace calls fsync(), syncfs() or sync(), or (c) just before an undeleted inode is evicted from memory. This is OK according to POSIX because there are no guarantees after a crash unless userspace explicitly requests via a fsync(2) call. For workloads which feature a large number of random write to a preallocated file, the lazytime mount option significantly reduces writes to the inode table. The repeated 4k writes to a single block will result in undesirable stress on flash devices and SMR disk drives. Even on conventional HDD's, the repeated writes to the inode table block will trigger Adjacent Track Interference (ATI) remediation latencies, which very negatively impact long tail latencies --- which is a very big deal for web serving tiers (for example). Google-Bug-Id: 18297052 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04	dmaengine: dw: define DW_DMA_MAX_NR_MASTERS	Andy Shevchenko
	Instead of using magic number in the code the patch provides DW_DMA_MAX_NR_MASTERS constant. While here, restrict the reading of data width array by amount of the actual number of AHB masters. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2015-02-04	dmaengine: dw: amend description of dma_dev field	Andy Shevchenko
	The dma_dev field is widely used in filter functions to mach with a proper DMA controller device. Thus it's not deprecated. The patch fixes the description of that field. There is no functional change. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2015-02-04	pkt_sched: fq: better control of DDOS traffic	Eric Dumazet
	FQ has a fast path for skb attached to a socket, as it does not have to compute a flow hash. But for other packets, FQ being non stochastic means that hosts exposed to random Internet traffic can allocate million of flows structure (104 bytes each) pretty easily. Not only host can OOM, but lookup in RB trees can take too much cpu and memory resources. This patch adds a new attribute, orphan_mask, that is adding possibility of having a stochastic hash for orphaned skb. Its default value is 1024 slots, to mimic SFQ behavior. Note: This does not apply to locally generated TCP traffic, and no locally generated traffic will share a flow structure with another perfect or stochastic flow. This patch also handles the specific case of SYNACK messages: They are attached to the listener socket, and therefore all map to a single hash bucket. If listener have set SO_MAX_PACING_RATE, hoping to have new accepted socket inherit this rate, SYNACK might be paced and even dropped. This is very similar to an internal patch Google have used more than one year. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-05	Merge tag 'asoc-v3.20-2' of ↵	Takashi Iwai
	git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-next ASoC: Updates for v3.20 More updates for v3.20: - Lots of refactoring from Lars-Peter Clausen, moving drivers to more data driven initialization and rationalizing a lot of DAPM usage. - Much improved handling of CDCLK clocks on Samsung I2S controllers. - Lots of driver specific cleanups and feature improvements. - CODEC support for TI PCM514x and TLV320AIC3104 devices. - Board support for Tegra systems with Realtek RT5677. Conflicts: sound/soc/intel/sst-mfld-platform-pcm.c
2015-02-04	Merge branch 'for-davem' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs More iov_iter work from Al Viro. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	tcp: do not pace pure ack packets	Eric Dumazet
	When we added pacing to TCP, we decided to let sch_fq take care of actual pacing. All TCP had to do was to compute sk->pacing_rate using simple formula: sk->pacing_rate = 2 * cwnd * mss / rtt It works well for senders (bulk flows), but not very well for receivers or even RPC : cwnd on the receiver can be less than 10, rtt can be around 100ms, so we can end up pacing ACK packets, slowing down the sender. Really, only the sender should pace, according to its own logic. Instead of adding a new bit in skb, or call yet another flow dissection, we tweak skb->truesize to a small value (2), and we instruct sch_fq to use new helper and not pace pure ack. Note this also helps TCP small queue, as ack packets present in qdisc/NIC do not prevent sending a data packet (RPC workload) This helps to reduce tx completion overhead, ack packets can use regular sock_wfree() instead of tcp_wfree() which is a bit more expensive. This has no impact in the case packets are sent to loopback interface, as we do not coalesce ack packets (were we would detect skb->truesize lie) In case netem (with a delay) is used, skb_orphan_partial() also sets skb->truesize to 1. This patch is a combination of two patches we used for about one year at Google. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	rhashtable: Introduce rhashtable_walk_*	Herbert Xu
	Some existing rhashtable users get too intimate with it by walking the buckets directly. This prevents us from easily changing the internals of rhashtable. This patch adds the helpers rhashtable_walk_init/exit/start/next/stop which will replace these custom walkers. They are meant to be usable for both procfs seq_file walks as well as walking by a netlink dump. The iterator structure should fit inside a netlink dump cb structure, with at least one element to spare. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-05	Merge branch 'drm-intel-next' of git://anongit.freedesktop.org/drm-intel ↵	Dave Airlie
	into drm-next drm-intel-next-2015-01-30: - chv rps improvements from Ville - atomic state handling prep work from Ander - execlist request tracking refactoring from Nick Hoath - forcewake code consolidation from Chris&Mika - fastboot plane config refactoring and skl support from Damien - some more skl pm patches all over (Damien) - refactor dsi code to use drm dsi helpers and drm_panel infrastructure (Jani) - first cut at experimental atomic plane updates (Matt Roper) - piles of smaller things all over, as usual * 'drm-intel-next' of git://anongit.freedesktop.org/drm-intel: (102 commits) drm/i915: Remove bogus locking check in the hangcheck code drm/i915: Update DRIVER_DATE to 20150130 drm/i915: Use pipe_config's cpu_transcoder for reading encoder hw state drm/i915: Fix a use-after-free in intel_execlists_retire_requests drm/i915: Split shared dpll setup out of __intel_set_mode() drm/i915: Don't do posting reads on getting forcewake drm/i915: Do uncore early sanitize after domain init drm/i915: Handle CHV in vlv_set_rps_idle() drm/i915: Remove nested work in gpu error handling drm/i915/documentation: Add intel_uncore.c to drm.tmpl drm/i915/dsi: remove intel_dsi_cmd.c and the unused functions therein drm/i915/dsi: move dpi_send_cmd() to intel_dsi.c and make it static drm/i915/dsi: remove old read/write functions in favor of new stuff drm/i915/dsi: make the vbt panel driver use mipi_dsi_device for transfers drm/i915/dsi: add drm mipi dsi host support drm/i915/dsi: switch to drm_panel interface drm/i915/skl: Enabling PSR on Skylake Revert "drm/i915: Fix mutex->owner inspection race under DEBUG_MUTEXES" drm/i915: Be consistent on printing seqnos drm/i915: Display current hangcheck status in debugfs ...
2015-02-04	net/mlx4_core: Port aggregation upper layer interface	Moni Shoua
	Supply interface functions to bond and unbond ports of a mlx4 internal interfaces. Example for such an interface is the one registered by the mlx4 IB driver under RoCE. There are 1. Functions to go in/out to/from bonded mode 2. Function to remap virtual ports to physical ports The bond_mutex prevents simultaneous access to data that keep status of the device in bonded mode. The upper mlx4 interface marks to the mlx4 core module that they want to be subject for such bonding by setting the MLX4_INTFF_BONDING flag. Interface which goes to/from bonded mode is re-created. The mlx4 Ethernet driver does not set this flag when registering the interface, the IB driver does. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	net/mlx4_core: Port aggregation low level interface	Moni Shoua
	Implement the hardware interface required for port aggregation. 1. Disable RX port check on receive - don't perform a validity check that matches to QP's port and the port where the packet is received. 2. Virtual to physical port remap - configure virtual to physical port mapping. Port remap capability for virtual functions. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	net/bonding: Notify state change on slaves	Moni Shoua
	Use notifier chain to dispatch an event upon a change in slave state. Event is dispatched with slave specific info. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	net/bonding: Move slave state changes to a helper function	Moni Shoua
	Move slave state changes to a helper function, this is a pre-step for adding functionality of dispatching an event when this helper is called. This commit doesn't add new functionality. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	net/core: Add event for a change in slave state	Moni Shoua
	Add event which provides an indication on a change in the state of a bonding slave. The event handler should cast the pointer to the appropriate type (struct netdev_bonding_info) in order to get the full info about the slave. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	Merge tag 'mac80211-next-for-davem-2015-02-03' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Last round of updates for net-next: * revert a patch that caused a regression with mesh userspace (Bob) * fix a number of suspend/resume related races (from Emmanuel, Luca and myself - we'll look at backporting later) * add software implementations for new ciphers (Jouni) * add a new ACPI ID for Broadcom's rfkill (Mika) * allow using netns FD for wireless (Vadim) * some other cleanups (various) Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	Merge branch 'for-upstream' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2015-02-03 Here's what's likely the last bluetooth-next pull request for 3.20. Notable changes include: - xHCI workaround + a new id for the ath3k driver - Several new ids for the btusb driver - Support for new Intel Bluetooth controllers - Minor cleanups to ieee802154 code - Nested sleep warning fix in socket accept() code path - Fixes for Out of Band pairing handling - Support for LE scan restarting for HCI_QUIRK_STRICT_DUPLICATE_FILTER - Improvements to data we expose through debugfs - Proper handling of Hardware Error HCI events Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	net: add skb functions to process remote checksum offload	Tom Herbert
	This patch adds skb_remcsum_process and skb_gro_remcsum_process to perform the appropriate adjustments to the skb when receiving remote checksum offload. Updated vxlan and gue to use these functions. Tested: Ran TCP_RR and TCP_STREAM netperf for VXLAN and GUE, did not see any change in performance. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	xps: fix xps for stacked devices	Eric Dumazet
	A typical qdisc setup is the following : bond0 : bonding device, using HTB hierarchy eth1/eth2 : slaves, multiqueue NIC, using MQ + FQ qdisc XPS allows to spread packets on specific tx queues, based on the cpu doing the send. Problem is that dequeues from bond0 qdisc can happen on random cpus, due to the fact that qdisc_run() can dequeue a batch of packets. CPUA -> queue packet P1 on bond0 qdisc, P1->ooo_okay=1 CPUA -> queue packet P2 on bond0 qdisc, P2->ooo_okay=0 CPUB -> dequeue packet P1 from bond0 enqueue packet on eth1/eth2 CPUC -> dequeue packet P2 from bond0 enqueue packet on eth1/eth2 using sk cache (ooo_okay is 0) get_xps_queue() then might select wrong queue for P1, since current cpu might be different than CPUA. P2 might be sent on the old queue (stored in sk->sk_tx_queue_mapping), if CPUC runs a bit faster (or CPUB spins a bit on qdisc lock) Effect of this bug is TCP reorders, and more generally not optimal TX queue placement. (A victim bulk flow can be migrated to the wrong TX queue for a while) To fix this, we have to record sender cpu number the first time dev_queue_xmit() is called for one tx skb. We can union napi_id (used on receive path) and sender_cpu, granted we clear sender_cpu in skb_scrub_packet() (credit to Willem for this union idea) Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	Merge remote-tracking branches 'asoc/topic/rx51', 'asoc/topic/samsung', ↵	Mark Brown
	'asoc/topic/sh', 'asoc/topic/simple' and 'asoc/topic/sta32x' into asoc-next
2015-02-04	Merge remote-tracking branches 'asoc/topic/omap', 'asoc/topic/pxa' and ↵	Mark Brown
	'asoc/topic/rcar' into asoc-next
2015-02-04	Merge remote-tracking branches 'asoc/topic/cs42l73', 'asoc/topic/dai' and ↵	Mark Brown
	'asoc/topic/davinci' into asoc-next
2015-02-04	Merge remote-tracking branch 'asoc/topic/w-codec' into asoc-next	Mark Brown

2015-02-04	Merge remote-tracking branch 'asoc/topic/pcm512x' into asoc-next	Mark Brown

2015-02-04	Merge remote-tracking branch 'asoc/topic/dapm' into asoc-next	Mark Brown

2015-02-04	Merge remote-tracking branch 'asoc/topic/core' into asoc-next	Mark Brown

2015-02-04	Merge remote-tracking branches 'asoc/fix/ac97', 'asoc/fix/atmel', ↵	Mark Brown
	'asoc/fix/intel', 'asoc/fix/rt286', 'asoc/fix/rt5640', 'asoc/fix/sgtl5000', 'asoc/fix/sta32x', 'asoc/fix/tlv320aic3x' and 'asoc/fix/wm8731' into asoc-linus
2015-02-04	spi: spi-pxa2xx: only include mach/dma.h for legacy DMA	Rob Herring
	Move the include of mach/dma.h to the legacy PXA DMA code where it is used. This enables building spi-pxa2xx on ARM64. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2015-02-04	Merge tag 'usb-for-v3.20' of ↵	Greg Kroah-Hartman
	git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-next Felipe writes: usb: patches for v3.20 merge window Here's the big pull request for Gadgets and PHYs. It's a total of 217 non-merge commits with pretty much everything being touched. The most important bits are a ton of new documentation for almost all usb gadget functions, a new isp1760 UDC driver, several improvements to the old net2280 UDC driver, and some minor tracepoint improvements to dwc3. Other than that, a big list of minor cleanups, smaller bugfixes and new features all over the place. Signed-off-by: Felipe Balbi <balbi@ti.com>
2015-02-04	iscsi-target: Introduce session_get_next_ttt	Sagi Grimberg
	Reduce code duplication. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-02-04	coresight: remove the unnecessary function coresight_is_bit_set()	Kaixu Xia
	This function coresight_is_bit_set() isn't called, so we should remove it. Signed-off-by: Kaixu Xia <xiakaixu@huawei.com> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-02-04	block: Simplify bsg complete all	Peter Zijlstra
	It took me a few tries to figure out what this code did; lets rewrite it into a more regular form. The thing that makes this one 'special' is the BSG_F_BLOCK flag, if that is not set we're not supposed/allowed to block and should spin wait for completion. The (new) io_wait_event() will never see a false condition in case of the spinning and we will therefore not block. Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Jens Axboe <axboe@fb.com>