summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2014-10-15virtio_console: enable VQs early on restoreMichael S. Tsirkin
virtio spec requires drivers to set DRIVER_OK before using VQs. This is set automatically after resume returns, virtio console violated this rule by adding inbufs, which causes the VQ to be used directly within restore. To fix, call virtio_device_ready before using VQs. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15virtio_scsi: enable VQs early on restoreMichael S. Tsirkin
virtio spec requires drivers to set DRIVER_OK before using VQs. This is set automatically after restore returns, virtio scsi violated this rule on restore by kicking event vq within restore. To fix, call virtio_device_ready before using event queue. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15virtio_blk: enable VQs early on restoreMichael S. Tsirkin
virtio spec requires drivers to set DRIVER_OK before using VQs. This is set automatically after restore returns, virtio block violated this rule on restore by restarting queues, which might in theory cause the VQ to be used directly within restore. To fix, call virtio_device_ready before using starting queues. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15virtio_scsi: move kick event out from virtscsi_initMichael S. Tsirkin
We currently kick event within virtscsi_init, before host is fully initialized. This can in theory confuse guest if device consumes the buffers immediately. To fix, move virtscsi_kick_event_all out to scan/restore. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15virtio_net: fix use after free on allocation failureMichael S. Tsirkin
In the extremely unlikely event that driver initialization fails after RX buffers are added, virtio net frees RX buffers while VQs are still active, potentially causing device to use a freed buffer. To fix, reset device first - same as we do on device removal. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-159p/trans_virtio: enable VQs earlyMichael S. Tsirkin
virtio spec requires drivers to set DRIVER_OK before using VQs. This is set automatically after probe returns, but virtio 9p device adds self to channel list within probe, at which point VQ can be used in violation of the spec. To fix, call virtio_device_ready before using VQs. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15virtio_console: enable VQs earlyMichael S. Tsirkin
virtio spec requires drivers to set DRIVER_OK before using VQs. This is set automatically after probe returns, virtio console violated this rule by adding inbufs, which causes the VQ to be used directly within probe. To fix, call virtio_device_ready before using VQs. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15virtio_blk: enable VQs earlyMichael S. Tsirkin
virtio spec requires drivers to set DRIVER_OK before using VQs. This is set automatically after probe returns, virtio block violated this rule by calling add_disk, which causes the VQ to be used directly within probe. To fix, call virtio_device_ready before using VQs. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15virtio_net: enable VQs earlyMichael S. Tsirkin
virtio spec requires drivers to set DRIVER_OK before using VQs. This is set automatically after probe returns, virtio net violated this rule by using receive VQs within probe. To fix, call virtio_device_ready before using VQs. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15virtio: add API to enable VQs earlyMichael S. Tsirkin
virtio spec 0.9.X requires DRIVER_OK to be set before VQs are used, but some drivers use VQs before probe function returns. Since DRIVER_OK is set after probe, this violates the spec. Even though under virtio 1.0 transitional devices support this behaviour, we want to make it possible for those early callers to become spec compliant and eventually support non-transitional devices. Add API for drivers to call before using VQs. Sets DRIVER_OK internally. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15virtio_net: minor cleanupMichael S. Tsirkin
goto done; done: return; is ugly, it was put there to make diff review easier. replace by open-coded return. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15virtio-net: drop config_mutexMichael S. Tsirkin
config_mutex served two purposes: prevent multiple concurrent config change handlers, and synchronize access to config_enable flag. Since commit dbf2576e37da0fcc7aacbfbb9fd5d3de7888a3c1 workqueue: make all workqueues non-reentrant all workqueues are non-reentrant, and config_enable is now gone. Get rid of the unnecessary lock. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15virtio_net: drop config_enableMichael S. Tsirkin
Now that virtio core ensures config changes don't arrive during probing, drop config_enable flag in virtio net. On removal, flush is now sufficient to guarantee that no change work is queued. This help simplify the driver, and will allow setting DRIVER_OK earlier without losing config change notifications. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15virtio-blk: drop config_mutexMichael S. Tsirkin
config_mutex served two purposes: prevent multiple concurrent config change handlers, and synchronize access to config_enable flag. Since commit dbf2576e37da0fcc7aacbfbb9fd5d3de7888a3c1 workqueue: make all workqueues non-reentrant all workqueues are non-reentrant, and config_enable is now gone. Get rid of the unnecessary lock. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15virtio_blk: drop config_enableMichael S. Tsirkin
Now that virtio core ensures config changes don't arrive during probing, drop config_enable flag in virtio blk. On removal, flush is now sufficient to guarantee that no change work is queued. This help simplify the driver, and will allow setting DRIVER_OK earlier without losing config change notifications. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15virtio: defer config changed notificationsMichael S. Tsirkin
Defer config changed notifications that arrive during probe/scan/freeze/restore. This will allow drivers to set DRIVER_OK earlier, without worrying about racing with config change interrupts. This change will also benefit old hypervisors (before 2009) that send interrupts without checking DRIVER_OK: previously, the callback could race with driver-specific initialization. This will also help simplify drivers. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (cosmetic changes)
2014-10-15virtio-pci: move freeze/restore to virtio coreMichael S. Tsirkin
This is in preparation to extending config changed event handling in core. Wrapping these in an API also seems to make for a cleaner code. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15virtio: unify config_changed handlingMichael S. Tsirkin
Replace duplicated code in all transports with a single wrapper in virtio.c. The only functional change is in virtio_mmio.c: if a buggy device sends us an interrupt before driver is set, we previously returned IRQ_NONE, now we return IRQ_HANDLED. As this must not happen in practice, this does not look like a big deal. See also commit 3fff0179e33cd7d0a688dab65700c46ad089e934 virtio-pci: do not oops on config change if driver not loaded. for the original motivation behind the driver check. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15virtio_pci: fix virtio spec compliance on restoreMichael S. Tsirkin
On restore, virtio pci does the following: + set features + init vqs etc - device can be used at this point! + set ACKNOWLEDGE,DRIVER and DRIVER_OK status bits This is in violation of the virtio spec, which requires the following order: - ACKNOWLEDGE - DRIVER - init vqs - DRIVER_OK This behaviour will break with hypervisors that assume spec compliant behaviour. It seems like a good idea to have this patch applied to stable branches to reduce the support butden for the hypervisors. Cc: stable@vger.kernel.org Cc: Amit Shah <amit.shah@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15modules, lock around setting of MODULE_STATE_UNFORMEDPrarit Bhargava
A panic was seen in the following sitation. There are two threads running on the system. The first thread is a system monitoring thread that is reading /proc/modules. The second thread is loading and unloading a module (in this example I'm using my simple dummy-module.ko). Note, in the "real world" this occurred with the qlogic driver module. When doing this, the following panic occurred: ------------[ cut here ]------------ kernel BUG at kernel/module.c:3739! invalid opcode: 0000 [#1] SMP Modules linked in: binfmt_misc sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw igb gf128mul glue_helper iTCO_wdt iTCO_vendor_support ablk_helper ptp sb_edac cryptd pps_core edac_core shpchp i2c_i801 pcspkr wmi lpc_ich ioatdma mfd_core dca ipmi_si nfsd ipmi_msghandler auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm isci drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: dummy_module] CPU: 37 PID: 186343 Comm: cat Tainted: GF O-------------- 3.10.0+ #7 Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013 task: ffff8807fd2d8000 ti: ffff88080fa7c000 task.ti: ffff88080fa7c000 RIP: 0010:[<ffffffff810d64c5>] [<ffffffff810d64c5>] module_flags+0xb5/0xc0 RSP: 0018:ffff88080fa7fe18 EFLAGS: 00010246 RAX: 0000000000000003 RBX: ffffffffa03b5200 RCX: 0000000000000000 RDX: 0000000000001000 RSI: ffff88080fa7fe38 RDI: ffffffffa03b5000 RBP: ffff88080fa7fe28 R08: 0000000000000010 R09: 0000000000000000 R10: 0000000000000000 R11: 000000000000000f R12: ffffffffa03b5000 R13: ffffffffa03b5008 R14: ffffffffa03b5200 R15: ffffffffa03b5000 FS: 00007f6ae57ef740(0000) GS:ffff88101e7a0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000404f70 CR3: 0000000ffed48000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffffffffa03b5200 ffff8810101e4800 ffff88080fa7fe70 ffffffff810d666c ffff88081e807300 000000002e0f2fbf 0000000000000000 ffff88100f257b00 ffffffffa03b5008 ffff88080fa7ff48 ffff8810101e4800 ffff88080fa7fee0 Call Trace: [<ffffffff810d666c>] m_show+0x19c/0x1e0 [<ffffffff811e4d7e>] seq_read+0x16e/0x3b0 [<ffffffff812281ed>] proc_reg_read+0x3d/0x80 [<ffffffff811c0f2c>] vfs_read+0x9c/0x170 [<ffffffff811c1a58>] SyS_read+0x58/0xb0 [<ffffffff81605829>] system_call_fastpath+0x16/0x1b Code: 48 63 c2 83 c2 01 c6 04 03 29 48 63 d2 eb d9 0f 1f 80 00 00 00 00 48 63 d2 c6 04 13 2d 41 8b 0c 24 8d 50 02 83 f9 01 75 b2 eb cb <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 RIP [<ffffffff810d64c5>] module_flags+0xb5/0xc0 RSP <ffff88080fa7fe18> Consider the two processes running on the system. CPU 0 (/proc/modules reader) CPU 1 (loading/unloading module) CPU 0 opens /proc/modules, and starts displaying data for each module by traversing the modules list via fs/seq_file.c:seq_open() and fs/seq_file.c:seq_read(). For each module in the modules list, seq_read does op->start() <-- this is a pointer to m_start() op->show() <- this is a pointer to m_show() op->stop() <-- this is a pointer to m_stop() The m_start(), m_show(), and m_stop() module functions are defined in kernel/module.c. The m_start() and m_stop() functions acquire and release the module_mutex respectively. ie) When reading /proc/modules, the module_mutex is acquired and released for each module. m_show() is called with the module_mutex held. It accesses the module struct data and attempts to write out module data. It is in this code path that the above BUG_ON() warning is encountered, specifically m_show() calls static char *module_flags(struct module *mod, char *buf) { int bx = 0; BUG_ON(mod->state == MODULE_STATE_UNFORMED); ... The other thread, CPU 1, in unloading the module calls the syscall delete_module() defined in kernel/module.c. The module_mutex is acquired for a short time, and then released. free_module() is called without the module_mutex. free_module() then sets mod->state = MODULE_STATE_UNFORMED, also without the module_mutex. Some additional code is called and then the module_mutex is reacquired to remove the module from the modules list: /* Now we can delete it from the lists */ mutex_lock(&module_mutex); stop_machine(__unlink_module, mod, NULL); mutex_unlock(&module_mutex); This is the sequence of events that leads to the panic. CPU 1 is removing dummy_module via delete_module(). It acquires the module_mutex, and then releases it. CPU 1 has NOT set dummy_module->state to MODULE_STATE_UNFORMED yet. CPU 0, which is reading the /proc/modules, acquires the module_mutex and acquires a pointer to the dummy_module which is still in the modules list. CPU 0 calls m_show for dummy_module. The check in m_show() for MODULE_STATE_UNFORMED passed for dummy_module even though it is being torn down. Meanwhile CPU 1, which has been continuing to remove dummy_module without holding the module_mutex, now calls free_module() and sets dummy_module->state to MODULE_STATE_UNFORMED. CPU 0 now calls module_flags() with dummy_module and ... static char *module_flags(struct module *mod, char *buf) { int bx = 0; BUG_ON(mod->state == MODULE_STATE_UNFORMED); and BOOM. Acquire and release the module_mutex lock around the setting of MODULE_STATE_UNFORMED in the teardown path, which should resolve the problem. Testing: In the unpatched kernel I can panic the system within 1 minute by doing while (true) do insmod dummy_module.ko; rmmod dummy_module.ko; done and while (true) do cat /proc/modules; done in separate terminals. In the patched kernel I was able to run just over one hour without seeing any issues. I also verified the output of panic via sysrq-c and the output of /proc/modules looks correct for all three states for the dummy_module. dummy_module 12661 0 - Unloading 0xffffffffa03a5000 (OE-) dummy_module 12661 0 - Live 0xffffffffa03bb000 (OE) dummy_module 14015 1 - Loading 0xffffffffa03a5000 (OE+) Signed-off-by: Prarit Bhargava <prarit@redhat.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: stable@kernel.org
2014-10-14mnt: Prevent pivot_root from creating a loop in the mount treeEric W. Biederman
Andy Lutomirski recently demonstrated that when chroot is used to set the root path below the path for the new ``root'' passed to pivot_root the pivot_root system call succeeds and leaks mounts. In examining the code I see that starting with a new root that is below the current root in the mount tree will result in a loop in the mount tree after the mounts are detached and then reattached to one another. Resulting in all kinds of ugliness including a leak of that mounts involved in the leak of the mount loop. Prevent this problem by ensuring that the new mount is reachable from the current root of the mount tree. [Added stable cc. Fixes CVE-2014-7970. --Andy] Cc: stable@vger.kernel.org Reported-by: Andy Lutomirski <luto@amacapital.net> Reviewed-by: Andy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/87bnpmihks.fsf@x220.int.ebiederm.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andy Lutomirski <luto@amacapital.net>
2014-10-14tcp: TCP Small Queues and strange attractorsEric Dumazet
TCP Small queues tries to keep number of packets in qdisc as small as possible, and depends on a tasklet to feed following packets at TX completion time. Choice of tasklet was driven by latencies requirements. Then, TCP stack tries to avoid reorders, by locking flows with outstanding packets in qdisc in a given TX queue. What can happen is that many flows get attracted by a low performing TX queue, and cpu servicing TX completion has to feed packets for all of them, making this cpu 100% busy in softirq mode. This became particularly visible with latest skb->xmit_more support Strategy adopted in this patch is to detect when tcp_wfree() is called from ksoftirqd and let the outstanding queue for this flow being drained before feeding additional packets, so that skb->ooo_okay can be set to allow select_queue() to select the optimal queue : Incoming ACKS are normally handled by different cpus, so this patch gives more chance for these cpus to take over the burden of feeding qdisc with future packets. Tested: lpaa23:~# ./super_netperf 1400 --google-pacing-rate 3028000 -H lpaa24 -l 3600 & lpaa23:~# sar -n DEV 1 10 | grep eth1 06:16:18 AM eth1 595448.00 1190564.00 38381.09 1760253.12 0.00 0.00 1.00 06:16:19 AM eth1 594858.00 1189686.00 38340.76 1758952.72 0.00 0.00 0.00 06:16:20 AM eth1 597017.00 1194019.00 38480.79 1765370.29 0.00 0.00 1.00 06:16:21 AM eth1 595450.00 1190936.00 38380.19 1760805.05 0.00 0.00 0.00 06:16:22 AM eth1 596385.00 1193096.00 38442.56 1763976.29 0.00 0.00 1.00 06:16:23 AM eth1 598155.00 1195978.00 38552.97 1768264.60 0.00 0.00 0.00 06:16:24 AM eth1 594405.00 1188643.00 38312.57 1757414.89 0.00 0.00 1.00 06:16:25 AM eth1 593366.00 1187154.00 38252.16 1755195.83 0.00 0.00 0.00 06:16:26 AM eth1 593188.00 1186118.00 38232.88 1753682.57 0.00 0.00 1.00 06:16:27 AM eth1 596301.00 1192241.00 38440.94 1762733.09 0.00 0.00 0.00 Average: eth1 595457.30 1190843.50 38381.69 1760664.84 0.00 0.00 0.50 lpaa23:~# ./tc -s -d qd sh dev eth1 | grep backlog backlog 7606336b 2513p requeues 167982 backlog 224072b 74p requeues 566 backlog 581376b 192p requeues 5598 backlog 181680b 60p requeues 1070 backlog 5305056b 1753p requeues 110166 // Here, this TX queue is attracting flows backlog 157456b 52p requeues 1758 backlog 672216b 222p requeues 3025 backlog 60560b 20p requeues 24541 backlog 448144b 148p requeues 21258 lpaa23:~# echo 1 >/proc/sys/net/ipv4/tcp_tsq_enable_tcp_wfree_ksoftirqd_detect Immediate jump to full bandwidth, and traffic is properly shard on all tx queues. lpaa23:~# sar -n DEV 1 10 | grep eth1 06:16:46 AM eth1 1397632.00 2795397.00 90081.87 4133031.26 0.00 0.00 1.00 06:16:47 AM eth1 1396874.00 2793614.00 90032.99 4130385.46 0.00 0.00 0.00 06:16:48 AM eth1 1395842.00 2791600.00 89966.46 4127409.67 0.00 0.00 1.00 06:16:49 AM eth1 1395528.00 2791017.00 89946.17 4126551.24 0.00 0.00 0.00 06:16:50 AM eth1 1397891.00 2795716.00 90098.74 4133497.39 0.00 0.00 1.00 06:16:51 AM eth1 1394951.00 2789984.00 89908.96 4125022.51 0.00 0.00 0.00 06:16:52 AM eth1 1394608.00 2789190.00 89886.90 4123851.36 0.00 0.00 1.00 06:16:53 AM eth1 1395314.00 2790653.00 89934.33 4125983.09 0.00 0.00 0.00 06:16:54 AM eth1 1396115.00 2792276.00 89984.25 4128411.21 0.00 0.00 1.00 06:16:55 AM eth1 1396829.00 2793523.00 90030.19 4130250.28 0.00 0.00 0.00 Average: eth1 1396158.40 2792297.00 89987.09 4128439.35 0.00 0.00 0.50 lpaa23:~# tc -s -d qd sh dev eth1 | grep backlog backlog 7900052b 2609p requeues 173287 backlog 878120b 290p requeues 589 backlog 1068884b 354p requeues 5621 backlog 996212b 329p requeues 1088 backlog 984100b 325p requeues 115316 backlog 956848b 316p requeues 1781 backlog 1080996b 357p requeues 3047 backlog 975016b 322p requeues 24571 backlog 990156b 327p requeues 21274 (All 8 TX queues get a fair share of the traffic) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-14Merge branches 'core', 'cxgb4', 'iser', 'mlx5' and 'ocrdma' into for-nextRoland Dreier
2014-10-14Merge branch 'qlcnic'David S. Miller
Rajesh Borundia says: ==================== qlcnic: Bug fixes This series fixes following issues. * We were programming maximum number of arguments supported by adapter instead of required in a command. * Destroy tx command requires three arguments instead of two. Please apply these patches to net. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-14qlcnic: Fix number of arguments in destroy tx context commandRajesh Borundia
o Number of arguments taken by destroy tx command is three instead of two. Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-14qlcnic: Fix programming number of arguments in a command.Rajesh Borundia
o Initially we were programming maximum number of arguments. Instead we should program number of arguments required in a command. o Maximum number of arguments for 82xx adapter is four. Fix it for GET_ESWITCH_STATS command. Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-14genl_magic: Resolve logical-op warningsMark Rustad
Resolve "logical 'and' applied to non-boolean constant" warnings" that appear in W=2 builds by adding !! to a bit test. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-14net: Trap attempts to call sock_kfree_s() with a NULL pointer.David S. Miller
Unlike normal kfree() it is never right to call sock_kfree_s() with a NULL pointer, because sock_kfree_s() also has the side effect of discharging the memory from the sockets quota. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-14rds: avoid calling sock_kfree_s() on allocation failureCong Wang
It is okay to free a NULL pointer but not okay to mischarge the socket optmem accounting. Compile test only. Reported-by: rucsoftsec@gmail.com Cc: Chien Yen <chien.yen@oracle.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-14cxgb4: Fix FW flash logic using ethtoolHariprasad Shenai
Use t4_fw_upgrade instead of t4_load_fw to write firmware into FLASH, since t4_load_fw doesn't co-ordinate with the firmware and the adapter can get hosed enough to require a power cycle of the system. Based on original work by Casey Leedom <leedom@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-14perf symbols: Make sym->end be the first address after the symbol rangeArnaldo Carvalho de Melo
To follow vm_area_struct->vm_end convention. By adhering to the convention that ->end is the first address outside the symbol's range we can do things like: sym->end = start + len; len = sym->end - sym->start; This is also now the convention used for struct map->end, fixing some off-by-one bugs. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Chuck Ebbert <cebbert.lkml@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-agomujr7tuqaq6lu7kr6z7h6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-14perf symbols: Fix map->end fixupArnaldo Carvalho de Melo
When synthesizing maps from files that have incomplete symbol information, like kallsyms, we need to fixup the end of maps by seting its end from the ->start of the next map, fix it to set prev_map->end to curr_map->start, since ->end is the first byte outside prev_map address range. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-ivbrj08sjakxdwkrcndbkoig@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-14perf tools: Fixup off-by-one comparision in maps__findNamhyung Kim
map->end is the first addr _outside_ the a map, following the convention of vm_area_struct->vm_end. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Stephane Eranian <eranian@google.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/8761fwh1nc.fsf@sejong.aot.lge.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-14perf tools: fix off-by-one error in mapsStephane Eranian
This patch fixes off-by-one errors in the management of maps. A map is defined by start address and length as implemented by map__new(): map__init(map, type, start, start + len, pgoff, dso); map->start = addr; map->end = end; Consequently, the actual address range is [start; end[ map->end is the first byte outside the range. This patch fixes two bugs where upper bound checking was off-by-one. In V2, we fix map_groups__fixup_overlappings() some more where map->start was off-by-one as reported by Jiri. Signed-off-by: Stephane Eranian <eranian@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20141006083532.GA4850@quad Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-14perf machine: Add missing dsos->root rbtree root initializationArnaldo Carvalho de Melo
A segfault happens on 'perf test hists_link' because we end up using a struct machines on the stack, and then machines__init() was not initializing the newly introduced rb_root, just the existing list_head. When we introduced struct dsos, to group the two ways to store dsos, i.e. the linked list and the rbtree, we didn't turned the initialization done in: machines__init(machines->host) -> machine__init() -> INIT_LIST_HEAD into a dsos__init() to keep on initializing the list_head but _as well_ initializing the rb_root, oops. All worked because outside perf-test we probably zalloc the whole thing which ends up initializing it in to NULL. So the problem looks contained to 'perf test' that uses it on stack, etc. Reported-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Waiman Long <Waiman.Long@hp.com>, Cc: Adrian Hunter <adrian.hunter@intel.com>, Cc: Don Zickus <dzickus@redhat.com> Cc: Douglas Hatch <doug.hatch@hp.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Scott J Norton <scott.norton@hp.com> Cc: Waiman Long <Waiman.Long@hp.com>, Link: http://lkml.kernel.org/r/20141014180353.GF3198@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-14Merge branch 'stmmac'David S. Miller
Giuseppe Cavallaro says: ==================== stmmac: review and fix the dwmac-sti glue-logic This patch is to review the whole glue logic adopted on STi SoCs that was bugged. In the old glue-logic there was a lot of confusion when setup the retiming especially for STiD127 where, for example, the bits 6 and 7 (in the GMAC control register) have a different meaning of what is used for STiH4xx SoCs. So we cannot adopt the same glue for all these SoCs. Moreover, GiGa on STiD127 didn't work and, for all the SoCs, the RGMII couldn't run when the speed was 10Mbps (because the clock was not properly managed). Note that the phy clock needs to be provided by the platform as well as documented in the related binding file (updated as consequence). The old code supported too many configurations never adopted and validated. This made the code very complex to maintain and debug in case of issues. The patch simplifies all the configurations as commented in the tables inside the file and obviously it has been tested on all the boards based on the SoCs mentioned. With this patch, the dwmac-sti is also ready to support new configurations that will be available on next SoC generations. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-14stmmac: dwmac-sti: review the glue-logic for STi4xx and STiD127 SoCsGiuseppe CAVALLARO
This patch is to review the whole glue logic adopted on STi SoCs that was bugged. In the old glue-logic there was a lot of confusion when setup the retiming especially for STiD127 where, for example, the bits 6 and 7 (in the GMAC control register) have a different meaning of what is used for STiH4xx SoCs. So we cannot adopt the same glue for all these SoCs. Moreover, GiGa on STiD127 didn't work and, for all the SoCs, the RGMII couldn't run when the speed was 10Mbps (because the clock was not properly managed). Note that the phy clock needs to be provided by the platform as well as documented in the related binding file (updated as consequence). The old code supported too many configurations never adopted and validated. This made the code very complex to maintain and debug in case of issues. The patch simplifies all the configurations as commented in the tables inside the file and obviously it has been tested on all the boards based on the SoCs mentioned. With this patch, the dwmac-sti is also ready to support new configurations that will be available on next SoC generations. Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Srinivas Kandagatla <srinivas.kandagatla@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-14stmmac: make the STi Layer compatible to STiH407Giuseppe CAVALLARO
This adds the missing compatibility to the STiH407 SoC. Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-14stmmac: platform: fix FIXED_PHY support.Giuseppe CAVALLARO
On several STi platforms: e.g. stihxxx-b2120 an Ethernet switch is embedded and connected to the stmmac via RGMII mode. So this is managed by using the FIXED_PHY. In that case, the support in the platform needs to be fixed to allow the stmmac to dialog with the switch via fixed-link by using phy_bus_name property. Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-14perf evsel: Make some exit routines staticArnaldo Carvalho de Melo
Since they are automatically called by other methods used by tools. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-ne3g4any7q6ty5d6yv8t1wws@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-14perf evsel: Add missing 'target' struct forward declarationArnaldo Carvalho de Melo
We use it in evsel.h but were getting it indirectly, fix it. Noticed while working on having evsel.h usable by rasd.c. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-94t3jvw4tmzrq3dnovvpl65e@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-14perf evlist: Default to syswide target when no thread/cpu maps setArnaldo Carvalho de Melo
If all a tool wants is to do system wide event monitoring, there is no more the need to setup thread_map and cpu_map objects, just call perf_evlist__open() and it will do create one fd per CPU monitoring all threads. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-poovolkigu72brx4783uq4cf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-14perf evlist: Check that there is a thread_map when preparing a workloadArnaldo Carvalho de Melo
The perf_evlist__prepare_workload expects a thread map to be in place so that it can store the pid of the workload being started, so check it and tell the developer about it instead of segfaulting. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-jvlz2f264e7kpmhjmwltikqw@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-14perf thread_map: Create dummy constructor out of open coded equivalentArnaldo Carvalho de Melo
Create a dummy thread_map, one that has just one entry and it is -1, meaning 'all threads', as this ends up going down to perf_event_open(). Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-8av26cz8uxmbnihl5mmrygp9@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-14perf tools: Remove hists from evselArnaldo Carvalho de Melo
Now tools that deals want to have an hists per evsel need to call hists__init() before creating any evsels, which can be as early as when parsing the command line, so do it before calling parse_options(). The current tools using hists/hist_entries are report, top and annotate, change them to request per evsel hists. This is in preparation for making evsels usable by 3rd party tools, that not necessarily live in perf's source code repository. Acked-by: Borislav Petkov <bp@suse.de> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-usjx2la743f10ippj7p1b20x@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-14perf callchain: Move the callchain_param extern to callchain.hArnaldo Carvalho de Melo
It was lost in hist.h, move it to where it belongs, callchain.h, as there are places that gets hist.h by means of evsel.h, and since evsel.h is being untangled from hist.h... Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-0rg7ji1jnbm6q6gj35j37jby@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-14perf evsel: SubclassingArnaldo Carvalho de Melo
Provide a method to be called at tool start to config the perf_evsel instance size, together with optional constructor and destructor. This will be used so that perf_evsel doesn't always include a struct hists, tools that works with hists/hist_entries, like report, top and annotate, will, at start, tell the evsel class the size they need per instance. v2: Don't use exit as a name of a member of function parameter, as this breaks the build on at least fedora14 and rhel6. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-7t8cay0ieryox4gqosie85ek@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-14dsa: mv88e6171: Fix tag_protocol checkGuenter Roeck
tag_protocol is now an enum, so drivers have to check against it. Cc: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-14dlm: fix missing endian conversion of rcom_status flagsNeale Ferguson
The flags are already converted to le when being sent, but are not being converted back to cpu when received. Signed-off-by: Neale Ferguson <neale@sinenomine.net> Signed-off-by: David Teigland <teigland@redhat.com>
2014-10-14Merge branch 'xgene'David S. Miller
Iyappan Subramanian says: ==================== Adding SGMII based 1GbE basic support to APM X-Gene SoC ethernet driver. v2: Address comments from v1 * Split the patchset into two, the first one being preparatory patch * Added link_state function pointer to the xgene_mac_ops structure * Added xgene_indirect_ctl structure for indirect read/write arguments v1: * Initial version ==================== Signed-off-by: David S. Miller <davem@davemloft.net>