summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2009-08-10futex: Update futex_q lock_ptr on requeue proxy lockDarren Hart
futex_requeue() can acquire the lock on behalf of a waiter early on or during the requeue loop if it is uncontended or in the event of a lock steal or owner died. On wakeup, the waiter (in futex_wait_requeue_pi()) cleans up the pi_state owner using the lock_ptr to protect against concurrent access to the pi_state. The pi_state is hung off futex_q's on the requeue target futex hash bucket so the lock_ptr needs to be updated accordingly. The problem manifested by triggering the WARN_ON in lookup_pi_state() about the pid != pi_state->owner->pid. With this patch, the pi_state is properly guarded against concurrent access via the requeue target hb lock. The astute reviewer may notice that there is a window of time between when futex_requeue() unlocks the hb locks and when futex_wait_requeue_pi() will acquire hb2->lock. During this time the pi_state and uval are not in sync with the underlying rtmutex owner (but the uval does indicate there are waiters, so no atomic changes will occur in userspace). However, this is not a problem. Should a contending thread enter lookup_pi_state() and acquire hb2->lock before the ownership is fixed up, it will find the pi_state hung off a waiter's (possibly the pending owner's) futex_q and block on the rtmutex. Once futex_wait_requeue_pi() fixes up the owner, it will also move the pi_state from the old owner's task->pi_state_list to its own. v3: Fix plist lock name for application to mainline (rather than -rt) Compile tested against tip/v2.6.31-rc5. Signed-off-by: Darren Hart <dvhltc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Dinakar Guniguntala <dino@in.ibm.com> Cc: John Stultz <johnstul@linux.vnet.ibm.com> LKML-Reference: <4A7F4EFF.6090903@us.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-10powerpc/dma: pci_set_dma_mask() shouldn't fail if mask fits in RAMBenjamin Herrenschmidt
On an iMac G5, the b43 driver is failing to initialise because trying to set the dma mask to 30-bit fails. Even though there's only 512MiB of RAM in the machine anyway: https://bugzilla.redhat.com/show_bug.cgi?id=514787 We should probably let it succeed if the available RAM in the system doesn't exceed the requested limit. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-09mlx4_en: Fix read buffer overflow in mlx4_en_complete_rx_desc()roel kluin
If the length is less or equal to frag_prefix_size in the first iteration we write skb_frags_rx[-1] and read from priv->frag_info[-1] Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-09zorro8390: Fix read buffer overflow in zorro8390_init_one()roel kluin
Prevent read from cards[-1] when no card was found. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-09pcnet32: Read buffer overflowroel kluin
An `options[cards_found]' that equals `sizeof(options_mapping)' is already beyond the array. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-09sctp: fix missing destroy of percpu counter variable in sctp_proc_exit()Rafael Laufer
Commit 1748376b6626acf59c24e9592ac67b3fe2a0e026, net: Use a percpu_counter for sockets_allocated added percpu_counter function calls to sctp_proc_init code path, but forgot to add them to sctp_proc_exit(). This resulted in a following Ooops when performing this test # modprobe sctp # rmmod -f sctp # modprobe sctp [ 573.862512] BUG: unable to handle kernel paging request at f8214a24 [ 573.862518] IP: [<c0308b8f>] __percpu_counter_init+0x3f/0x70 [ 573.862530] *pde = 37010067 *pte = 00000000 [ 573.862534] Oops: 0002 [#1] SMP [ 573.862537] last sysfs file: /sys/module/libcrc32c/initstate [ 573.862540] Modules linked in: sctp(+) crc32c libcrc32c binfmt_misc bridge stp bnep lp snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss arc4 joydev snd_pcm ecb pcmcia snd_seq_dummy snd_seq_oss iwlagn iwlcore snd_seq_midi snd_rawmidi snd_seq_midi_event yenta_socket rsrc_nonstatic thinkpad_acpi snd_seq snd_timer snd_seq_device mac80211 psmouse sdhci_pci sdhci nvidia(P) ppdev video snd soundcore serio_raw pcspkr iTCO_wdt iTCO_vendor_support led_class ricoh_mmc pcmcia_core intel_agp nvram agpgart usbhid parport_pc parport output snd_page_alloc cfg80211 btusb ohci1394 ieee1394 e1000e [last unloaded: sctp] [ 573.862589] [ 573.862593] Pid: 5373, comm: modprobe Tainted: P R (2.6.31-rc3 #6) 7663B15 [ 573.862596] EIP: 0060:[<c0308b8f>] EFLAGS: 00010286 CPU: 1 [ 573.862599] EIP is at __percpu_counter_init+0x3f/0x70 [ 573.862602] EAX: f8214a20 EBX: f80faa14 ECX: c48c0000 EDX: f80faa20 [ 573.862604] ESI: f80a7000 EDI: 00000000 EBP: f69d5ef0 ESP: f69d5eec [ 573.862606] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 573.862610] Process modprobe (pid: 5373, ti=f69d4000 task=c2130c70 task.ti=f69d4000) [ 573.862612] Stack: [ 573.862613] 00000000 f69d5f18 f80a70a8 f80fa9fc 00000000 fffffffc f69d5f30 c018e2d4 [ 573.862619] <0> 00000000 f80a7000 00000000 f69d5f88 c010112b 00000000 c07029c0 fffffffb [ 573.862626] <0> 00000000 f69d5f38 c018f83f f69d5f54 c0557cad f80fa860 00000001 c07010c0 [ 573.862634] Call Trace: [ 573.862644] [<f80a70a8>] ? sctp_init+0xa8/0x7d4 [sctp] [ 573.862650] [<c018e2d4>] ? marker_update_probe_range+0x184/0x260 [ 573.862659] [<f80a7000>] ? sctp_init+0x0/0x7d4 [sctp] [ 573.862662] [<c010112b>] ? do_one_initcall+0x2b/0x160 [ 573.862666] [<c018f83f>] ? tracepoint_module_notify+0x2f/0x40 [ 573.862671] [<c0557cad>] ? notifier_call_chain+0x2d/0x70 [ 573.862678] [<c01588fd>] ? __blocking_notifier_call_chain+0x4d/0x60 [ 573.862682] [<c016b2f1>] ? sys_init_module+0xb1/0x1f0 [ 573.862686] [<c0102ffc>] ? sysenter_do_call+0x12/0x28 [ 573.862688] Code: 89 48 08 b8 04 00 00 00 e8 df aa ec ff ba f4 ff ff ff 85 c0 89 43 14 74 31 b8 b0 18 71 c0 e8 19 b9 24 00 a1 c4 18 71 c0 8d 53 0c <89> 50 04 89 43 0c b8 b0 18 71 c0 c7 43 10 c4 18 71 c0 89 15 c4 [ 573.862725] EIP: [<c0308b8f>] __percpu_counter_init+0x3f/0x70 SS:ESP 0068:f69d5eec [ 573.862730] CR2: 00000000f8214a24 [ 573.862734] ---[ end trace 39c4e0b55e7cf54d ]--- Signed-off-by: Rafael Laufer <rlaufer@cisco.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-09MAINTAINERS: additional NETWORKING [GENERAL] and NETWORKING DRIVERS patternsJoe Perches
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-09gianfar: keep vlan related state when restartYong Zhang
If vlan has been enabled. ifdown followed by ifup will lost hardware related state. Also remove duplicated operation in gfar_vlan_rx_register(). Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> Acked-by: Dai Haruki <dai.haruki@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-09e1000e: fix potential NVM corruption on ICH9 with 8K bank sizeBruce Allan
The bank offset was being incorrectly calculated on ICH9 parts with a bank size of 8K (instead of the more common 4K bank) which would cause any NVM writes to be done on the wrong address after switching from bank 1 to bank 0. Additionally, assume we are meant to use bank 0 if a valid bank is not detected, and remove the unnecessary acquisition of the SW/FW/HW semaphore when writing to the shadow ram version of the NVM image. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-09e1000e: fix acquisition of SW/FW/HW semaphore for ICHx partsBruce Allan
For ICHx parts, write the EXTCNF_CTRL.SWFLAG bit once when trying to acquire the SW/FW/HW semaphore instead of multiple times to prevent the hardware from having problems (especially for systems with manageability enabled), and extend the timeout for the hardware to set the SWFLAG bit. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-09ixgbe: Disable packet split only on FCoE queues in 82599Yi Zou
For 82599, packet split has to be disabled for FCoE direct data placement. However, this is only required on received queues allocated for FCoE. This patch adds a per ring flags to indicate if packet split is disabled on a per queue basis, particularly for FCoE, as packet split must be disabled for large receive using direct data placement (DDP). Signed-off-by: Yi Zou <yi.zou@intel.com> Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-09ixgbe: Pass rx_ring directly in ixgbe_configure_srrctl()Yi Zou
Instead of passing the register index of the corresponding rx_ring and find the way back to get to corresponding rx_ring in ixgbe_configure_srrctl(), simplify the function ixgbe_configure_srrctl() by passing the rx_ring into it. Then the register index for that rx_ring is already available from rx_ring->reg_idx. Signed-off-by: Yi Zou <yi.zou@intel.com> Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-09tun: Extend RTNL lock coverage over whole ioctlHerbert Xu
As it is, parts of the ioctl runs under the RTNL and parts of it do not. The unlocked section is still protected by the BKL, but there can be subtle races. For example, Eric Biederman and Paul Moore observed that if two threads tried to create two tun devices on the same file descriptor, then unexpected results may occur. As there isn't anything in the ioctl that is expected to sleep indefinitely, we can prevent this from occurring by extending the RTNL lock coverage. This also allows to get rid of the BKL. Finally, I changed tun_get_iff to take a tun device in order to avoid calling tun_put which would dead-lock as it also tries to take the RTNL lock. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-09fec: fix FEC driver packet transmission breakageGreg Ungerer
Commit f0b3fbeae11a526c3d308b691684589ee37c359b ("FEC Buffer rework") breaks transmission of packets where the skb data buffer is not memory aligned according to FEC_ALIGNMENT. It incorrectly passes to dma_sync_single() the buffer address directly from the skb, instead of the address calculated for use (which may be the skb address or one of the bounce buffers). It seems there is no use converting the cpu address of the buffer to a physical either, since dma_map_single() expects the cpu address and will return the dma address to use in the descriptor. So remove the use of __pa() on the buffer address as well. This patch is against 2.6.30-rc5. This breakage is a regression over 2.6.30, which does not have this problem. Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-09can: Fix raw_getname() leakEric Dumazet
raw_getname() can leak 10 bytes of kernel memory to user (two bytes hole between can_family and can_ifindex, 8 bytes at the end of sockaddr_can structure) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Oliver Hartkopp <oliver@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-09Fix xfrm hash collisions by changing __xfrm4_daddr_saddr_hash to hash ↵Jussi Mäki
addresses with addition This patch fixes hash collisions in cases where number of entries have incrementing IP source and destination addresses from single respective subnets (i.e. 192.168.0.1-172.16.0.1, 192.168.0.2-172.16.0.2, and so on.). Signed-off-by: Jussi Maki <joamaki@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-09atlx: strncpy does not null terminate stringRoel Kluin
strlcpy() will always null terminate the string. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Cc: Jay Cliburn <jcliburn@gmail.com> Cc: Chris Snook <csnook@redhat.com> Cc: Jie Yang <jie.yang@atheros.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-09irda: fix read buffer overflowRoel Kluin
io[i] is read before the bounds check on i, order should be reversed. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Cc: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-09MAINTAINERS: update atlx contact infoChris Snook
Update MAINTAINERS to reflect my current (non-)affiliation. Anyone hiring? Signed-off-by: Chris Snook <chris.snook@gmail.com> Cc: Jay Cliburn <jcliburn@gmail.com> Cc: Jie Yang <jie.yang@atheros.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-09Merge branch 'master' of /home/davem/src/GIT/linux-2.6/David S. Miller
2009-08-09Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
2009-08-10Remove deadlock potential in md_openNeilBrown
A recent commit: commit 449aad3e25358812c43afc60918c5ad3819488e7 introduced the possibility of an A-B/B-A deadlock between bd_mutex and reconfig_mutex. __blkdev_get holds bd_mutex while calling md_open which takes reconfig_mutex, do_md_run is always called with reconfig_mutex held, and it now takes bd_mutex in the call the revalidate_disk. This potential deadlock was not caught by lockdep due to the use of mutex_lock_interruptible_nexted which was introduced by commit d63a5a74dee87883fda6b7d170244acaac5b05e8 do avoid a warning of an impossible deadlock. It is quite possible to split reconfig_mutex in to two locks. One protects the array data structures while it is being reconfigured, the other ensures that an array is never even partially open while it is being deactivated. In particular, the second lock prevents an open from completing between the time when do_md_stop checks if there are any active opens, and the time when the array is either set read-only, or when ->pers is set to NULL. So we can be certain that no IO is in flight as the array is being destroyed. So create a new lock, open_mutex, just to ensure exclusion between 'open' and 'stop'. This avoids the deadlock and also avoids the lockdep warning mentioned in commit d63a5a74d Reported-by: "Mike Snitzer" <snitzer@gmail.com> Reported-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NeilBrown <neilb@suse.de>
2009-08-09Merge branch 'for-linus' of git://git.infradead.org/ubi-2.6Linus Torvalds
* 'for-linus' of git://git.infradead.org/ubi-2.6: UBI: compatible fallback in absense of sequence numbers UBI: fix double free on error path
2009-08-09Merge branch 'kvm-updates/2.6.31' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
* 'kvm-updates/2.6.31' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: Avoid redelivery of edge interrupt before next edge KVM: MMU: limit rmap chain length KVM: ia64: fix build failures due to ia64/unsigned long mismatches KVM: Make KVM_HPAGES_PER_HPAGE unsigned long to avoid build error on powerpc KVM: fix ack not being delivered when msi present KVM: s390: fix wait_queue handling KVM: VMX: Fix locking imbalance on emulation failure KVM: VMX: Fix locking order in handle_invalid_guest_state KVM: MMU: handle n_free_mmu_pages > n_alloc_mmu_pages in kvm_mmu_change_mmu_pages KVM: SVM: force new asid on vcpu migration KVM: x86: verify MTRR/PAT validity KVM: PIT: fix kpit_elapsed division by zero KVM: Fix KVM_GET_MSR_INDEX_LIST
2009-08-09Merge branch 'drm-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: drm/i915: silence vblank warnings drm: silence pointless vblank warning. drm: When adding probed modes, preserve duplicate mode types
2009-08-09Merge branch 'timers-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: posix_cpu_timers_exit_group(): Do not use thread_group_cputimer()
2009-08-09Merge branch 'tracing-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf_counter: Fix/complete ftrace event records sampling perf_counter, ftrace: Fix perf_counter integration tracing/filters: Always free pred on filter_add_subsystem_pred() failure tracing/filters: Don't use pred on alloc failure ring-buffer: Fix memleak in ring_buffer_free() tracing: Fix recordmcount.pl to handle sections with only weak functions ring-buffer: Fix advance of reader in rb_buffer_peek() tracing: do not use functions starting with .L in recordmcount.pl ring-buffer: do not disable ring buffer on oops_in_progress ring-buffer: fix check of try_to_discard result
2009-08-09Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: fix buffer overflow in efi_init() x86: Add quirk to make Apple MacBookPro5,1 use reboot=pci x86: Fix MSI-X initialization by using online_mask for x2apic target_cpus x86: Fix VMI && stack protector
2009-08-09Merge branch 'core-fixes-for-linus-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: lockdep: Fix typos in documentation lockdep: Fix file mode of lock_stat rtmutex: Avoid deadlock in rt_mutex_start_proxy_lock()
2009-08-09perf tools: callchain: Fix bad rounding of minimum rateFrederic Weisbecker
Sometimes we get callchain branches that have a rate under the limit given by the user. Say you launched: perf record -f -g -a ./hackbench 10 perf report -g fractal,10.0 And you got: 2.33% hackbench [kernel] [k] _spin_lock_irqsave | |--78.57%-- remove_wait_queue | poll_freewait | do_sys_poll | sys_poll | sysenter_dispatch | 0xf7ffa430 | 0x1ffadea3c | |--7.14%-- __up_read | up_read | do_page_fault | page_fault | 0xf7ffa430 | 0xa0df710000000a ... It is abnormal to get a 7.14% branch whereas we passed a 10% filter. The problem is that we round down the minimum threshold. This happens mostly when we have very low number of events. If the total amount of your branch is 4 and you have a subranch of 3 events, filtering to 90% will be computed like follows: limit = 4 * 0.9; The result is about 3.6, but the cast to integer will round down to 3. It means that our filter is actually of 75% We must then explicitly round up the minimum threshold. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: acme@redhat.com Cc: peterz@infradead.org Cc: efault@gmx.de LKML-Reference: <20090809024235.GA10146@nowhere> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf_counter tools: Fix libbfd detection for systems with libz dependencyMike Galbraith
Due to a libz dependency in some distro's binutils package, C++ demangle support isn't compiled in despite the necessary libraries being available. Fix this by adding a -lz link test to the dependency detection rules. Signed-off-by: Mike Galbraith <efault@gmx.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1249733655.6929.5.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf: "Longum est iter per praecepta, breve et efficax per exempla"Carlos R. Mafra
A few examples of how 'perf' can be used, from an e-mail by Ingo Molnar http://lkml.org/lkml/2009/8/4/346. Signed-off-by: Carlos R. Mafra <crmafra2@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Valdis.Kletnieks@vt.edu LKML-Reference: <20090805185334.GA4535@Pilar.aei.mpg.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf_counter: Fix a race on perf_counter_ctxPeter Zijlstra
While extending perfcounters with BTS hw-tracing, Markus Metzger managed to trigger this warning: [ 995.557128] WARNING: at kernel/perf_counter.c:1191 __perf_counter_task_sched_out+0x48/0x6b() triggers because commit 9f498cc5be7e013d8d6e4c616980ed0ffc8680d2 (perf_counter: Full task tracing) removed clearing of tsk->perf_counter_ctxp out from under ctx->lock which introduced a race (against perf_lock_task_context). Move it back and deal with the exit notification by explicitly passing along the former task context. Reported-by: Markus T Metzger <markus.t.metzger@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1249667341.17467.5.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf_counter: Fix tracepoint sampling to be part of generic samplingFrederic Weisbecker
Based on Peter's comments, make tracepoint sampling generic just like all the other sampling bits are. This is a rename with no code changes: - PERF_SAMPLE_TP_RECORD to PERF_SAMPLE_RAW - struct perf_tracepoint_record to perf_raw_record We want the system in place that transport tracepoints raw samples events into the perf ring buffer to be generalized and usable by any type of counter. Reported-by; Peter Zijlstra <peterz@infradead.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1249698400-5441-4-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf_counter: Work around gcc warning by initializing tracepoint record ↵Frederic Weisbecker
unconditionally Despite that the tracepoint record is always present when the PERF_SAMPLE_TP_RECORD flag is set, gcc raises a warning, thinking it might not be initialized: kernel/perf_counter.c: In function ‘perf_counter_output’: kernel/perf_counter.c:2650: warning: ‘tp’ may be used uninitialized in this function Then, initialize it to NULL and always check if it's not NULL before dereference it. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1249698400-5441-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf tools: callchain: Fix sum of percentages to be 100% by displaying ↵Frederic Weisbecker
amount of ignored chains in fractal mode When we filter the callchains below a given percentage, we ignore them and the end result only shows entries that have an upper percentage than the filter threshold. It seems to users then that we have an imbalance in the percentage, as if the sum inside a profiled branch doesn't reach 100%. Since in the past there have been real perf report bugs that showed the same sypmtom, it would be nice to assure the user that the data is perfect and trustable and it all sums up to 100.00%. So fix this by displaying the remaining hits that have been filtered but without more detail than their amount in each branches. Example while filtering below 50%: 7.73% [k] delay_tsc | |--98.22%-- __const_udelay | | | |--86.37%-- ath5k_hw_register_timeout | | ath5k_hw_noise_floor_calibration | | ath5k_hw_reset | | ath5k_reset | | ath5k_config | | ieee80211_hw_config | | | | | |--88.53%-- ieee80211_scan_work | | | worker_thread | | | kthread | | | child_rip | | --11.47%-- [...] | --13.63%-- [...] --1.78%-- [...] Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1249690585-9145-4-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf tools: callchain: Fix 'perf report' display to be callchain by defaultFrederic Weisbecker
If we recorded with -g option to record the callchain, right now we require a -g option to perf report as well - and people reported this as unnecessary complication: the user already specified -g once, no need to require it a second time. So if the recording includes call-chains, display the callchain by default from perf report. ( The user can override this default using "-g none" option from perf report. ) Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1249690585-9145-3-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf tools: callchain: Fix spurious 'perf report' warnings: ignore empty ↵Frederic Weisbecker
callchains When the callchain tree comes to insert an empty backtrace, it raises a spurious warning about the fact we are inserting an empty. This is spurious because the radix tree assumes it did something wrong to reach this state. But it didn't, we just met an empty callchain that has to be ignored. This happens occasionally with certain types of call-chain recordings. If it happens it's a big nuisance as perf report output starts with thousands of warning lines. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1249690585-9145-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf record: Fix the -A UI for empty or non-existent perf.dataPierre Habouzit
1. Ignore the -A argument if there is no perf.data file 2. Treat an empty file like a non existent file. Else, perf will try to read the perf.data header, and fail with an error. Treating an empty file like a non-existent file makes sense, since an interupted (as in SIGKILLed) perf could leave such files around, and you don't want to annoy the user with errors for files with no data in it. Signed-off-by: Pierre Habouzit <pierre.habouzit@intersec.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf util: Fix do_read() to fail on EOF instead of busy-loopingPierre Habouzit
While toying with perf, I've noticed that perf record can easily enter a busy loop when doing something as silly as: $ perf record -A ls Yeah, do_read here really wants to read a known size, not being able to should die(), not busy-loop ;) That was the cause for the bug. Signed-off-by: Pierre Habouzit <pierre.habouzit@intersec.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf list: Fix the output to not include tracepoints without an idPeter Zijlstra
Stop perf list from displaying tracepoints without an id file, those are special tracepoints that are not interfaced to perfcounters so listing them is erroneous and passing them as events will produce no output. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Jason Baron <jbaron@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Chris Mason <chris.mason@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf_counter/powerpc: Fix oops on cpus without perf_counter hardware supportPaul Mackerras
If we have the powerpc perf_counter backend compiled in, but the cpu we are running on is one where we don't support the PMU, we currently oops in hw_perf_group_sched_in if we try to use any counters, because ppmu is NULL in that case, and we unconditionally dereference ppmu. This fixes the problem by adding a check if ppmu is NULL at the beginning of hw_perf_group_sched_in, and also at the beginning of the other functions that get called from the perf_counter core, i.e. hw_perf_disable, hw_perf_enable, and hw_perf_counter_setup. Signed-off-by: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: benh@kernel.crashing.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf stat: Fix tool option consistency: rename -S/--scale to -c/--scaleBrice Goglin
We want to use a coherent flag for -S/--stat across all tools, so free up -S in perf stat. Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus@samba.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf report: Add debug help for the finding of symbol bugs - show the symtab ↵Arnaldo Carvalho de Melo
origin (DSO, build-id, kernel, etc) Used with perf report --verbose: [acme@doppio linux-2.6-tip]$ perf report -v | head -16 5.17% firefox /usr/lib64/xulrunner-1.9.1/libxul.so 0x00000000005d8eee f [.] imgContainer::DrawFrameTo(gfxIImageFrame*, gfxIImageFrame*, nsRect&) 2.56% firefox /lib64/libpthread-2.10.1.so 0x0000000000008e02 d [.] __pthread_mutex_lock_internal 1.94% firefox /usr/lib64/xulrunner-1.9.1/libxul.so 0x0000000000d0af8f f [.] SearchTable 1.75% firefox [kernel] 0xffffffffff60013b k [.] vread_hpet 1.63% firefox /lib64/libpthread-2.10.1.so 0x000000000000a404 d [.] __pthread_mutex_unlock 1.47% firefox /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x00000000000482ea f [.] js_Interpret 1.42% firefox /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x000000000003eda3 f [.] JS_CallTracer 1.24% firefox [kernel] 0xffffffff8102ca4a k [k] read_hpet 1.16% firefox [kernel] 0xffffffff810f3dd4 k [k] fget_light 1.11% firefox /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x00000000000567ff f [.] js_TraceObject 0.98% firefox /usr/lib64/firefox-3.5.2/firefox 0x000000000000dd23 b [.] arena_ralloc [acme@doppio linux-2.6-tip]$ The new field is just after the symbol address. To help in figuring out symbol resolution bugs. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf report: Fix per task mult-counter stat reportingPeter Zijlstra
Brice Goglin reported: > I can easily sort them by thread id, but I don't know how to match > my 4 events with each group of 4 lines. Also report the counter id and the time running/enabled stats (in case the counter got time-shared). Reported-by: Brice Goglin <Brice.Goglin@inria.fr> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Brice Goglin <Brice.Goglin@inria.fr> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf tools: Fix multi-counter stat bug caused by incorrect reading of ↵Peter Zijlstra
perf.data file header Brice Goglin reported that only the first result from a multi-counter perf record --stat run is accurate, the rest looks bogus. A silly mistake made us re-read the first attribute for every recorded attribute. Reported-by: Brice Goglin <Brice.Goglin@inria.fr> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Brice Goglin <Brice.Goglin@inria.fr> Cc: paulus@samba.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf tools: Fix call-chain cumul hit based sub-total (fractal mode)Frederic Weisbecker
The callchain fractal mode builds each new total hits in a new branch of profiling by using the parent's hits of the current branch plus the hits of the children. This is wrong, the total hits of a branch should be made of the sum of every children hits, we must ignore the parent hits in this scope. This patch also fixes another mistake with the hit counting. Now the rates are correct. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf top: Update man pageMike Galbraith
perf_counter tools: update perf top manual page to reflect current implementation. Signed-off-by: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf top: Improve interactive key handlingMike Galbraith
Pressing any key which is not currently mapped to functionality, based on startup command line options, displays currently mapped keys, and prompts for input. Pressing any unmapped key at the prompt returns the user to display mode with variables unchanged. eg, pressing ? <SPACE> <ESC> etc displays currently available keys, the value of the variable associated with that key, and prompts. Pressing same again aborts input. Signed-off-by: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf_counter: Fix software counters for fast moving event sourcesPeter Zijlstra
Reimplement the software counters to deal with fast moving event sources (such as tracepoints). This means being able to generate multiple overflows from a single 'event' as well as support throttling. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>