summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/i915/intel_ringbuffer.c
AgeCommit message (Collapse)Author
2012-05-03drm/i915: kill pointless clearing of dev_priv->hws_mapDaniel Vetter
We kzalloc dev_priv, and we never use hws_map in intel_ringbuffer.c. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-03drm/i915: remove do_retire from i915_wait_requestBen Widawsky
This originates from a hack by me to quickly fix a bug in an earlier patch where we needed control over whether or not waiting on a seqno actually did any retire list processing. Since the two operations aren't clearly related, we should pull the parameter out of the wait function, and make the caller responsible for retiring if the action is desired. The only function call site which did not get an explicit retire_request call (on purpose) is i915_gem_inactive_shrink(). That code was already calling retire_request a second time. v2: don't modify any behavior excepit i915_gem_inactive_shrink(Daniel) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-03drm/i915: rip out GEM drm feature checksDaniel Vetter
We always set it so there's no point in checking. We could instead add a bit that tells us whether gem is actually initialized (i.e. either kms or gem_init_ioctl called), but that's imho not worth it. So just rip it out. There's a little change in the wait_ring timeout, but we've never run with anything else than the 60 second timeout, even on dri1 userspace. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-03drm/i915: Use a global lock for modifying global irq flagsChris Wilson
We were attempting to use a per-ring spinlock whilst modifying global IRQ flags. A recipe for rare missed interrupts. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-03drm/i915: create macros to handle masked bitsDaniel Vetter
... and put them to so good use. Note that there's functional change in vlv clock gating code, we now no longer spuriously read back the current value of the bit. According to Bspec the high bits should always read zero, so ORing this in should have no effect. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-03drm/i915: i8xx interrupt handlerChris Wilson
gen2 hardware has some significant differences from the other interrupt routines that were glossed over and then forgotten about in the transition to KMS. Such as - 16bit IIR - PendingFlip status bit This patch reintroduces a handler specifically for gen2 for the purpose of handling pageflips correctly, simplifying code in the process. v2: Also fixup ring get/put irq to only access 16bit registers (Daniel) Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=24202 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41793 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: use posting_read16 in intel_ringbuffer.c and kill _driver from the function names.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-28drm/i915: Set the Stencil Cache eviction policy to non-LRA mode.Kenneth Graunke
Clearing bit 5 of CACHE_MODE_0 is necessary to prevent GPU hangs in OpenGL programs such as Google MapsGL, Google Earth, and gzdoom when using separate stencil buffers. Without it, the GPU tries to use the LRA eviction policy, which isn't supported. This was supposed to be off by default, but seems to be on for many machines. This cannot be done in gen6_init_clock_gating with most of the other workaround bits; the render ring needs to exist. Otherwise, the register write gets dropped on the floor (one printk will show it changed, but a second printk immediately following shows the value reverts to the old one). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47535 Cc: stable@vger.kernel.org Cc: Rob Castle <futuredub@gmail.com> Cc: Eric Appleman <erappleman@gmail.com> Cc: aaron667@gmx.net Cc: Keith Packard <keithp@keithp.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-04-20drm/i915: invalidate render cache on gen2Daniel Vetter
It looks like we also need to flush the render cache when we just invalidate it. This fixes a regression in i-g-t/gem_tiled_blits on my i855gm. I guess the render cache there is virtually indexed, so we need to clean it when changing gtt mappings. This regression has been introduce in commit 46f0f8d120c4afae53a5670bf3ac80a928340ff3 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Apr 18 11:12:11 2012 +0100 drm/i915: Don't set a MBZ bit in gen2/3 MI_FLUSH Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-18drm/i915: Don't set a MBZ bit in gen2/3 MI_FLUSHChris Wilson
On gen2 MI_EXE_FLUSH is actually an AGP flush bit and on gen3 marked as reserved. On both it is documented as being must-be-zero. So obey the documentation, and separate the gen2 flush into its own little routine and share with gen3. This means that we can rename the existing render_ring_flush() to reflect the generation from which it first applies and remove the code for handling earlier generations from it. v2: Applies to gen3 as well v3: Make it compile and improve the commit message. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-18drm/i915: Replace open coded MI_BATCH_GTTChris Wilson
The (2<<6) virtual memory space selector harks back to gen3 and is mandatory given our use of GTT space for batchbuffers. On gen4+, use of the GTT became mandatory and bit6 marked reserved. However the code must now explicitly set (1<<7), which conveniently is also (2<<6). To clarify the meaning for future readers, replace the open coded (2<<6) with MI_BATCH_GTT. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-18drm/i915: [sparse] trivial sparse fixesBen Widawsky
This should contain all the changes which require no thought to make sparse happy. Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-17Merge tag 'v3.4-rc3' into drm-intel-next-queuedDaniel Vetter
Backmerge Linux 3.4-rc3 into drm-intel-next to resolve a few things that conflict/depend upon patches in -rc3: - Second part of the Sandybridge workaround series - it changes some of the same registers. - Preparation for Chris Wilson's fencing cleanup - we need the fix from -rc3 merged before we can move around all that code. - Resolve the gmbus conflict - gmbus has been disabled in 3.4 again, but should be enabled on all generations in 3.5. Conflicts: drivers/gpu/drm/i915/intel_i2c.c Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13drm/i915: inline enable/disable_irq into ring->get/put_irqDaniel Vetter
Now that these are properly refactored this additional indirection doesn't really buy us anything but confusion. Hence inline them. This duplicates the ironlake gt enable/disable code snippet, but we've already separate ilk from gen6+ gt irq in i915_irq.c, so I think this makes more sense. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13drm/i915: don't set up gem ring functions on gen5 for !kmsDaniel Vetter
We already disallow initialition of gem in this case in the corresponding ioctl, so don't bother setting up the gem support ring functions in the legacy dri render ring init. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13drm/i915: consolidate ring->add_request a bitDaniel Vetter
They're indentical, so just kill one. Also give the other a prefix to distinguish it from the gen6+ functions - this add_request function is not really generic code. v2: Fixup commit message as noted by Ben Widawsky. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13drm/i915: split up ring->dispatch_execbuffer functionsDaniel Vetter
Now that we can, we should split them up in a way that makes some sense and banishes the IS_ checks into init code. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13drm/i915: don't enable the gen6 bsd ring tail write enable on gen7Daniel Vetter
HW engineers have fixed this issue for ivb. Again, a nice cleanup possible thanks to the more flexible ring initialization. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13drm/i915: split out the gen5 ring irq get/put functionsDaniel Vetter
Now that we have sensibly split up, we can nicely get rid of that ugly is_gen5 check. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13drm/i915: abstract away ring-specific irq_get/putDaniel Vetter
Inspired by Ben Widawsky's patch for gen6+. Now after restructuring how we set up the ring vtables and parameters, we can do this right. This kills the bsd specific get/put_irq functions, they're now the same. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13drm/i915: consolidate ring->sync-to functionsDaniel Vetter
The waiter is always the ring itself (otherwise we'd have a decent snafu in a callsite), so we can unify this easily. Also give it the usual gen6_ prefix, in case anyone is foolish enough to implement hw semaphores for gen5. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13drm/i915: don't set up rings on gen6+ for non-kmsDaniel Vetter
It's not supported, and with the patch to refuse loading on gen6+ without kms enabled, there's also no way we can hit this. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13drm/i915: dynamically set up blt ring functions and parametersDaniel Vetter
Just for consistency. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13drm/i915: dynamically set up bsd ring functions and paramsDaniel Vetter
The same treatment for the bsd ring. Again, this will be split up further by the irq rework. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13drm/i915: dynamically set up the render ring functions and paramsDaniel Vetter
Our hw is simply not well-designed enough that it neatly fits into boxes. Everywhere else we set up vtables and similar things dynamically using switch statements - it's simply much more flexible. This is prep work to rework the pre-gen6 ring irq stuff - it'll add a few more differences. With the current const struct templates, that would be a mess. This leads to some unfortunate duplication with the old dri1 code, but we can reap that again because gen6 isn't actually supported there. But that's for a separate patch. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13drm/i915: set ring->size in common ring setup codeDaniel Vetter
Eventually we want to scale the ring size depending upon available gtt space. For now just consolidate this instead of replicating it over all ringbuffer templates. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13drm/i915: rip out ring->irq_maskDaniel Vetter
We only ever enable/disable one interrupt (namely user_interrupts and pipe_notify), so we don't need to track the interrupt masking state. Also rename irq_enable to irq_enable_mask, now that it won't collide - beforehand both a irq_mask and irq_enable_mask would have looked a bit strange. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-12drm/i915: hide (seqno-1) in ringbuffer codeBen Widawsky
Waiting for seqno-1 in our object synchronization code is an implementation detail given how we've decided to do the waits within the rest of our code. Requested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-12Merge branch 'drm-intel-next' of ↵Dave Airlie
git://people.freedesktop.org/~danvet/drm-intel into drm-core-next Daniel Vetter wrote First pull request for 3.5-next, slightly large than usual because new things kept coming in since the last pull for 3.4. Highlights: - first batch of hw enablement for vlv (Jesse et al) and hsw (Eugeni). pci ids are not yet added, and there's still quite a few patches to merge (mostly modesetting). To make QA easier I've decided to merge this stuff in pieces. - loads of cleanups and prep patches spurred by the above. Especially vlv is a real frankenstein chip, but also hsw is stretching our driver's code design. Expect more to come in this area for 3.5. - more gmbus fixes, cleanups and improvements by Daniel Kurtz. Again, there are more patches needed (and some already queued up), but I wanted to split this a bit for better testing. - pwrite/pread rework and retuning. This series has been in the works for a few months already and a lot of i-g-t tests have been created for it. Now it's finally ready to be merged. Note that one patch in this series touches include/pagemap.h, that patch is acked-by akpm. - reduce mappable pressure and relocation throughput improvements from Chris. - mmap offset exhaustion mitigation by Chris Wilson. - a start at figuring out which codepaths in our messy dri1/ums+gem/kms driver we actually need to support by bailing out of unsupported case. The driver now refuses to load without kms on gen6+ and disallows a few ioctls that userspace never used in certain cases. More of this will definitely come. - More decoupling of global gtt and ppgtt. - Improved dual-link lvds detection by Takashi Iwai. - Shut up the compiler + plus fix the fallout (Ben) - Inverted panel brightness handling (mostly Acer manages to break things in this way). - Small fixlets and adjustements and some minor things to help debugging. Regression-wise QA reported quite a few issues on ivb, but all of them turned out to be hw stability issues which are already fixed in drm-intel-fixes (QA runs the nightly regression tests on -next alone, without -fixes automatically merged in). There's still one issue open on snb, it looks like occlusion query writes are not quite as cache coherent as we've expected. With some of the pwrite adjustements we can now reliably hit this. Kernel workaround for it is in the works." * 'drm-intel-next' of git://people.freedesktop.org/~danvet/drm-intel: (101 commits) drm/i915: VCS is not the last ring drm/i915: Add a dual link lvds quirk for MacBook Pro 8,2 drm/i915: make quirks more verbose drm/i915: dump the DMA fetch addr register on pre-gen6 drm/i915/sdvo: Include YRPB as an additional TV output type drm/i915: disallow gem init ioctl on ilk drm/i915: refuse to load on gen6+ without kms drm/i915: extract gt interrupt handler drm/i915: use render gen to switch ring irq functions drm/i915: rip out old HWSTAM missed irq WA for vlv drm/i915: open code gen6+ ring irqs drm/i915: ring irq cleanups drm/i915: add SFUSE_STRAP registers for digital port detection drm/i915: add WM_LINETIME registers drm/i915: add WRPLL clocks drm/i915: add LCPLL control registers drm/i915: add SSC offsets for SBI access drm/i915: add port clock selection support for HSW drm/i915: add S PLL control drm/i915: add PIXCLK_GATE register ... Conflicts: drivers/char/agp/intel-agp.h drivers/char/agp/intel-gtt.c drivers/gpu/drm/i915/i915_debugfs.c
2012-04-11drm/i915/ringbuffer: Exclude last 2 cachlines of ring on 845gChris Wilson
The 845g shares the errata with i830 whereby executing a command within 2 cachelines of the end of the ringbuffer may cause a GPU hang. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-09drm/i915: use render gen to switch ring irq functionsDaniel Vetter
Top-level interrupt bits are usually found in the display block. It therefore makes sense to use HAS_PCH_SPLIT in i915_irq.c But the irq stuff in intel_ring.c only concerns itself with render core/gt-level interrupt sources. It therefore makes more sense to switch based on gpu gen. Kills a vlv special case. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-09drm/i915: open code gen6+ ring irqsBen Widawsky
We can now open-code the get/put irq functions as they were just abstracting single register definitions. It would be nice to merge this in with the IRQ handling code... but that is too much work for me at present. In addition I could probably collapse this in to a lot of the Ironlake stuff, but I don't think it's worth the potential regressions. This patch itself should not effect functionality. CC: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-09drm/i915: ring irq cleanupsBen Widawsky
- gen6 put/get only need one argument rflags and gflags are always the same (see above explanation) - remove a couple redundantly defined IRQs - reordered some lines to make things go in descending order Every ring has its own interrupts, enables, masks, and status bits that are fed into the main interrupt enable/mask/status registers. At one point in time it seemed like a good idea to make our functions support the notion that each interrupt may have a different bit position in the corresponding register (blitter parser error may be bit n in IMR, but bit m in blitter IMR). It turned out though that the HW designers did us a solid on Gen6+ and this unfortunate situation has been avoided. This allows our interrupt code to be cleaned up a bit. I jammed this into one commit because there should be no functional change with this commit, and staging it into multiple commits was unnecessarily artificial IMO. CC: Chris Wilson <chris@chris-wilson.co.uk> CC: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: - fixed up merged conflict with vlv changes. - added GEN6 to GT blitter bit, we only use it on gen6+. - added a comment to both ring irq bits and GT irq bits that on gen6+ these alias. - added comment that GT_BSD_USER_INTERRUPT is ilk-only. - I've got confused a bit that we still use GT_USER_INTERRUPT on ivb for the render ring - but this goes back to ilk where we have only gt interrupt bits and so we be equally confusing if changed.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-01drm/i915: apply CS reg readback trick against missed IRQ on snbDaniel Vetter
Ben Widawsky reported missed IRQ issues and this patch here helps. We have one other missed IRQ report still left on snb, reported by QA: https://bugs.freedesktop.org/show_bug.cgi?id=46145 This is _not_ a regression due to the forcewake voodoo though, it started showing up before that was applied and has been on-and-off for the past few weeks. According to QA this patch does not help. But the missed IRQ is always from the blt ring (despite running piglit, so also render activity expected), so I'm hopefully that this is an issue with the blt ring itself. Tested-by: Ben Widawsky <ben@bwidawsk.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-03-29drm/i915: ValleyView IRQ supportJesse Barnes
ValleyView has a new interrupt architecture; best to put it in a new set of functions. Also make sure the ring mask functions handle ValleyView. FIXME: fix flipping; need to enable interrupts and call prepare/finish Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-03-18drm/i915: Add wait_for in init_ring_commonSean Paul
I have seen a number of "blt ring initialization failed" messages where the ctl or start registers are not the correct value. Upon further inspection, if the code just waited a little bit, it would read the correct value. Adding the wait_for to these reads should eliminate the issue. Signed-off-by: Sean Paul <seanpaul@chromium.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-03-15drm: Merge tag 'v3.3-rc7' into drm-core-nextDave Airlie
Merge the fixes so far into core-next, needed to test intel driver. Conflicts: drivers/gpu/drm/i915/intel_ringbuffer.c
2012-02-27drm/i915: Remove use of the autoreported ringbuffer HEAD positionChris Wilson
This is a revert of 6aa56062eaba67adfb247cded244fd877329588d. This was originally introduced to workaround reads of the ringbuffer registers returning 0 on SandyBridge causing hangs due to ringbuffer overflow. The root cause here was reads through the GT powerwell require the forcewake dance, something we only learnt of later. Now it appears that reading the reported head position from the HWS is returning garbage, leading once again to hangs. For example, on q35 the autoreported head reports: [ 217.975608] head now 00010000, actual 00010000 [ 436.725613] head now 00200000, actual 00200000 [ 462.956033] head now 00210000, actual 00210010 [ 485.501409] head now 00400000, actual 00400020 [ 508.064280] head now 00410000, actual 00410000 [ 530.576078] head now 00600000, actual 00600020 [ 553.273489] head now 00610000, actual 00610018 which appears reasonably sane. In contrast, if we look at snb: [ 141.970680] head now 00e10000, actual 00008238 [ 141.974062] head now 02734000, actual 000083c8 [ 141.974425] head now 00e10000, actual 00008488 [ 141.980374] head now 032b5000, actual 000088b8 [ 141.980885] head now 03271000, actual 00008950 [ 142.040628] head now 02101000, actual 00008b40 [ 142.180173] head now 02734000, actual 00009050 [ 142.181090] head now 00000000, actual 00000ae0 [ 142.183737] head now 02734000, actual 00009050 In addition, the automatic reporting of the head position is scheduled to be defeatured in the future. It has no more utility, remove it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45492 Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Tested-by: Eric Anholt <eric@anholt.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-15drm/i915: Record the tail at each request and use it to estimate the headChris Wilson
By recording the location of every request in the ringbuffer, we know that in order to retire the request the GPU must have finished reading it and so the GPU head is now beyond the tail of the request. We can therefore provide a conservative estimate of where the GPU is reading from in order to avoid having to read back the ring buffer registers when polling for space upon starting a new write into the ringbuffer. A secondary effect is that this allows us to convert intel_ring_buffer_wait() to use i915_wait_request() and so consolidate upon the single function to handle the complicated task of waiting upon the GPU. A necessary precaution is that we need to make that wait uninterruptible to match the existing conditions as all the callers of intel_ring_begin() have not been audited to handle ERESTARTSYS correctly. By using a conservative estimate for the head, and always processing all outstanding requests first, we prevent a race condition between using the estimate and direct reads of I915_RING_HEAD which could result in the value of the head going backwards, and the tail overflowing once again. We are also careful to mark any request that we skip over in order to free space in ring as consumed which provides a self-consistency check. Given sufficient abuse, such as a set of unthrottled GPU bound cairo-traces, avoiding the use of I915_RING_HEAD gives a 10-20% boost on Sandy Bridge (i5-2520m): firefox-paintball 18927ms -> 15646ms: 1.21x speedup firefox-fishtank 12563ms -> 11278ms: 1.11x speedup which is a mild consolation for the performance those traces achieved from exploiting the buggy autoreported head. v2: Add a few more comments and make request->tail a conservative estimate as suggested by Daniel Vetter. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: resolve conflicts with retirement defering and the lack of the autoreport head removal (that will go in through -fixes).] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-02-13drm/i915: enable forcewake voodoo also for gen6Daniel Vetter
We still have reports of missed irqs even on Sandybridge with the HWSTAM workaround in place. Testing by the bug reporter gets rid of them with the forcewake voodoo and no HWSTAM writes. Because I've slightly botched the rebasing I've left out the ACTHD readback which is also required to get IVB working. Seems to still work on the tester's machine, so I think we should go with the more minmal approach on SNB. Especially since I've only found weak evidence for holding forcewake while waiting for an interrupt to arrive, but none for the ACTHD readback. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45181 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45332 Tested-by: Nicolas Kalkhof nkalkhof()at()web.de Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-02-13drm/i915: fixup seqno allocation logic for lazy_requestDaniel Vetter
Currently we reserve seqnos only when we emit the request to the ring (by bumping dev_priv->next_seqno), but start using it much earlier for ring->oustanding_lazy_request. When 2 threads compete for the gpu and run on two different rings (e.g. ddx on blitter vs. compositor) hilarity ensued, especially when we get constantly interrupted while reserving buffers. Breakage seems to have been introduced in commit 6f392d548658a17600da7faaf8a5df25ee5f01f6 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sat Aug 7 11:01:22 2010 +0100 drm/i915: Use a common seqno for all rings. This patch fixes up the seqno reservation logic by moving it into i915_gem_next_request_seqno. The ring->add_request functions now superflously still return the new seqno through a pointer, that will be refactored in the next patch. Note that with this change we now unconditionally allocate a seqno, even when ->add_request might fail because the rings are full and the gpu died. But this does not open up a new can of worms because we can already leave behind an outstanding_request_seqno if e.g. the caller gets interrupted with a signal while stalling for the gpu in the eviciton paths. And with the bugfix we only ever have one seqno allocated per ring (and only that ring), so there are no ordering issues with multiple outstanding seqnos on the same ring. v2: Keep i915_gem_get_seqno (but move it to i915_gem.c) to make it clear that we only have one seqno counter for all rings. Suggested by Chris Wilson. v3: As suggested by Chris Wilson use i915_gem_next_request_seqno instead of ring->oustanding_lazy_request to make the follow-up refactoring more clearly correct. Also improve the commit message with issues discussed on irc. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45181 Tested-by: Nicolas Kalkhof nkalkhof()at()web.de Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-02-10Merge remote-tracking branch 'airlied/drm-fixes' into drm-intel-next-queuedDaniel Vetter
Back-merge from drm-fixes into drm-intel-next to sort out two things: - interlaced support: -fixes contains a bugfix to correctly clear interlaced configuration bits in case the bios sets up an interlaced mode and we want to set up the progressive mode (current kernels don't support interlaced). The actual feature work to support interlaced depends upon (and conflicts with) this bugfix. - forcewake voodoo to workaround missed IRQ issues: -fixes only enabled this for ivybridge, but some recent bug reports indicate that we need this on Sandybridge, too. But in a slightly different flavour and with other fixes and reworks on top. Additionally there are some forcewake cleanup patches heading to -next that would conflict with currrent -fixes. Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-01-29drm/i915/ringbuffer: kill snb blt workaroundDaniel Vetter
This was just to facilitate product enablement with pre-production hw. Allows us to kill quite a bit of cruft. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-01-29drm/i915: switch ring->id to be a real idDaniel Vetter
... and add a helpr function for the places where we want a flag. This way we can use ring->id to index into arrays. v2: Resurrect the missing beautification-space Chris Wilson noted. I'm moving this space around because I'll reuse ring_str in the next patch. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-01-25drm/i915: Remove the MI_FLUSH_ENABLE setting.Eric Anholt
We have always been using the wrong bit -- it's bit 12. However, the bit also doesn't do anything -- hardware has always accepted the MI_FLUSH command even when it was specced not to. Given that there is only one MI_FLUSH emitted in all of the driver stack on gen6+ (in i965_video.c of the 2d driver, and it should be using other code to do its flush instead), just remove the MI_FLUSH enable instead of trying to fix it. Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-01-20Revert "drm/i915: Work around gen7 BLT ring synchronization issues."Keith Packard
This reverts commit 42ff6572e5a4a7414330a4ca91f0335da67deca9. New forcewake voodoo makes this no longer necessary. Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Keith Packard <keithp@keithp.com>
2012-01-19drm/i915: paper over missed irq issues with force wake voodooDaniel Vetter
Two things seem to do the trick on my ivb machine here: - prevent the gt from powering down while waiting for seqno notification interrupts by grabbing the force_wake in get_irq (and dropping it in put_irq again). - ordering writes from the ring's CS by reading a CS register, ACTHD seems to work. Only the blt&bsd ring on ivb seem to be massively affected by this, but for paranoia do this dance also on the render ring and on snb (i.e. all gpus with forcewake). Tested with Eric's glCopyPixels loop which without this patch scores a missed irq every few seconds. This patch needs my forcewake rework to use a spinlock instead of dev->struct_mutex. After crawling through docs a lot I've found the following nugget: Internal doc "SNB GT PM Programming Guide", Section 4.3.1: "GT does not generate interrupts while in RC6 (by design)" So it looks like rc6 and irq generation are indeed related. v2: Improve the comment per Eugeni Dodonov's suggestion. v3: Add the documentation snipped. Also restrict the w/a to ivb only for -fixes, as suggested by Keith Packard. Cc: stable@kernel.org Cc: Eric Anholt <eric@anholt.net> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Eugeni Dodonov <eugeni.dodonov@intel.com> Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Keith Packard <keithp@keithp.com>
2012-01-03drm/i915: don't bail out of intel_wait_ring_buffer too earlyDaniel Vetter
In the pre-gem days with non-existing hangcheck and gpu reset code, this timeout of 3 seconds was pretty important to avoid stuck processes. But now we have the hangcheck code in gem that goes to great length to ensure that the gpu is really dead before declaring it wedged. So there's no need for this timeout anymore. Actually it's even harmful because we can bail out too early (e.g. with xscreensaver slip) when running giant batchbuffers. And our code isn't robust enough to properly unroll any state-changes, we pretty much rely on the gpu reset code cleaning up the mess (like cache tracking, fencing state, active list/request tracking, ...). With this change intel_begin_ring can only fail when the gpu is wedged, and it will return -EAGAIN (like wait_request in case the gpu reset is still outstanding). v2: Chris Wilson noted that on resume timers aren't running and hence we won't ever get kicked out of this loop by the hangcheck code. Use an insanely large timeout instead for the HAS_GEM case to prevent resume bugs from totally hanging the machine. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Keith Packard <keithp@keithp.com>
2012-01-03drm/i915: Work around gen7 BLT ring synchronization issues.Eric Anholt
Previous to this commit, testing easily reproduced a failure where the seqno would apparently arrive after the IRQ associated with it, with test programs as simple as: for (;;) { glCopyPixels(0, 0, 1, 1); glFinish(); } Various workarounds we've seen for previous generations didn't work to fix this issue, so until new information comes in, replace the IRQ waits on the BLT ring with polling. Signed-off-by: Eric Anholt <eric@anholt.net> Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Keith Packard <keithp@keithp.com>
2012-01-03drm/i915: Force sync command ordering (Gen6+)Ben Widawsky
The docs say this is required for Gen7, and since the bit was added for Gen6, we are also setting it there pit pf paranoia. Particularly as Chris points out, if PIPE_CONTROL counts as a 3d state packet. This was found through doc inspection by Ken and applies to Gen6+; Reported-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Keith Packard <keithp@keithp.com>
2011-10-20drm/i915: Use PIPE_CONTROL for flushing on gen6+.Jesse Barnes
v2 by danvet: Use a new flag to flush the render target cache on gen6+ (hw reuses the old write flush bit), as suggested by Ben Widawsdy. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> [danvet: this seems to fix cairo-perf-trace hangs on my snb] Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Keith Packard <keithp@keithp.com>