summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/i915/intel_ringbuffer.c
AgeCommit message (Collapse)Author
2014-03-28drm/i915: Remove the enabling of VS_TIMER_DISPATCH bit in MI MODE regAkash Goel
This patch Removes the VS_TIMER_DISPATCH bit enable in MI MODE reg for platforms > Gen6. VS_TIMER_DISPATCH bit enable was earlier required as a part of WA 'WaTimedSingleVertexDispatch', which is now applicable only to platforms < Gen7. v2: Enhancing the scope of the patch to full Gen7 (Chris) v3: Modifying the WA condition to the cover the applicable platforms, and adding the WA name in comments. (Ville) Signed-off-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> # ivb, hsw -Chris Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-13drm/i915/bdw: The TLB invalidation mechanism has been removed from INSTPMDamien Lespiau
While wandering in the spec, I noticed that BDW removes those 2 bits from INSTPM. I couldn't find any direct way to invalidate the TLB (ie without the ring working already). Maybe someone will be more lucky. At least, we now know we may be a problem. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-12drm/i915: warn if ring is active before sync flushNaresh Kumar Kachhi
Based on Bspec the command parser must be stopped prior to issuing sync flush. This should be done by the caller of intel_ring_setup_status_page. Patch adds a warning if it is not done. v2: rebased based on new patch (wait for ring to become idle) Signed-off-by: Naresh Kumar Kachhi <naresh.kumar.kachhi@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-12drm/i915: wait for rings to become idle once disabledNaresh Kumar Kachhi
make sure we wait for rings to become idle once they are disabled. In case of timeout print an error message Signed-off-by: Naresh Kumar Kachhi <naresh.kumar.kachhi@intel.com> [danvet: Frob patch as suggested by Chris.] Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-12drm/i915: disable rings before HW status page setupNaresh Kumar Kachhi
Rings should be idle before issuing sync_flush (in intel_ring_setup_status_page). This patch moves the ring disabling before doing the HW status page setup. Signed-off-by: Naresh Kumar Kachhi <naresh.kumar.kachhi@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-10Merge tag 'v3.14-rc6' into drm-intel-next-queuedDaniel Vetter
Linux 3.14-rc6 I need the hdmi/dvi-dual link fixes in 3.14 to avoid ugly conflicts when merging Ville's new hdmi cloning support into my -next tree Conflicts: drivers/gpu/drm/i915/Makefile drivers/gpu/drm/i915/intel_dp.c Makefile cleanup conflicts with an acpi build fix, intel_dp.c is trivial. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-07drm/i915: Implement command buffer parsing logicBrad Volkin
The command parser scans batch buffers submitted via execbuffer ioctls before the driver submits them to hardware. At a high level, it looks for several things: 1) Commands which are explicitly defined as privileged or which should only be used by the kernel driver. The parser generally rejects such commands, with the provision that it may allow some from the drm master process. 2) Commands which access registers. To support correct/enhanced userspace functionality, particularly certain OpenGL extensions, the parser provides a whitelist of registers which userspace may safely access (for both normal and drm master processes). 3) Commands which access privileged memory (i.e. GGTT, HWS page, etc). The parser always rejects such commands. See the overview comment in the source for more details. This patch only implements the logic. Subsequent patches will build the tables that drive the parser. v2: Don't set the secure bit if the parser succeeds Fail harder during init Makefile cleanup Kerneldoc cleanup Clarify module param description Convert ints to bools in a few places Move client/subclient defs to i915_reg.h Remove the bits_count field OTC-Tracker: AXIA-4631 Change-Id: I50b98c71c6655893291c78a2d1b8954577b37a30 Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> [danvet: Appease checkpatch.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05drm/i915: We implement WaDisableAsyncFlipPerfMode:bdwVille Syrjälä
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-14drm/i915: Handle set_cache_level errors in the status page setupDaniel Vetter
Split out from Chris vma-bind rework. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-14drm/i915: Don't pin the status page as mappableDaniel Vetter
We access it through the cpu window. No functional difference expected atm since we default to a bottom-up allocation scheme. But that might eventually change so that we prefer the unmappable range for buffers that don't need cpu gtt access. Split out from Chris vma-bind rework. Note that this is only possible due to the split-up of the mappable pin flag into PIN_GLOBAL and PIN_MAPPABLE. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-14drm/i915: Don't set PIN_MAPPABLE for legacy ringbuffersDaniel Vetter
Tighter code since legacy gem has only mappable anyway. Split out from Chris vma-bind rework. Note that this is only possible due to the split-up of the mappable pin flag into PIN_GLOBAL and PIN_MAPPABLE. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-14drm/i915: Handle set_cache_level errors in the pipe control scratch setupDaniel Vetter
Split out from Chris vma-bind rework. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-14drm/i915: Consolidate binding parameters into flagsDaniel Vetter
Anything more than just one bool parameter is just a pain to read, symbolic constants are much better. Split out from Chris' vma-binding rework patch. v2: Undo the behaviour change in object_pin that Chris spotted. v3: Split out misplaced hunk to handle set_cache_level errors, spotted by Jani. v4: Keep the current over-zealous binding logic in the execbuffer code working with a quick hack while the overall binding code gets shuffled around. v5: Reorder the PIN_ flags for more natural patch splitup. v6: Pull out the PIN_GLOBAL split-up again. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-12drm/i915: protect ringbuffer sarea update behind !MODESETDaniel Vetter
Avoids surprises when userspace races open/closes against this. Cc: Stéphane Marchesin <marcheu@chromium.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-11drm/i915: Add intel_ring_cachline_align()Ville Syrjälä
intel_ring_cachline_align() emits MI_NOOPs until the ring tail is aligned to a cacheline boundary. Cc: Bjoern C <lkml@call-home.ch> Cc: Alexandru DAMIAN <alexandru.damian@intel.com> Cc: Enrico Tagliavini <enrico.tagliavini@gmail.com> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org (prereq for the next patch) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-06drm/i915: Prevent recursion by retiring requests when the ring is fullChris Wilson
As the VM do not track activity of objects and instead use a large hammer to forcibly idle and evict all of their associated objects when one is released, it is possible for that to cause a recursion when we need to wait for free space on a ring and call retire requests. (intel_ring_begin -> intel_ring_wait_request -> i915_gem_retire_requests_ring -> i915_gem_context_free -> i915_gem_evict_vm -> i915_gpu_idle -> intel_ring_begin etc) In order to remove the requirement for calling retire-requests from intel_ring_wait_request, we have to inline a couple of steps from retiring requests, notably we have to record the position of the request we wait for and use that to update the available ring space. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-01-25Merge branch 'topic/ppgtt' into drm-intel-next-queuedDaniel Vetter
Because whatever.* * This should contain a fairly long list of issues and still unresolved resgressions, but I didn't really get a vote. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-01-22drm/i915: Fix disabled semaphoresBen Widawsky
The ring will emit too many if semaphores are disabled since we do not add the correct number to num_dwords anymore. This was introduced: commit 52ed23253b68e1cf154b03d91bed619504cf955b Author: Ben Widawsky <benjamin.widawsky@intel.com> Date: Mon Dec 16 20:50:38 2013 -0800 drm/i915: Don't emit mbox updates without semaphores FWIW, the bug was fixed later in the series. /me hangs head in shame. Daniel: Also note that we should have merged the read-only semaphore modparam before this patch. Reported-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-01-07drm/i915: Flush outstanding requests before allocating new seqnoChris Wilson
In very rare cases (such as a memory failure stress test) it is possible to fill the entire ring without emitting a request. Under this circumstance, the outstanding request is flushed and waited upon. After space on the ring is cleared, we return to emitting the new command - except that we just cleared the seqno allocated for this operation and trigger the sanity check that a request is only ever emitted with a valid seqno. The fix is to rearrange the code to make sure the allocation of the seqno for this operation is after any required flushes of outstanding operations. The bug exists since the preallocation was introduced in commit 9d7730914f4cd496e356acfab95b41075aa8eae8 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Nov 27 16:22:52 2012 +0000 drm/i915: Preallocate next seqno before touching the ring Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: stable@vger.kernel.org Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18drm/i915: Make pin count per VMABen Widawsky
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-17drm/i915: Don't emit mbox updates without semaphoresBen Widawsky
Aside from the fact that it leaves confusing dumps on error capture, it is entirely unnecessary, and potentially harmful in cases like BDW, where the instruction has changed. In reality (seemingly), this will have no behavioral impact. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-28drm/i915: Add power well arguments to force wake routines.Deepak S
Added power well arguments to all the force wake routines to help us individually control power well based on the scenario. Signed-off-by: Deepak S <deepak.s@intel.com> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> [danvet: Resolve conflict with the removed forcewake hack and drop one spurious hunk Jesse noticed.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-26drm/i915: Drop forcewake w/a for missed interrupts/seqno on SandybridgeChris Wilson
I believe, and an evening of i-g-t, that our original workaround for the missed interrupts on Sandybridge, that of holding forcewake whilst we wait for an interrupts, is no longer required. This leaves us dependent on the second workaround of forcing an UC read of the ACTHD before reading back the seqno from the snooped HWS. Dropping the forcewake should allow us to conserve a little power, not much as the GPU is meant to be busy whilst we wait for it! Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-21drm/i915: Emit SRM after the MSG_FBC_REND_STATE LRIVille Syrjälä
The spec tells us that we need to emit an SRM after the LRI to MSG_FBC_REND_STATE. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-21drm/i915: Limit FBC flush to post batch flushVille Syrjälä
Don't issue the FBC nuke/cache clean command when invalidate_domains!=0. That would indicate that we're not being called for the post-batch flush. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-14drm/i915/bdw: Add comment about gen8 HWS PGABen Widawsky
This confused me some many times that I think it is appropriate to add a small comment to instruct the reader of the code that it is indeed doing what it is supposed to do. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08drm/i915/bdw: Take render error interrupt out of the maskDaniel Vetter
The handling of the error interrupts isn't wired up at all. And it hasn't been ever since ilk happened, so don't bother. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08drm/i915/bdw: BSD init for gen8 alsoBen Widawsky
This was an oversight and should have been in a previous series somewhere. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08drm/i915/bdw: Render ring flushingBen Widawsky
PIPE_CONTROL added the high address dword. I'm not sure how the simulator let me get away with this. I've explicitly left out all the workarounds from Gen7 because in the minimal digging that I did, most don't seem necessary, and the simulator doesn't complain without them Note that BLT and BSD ring commands had already been updated previously. Just render/pipe_control should have been broken. v2: Squash in a fixup from Ville to follow the recent IVB PIPE_CONTROL updates: "BDW uses the IVB PIPE_CONTROL style for specifying GTT vs. PPGTT for the PIPE_CONTROL QW/DW write." v3: Rebase on top of Chris' cleanup to have an explicit ring->scratch buffer object instead of an opaque ring->private where everyone stores the same stuff inside. Reported-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (for the fixup) Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08drm/i915/bdw: unleash PPGTTBen Widawsky
v2: Squash in fix from Ben: Set PPGTT batches as necessary This fixes the regression in the last couple of days when we enabled PPGTT. v3: Squash in fixup to still use GTT for secure batches from Ville: BDW doesn't have a separate secure vs. non-secure bit in MI_BATCH_BUFFER_START. So for secure batches we have to simply leave the PPGTT bit unset. Fortunately older generations (except HSW) had similar limitations so execbuffer already creates a GTT mapping for all secure batches. Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08drm/i915/bdw: Update MI_FLUSH_DWBen Widawsky
The code is more verbose than necessary for the reader's sake, hopefully the compiler optimizes away the if. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08drm/i915/bdw: dispatch updates (64b related)Ben Widawsky
The command to emit batch buffers has changed to address 48b addresses. It seemed reasonable that we could still use the old instruction where emitting 0 for length would do the right thing, but it seems to bother the simulator when the code does that. Now the second dword in the command has the upper 16b of the address of the batchbuffer. v2: Remove duplicated vfun assignment. v3: Squash in VECS support changes from Zhao Yakui <yakui.zhao@intel.com> v4: Make checkpatch happy. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v2) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08drm/i915/bdw: Implement interrupt changesBen Widawsky
The interrupt handling implementation remains the same as previous generations with the 4 types of registers, status, identity, mask, and enable. However the layout of where the bits go have changed entirely. To address these changes, all of the interrupt vfuncs needed special gen8 code. The way it works is there is a top level status register now which informs the interrupt service routine which unit caused the interrupt, and therefore which interrupt registers to read to process the interrupt. For display the division is quite logical, a set of interrupt registers for each pipe, and in addition to those, a set each for "misc" and port. For GT the things get a bit hairy, as seen by the code. Each of the GT units has it's own bits defined. They all look *very similar* and resides in 16 bits of a GT register. As an example, RCS and BCS share register 0. To compact the code a bit, at a slight expense to complexity, this is exactly how the code works as well. 2 structures are added to the ring buffer so that our ring buffer interrupt handling code knows which ring shares the interrupt registers, and a shift value (ie. the top or bottom 16 bits of the register). The above allows us to kept the interrupt register caching scheme, the per interrupt enables, and the code to mask and unmask interrupts relatively clean (again at the cost of some more complexity). Most of the GT units mentioned above are command streamers, and so the symmetry should work quite well for even the yet to be implemented rings which Broadwell adds. v2: Fixes up a couple of bugs, and is more verbose about errors in the Broadwell interrupt handler. v3: fix DE_MISC IER offset v4: Simplify interrupts: I totally misread the docs the first time I implemented interrupts, and so this should greatly simplify the mess. Unlike GEN6, we never touch the regular mask registers in irq_get/put. v5: Rebased on to of recent pch hotplug setup changes. v6: Fixup on top of moving num_pipes to intel_info. v7: Rebased on top of Egbert Eich's hpd irq handling rework. Also wired up ibx_hpd_irq_setup for gen8. v8: Rebase on top of Jani's asle handling rework. v9: Rebase on top of Ben's VECS enabling for Haswell, where he unfortunately went OCD on the gt irq #defines. Not that they're still not yet fully consistent: - Used the GT_RENDER_ #defines + bdw shifts. - Dropped the shift from the L3_PARITY stuff, seemed clearer. - s/irq_refcount/irq_refcount.gt/ v10: Squash in VECS enabling patches and the gen8_gt_irq_handler refactoring from Zhao Yakui <yakui.zhao@intel.com> v11: Rebase on top of the interrupt cleanups in upstream. v12: Rebase on top of Ben's DPF changes in upstream. v13: Drop bdw from the HAS_L3_DPF feature flag for now, it's unclear what exactly needs to be done. Requested by Ben. v14: Fix the patch. - Drop the mask of reserved bits and assorted logic, it doesn't match the spec. - Do the posting read inconditionally instead of commenting it out. - Add a GEN8_MASTER_IRQ_CONTROL definition and use it. - Fix up the GEN8_PIPE interrupt defines and give the GEN8_ prefixes - we actually will need to use them. - Enclose macros in do {} while (0) (checkpatch). - Clear DE_MISC interrupt bits only after having processed them. - Fix whitespace fail (checkpatch). - Fix overtly long lines where appropriate (checkpatch). - Don't use typedef'ed private_t (maintainer-scripts). - Align the function parameter list correctly. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v4) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> bikeshed
2013-10-16drm/i915: Do a fuller init after resetBen Widawsky
I had this lying around from he original PPGTT series, and thought we might try to get it in by itself. It's convenient to just call i915_gem_init_hw at reset because we'll be adding new things to that function, and having just one function to call instead of reimplementing it in two places is nice. In order to accommodate we cleanup ringbuffers in order to bring them back up cleanly. Optionally, we could also teardown/re initialize the default context but this was causing some problems on reset which I wasn't able to fully debug, and is unnecessary with the previous context init/enable split. This essentially reverts: commit 8e88a2bd5987178d16d53686197404e149e996d9 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Tue Jun 19 18:40:00 2012 +0200 drm/i915: don't call modeset_init_hw in i915_reset It seems to work for me on ILK now. Perhaps it's due to: commit 8a5c2ae753c588bcb2a4e38d1c6a39865dbf1ff3 Author: Jesse Barnes <jbarnes@virtuousgeek.org> Date: Thu Mar 28 13:57:19 2013 -0700 drm/i915: fix ILK GPU reset for render Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-10-10drm/i915: Remove gen specific checks in MMIOBen Widawsky
Now that MMIO has been split up into gen specific functions it is obvious when HAS_FPGA_DBG_UNCLAIMED, HAS_FORCE_WAKE are needed. As such, we can remove this extraneous condition. As a result of this, as well as previously existing function pointers for forcewake, we no longer need the has_force_wake member in the device specific data structure. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-09-19drm/i915: s/HAS_L3_GPU_CACHE/HAS_L3_DPFBen Widawsky
We'd only ever used this define to denote whether or not we have the dynamic parity feature (DPF) and never to determine whether or not L3 exists. Baytrail is a good example of where L3 exists, and not DPF. This patch provides clarify in the code for future use cases which might want to actually query whether or not L3 exists. v2: Add /* DPF == dynamic parity feature */ Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-09-19drm/i915: Add second slice l3 remappingBen Widawsky
Certain HSW SKUs have a second bank of L3. This L3 remapping has a separate register set, and interrupt from the first "slice". A slice is simply a term to define some subset of the GPU's l3 cache. This patch implements both the interrupt handler, and ability to communicate with userspace about this second slice. v2: Remove redundant check about non-existent slice. Change warning about interrupts of unknown slices to WARN_ON_ONCE Handle the case where we get 2 slice interrupts concurrently, and switch the tracking of interrupts to be non-destructive (all Ville) Don't enable/mask the second slice parity interrupt for ivb/vlv (even though all docs I can find claim it's rsvd) (Ville + Bryan) Keep BYT excluded from L3 parity v3: Fix the slice = ffs to be decremented by one (found by Ville). When I initially did my testing on the series, I was using 1-based slice counting, so this code was correct. Not sure why my simpler tests that I've been running since then didn't pick it up sooner. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-09-10drm/i915: Write RING_TAIL once per-requestChris Wilson
Ignoring the legacy DRI1 code, and a couple of special cases (to be discussed later), all access to the ring is mediated through requests. The first write to a ring will grab a seqno and mark the ring as having an outstanding_lazy_request. Either through explicitly adding a request after an execbuffer or through an implicit wait (either by the CPU or by a semaphore), that sequence of writes will be terminated with a request. So we can ellide all the intervening writes to the tail register and send the entire command stream to the GPU at once. This will reduce the number of *serialising* writes to the tail register by a factor or 3-5 times (depending upon architecture and number of workarounds, context switches, etc involved). This becomes even more noticeable when the register write is overloaded with a number of debugging tools. The astute reader will wonder if it is then possible to overflow the ring with a single command. It is not. When we start a command sequence to the ring, we check for available space and issue a wait in case we have not. The ring wait will in this case be forced to flush the outstanding register write and then poll the ACTHD for sufficient space to continue. The exception to the rule where everything is inside a request are a few initialisation cases where we may want to write GPU commands via the CS before userspace wakes up and page flips. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-09-05drm/i915; Preallocate the lazy requestChris Wilson
It is possible for us to be forced to perform an allocation for the lazy request whilst running the shrinker. This allocation may fail, leaving us unable to reclaim any memory leading to premature OOM. A neat solution to the problem is to preallocate the request at the same time as acquiring the seqno for the ring transaction. This means that we can report ENOMEM prior to touching the rings. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-09-05drm/i915: Rename ring->outstanding_lazy_requestChris Wilson
Prior to preallocating an request for lazy emission, rename the existing field to make way (and differentiate the seqno from the request struct). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-09-03drm/i915: Embed the ring->private within the struct intel_ring_bufferChris Wilson
We now have more devices using ring->private than not, and they all want the same structure. Worse, I would like to use a scratch page from outside of intel_ringbuffer.c and so for convenience would like to reuse ring->private. Embed the object into the struct intel_ringbuffer so that we can keep the code clean. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-09-02Merge branch 'drm-next-3.12' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie
into drm-next Alex writes: This is the radeon drm-next request. Big changes include: - support for dpm on CIK parts - support for ASPM on CIK parts - support for berlin GPUs - major ring handling cleanup - remove the old 3D blit code for bo moves in favor of CP DMA or sDMA - lots of bug fixes [airlied: fix up a bunch of conflicts from drm_order removal] * 'drm-next-3.12' of git://people.freedesktop.org/~agd5f/linux: (898 commits) drm/radeon/dpm: make sure dc performance level limits are valid (CI) drm/radeon/dpm: make sure dc performance level limits are valid (BTC-SI) (v2) drm/radeon: gcc fixes for extended dpm tables drm/radeon: gcc fixes for kb/kv dpm drm/radeon: gcc fixes for ci dpm drm/radeon: gcc fixes for si dpm drm/radeon: gcc fixes for ni dpm drm/radeon: gcc fixes for trinity dpm drm/radeon: gcc fixes for sumo dpm drm/radeonn: gcc fixes for rv7xx/eg/btc dpm drm/radeon: gcc fixes for rv6xx dpm drm/radeon: gcc fixes for radeon_atombios.c drm/radeon: enable UVD interrupts on CIK drm/radeon: fix init ordering for r600+ drm/radeon/dpm: only need to reprogram uvd if uvd pg is enabled drm/radeon: check the return value of uvd_v1_0_start in uvd_v1_0_init drm/radeon: split out radeon_uvd_resume from uvd_v4_2_resume radeon kms: fix uninitialised hotplug work usage in r100_irq_process() drm/radeon/audio: set up the sads on DCE3.2 asics drm/radeon: fix handling of variable sized arrays for router objects ... Conflicts: drivers/gpu/drm/i915/i915_dma.c drivers/gpu/drm/i915/i915_gem_dmabuf.c drivers/gpu/drm/i915/intel_pm.c drivers/gpu/drm/radeon/cik.c drivers/gpu/drm/radeon/ni.c drivers/gpu/drm/radeon/r600.c
2013-08-23drm/i915: wrap GEN6_PMIMR changesPaulo Zanoni
Just like we're doing with the other IMR changes. One of the functional changes is that not every caller was doing the POSTING_READ. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-23drm/i915: wrap GTIMR changesPaulo Zanoni
Just like the functions that touch DEIMR and SDEIMR, but for GTIMR. The new functions contain a POSTING_READ(GTIMR) which was not present at the 2 callers inside i915_irq.c. The implementation is based on ibx_display_interrupt_update. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-22drm/i915: Initialize seqno for VECS tooBen Widawsky
We require n-1 mailboxes for proper semaphore synchronization. All semaphore synchronization code relies on proper values in these mailboxes. The fact that we failed to touch the vebox ring by itself was unlikely to be an issue since the HW should be initializing the values to 0. However the error framework for testing seqno wrap introduced by Mika, in addition to the hangcheck via seqno, and i915_error_first_batchbuffer() combined caused a nice explosion. The problem is caused by seqno wrap because the wrap condition is not properly setup. The wrap code attempts to set the sync mailboxes all to 0, and then set the current seqno to one less than 0. In all cases, the vebox mailbox wasn't properly being initialized. This caused a wrap to not occur. When hangcheck kicks in with the bogus seqno values, the rest just doesn't work. It makes me wonder if we shouldn't consider a dumber version of hangcheck... How we messed this up: VECS support was written before the aforementioned other features. Upon VECS being rebased, these facts were missed. Cc: Mika Kuoppala <mika.kuoppala@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65387 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67198 Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-18drm/i915: Invalidate TLBs for the rings after a resetChris Wilson
After any "soft gfx reset" we must manually invalidate the TLBs associated with each ring. Empirically, it seems that a suspend/resume or D3-D0 cycle count as a "soft reset". The symptom is that the hardware would fail to note the new address for its status page, and so it would continue to write the shadow registers and breadcrumbs into the old physical address (now used by something completely different, scary). Whereas the driver would read the new status page and never see any progress, it would appear that the GPU hung immediately upon resume. Based on a patch by naresh kumar kachhi <naresh.kumar.kacchi@intel.com> Reported-by: Thiago Macieira <thiago@kde.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64725 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Thiago Macieira <thiago@kde.org> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-05drm/i915: Add VM to pinBen Widawsky
To verbalize it, one can say, "pin an object into the given address space." The semantics of pinning remain the same otherwise. Certain objects will always have to be bound into the global GTT. Therefore, global GTT is a special case, and keep a special interface around for it (i915_gem_obj_ggtt_pin). v2: s/i915_gem_ggtt_pin/i915_gem_obj_ggtt_pin Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-25Merge commit 'Merge branch 'drm-fixes' of ↵Daniel Vetter
git://people.freedesktop.org/~airlied/linux' This backmerges Linus' merge commit of the latest drm-fixes pull: commit 549f3a1218ba18fcde11ef0e22b07e6365645788 Merge: 42577ca 058ca4a Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Tue Jul 23 15:47:08 2013 -0700 Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux We've accrued a few too many conflicts, but the real reason is that I want to merge the 100% solution for Haswell concurrent registers writes into drm-intel-next. But that depends upon the 90% bandaid merged into -fixes: commit a7cd1b8fea2f341b626b255d9898a5ca5fabbf0a Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jul 19 20:36:51 2013 +0100 drm/i915: Serialize almost all register access Also, we can roll up on accrued conflicts. Usually I'd backmerge a tagged -rc, but I want to get this done before heading off to vacations next week ;-) Conflicts: drivers/gpu/drm/i915/i915_dma.c drivers/gpu/drm/i915/i915_gem.c v2: For added hilarity we have a init sequence conflict around the gt_lock, so need to move that one, too. Spotted by Jani Nikula. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-11drm/i915: don't enable PM_VEBOX_CS_ERROR_INTERRUPTDaniel Vetter
The code to handle it is broken - there's simply no code to clear CS parser errors on gen5+. And behold, for all the other rings we also don't enable it! Leave the handling code itself in place just to be consistent with the existing mess though. And in case someone feels like fixing it all up. This has been errornously enabled in commit 12638c57f31952127c734c26315e1348fa1334c2 Author: Ben Widawsky <ben@bwidawsk.net> Date: Tue May 28 19:22:31 2013 -0700 drm/i915: Enable vebox interrupts Cc: Damien Lespiau <damien.lespiau@intel.com> Cc: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-11drm/i915: unify ring irq refcounts (again)Daniel Vetter
With the simplified locking there's no reason any more to keep the refcounts seperate. v2: Readd the lost comment that ring->irq_refcount is protected by dev_priv->irq_lock. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>