summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/i915/intel_ringbuffer.c
AgeCommit message (Collapse)Author
2014-09-16Merge tag 'drm-intel-next-2014-09-05' of ↵Dave Airlie
git://anongit.freedesktop.org/drm-intel into drm-next - final bits (again) for the rotation support (Sonika Jindal) - support bl_power in the intel backlight (Jani) - vdd handling improvements from Ville - i830M fixes from Ville - piles of prep work all over to make skl enabling just plug in (Damien, Sonika) - rename DP training defines to reflect latest edp standards, this touches all drm drivers supporting DP (Sonika Jindal) - cache edids during single detect cycle to avoid re-reading it for e.g. audio, from Chris - move w/a for registers which are stored in the hw context to the context init code (Arun&Damien) - edp panel power sequencer fixes, helps chv a lot (Ville) - piles of other chv fixes all over - much more paranoid pageflip handling with stall detection and better recovery from Chris - small things all over, as usual * tag 'drm-intel-next-2014-09-05' of git://anongit.freedesktop.org/drm-intel: (114 commits) drm/i915: Update DRIVER_DATE to 20140905 drm/i915: Decouple the stuck pageflip on modeset drm/i915: Check for a stalled page flip after each vblank drm/i915: Introduce a for_each_plane() macro drm/i915: Rewrite ABS_DIFF() in a safer manner drm/i915: Add comments explaining the vdd on/off functions drm/i915: Move DP port disable to post_disable for pch platforms drm/i915: Enable DP port earlier drm/i915: Turn on panel power before doing aux transfers drm/i915: Be more careful when picking the initial power sequencer pipe drm/i915: Reset power sequencer pipe tracking when disp2d is off drm/i915: Track which port is using which pipe's power sequencer drm/i915: Fix edp vdd locking drm/i915: Reset the HEAD pointer for the ring after writing START drm/i915: Fix unsafe vma iteration in i915_drop_caches drm/i915: init sprites with univeral plane init function drm/i915: Check of !HAS_PCH_SPLIT() in PCH transcoder funcs drm/i915: Use HAS_GMCH_DISPLAY un underrun reporting code drm/i915: Use IS_BROADWELL() instead of IS_GEN8() in forcewake code drm/i915: Don't call gen8_fbc_sw_flush() on chv ...
2014-09-16drm: backmerge tag 'v3.17-rc5' into drm-nextDave Airlie
This is requested to get the fixes for intel and radeon into the same tree for future development work. i915_display.c: fix missing dev_priv conflict.
2014-09-08drm/i915: Evict CS TLBs between batchesChris Wilson
Running igt, I was encountering the invalid TLB bug on my 845g, despite that it was using the CS workaround. Examining the w/a buffer in the error state, showed that the copy from the user batch into the workaround itself was suffering from the invalid TLB bug (the first cacheline was broken with the first two words reversed). Time to try a fresh approach. This extends the workaround to write into each page of our scratch buffer in order to overflow the TLB and evict the invalid entries. This could be refined to only do so after we update the GTT, but for simplicity, we do it before each batch. I suspect this supersedes our current workaround, but for safety keep doing both. v2: The magic number shall be 2. This doesn't conclusively prove that it is the mythical TLB bug we've been trying to workaround for so long, that it requires touching a number of pages to prevent the corruption indicates to me that it is TLB related, but the corruption (the reversed cacheline) is more subtle than a TLB bug, where we would expect it to read the wrong page entirely. Oh well, it prevents a reliable hang for me and so probably for others as well. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: stable@vger.kernel.org Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2014-09-04drm/i915: Reset the HEAD pointer for the ring after writing STARTChris Wilson
Ville found an old w/a documented for g4x that suggested that we need to reset the HEAD after writing START. This is a useful fixup for some of the g4x ring initialisation woes, but as usual, not all. v2: Do the rewrite unconditionally anyway References: https://bugs.freedesktop.org/show_bug.cgi?id=76554 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-09-03drm/i915: Remove unneeded bracketsDamien Lespiau
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-09-03drm/i915: Don't silently discard workaroundsDamien Lespiau
If we happen to emit more than I915_MAX_WA_REGS workarounds, we will currently discard them, not even emit the LRI. Not really what we want, so warn loudly. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-09-03drm/i915: Don't overrun the intel_wa_regs arrayDamien Lespiau
When entering intel_ring_emit_wa() with num_wa_regs equal to I915_MAX_WA_REGS, we end up indexing the intel_wa_regs array beyond its allocation. Fix the check then. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-09-03drm/i915: Init some CHV workarounds via LRIs in ring->init_context()Ville Syrjälä
Follow the BDW example and apply the workarounds touching registers which are saved in the context image through LRIs in the new ring->init_context() hook. This makes Mesa much happier and eg. glxgears doesn't hang after the first frame. Cc: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> [danvet: Add missing wa table initialization to avoid a functional conflict with Arun's wa table debugfs support.] Reviewed-by: "Barbalho, Rafael" <rafael.barbalho@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-09-03drm/i915/bdw: Export workaround data to debugfsArun Siluvery
The workarounds that are applied are exported to a debugfs file; this is used to verify their state after the test case (reset or suspend/resume etc). This patch is only required to support i-g-t. Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-09-03drm/i915/bdw: Apply workarounds in render ring init functionArun Siluvery
For BDW workarounds are currently initialized in init_clock_gating() but they are lost during reset, suspend/resume etc; this patch moves the WAs that are part of register state context to render ring init fn otherwise default context ends up with incorrect values as they don't get initialized until init_clock_gating fn. v2: Add workarounds to golden render state This method has its own issues, first of all this is different for each gen and it is generated using a tool so adding new workaround and mainitaining them across gens is not a straightforward process. v3: Use LRIs to emit these workarounds (Ville) Instead of modifying the golden render state the same LRIs are emitted from within the driver. v4: Use abstract name when exporting gen specific routines (Chris) For: VIZ-4092 Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-09-03drm/i915: FBC flush nuke for BDWRodrigo Vivi
According to spec FBC on BDW and HSW are identical without any gaps. So let's copy the nuke and let FBC really start compressing stuff. Without this patch we can verify with false color that nothing is being compressed. With the nuke in place and false color it is possible to see false color debugs. Unfortunatelly on some rings like BCS on BDW we have to avoid Bits 22:18 on LRIs due to a high risk of hung. So, when using Blt ring for frontbuffer rend cache would never been cleaned and FBC would stop compressing buffer. One alternative is to cache clean on software frontbuffer tracking. v2: Fix rebase conflict. v3: Do not clean cache on BCS ring. Instead use sw frontbuffer tracking. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-08-20drm/i915/bdw: Make sure gpu reset still works with ExeclistsOscar Mateo
If we reset a ring after a hang, we have to make sure that we clear out all queued Execlists requests. v2: The ring is, at this point, already being correctly re-programmed for Execlists, and the hangcheck counters cleared. v3: Daniel suggests to drop the "if (execlists)" because the Execlists queue should be empty in legacy mode (which is true, if we do the INIT_LIST_HEAD). v4: Do the pending intel_runtime_pm_put Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-08-13drm/i915: Fix up checks for aliasing ppgttDaniel Vetter
A subsequent patch will no longer initialize the aliasing ppgtt if we have full ppgtt enabled, since we simply don't need that any more. Unfortunately a few places check for the aliasing ppgtt instead of checking for ppgtt in general. Fix them up. One special case are the gtt offset and size macros, which have some code to remap the aliasing ppgtt to the global gtt. The aliasing ppgtt is _not_ a logical address space, so passing that in as the vm is plain and simple a bug. So just WARN about it and carry on - we have a gracefully fall-through anyway if we can't find the vma. Reviewed-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-08-11drm/i915/bdw: GEN-specific logical ring emit flushOscar Mateo
Same as the legacy-style ring->flush. v2: The BSD invalidate bit still exists in GEN8! Add it for the VCS rings (but still consolidate the blt and bsd ring flushes into one). This was noticed by Brad Volkin. v3: The command for BSD and for other rings is slightly different: get it exactly the same as in gen6_ring_flush + gen6_bsd_ring_flush Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> [danvet: Checkpatch.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-08-11drm/i915/bdw: New logical ring submission mechanismOscar Mateo
Well, new-ish: if all this code looks familiar, that's because it's a clone of the existing submission mechanism (with some modifications here and there to adapt it to LRCs and Execlists). And why did we do this instead of reusing code, one might wonder? Well, there are some fears that the differences are big enough that they will end up breaking all platforms. Also, Execlists offer several advantages, like control over when the GPU is done with a given workload, that can help simplify the submission mechanism, no doubt. I am interested in getting Execlists to work first and foremost, but in the future this parallel submission mechanism will help us to fine tune the mechanism without affecting old gens. v2: Pass the ringbuffer only (whenever possible). Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> [danvet: Appease checkpatch. Again. And drop the legacy sarea gunk that somehow crept in.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-08-11drm/i915/bdw: GEN-specific logical ring initOscar Mateo
Logical rings do not need most of the initialization their legacy ringbuffer counterparts do: we just need the pipe control object for the render ring, enable Execlists on the hardware and a few workarounds. v2: Squash with: "drm/i915: Extract pipe control fini & make init outside accesible". Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> [danvet: Make checkpatch happy.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-08-11drm/i915/bdw: Generic logical ring init and cleanupOscar Mateo
Allocate and populate the default LRC for every ring, call gen-specific init/cleanup, init/fini the command parser and set the status page (now inside the LRC object). These are things all engines/rings have in common. Stopping the ring before cleanup and initializing the seqnos is left as a TODO task (we need more infrastructure in place before we can achieve this). v2: Check the ringbuffer backing obj for ring_is_initialized, instead of the context backing obj (similar, but not exactly the same). Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-08-11drm/i915/bdw: Add a context and an engine pointers to the ringbufferDaniel Vetter
Any given ringbuffer is unequivocally tied to one context and one engine. By setting the appropriate pointers to them, the ringbuffer struct holds all the infromation you might need to submit a workload for processing, Execlists style. v2: Drop ring->ctx since that looks terribly ill-defined for legacy ringbuffer submission. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> (v1) Acked-by: Damien Lespiau <damien.lespiau@intel.com> (v2) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-08-11drm/i915/bdw: Allocate ringbuffers for Logical Ring ContextsOscar Mateo
As we have said a couple of times by now, logical ring contexts have their own ringbuffers: not only the backing pages, but the whole management struct. In a previous version of the series, this was achieved with two separate patches: drm/i915/bdw: Allocate ringbuffer backing objects for default global LRC drm/i915/bdw: Allocate ringbuffer for user-created LRCs Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-08-11drm/i915: Double check ring is idle before declaring the GPU wedgedChris Wilson
During ring initialisation, sometimes we observe, though not in production hardware, that the idle flag is not set even though the ring is empty. Double check before giving up. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-08-08drm/i915: No busy-loop wait_for in the ring init codeDaniel Vetter
Doing a 1s wait (tops) with the cpu is a bit excessive. Tune it down like everything else in that code. v2: Also insert the missing space Chris spotted. Cc: Naresh Kumar Kachhi <naresh.kumar.kachhi@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-08-08drm/i915: read HEAD register back in init_ring_common() to enforce orderingJiri Kosina
Withtout this, ring initialization fails reliabily during resume with [drm:init_ring_common] *ERROR* render ring initialization failed ctl 0001f001 head ffffff8804 tail 00000000 start 000e4000 This is not a complete fix, but it is verified to make the ring initialization failures during resume much less likely. We were not able to root-cause this bug (likely HW-specific to Gen4 chips) yet. This is therefore used as a ducttape before problem is fully understood and proper fix created, so that people don't suffer from completely unusable systems in the meantime. The discussion and debugging is happening at https://bugs.freedesktop.org/show_bug.cgi?id=76554 Signed-off-by: Jiri Kosina <jkosina@suse.cz> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-08-07drm/i915: Add the WaCsStallBeforeStateCacheInvalidate:bdw workaround.Kenneth Graunke
On Broadwell, any PIPE_CONTROL with the "State Cache Invalidate" bit set must be preceded by a PIPE_CONTROL with the "CS Stall" bit set. Documented on the BSpec 3D workarounds page. Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> [vsyrjala: add chv w/a note too] Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-08-07drm/i915: Refactor Broadwell PIPE_CONTROL emission into a helper.Kenneth Graunke
We'll want to reuse this for a workaround. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> [danvet: Rmove now unused int.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-07-23drm/i915: Use genX_ prefix for gt irq enable/disable functionsDaniel Vetter
Traditionally we use genX_ for GT/render stuff and the codenames for display stuff. But the gt and pm interrupt handling functions on gen5/6+ stuck out as exceptions, so convert them. Looking at the diff this nicely realigns our ducks since almost all the callers are already platform-specific functions following the genX_ pattern. Spotted while reviewing some internal rps patches. No function change in this patch. Acked-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-07-08drm/i915: HWS must be in the mappable region for g33Chris Wilson
On g33, the documentation states "HWS_PGA: Format = Bits 28:12 of graphics memory address (bits 31:29 MBZ)." which translates to that the address of the HWS must be below 256MiB, which is conveniently the mappable aperture. This also appears to be true (but not documented as so) for gen4 and gen5. To generalise we force it into the low mappable region for all non-LLC platforms. If we locate the HWS at the top of the GTT the machine will hard hang during boot (fails on pnv, gm45, ilk and byt, but works on snb, ivb, hsw). v2: Add comments to explain why use PIN_MAPPABLE even though we have no intention of mapping the object. (Ville) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-07-08drm/i915: Generalize ring_space to take a ringbufOscar Mateo
It's simple enough that it doesn't need to know anything about the engine. Trivial change. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-07-08drm/i915: Extract ringbuffer destroy & generalize alloc to take a ringbufOscar Mateo
More prep work: with Execlists, we are going to start creating a lot of extra ringbuffers soon, so these functions are handy. No functional changes. v2: rename allocate/destroy_ring_buffer to alloc/destroy_ringbuffer_obj because the name is more meaningful and to mirror a similar function in the context world: i915_gem_alloc_context_obj(). Change suggested by Brad Volkin. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-07-07drm/i915/bdw: poll semaphoresBen Widawsky
As Ville points out, it's possible/probable we don't actually need this. Potentially, this validates the letter of the spec, and not the spirit. Ville: > I discussed this on irc w/ Ben, and I was suggesting we don't need to > poll. Polling apparently can be used as a workaround for certain > hardware issues, but it looks like those issues shouldn't affect us, > for the momemnt at least. So my suggestion was to try w/o polling > first (since there could be some power cost to polling) and add the > poll bit if problems arise. Rodrigo: Spec suggests this as an W/A for GT3. However semaphores didn't worked in my BDW GT2 on Signal Mode. So pool mode is definitely needed. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Tested-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-07-07drm/i915/bdw: implement semaphore waitBen Widawsky
Semaphore waits use a new instruction, MI_SEMAPHORE_WAIT. The seqno to wait on is all well defined by the table in the previous patch. There is nothing else different from previous GEN's semaphore synchronization code. v2: Update macros to not require the other ring's ring->id (Chris) v3: Add missing VCS2 gen8_ring_wait init besides s/ring_buffer/engine_cs (Rodrigo) Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-07-07drm/i915/bdw: implement semaphore signalBen Widawsky
Semaphore signalling works similarly to previous GENs with the exception that the per ring mailboxes no longer exist. Instead you must define your own space, somewhere in the GTT. The comments in the code define the layout I've opted for, which should be fairly future proof. Ie. I tried to define offsets in abstract terms (NUM_RINGS, seqno size, etc). NOTE: If one wanted to move this to the HWSP they could. I've decided one 4k object would be easier to deal with, and provide potential wins with cache locality, but that's all speculative. v2: Update the macro to not need the other ring's ring->id (Chris) Update the comment to use the correct formula (Chris) v3: Move the macros the ringbuffer.h to prevent churn in next patch (Ville) v4: Fixed compilation rebase conflict commit 1ec9e26ddab06459e89a890431b2de064c5d1056 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Fri Feb 14 14:01:11 2014 +0100 drm/i915: Consolidate binding parameters into flags v5: VCS2 rebase Replace hweight_long with hweight32 v6 (Rodrigo): * Add missed VC2 gen8 ring signal init * fixing conflicst on rebase * minor fixes on address table * remove WARN_ON Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [danvet: s/BUG_ON/WARN_ON/] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-07-07drm/i915: Make semaphore updates more preciseBen Widawsky
With the ring mask we now have an easy way to know the number of rings in the system, and therefore can accurately predict the number of dwords to emit for semaphore signalling. This was not possible (easily) previously. There should be no functional impact, simply fewer instructions emitted. While we're here, simply do the round up to 2 instead of the fancier rounding we did before, which rounding up per mbox, ie 4. This also allows us to drop the unnecessary MI_NOOP, so not really 4, 3. v2: Use 3 dwords instead of 4 (Ville) Do the proper calculation to get the number of dwords to emit (Ville) Conditionally set .sync_to when semaphores are enabled (Ville) v3: Rebased on VCS2 Replace hweight_long with hweight32 (Ville) v4: Pull out the accidentally squashed hunk from the next patch after rebase (Daniel). v5: Fix conflict after rebase (Rodrigo) Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-07-07drm/i915: gen specific ring initBen Widawsky
Gen8 has already had some differentiation with how it handles rings. Semaphores bring yet more differences, and now is as good a time as any to do the split. Also, since gen8 doesn't actually use semaphores up until this point, put the proper "NULL" values in for the mbox info. v2: v1 had a stale commit message v3: Move everything in the is_semaphore_enabled() check v4: VCS2 rebase Remove double assignment of signal in render ring (Ville) v5: Adding missed VCS2 signal init on gen8+ (Rodrigo) Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-07-07drm/i915: Fix VCS2's ring name.Rodrigo Vivi
It just fix a typo. v2: removing underscore to let this like all other ring names (Oscar) Cc: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by (v1): Ben Widawsky <benjamin.widawsky@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-06-19drivers/i915: Fix unnoticed failure of init_ring_common()Konrad Zapalowicz
This commit add check for return value of init_ring_common() in the init_render_ring(). Now, when failure is detected the error code is propagated to the caller instead of being ignored. Signed-off-by: Konrad Zapalowicz <bergo.torino@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-06-17drm/i915: Added write-enable pte bit supporttAkash Goel
This adds support for a write-enable bit in the entry of GTT. This is handled via a read-only flag in the GEM buffer object which is then used to see how to set the bit when writing the GTT entries. Currently by default the Batch buffer & Ring buffers are marked as read only. v2: Moved the pte override code for read-only bit to 'byt_pte_encode'. (Chris) Fixed the issue of leaving 'gt_old_ro' as unused. (Chris) v3: Removed the 'gt_old_ro' field, now setting RO bit only for Ring Buffers(Daniel). v4: Added a new 'flags' parameter to all the pte(gen6) encode & insert_entries functions, in lieu of overloading the cache_level enum (Daniel). v5: Removed the superfluous VLV check & changed the definition location of PTE_READ_ONLY flag (Imre) Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-06-13drm/i915/bdw: Do not write the Semaphore Sync Registers in GEN8+Oscar Mateo
These do not exist anymore. Spotted while reading through intel_ringbuffer.c Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-06-05drm/i915: Don't WARN about ring idle bit on gen2Ville Syrjälä
Gen2 doesn't have the ring idle/stop bits in the SCPD/MI_MODE register, so don't go spewing warnings about the state of those bits. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-05-22drm/i915: Split the ringbuffers from the rings (3/3)Oscar Mateo
Manual cleanup after the previous Coccinelle script. Yes, I could write another Coccinelle script to do this but I don't want labor-replacing robots making an honest programmer's work obsolete (also, I'm lazy). Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-05-22drm/i915: Split the ringbuffers from the rings (2/3)Oscar Mateo
This refactoring has been performed using the following Coccinelle semantic script: @@ struct intel_engine_cs r; @@ ( - (r).obj + r.buffer->obj | - (r).virtual_start + r.buffer->virtual_start | - (r).head + r.buffer->head | - (r).tail + r.buffer->tail | - (r).space + r.buffer->space | - (r).size + r.buffer->size | - (r).effective_size + r.buffer->effective_size | - (r).last_retired_head + r.buffer->last_retired_head ) @@ struct intel_engine_cs *r; @@ ( - (r)->obj + r->buffer->obj | - (r)->virtual_start + r->buffer->virtual_start | - (r)->head + r->buffer->head | - (r)->tail + r->buffer->tail | - (r)->space + r->buffer->space | - (r)->size + r->buffer->size | - (r)->effective_size + r->buffer->effective_size | - (r)->last_retired_head + r->buffer->last_retired_head ) @@ expression E; @@ ( - LP_RING(E)->obj + LP_RING(E)->buffer->obj | - LP_RING(E)->virtual_start + LP_RING(E)->buffer->virtual_start | - LP_RING(E)->head + LP_RING(E)->buffer->head | - LP_RING(E)->tail + LP_RING(E)->buffer->tail | - LP_RING(E)->space + LP_RING(E)->buffer->space | - LP_RING(E)->size + LP_RING(E)->buffer->size | - LP_RING(E)->effective_size + LP_RING(E)->buffer->effective_size | - LP_RING(E)->last_retired_head + LP_RING(E)->buffer->last_retired_head ) Note: On top of this this patch also removes the now unused ringbuffer fields in intel_engine_cs. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> [danvet: Add note about fixup patch included here.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-05-22drm/i915: Split the ringbuffers from the rings (1/3)Oscar Mateo
As advanced by the previous patch, the ringbuffers and the engine command streamers belong in different structs. This is so because, while they used to be tightly coupled together, the new Logical Ring Contexts (LRC for short) have a ringbuffer each. In legacy code, we will use the buffer* pointer inside each ring to get to the pertaining ringbuffer (the actual switch will be done in the next patch). In the new Execlists code, this pointer will be NULL and we will use instead the one inside the context instead. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-05-22drm/i915: s/intel_ring_buffer/intel_engine_csOscar Mateo
In the upcoming patches we plan to break the correlation between engine command streamers (a.k.a. rings) and ringbuffers, so it makes sense to refactor the code and make the change obvious. No functional changes. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-05-20drm/i915/chv: Add some workaround notesVille Syrjälä
We implement the following workarounds: * WaDisableAsyncFlipPerfMode:chv * WaProgramMiArbOnOffAroundMiSetContext:chv v2: Drop WaDisableSemaphoreAndSyncFlipWait note Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-05-15drm/i915: Bail out early on gen6_signal if no semaphoresMika Kuoppala
If we dont have semaphores enabled, we allocate 4 dwords for signalling. But end up emitting more regardless. Fix this by bailing out early if semaphores are not enabled. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78274 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78283 Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-05-13drm/i915: Ringbuffer signal func for the second BSD ringOscar Mateo
This is missing in: commit 78325f2d270897c9ee0887125b7abb963eb8efea Author: Ben Widawsky <benjamin.widawsky@intel.com> Date: Tue Apr 29 14:52:29 2014 -0700 drm/i915: Virtualize the ringbuffer signal func Looks to me like a rebase side-effect... Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-05-12drm/i915: Use hash tables for the command parserBrad Volkin
For clients that submit large batch buffers the command parser has a substantial impact on performance. On my HSW ULT system performance drops as much as ~20% on some tests. Most of the time is spent in the command lookup code. Converting that from the current naive search to a hash table lookup reduces the performance drop to ~10%. The choice of value for I915_CMD_HASH_ORDER allows all commands currently used in the parser tables to hash to their own bucket (except for one collision on the render ring). The tradeoff is that it wastes memory. Because the opcodes for the commands in the tables are not particularly well distributed, reducing the order still leaves many buckets empty. The increased collisions don't seem to have a huge impact on the performance gain, but for now anyhow, the parser trades memory for performance. NB: Ville noticed that the error paths through the ring init code will leak memory. I've not addressed that here. We can do a follow up pass to handle all of the leaks. v2: improved comment describing selection of hash key mask (Damien) replace a BUG_ON() with an error return (Tvrtko, Ville) commit message improvements Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-05-08drm/i915: Flush request queue when waiting for ring spaceChris Wilson
During the review of commit 1f70999f9052f5a1b0ce1a55aff3808f2ec9fe42 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Jan 27 22:43:07 2014 +0000 drm/i915: Prevent recursion by retiring requests when the ring is full Ville raised the point that our interaction with request->tail was likely to foul up other uses elsewhere (such as hang check comparing ACTHD against requests). However, we also need to restore the implicit retire requests that certain test cases depend upon (e.g. igt/gem_exec_lut_handle), this raises the spectre that the ppgtt will randomly call i915_gpu_idle() and recurse back into intel_ring_begin(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78023 Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com> [danvet: Remove now unused 'tail' variable as spotted by Brad.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-05-08drm/i915: Improve fallback ring waitingChris Wilson
A few improvements to the fallback method for waiting upon ring space: 1. Fix the start/end wait tracepoints to always be paired. 2. Increase responsiveness of checking 3. Mark the process as waiting upon io 4. Check for signal interruptions Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com> [danvet: Drop the s/msleep/io_schedule_timeout/ change again since the latter isn't exported.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-05-05drm/i915: Support 64b execbufBen Widawsky
Previously, our code only had a 32b offset value for where the batchbuffer starts. With full PPGTT, and 64b canonical GPU address space, that is an insufficient value. The code to expand is pretty straight forward, and only one platform needs to do anything with the extra bits. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-05-05drm/i915: Move ring_begin to signal()Ben Widawsky
Add_request has always contained both the semaphore mailbox updates as well as the breadcrumb writes. Since the semaphore signal is the one which actually knows about the number of dwords it needs to emit to the ring, we move the ring_begin to that function. This allows us to remove the hideously shared #define On a related not, gen8 will use a different number of dwords for semaphores, but not for add request. v2: Make number of dwords an explicit part of signalling (via function argument). (Chris) v3: very slight comment change Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>