summaryrefslogtreecommitdiffstats
path: root/drivers/gpu
AgeCommit message (Collapse)Author
2015-01-13drm/i915: Improve HiZ throughput on Cherryview.Kenneth Graunke
Found by reading the HIZ_CHICKEN documentation. Improves performance in a HiZ microbenchmark by around 50%. Improves performance in OglZBuffer by around 18%. Thanks to Chris Wilson for helping me figure out where to put this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-13drm/i915: Reset CSB read pointer in ring initThomas Daniel
A previous commit enabled execlists by default: commit 27401d126b5b ("drm/i915/bdw: Enable execlists by default where supported") This allowed routine testing of execlists which exposed a regression when resuming from suspend. The cause was tracked down the to recent changes to the ring init sequence: commit 35a57ffbb108 ("drm/i915: Only init engines once") During a suspend/resume cycle the hardware Context Status Buffer write pointer is reset. However since the recent changes to the init sequence the software CSB read pointer is no longer reset. This means that context status events are not handled correctly and new contexts are not written to the ELSP, resulting in an apparent GPU hang. Pending further changes to the ring init code, just move the ring->next_context_status_buffer initialization into gen8_init_common_ring to fix this regression. v2: Moved init into gen8_init_common_ring rather than context_enable after feedback from Daniel Vetter. Updated commit msg to reflect this and also cite commits related to the regression. Fixed bz link to correct bug. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88096 Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-12drm/i915: Drop unused position fields (v2)Matt Roper
The userspace-requested plane coordinates are now always available via plane->state.base (and the i915-adjusted values are stored in plane->state), so we no longer use the coordinate fields in intel_plane and can drop them. Also, note that the error case for pageflip calls update_plane() to program the values from plane->state; it's simpler to just call intel_plane_restore() which does the same thing. v2: Replace manual update_plane() with intel_plane_restore() in pageflip error handler. Reviewed-by(v1): Bob Paauwe <bob.j.paauwe@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-12drm/i915: Move to atomic plane helpers (v9)Matt Roper
Switch plane handling to use the atomic plane helpers. This means that rather than provide our own implementations of .update_plane() and .disable_plane(), we expose the lower-level check/prepare/commit/cleanup entrypoints and let the DRM core implement update/disable for us using those entrypoints. The other main change that falls out of this patch is that our drm_plane's will now always have a valid plane->state that contains the relevant plane state (initial state is allocated at plane creation). The base drm_plane_state pointed to holds the requested source/dest coordinates, and the subclassed intel_plane_state holds the adjusted values that our driver actually uses. v2: - Renamed file from intel_atomic.c to intel_atomic_plane.c (Daniel) - Fix a copy/paste comment mistake (Bob) v3: - Use prepare/cleanup functions that we've already factored out - Use newly refactored pre_commit/commit/post_commit to avoid sleeping during vblank evasion v4: - Rebase to latest di-nightly requires adding an 'old_state' parameter to atomic_update; v5: - Must have botched a rebase somewhere and lost some work. Restore state 'dirty' flag to let begin/end code know which planes to run the pre_commit/post_commit hooks for. This would have actually shown up as broken in the next commit rather than this one. v6: - Squash kerneldoc patch into this one. - Previous patches have now already taken care of most of the infrastructure that used to be in this patch. All we're adding here now is some thin wrappers. v7: - Check return of intel_plane_duplicate_state() for allocation failures. v8: - Drop unused drm_plane_state -> intel_plane_state cast. (Ander) - Squash in actual transition to plane helpers. Significant refactoring earlier in the patchset has made the combined prep+transition much easier to swallow than it was in earlier iterations. (Ander) v9: - s/track_fbs/disabled_planes/ in the atomic crtc flags. The only fb's we need to update frontbuffer tracking for are those on a plane about to be disabled (since the atomic helpers never call prepare_fb() when disabling a plane), so the new name more accurately describes what we're actually tracking. Testcase: igt/kms_plane Testcase: igt/kms_universal_plane Testcase: igt/kms_cursor_crc Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-12drm/i915: Clarify sprite plane function names (v4)Matt Roper
A few of the sprite-related function names in i915 are very similar (e.g., intel_enable_planes() vs intel_crtc_enable_planes()) and don't make it clear whether they only operate on sprite planes, or whether they also apply to all universal plane types. Rename a few functions to be more consistent with our function naming for primary/cursor planes or to clarify that they apply specifically to sprite planes: - s/intel_disable_planes/intel_disable_sprite_planes/ - s/intel_enable_planes/intel_enable_sprite_planes/ Also, drop the sprite-specific intel_destroy_plane() and just use the type-agnostic intel_plane_destroy() function. The extra 'disable' call that intel_destroy_plane() did is unnecessary since the plane will already be disabled due to framebuffer destruction by the point it gets called. v2: Earlier consolidation patches have reduced the number of functions we need to rename here. v3: Also rename intel_plane_funcs vtable to intel_sprite_plane_funcs for consistency with primary/cursor. (Ander) v4: Convert comment for intel_plane_destroy() to kerneldoc now that it is no longer a static function. (Ander) Reviewed-by(v1): Bob Paauwe <bob.j.paauwe@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-12drm/i915: Move vblank evasion to commit (v4)Matt Roper
Move the vblank evasion up from the low-level, hw-specific update_plane() handlers to the general plane commit operation. Everything inside commit should now be non-sleeping, so this brings us closer to how vblank evasion will behave once we move over to atomic. v2: - Restore lost intel_crtc->active check on vblank evasion v3: - Replace assert_pipe_enabled() in intel_disable_primary_hw_plane() with an intel_crtc->active test; it turns out assert_pipe_enabled() grabs some mutexes and can sleep, which we can't do with interrupts disabled. v4: - Equivalent to v2; v3 change is now squashed into an earlier patch of the series. (Ander). Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-12drm/i915: Refactor work that can sleep out of commit (v7)Matt Roper
Once we integrate our work into the atomic pipeline, plane commit operations will need to happen with interrupts disabled, due to vblank evasion. Our commit functions today include sleepable work, so those operations need to be split out and run either before or after the atomic register programming. The solution here calculates which of those operations will need to be performed during the 'check' phase and sets flags in an intel_crtc sub-struct. New intel_begin_crtc_commit() and intel_finish_crtc_commit() functions are added before and after the actual register programming; these will eventually be called from the atomic plane helper's .atomic_begin() and .atomic_end() entrypoints. v2: Fix broken sprite code split v3: Make the pre/post commit work crtc-based to match how we eventually want this to be called from the atomic plane helpers. v4: Some platforms that haven't had their watermark code reworked were waiting for vblank, then calling update_sprite_watermarks in their platform-specific disable code. These also need to be flagged out of the critical section. v5: Sprite plane test for primary show/hide should just set the flag to wait for pending flips, not actually perform the wait. (Ander) v6: - Rebase onto latest di-nightly; picks up an important runtime PM fix. - Handle 'wait_for_flips' flag in intel_begin_crtc_commit(). (Ander) - Use wait_for_flips flag for primary plane update rather than performing the wait in the check routine. - Added kerneldoc to pre_disable/post_enable functions that are no longer static. (Ander) - Replace assert_pipe_enabled() in intel_disable_primary_hw_plane() with an intel_crtc->active test; it turns out assert_pipe_enabled() grabs some mutexes and can sleep, which we can't do with interrupts disabled. v7: - Check for fb != NULL when deciding whether the sprite plane hides the primary plane during a sprite update. (PRTS) Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-12drm/i915: fix build for CONFIG_BUG=nJani Nikula
If CONFIG_BUG=n __WARN_printf won't be defined leading to the below build failure. The double underscores should have told us to steer clear of it anyway. drivers/gpu/drm/i915/intel_display.c: In function ‘assert_pll’: drivers/gpu/drm/i915/intel_display.c:1027:2: error: implicit declaration of function ‘__WARN_printf’ [-Werror=implicit-function-declaration] I915_STATE_WARN(cur_state != state, Use WARN(1, ...) instead. It handles CONFIG_BUG=n gracefully and, with the constant condition, a sane compiler should reduce it to __WARN_printf. This is a regression introduced by commit e2c719b75c8c186deb86570d8466df9e9eff919b Author: Rob Clark <robdclark@gmail.com> Date: Mon Dec 15 13:56:32 2014 -0500 drm/i915: tame the chattermouth (v2) Reported-by: Jim Davis <jim.epost@gmail.com> Reference: http://mid.gmane.org/CA+r1ZhgHTi7bS2irhtuSUs9aO=Br1dumN8=oAOeaMJDZ_ZhwBw@mail.gmail.com Cc: Rob Clark <robdclark@gmail.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-12drm/radeon: add si dpm quirk listAlex Deucher
This adds a quirks list to fix stability problems with certain SI boards. bug: https://bugs.freedesktop.org/show_bug.cgi?id=76490 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2015-01-12drm/radeon: don't print error on -ERESTARTSYSChristian König
Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-12Merge tag 'topic/i915-hda-componentized-2015-01-12' into drm-intel-next-queuedDaniel Vetter
Conflicts: drivers/gpu/drm/i915/intel_runtime_pm.c Separate branch so that Takashi can also pull just this refactoring into sound-next. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2015-01-13drm: fix mismerge in drm_crtc.cDave Airlie
Daniel merged two things in 72a3697097b8dc92f5b8362598f5730a9986eb83, but he merged this code twice, Dan's static checker spotted it. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-01-12drm/i915: Fix mutex->owner inspection race under DEBUG_MUTEXESChris Wilson
If CONFIG_DEBUG_MUTEXES is set, the mutex->owner field is only cleared if the mutex debugging is enabled which introduces a race in our mutex_is_locked_by() - i.e. we may inspect the old owner value before it is acquired by the new task. This is the root cause of this error: diff --git a/kernel/locking/mutex-debug.c b/kernel/locking/mutex-debug.c index 5cf6731..3ef3736 100644 --- a/kernel/locking/mutex-debug.c +++ b/kernel/locking/mutex-debug.c @@ -80,13 +80,13 @@ void debug_mutex_unlock(struct mutex *lock) DEBUG_LOCKS_WARN_ON(lock->owner != current); DEBUG_LOCKS_WARN_ON(!lock->wait_list.prev && !lock->wait_list.next); - mutex_clear_owner(lock); } /* * __mutex_slowpath_needs_to_unlock() is explicitly 0 for debug * mutexes so that we can do it here after we've verified state. */ + mutex_clear_owner(lock); atomic_set(&lock->count, 1); } Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87955 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-01-12drm/i915: Ban Haswell from using RCS flipsChris Wilson
Like Ivybridge, we have reports that we get random hangs when flipping with multiple pipes. Extend commit 2a92d5bca1999b69c78f3c3e97b5484985b094b9 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Jul 8 10:40:29 2014 +0100 drm/i915: Disable RCS flips on Ivybridge to also apply to Haswell. Reported-and-tested-by: Scott Tsai <scottt.tw@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87759 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: stable@vger.kernel.org # 2a92d5bca199 drm/i915: Disable RCS flips on Ivybridge Cc: stable@vger.kernel.org Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-01-12drm/i915: vlv: sanitize RPS interrupt mask during GPU idlingImre Deak
We apply the RPS interrupt workaround on VLV everywhere except when writing the mask directly during idling the GPU. For consistency do this also there. While at it also extend the code comment about affected platforms. I couldn't reproduce the issue on VLV fixed by this workaround, by removing the workaround from everywhere, while it's 100% reproducible on SNB using igt/gem_reset_stats/ban-ctx-render. So also add a note that it hasn't been verified if the workaround really applies to VLV/CHV. Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-01-12drm/i915: fix HW lockup due to missing RPS IRQ workaround on GEN6Imre Deak
In commit dbea3cea69508e9d548ed4a6be13de35492e5d15 Author: Imre Deak <imre.deak@intel.com> Date: Mon Dec 15 18:59:28 2014 +0200 drm/i915: sanitize RPS resetting during GPU reset we disable RPS interrupts during GPU resetting, but don't apply the necessary GEN6 HW workaround. This leads to a HW lockup during a subsequent "looping batchbuffer" workload. This is triggered by the testcase that submits exactly this kind of workload after a simulated GPU reset. I'm not sure how likely the bug would have triggered otherwise, since we would have applied the workaround anyway shortly after the GPU reset, when enabling GT powersaving from the deferred work. This may also fix unrelated issues, since during driver loading / suspending we also disable RPS interrupts and so we also had a short window during the rest of the loading / resuming where a similar workload could run without the workaround applied. v2: - separate the fix to route RPS interrupts to the CPU on GEN9 too to a separate patch (Daniel) Bisected-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Testcase: igt/gem_reset_stats/ban-ctx-render Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87429 Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-01-12drm/i915: gen9: fix RPS interrupt routing to CPU vs. GTImre Deak
GEN8+ HW has the option to route PM interrupts to either the CPU or to GT. For GEN8 this was already set correctly to routing to CPU, but not for GEN9, so fix this. Note that when disabling RPS interrupts this was set already correctly, though in that case it didn't matter much except for the possibility of spurious interrupts. Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-01-12drm/i915: remove unused power_well/get_cdclk_freq apiImre Deak
After switching to using the component interface this API isn't needed any more. v2-3: unchanged v4: - move the removal of i915_powerwell.h to this patch (Takashi) Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-12drm/i915: add component supportImre Deak
Register a component to be used to interface with the snd_hda_intel driver. This is meant to replace the same interface that is currently based on module symbol lookup. v2: - change roles between the hda and i915 components (Daniel) - add the implementation to a new file (Jani) - use better namespacing (Jani) v3: - move the implementation to intel_audio.c (Daniel) - rename display_component to audio_component (Daniel) - add kerneldoc (Daniel) v4: - run forgotten git rm i915_component.c (Jani) Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-12drm/i915: add dev_to_i915 helperImre Deak
This will be needed by later patches, so factor it out. No functional change. v2: - s/dev_to_i915_priv/dev_to_i915/ (Jani) - don't use the helper in i915_pm_suspend (Chris) - simplify the helper (Chris) v3: - remove redundant upcasting in the helper (Daniel) Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-12drm/exynos: remove the redundant machine checking codeHyungwon Hwang
This code is unnecessary, because same logic is already included. Refer this mail thread[1] for detail. [1] http://lists.freedesktop.org/archives/dri-devel/2015-January/075132.html Signed-off-by: Hyungwon Hwang <human.hwang@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2015-01-10Merge tag 'drm-intel-next-2014-12-19' of ↵Dave Airlie
git://anongit.freedesktop.org/drm-intel into drm-next - plane handling refactoring from Matt Roper and Gustavo Padovan in prep for atomic updates - fixes and more patches for the seqno to request transformation from John - docbook for fbc from Rodrigo - prep work for dual-link dsi from Gaurav Signh - crc fixes from Ville - special ggtt views infrastructure from Tvrtko Ursulin - shadow patch copying for the cmd parser from Brad Volkin - execlist and full ppgtt by default on gen8, for testing for now * tag 'drm-intel-next-2014-12-19' of git://anongit.freedesktop.org/drm-intel: (131 commits) drm/i915: Update DRIVER_DATE to 20141219 drm/i915: Hold runtime PM during plane commit drm/i915: Organize bind_vma funcs drm/i915: Organize INSTDONE report for future. drm/i915: Organize PDP regs report for future. drm/i915: Organize PPGTT init drm/i915: Organize Fence registers for future enablement. drm/i915: tame the chattermouth (v2) drm/i915: Warn about missing context state workarounds only once drm/i915: Use true PPGTT in Gen8+ when execlists are enabled drm/i915: Skip gunit save/restore for cherryview drm/i915/chv: Use timeout mode for RC6 on chv drm/i915: Add GPGPU_THREADS_DISPATCHED to the register whitelist drm/i915: Tidy up execbuffer command parsing code drm/i915: Mark shadow batch buffers as purgeable drm/i915: Use batch length instead of object size in command parser drm/i915: Use batch pools with the command parser drm/i915: Implement a framework for batch buffer pools drm/i915: fix use after free during eDP encoder destroying drm/i915/skl: Skylake also supports DP MST ...
2015-01-09drm/amd: Remove old radeon_sa funcs from kfd-->kgd interfaceOded Gabbay
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09drm/radeon: Remove old radeon_sa usage from kfd-->kgd interfaceOded Gabbay
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09drm/amdkfd: Using new gtt sa in amdkfdOded Gabbay
This patch change the calls throughout the amdkfd driver from the old kfd-->kgd interface to the new kfd gtt sa inside amdkfd v2: change the new call in sdma code that appeared because of the sdma feature Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09drm/amdkfd: Allocate gart memory using new interfaceOded Gabbay
This patch changes the calls to allocate the gart memory for amdkfd from the old interface (radeon_sa) to the new one (kfd_gtt_sa) The new gart sub-allocator is initialized with chunk size equal to 512 bytes. This is because the KV MQD is 512 Bytes and most of the sub-allocations are MQDs. Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09drm/amdkfd: Fixed calculation of gart buffer sizeOded Gabbay
This patch makes the gart's buffer size calculation more accurate. This buffer is needed per GPU. It takes into account maximum number of MQDs, runlist packets, kernel queues and reserves 512KB for other misc allocations. The total size is just shy of 4MB, for 32 processes and 128 queues per process, which are the defaults for amdkfd kernel module parameters. Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09drm/amdkfd: Add kfd gtt sub-allocator functionsOded Gabbay
This patch adds new kfd gtt sub-allocator functions that service the amdkfd driver when it wants to use gtt memory. The sub-allocator uses a bitmap to handle the memory area that was transferred to it during init. It divides the memory area into chunks, according to chunk size parameter. The allocation function will allocate contiguous chunks from that memory area, according to the requested size. If the requested size is smaller than the chunk size, a single chunk will be allocated. v2: Do some more verifications on parameters that are passed into kfd_gtt_sa_init() Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09drm/amdkfd: Add gtt sa related data to kfd_dev structOded Gabbay
This patch adds new fields to kfd_dev struct that are necessary for the new kfd gtt sa module Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09drm/radeon: Impl. new gtt allocate/free functionsOded Gabbay
This patch adds the implementation of the gtt interface functions. The allocate function will allocate a single bo, pin and map it to kernel memory. It will return the gpu address and cpu ptr as arguments. v2: The bulk of the allocations in the GART is for MQDs. MQDs represent active user-mode queues, which are on the current runlist. It is important to remember that active queues doesn't necessarily mean scheduled/running queues, especially if there is over-subscription of queues or more than a single HSA process. Because the scheduling of the user-mode queues is done by the CP firmware, amdkfd doesn't have any indication if the queue is scheduled or not. If the CP will try to schedule a queue, and its MQD is not present, this will probably stuck the CP permanently, as it will load garbage from the GART (the address of the MQD is given to the CP inside the runlist packet). In addition, there are a couple of small allocations which also should always be pinned - runlist packets (2 packets) and HPDs. runlist packets can be quite large, depending on number of processes and queues. This new allocate function represents the short/mid-term solution of limiting the total memory consumption to around 4MB by default. The long-term solution is to create a mechanism through which radeon/ttm can ask amdkfd to clear GART/VRAM memory due to memory pressure. Then, amdkfd will preempt the running queues and wait until the memory pressure is over. After that, amdkfd will reschedule the queues. Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09drm/amd: Add new kfd-->kgd interface for gart usageOded Gabbay
This patch adds two new functions to the kfd-->kgd interface: init_gtt_mem_allocation, which allocate a large enough buffer on the amdkfd needs, such as mqds, hpds, kernel queue, fence and runlists. This function is only called once per GPU device. The size of the allocated buffer is based on the maximum number of HSA processes and maximum number of queues per HSA process (two amdkfd kernel module parameters). free_gtt_mem, which frees a buffer that was allocated on the gart aperture. Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09drm/radeon: Enable sdma preemptionBen Goz
This patch adds to radeon the enablement of sdma preemption. This is needed to support HWS of SDMA user-mode queues. Signed-off-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09drm/amdkfd: Pass queue type to pqm_create_queue()Ben Goz
This patch passes the correct queue type to pqm_create_queue() instead of a fixed KFD_QUEUE_TYPE_COMPUTE type. Signed-off-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09drm/amdkfd: Identify SDMA queue in create queue ioctlBen Goz
This patch adds a check to the create queue ioctl path, which identifies SDMA queue type that is sent by userspace. Signed-off-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09drm/amdkfd: Add SDMA user-mode queues support to QCMBen Goz
This patch adds support for SDMA user-mode queues to the QCM - the Queue management system that manages queues-per-device and queues-per-process. v2: Remove calls to interface function that initializes sdma engines. v3: Use the new names of some of the defines. Signed-off-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09drm/amdkfd: Add SDMA mqd supportBen Goz
This patch adds support for SDMA mqd operations: - init_mqd_sdma - uninit_mqd_sdma - load_mqd_sdma - update_mqd_sdma - destroy_mqd_sdma - is_occupied_sdma It also adds SDMA queue information to some private structures of amdkfd. v3: Use the new names of some of the defines. Signed-off-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09drm/radeon: Implement SDMA interface functionsBen Goz
This patch implements the new SDMA interface functions. It also adds defines and structures related to SDMA registers. v2: Removed init_sdma_engines() from interface. Initialization is done in radeon. v3: - Removed unused defines. - Added SDMA_ prefix to defines that didn't have them. Signed-off-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09drm/amd: Add SDMA functions to kfd-->kgd interfaceBen Goz
This patch adds three new functions to the kfd2kgd interface: - hqd_sdma_load() - Loads SDMA mqd to a H/W SDMA hqd slot. Used only in no HWS mode. - hqd_sdma_is_occupied() - Checks if an SDMA hqd slot is occupied. Used only in no HWS mode. - hqd_sdma_destroy() - Destructs and preempts the SDMA queue assigned to that SDMA hqd slot. Used only in no HWS mode. These functions are needed to support SDMA queues scheduling when using no HWS mode (used for debug or bring-up). v2: Removed init_sdma_engines() from interface. Initialization is done in radeon. Signed-off-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09drm/amdkfd: Process-device data creation and lookup splitAlexey Skidanov
This patch splits the current kfd_get_process_device_data() to two functions, one that specifically creates a pdd and another one which just do lookup. This is done to enhance the readability and maintainability of the code. Signed-off-by: Alexey Skidanov <Alexey.Skidanov@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2015-01-09drm/amdkfd: Add number of watch points to topologyAlexey Skidanov
This patch adds the number of watch points to the node capabilities in the topology module Signed-off-by: Alexey Skidanov <Alexey.Skidanov@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2015-01-09drm/rockchip: fix dma_alloc_attrs() error checkDaniel Kurtz
dma_alloc_attrs() returns NULL if it cannot allocate a dma buffer (or mapping), not a negative error code. Rerported-by: Pawel Osciak <posciak@chromium.org> Signed-off-by: Daniel Kurtz <djkurtz@chromium.org> Signed-off-by: Mark Yao <mark.yao@rock-chips.com>
2015-01-09Merge tag 'topic/atomic-core-2015-01-05' of ↵Dave Airlie
git://anongit.freedesktop.org/drm-intel into drm-next Next batch of atomic work. Most important is the propertification from Rob and the nth iteration of the actual atomic ioctl originally from Ville. Big differences compared to earlier revisions: - Core properties are now fully handled by the core, drivers can only handle driver-specific properties. - Atomic props&ioctl are opt-in per file_priv, userspace needs to explicitly ask for it (like universal plane support). - For now all hidden behind the atomic module option until this has settled a bit. - Atomic modesets are currently not possible since the exact abi for how to handle the mode property is still under discussion. Besides this some cleanup patches from me and the addition of per-object state to global state backpointers to simplify drivers. * tag 'topic/atomic-core-2015-01-05' of git://anongit.freedesktop.org/drm-intel: drm: Ensure universal_planes is set for atomic drm/atomic: Hide drm.ko internal interfaces drm: Atomic modeset ioctl drm/atomic: atomic connector properties drm/atomic: atomic plane properties drm: small property creation cleanup drm/atomic: atomic_check functions drm: add atomic properties drm: refactor getproperties/getconnector drm: tweak getconnector locking drm: add atomic_get_property drm: add atomic_set_property wrappers drm: get rid of direct property value access drm: store property instead of id in obj attachment drm: allow property validation for refcnted props drm/atomic: Introduce state->obj backpointers drm/atomic-helper: Again check modeset *before* plane states drm/atomic-helper: Export both plane and modeset check helpers
2015-01-09Merge tag 'topic/core-stuff-2014-12-19' of ↵Dave Airlie
git://anongit.freedesktop.org/drm-intel into drm-next Misc drm patches with mostly polish patches from Thierry, with a bit of generic mode validation from Ville and a few other oddball things. * tag 'topic/core-stuff-2014-12-19' of git://anongit.freedesktop.org/drm-intel: (25 commits) drm: Include drm_crtc_helper.h in DocBook drm: Make drm_crtc_helper.h standalone includible drm: Move IRQ related fields to proper section drm: Remove stale comment drm: Do basic sanity checks for user modes drm: Perform basic sanity checks on probed modes drm: Reorganize probed mode validation drm/doc: Remove duplicate "by" drm/info: Remove unused code drm/cache: Use wbinvd helpers drm/plane-helper: Test for plane disable earlier drm/doc: Document drm_add_modes_noedid() usage drm: bit of spell-check / editorializing. drm: Prefer sizeof(type) over sizeof type drm: Remove useless else block drm: Remove unneeded braces for single statement blocks drm: Do not assign in if condition drm: Prefer kmalloc_array() over kmalloc() with multiply drm: Prefer kcalloc() over kzalloc() with multiply drm: Miscellaneous checkpatch whitespace cleanups ...
2015-01-08drm/radeon: add a dpm quirk listAlex Deucher
Disable dpm on certain problematic boards rather than disabling dpm for the entire chip family since most boards work fine. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1386534 https://bugzilla.kernel.org/show_bug.cgi?id=83731 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2015-01-08drm/amdkfd: Fix sparse warning (different address space)Oded Gabbay
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2015-01-08drm/radeon: fix VM flush on CIK (v3)Alex Deucher
We need to wait for the GPUVM flush to complete. There was some confusion as to how this mechanism was supposed to work. The operation is not atomic. For GPU initiated invalidations you need to read back a VM register to introduce enough latency for the update to complete. v2: drop gart changes v3: just read back rather than polling Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2015-01-08drm/radeon: fix VM flush on SI (v3)Alex Deucher
We need to wait for the GPUVM flush to complete. There was some confusion as to how this mechanism was supposed to work. The operation is not atomic. For GPU initiated invalidations you need to read back a VM register to introduce enough latency for the update to complete. v2: drop gart changes v3: just read back rather than polling Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2015-01-08drm/radeon: fix VM flush on cayman/aruba (v3)Alex Deucher
We need to wait for the GPUVM flush to complete. There was some confusion as to how this mechanism was supposed to work. The operation is not atomic. For GPU initiated invalidations you need to read back a VM register to introduce enough latency for the update to complete. v2: drop gart changes v3: just read back rather than polling Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2015-01-08drm/i915: Reserve shadow batch VMA analogue to othersTvrtko Ursulin
If not pinned VMA can become an eviction target just before it needs to be executed which breaks the internal object lifetime rules. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87399 Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-08drm/amdkfd: Drop interrupt SW ring bufferMichel Dänzer
The work queue couldn't reliably prevent the SW ring buffer from overflowing, so dmesg was spammed by kfd kfd: Interrupt ring overflow, dropping interrupt. messages when running e.g. the Atlantis Substance demo from https://wiki.unrealengine.com/Linux_Demos on Kaveri. Since the SW ring buffer doesn't actually do anything at this point, just remove it for now. When actual interrupt processing code is added to amdkfd, it should try to do things immediately and only defer to work queues when necessary. Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>