Age | Commit message (Collapse) | Author |
|
Found by reading the HIZ_CHICKEN documentation.
Improves performance in a HiZ microbenchmark by around 50%.
Improves performance in OglZBuffer by around 18%.
Thanks to Chris Wilson for helping me figure out where to put this.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
A previous commit enabled execlists by default:
commit 27401d126b5b ("drm/i915/bdw: Enable execlists by default where supported")
This allowed routine testing of execlists which exposed a regression when
resuming from suspend. The cause was tracked down the to recent changes to the
ring init sequence:
commit 35a57ffbb108 ("drm/i915: Only init engines once")
During a suspend/resume cycle the hardware Context Status Buffer write pointer
is reset. However since the recent changes to the init sequence the software CSB
read pointer is no longer reset. This means that context status events are not
handled correctly and new contexts are not written to the ELSP, resulting in an
apparent GPU hang.
Pending further changes to the ring init code, just move the
ring->next_context_status_buffer initialization into gen8_init_common_ring to
fix this regression.
v2: Moved init into gen8_init_common_ring rather than context_enable after
feedback from Daniel Vetter. Updated commit msg to reflect this and also cite
commits related to the regression. Fixed bz link to correct bug.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88096
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
The userspace-requested plane coordinates are now always available via
plane->state.base (and the i915-adjusted values are stored in
plane->state), so we no longer use the coordinate fields in intel_plane
and can drop them.
Also, note that the error case for pageflip calls update_plane() to
program the values from plane->state; it's simpler to just call
intel_plane_restore() which does the same thing.
v2: Replace manual update_plane() with intel_plane_restore() in pageflip
error handler.
Reviewed-by(v1): Bob Paauwe <bob.j.paauwe@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
Switch plane handling to use the atomic plane helpers. This means that
rather than provide our own implementations of .update_plane() and
.disable_plane(), we expose the lower-level check/prepare/commit/cleanup
entrypoints and let the DRM core implement update/disable for us using
those entrypoints.
The other main change that falls out of this patch is that our
drm_plane's will now always have a valid plane->state that contains the
relevant plane state (initial state is allocated at plane creation).
The base drm_plane_state pointed to holds the requested source/dest
coordinates, and the subclassed intel_plane_state holds the adjusted
values that our driver actually uses.
v2:
- Renamed file from intel_atomic.c to intel_atomic_plane.c (Daniel)
- Fix a copy/paste comment mistake (Bob)
v3:
- Use prepare/cleanup functions that we've already factored out
- Use newly refactored pre_commit/commit/post_commit to avoid sleeping
during vblank evasion
v4:
- Rebase to latest di-nightly requires adding an 'old_state' parameter
to atomic_update;
v5:
- Must have botched a rebase somewhere and lost some work. Restore
state 'dirty' flag to let begin/end code know which planes to
run the pre_commit/post_commit hooks for. This would have actually
shown up as broken in the next commit rather than this one.
v6:
- Squash kerneldoc patch into this one.
- Previous patches have now already taken care of most of the
infrastructure that used to be in this patch. All we're adding here
now is some thin wrappers.
v7:
- Check return of intel_plane_duplicate_state() for allocation
failures.
v8:
- Drop unused drm_plane_state -> intel_plane_state cast. (Ander)
- Squash in actual transition to plane helpers. Significant
refactoring earlier in the patchset has made the combined
prep+transition much easier to swallow than it was in earlier
iterations. (Ander)
v9:
- s/track_fbs/disabled_planes/ in the atomic crtc flags. The only fb's
we need to update frontbuffer tracking for are those on a plane about
to be disabled (since the atomic helpers never call prepare_fb() when
disabling a plane), so the new name more accurately describes what
we're actually tracking.
Testcase: igt/kms_plane
Testcase: igt/kms_universal_plane
Testcase: igt/kms_cursor_crc
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
A few of the sprite-related function names in i915 are very similar
(e.g., intel_enable_planes() vs intel_crtc_enable_planes()) and don't
make it clear whether they only operate on sprite planes, or whether
they also apply to all universal plane types. Rename a few functions to
be more consistent with our function naming for primary/cursor planes or
to clarify that they apply specifically to sprite planes:
- s/intel_disable_planes/intel_disable_sprite_planes/
- s/intel_enable_planes/intel_enable_sprite_planes/
Also, drop the sprite-specific intel_destroy_plane() and just use
the type-agnostic intel_plane_destroy() function. The extra 'disable'
call that intel_destroy_plane() did is unnecessary since the plane will
already be disabled due to framebuffer destruction by the point it gets
called.
v2: Earlier consolidation patches have reduced the number of functions
we need to rename here.
v3: Also rename intel_plane_funcs vtable to intel_sprite_plane_funcs
for consistency with primary/cursor. (Ander)
v4: Convert comment for intel_plane_destroy() to kerneldoc now that it
is no longer a static function. (Ander)
Reviewed-by(v1): Bob Paauwe <bob.j.paauwe@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
Move the vblank evasion up from the low-level, hw-specific
update_plane() handlers to the general plane commit operation.
Everything inside commit should now be non-sleeping, so this brings us
closer to how vblank evasion will behave once we move over to atomic.
v2:
- Restore lost intel_crtc->active check on vblank evasion
v3:
- Replace assert_pipe_enabled() in intel_disable_primary_hw_plane()
with an intel_crtc->active test; it turns out assert_pipe_enabled()
grabs some mutexes and can sleep, which we can't do with interrupts
disabled.
v4:
- Equivalent to v2; v3 change is now squashed into an earlier patch
of the series. (Ander).
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
Once we integrate our work into the atomic pipeline, plane commit
operations will need to happen with interrupts disabled, due to vblank
evasion. Our commit functions today include sleepable work, so those
operations need to be split out and run either before or after the
atomic register programming.
The solution here calculates which of those operations will need to be
performed during the 'check' phase and sets flags in an intel_crtc
sub-struct. New intel_begin_crtc_commit() and
intel_finish_crtc_commit() functions are added before and after the
actual register programming; these will eventually be called from the
atomic plane helper's .atomic_begin() and .atomic_end() entrypoints.
v2: Fix broken sprite code split
v3: Make the pre/post commit work crtc-based to match how we eventually
want this to be called from the atomic plane helpers.
v4: Some platforms that haven't had their watermark code reworked were
waiting for vblank, then calling update_sprite_watermarks in their
platform-specific disable code. These also need to be flagged out
of the critical section.
v5: Sprite plane test for primary show/hide should just set the flag to
wait for pending flips, not actually perform the wait. (Ander)
v6:
- Rebase onto latest di-nightly; picks up an important runtime PM fix.
- Handle 'wait_for_flips' flag in intel_begin_crtc_commit(). (Ander)
- Use wait_for_flips flag for primary plane update rather than
performing the wait in the check routine.
- Added kerneldoc to pre_disable/post_enable functions that are no
longer static. (Ander)
- Replace assert_pipe_enabled() in intel_disable_primary_hw_plane()
with an intel_crtc->active test; it turns out assert_pipe_enabled()
grabs some mutexes and can sleep, which we can't do with interrupts
disabled.
v7:
- Check for fb != NULL when deciding whether the sprite plane hides the
primary plane during a sprite update. (PRTS)
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
If CONFIG_BUG=n __WARN_printf won't be defined leading to the below
build failure. The double underscores should have told us to steer clear
of it anyway.
drivers/gpu/drm/i915/intel_display.c: In function ‘assert_pll’:
drivers/gpu/drm/i915/intel_display.c:1027:2: error: implicit declaration
of function ‘__WARN_printf’ [-Werror=implicit-function-declaration]
I915_STATE_WARN(cur_state != state,
Use WARN(1, ...) instead. It handles CONFIG_BUG=n gracefully and, with
the constant condition, a sane compiler should reduce it to
__WARN_printf.
This is a regression introduced by
commit e2c719b75c8c186deb86570d8466df9e9eff919b
Author: Rob Clark <robdclark@gmail.com>
Date: Mon Dec 15 13:56:32 2014 -0500
drm/i915: tame the chattermouth (v2)
Reported-by: Jim Davis <jim.epost@gmail.com>
Reference: http://mid.gmane.org/CA+r1ZhgHTi7bS2irhtuSUs9aO=Br1dumN8=oAOeaMJDZ_ZhwBw@mail.gmail.com
Cc: Rob Clark <robdclark@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
This adds a quirks list to fix stability problems with
certain SI boards.
bug:
https://bugs.freedesktop.org/show_bug.cgi?id=76490
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Conflicts:
drivers/gpu/drm/i915/intel_runtime_pm.c
Separate branch so that Takashi can also pull just this refactoring
into sound-next.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
|
|
Daniel merged two things in 72a3697097b8dc92f5b8362598f5730a9986eb83,
but he merged this code twice, Dan's static checker spotted it.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
If CONFIG_DEBUG_MUTEXES is set, the mutex->owner field is only cleared
if the mutex debugging is enabled which introduces a race in our
mutex_is_locked_by() - i.e. we may inspect the old owner value before it
is acquired by the new task.
This is the root cause of this error:
diff --git a/kernel/locking/mutex-debug.c b/kernel/locking/mutex-debug.c
index 5cf6731..3ef3736 100644
--- a/kernel/locking/mutex-debug.c
+++ b/kernel/locking/mutex-debug.c
@@ -80,13 +80,13 @@ void debug_mutex_unlock(struct mutex *lock)
DEBUG_LOCKS_WARN_ON(lock->owner != current);
DEBUG_LOCKS_WARN_ON(!lock->wait_list.prev && !lock->wait_list.next);
- mutex_clear_owner(lock);
}
/*
* __mutex_slowpath_needs_to_unlock() is explicitly 0 for debug
* mutexes so that we can do it here after we've verified state.
*/
+ mutex_clear_owner(lock);
atomic_set(&lock->count, 1);
}
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87955
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Like Ivybridge, we have reports that we get random hangs when flipping
with multiple pipes. Extend
commit 2a92d5bca1999b69c78f3c3e97b5484985b094b9
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Jul 8 10:40:29 2014 +0100
drm/i915: Disable RCS flips on Ivybridge
to also apply to Haswell.
Reported-and-tested-by: Scott Tsai <scottt.tw@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87759
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable@vger.kernel.org # 2a92d5bca199 drm/i915: Disable RCS flips on Ivybridge
Cc: stable@vger.kernel.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
We apply the RPS interrupt workaround on VLV everywhere except when
writing the mask directly during idling the GPU. For consistency do this
also there.
While at it also extend the code comment about affected platforms.
I couldn't reproduce the issue on VLV fixed by this workaround, by
removing the workaround from everywhere, while it's 100% reproducible on
SNB using igt/gem_reset_stats/ban-ctx-render. So also add a note that
it hasn't been verified if the workaround really applies to VLV/CHV.
Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
In
commit dbea3cea69508e9d548ed4a6be13de35492e5d15
Author: Imre Deak <imre.deak@intel.com>
Date: Mon Dec 15 18:59:28 2014 +0200
drm/i915: sanitize RPS resetting during GPU reset
we disable RPS interrupts during GPU resetting, but don't apply the
necessary GEN6 HW workaround. This leads to a HW lockup during a
subsequent "looping batchbuffer" workload. This is triggered by the
testcase that submits exactly this kind of workload after a simulated
GPU reset. I'm not sure how likely the bug would have triggered
otherwise, since we would have applied the workaround anyway shortly
after the GPU reset, when enabling GT powersaving from the deferred
work.
This may also fix unrelated issues, since during driver loading /
suspending we also disable RPS interrupts and so we also had a short
window during the rest of the loading / resuming where a similar
workload could run without the workaround applied.
v2:
- separate the fix to route RPS interrupts to the CPU on GEN9 too
to a separate patch (Daniel)
Bisected-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Testcase: igt/gem_reset_stats/ban-ctx-render
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87429
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
GEN8+ HW has the option to route PM interrupts to either the CPU or to
GT. For GEN8 this was already set correctly to routing to CPU, but not
for GEN9, so fix this. Note that when disabling RPS interrupts this was
set already correctly, though in that case it didn't matter much except
for the possibility of spurious interrupts.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
After switching to using the component interface this API isn't needed
any more.
v2-3: unchanged
v4:
- move the removal of i915_powerwell.h to this patch (Takashi)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
Register a component to be used to interface with the snd_hda_intel
driver. This is meant to replace the same interface that is currently
based on module symbol lookup.
v2:
- change roles between the hda and i915 components (Daniel)
- add the implementation to a new file (Jani)
- use better namespacing (Jani)
v3:
- move the implementation to intel_audio.c (Daniel)
- rename display_component to audio_component (Daniel)
- add kerneldoc (Daniel)
v4:
- run forgotten git rm i915_component.c (Jani)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
This will be needed by later patches, so factor it out.
No functional change.
v2:
- s/dev_to_i915_priv/dev_to_i915/ (Jani)
- don't use the helper in i915_pm_suspend (Chris)
- simplify the helper (Chris)
v3:
- remove redundant upcasting in the helper (Daniel)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
This code is unnecessary, because same logic is already included. Refer
this mail thread[1] for detail.
[1] http://lists.freedesktop.org/archives/dri-devel/2015-January/075132.html
Signed-off-by: Hyungwon Hwang <human.hwang@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
|
|
git://anongit.freedesktop.org/drm-intel into drm-next
- plane handling refactoring from Matt Roper and Gustavo Padovan in prep for
atomic updates
- fixes and more patches for the seqno to request transformation from John
- docbook for fbc from Rodrigo
- prep work for dual-link dsi from Gaurav Signh
- crc fixes from Ville
- special ggtt views infrastructure from Tvrtko Ursulin
- shadow patch copying for the cmd parser from Brad Volkin
- execlist and full ppgtt by default on gen8, for testing for now
* tag 'drm-intel-next-2014-12-19' of git://anongit.freedesktop.org/drm-intel: (131 commits)
drm/i915: Update DRIVER_DATE to 20141219
drm/i915: Hold runtime PM during plane commit
drm/i915: Organize bind_vma funcs
drm/i915: Organize INSTDONE report for future.
drm/i915: Organize PDP regs report for future.
drm/i915: Organize PPGTT init
drm/i915: Organize Fence registers for future enablement.
drm/i915: tame the chattermouth (v2)
drm/i915: Warn about missing context state workarounds only once
drm/i915: Use true PPGTT in Gen8+ when execlists are enabled
drm/i915: Skip gunit save/restore for cherryview
drm/i915/chv: Use timeout mode for RC6 on chv
drm/i915: Add GPGPU_THREADS_DISPATCHED to the register whitelist
drm/i915: Tidy up execbuffer command parsing code
drm/i915: Mark shadow batch buffers as purgeable
drm/i915: Use batch length instead of object size in command parser
drm/i915: Use batch pools with the command parser
drm/i915: Implement a framework for batch buffer pools
drm/i915: fix use after free during eDP encoder destroying
drm/i915/skl: Skylake also supports DP MST
...
|
|
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch change the calls throughout the amdkfd driver from the old kfd-->kgd
interface to the new kfd gtt sa inside amdkfd
v2: change the new call in sdma code that appeared because of the sdma feature
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch changes the calls to allocate the gart memory for amdkfd from the
old interface (radeon_sa) to the new one (kfd_gtt_sa)
The new gart sub-allocator is initialized with chunk size equal to 512 bytes.
This is because the KV MQD is 512 Bytes and most of the sub-allocations are
MQDs.
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch makes the gart's buffer size calculation more accurate. This buffer
is needed per GPU.
It takes into account maximum number of MQDs, runlist packets, kernel queues
and reserves 512KB for other misc allocations.
The total size is just shy of 4MB, for 32 processes and 128 queues per
process, which are the defaults for amdkfd kernel module parameters.
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch adds new kfd gtt sub-allocator functions that service the amdkfd
driver when it wants to use gtt memory.
The sub-allocator uses a bitmap to handle the memory area that was transferred
to it during init. It divides the memory area into chunks, according to chunk
size parameter.
The allocation function will allocate contiguous chunks from that memory area,
according to the requested size. If the requested size is smaller than the
chunk size, a single chunk will be allocated.
v2: Do some more verifications on parameters that are passed into
kfd_gtt_sa_init()
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch adds new fields to kfd_dev struct that are necessary for the new kfd
gtt sa module
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch adds the implementation of the gtt interface functions.
The allocate function will allocate a single bo, pin and map it to kernel
memory. It will return the gpu address and cpu ptr as arguments.
v2:
The bulk of the allocations in the GART is for MQDs. MQDs represent active
user-mode queues, which are on the current runlist. It is important to
remember that active queues doesn't necessarily mean scheduled/running
queues, especially if there is over-subscription of queues or more than a
single HSA process.
Because the scheduling of the user-mode queues is done by the CP firmware,
amdkfd doesn't have any indication if the queue is scheduled or not. If the
CP will try to schedule a queue, and its MQD is not present, this will
probably stuck the CP permanently, as it will load garbage from the GART
(the address of the MQD is given to the CP inside the runlist packet).
In addition, there are a couple of small allocations which also should
always be pinned - runlist packets (2 packets) and HPDs. runlist packets can
be quite large, depending on number of processes and queues.
This new allocate function represents the short/mid-term solution of limiting
the total memory consumption to around 4MB by default.
The long-term solution is to create a mechanism through which radeon/ttm can
ask amdkfd to clear GART/VRAM memory due to memory pressure.
Then, amdkfd will preempt the running queues and wait until the memory pressure
is over. After that, amdkfd will reschedule the queues.
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch adds two new functions to the kfd-->kgd interface:
init_gtt_mem_allocation, which allocate a large enough buffer on the amdkfd
needs, such as mqds, hpds, kernel queue, fence and runlists. This function
is only called once per GPU device. The size of the allocated buffer is
based on the maximum number of HSA processes and maximum number of queues
per HSA process (two amdkfd kernel module parameters).
free_gtt_mem, which frees a buffer that was allocated on the gart aperture.
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch adds to radeon the enablement of sdma preemption.
This is needed to support HWS of SDMA user-mode queues.
Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch passes the correct queue type to pqm_create_queue() instead of a
fixed KFD_QUEUE_TYPE_COMPUTE type.
Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch adds a check to the create queue ioctl path, which identifies SDMA
queue type that is sent by userspace.
Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch adds support for SDMA user-mode queues to the QCM - the Queue
management system that manages queues-per-device and queues-per-process.
v2: Remove calls to interface function that initializes sdma engines.
v3: Use the new names of some of the defines.
Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch adds support for SDMA mqd operations:
- init_mqd_sdma
- uninit_mqd_sdma
- load_mqd_sdma
- update_mqd_sdma
- destroy_mqd_sdma
- is_occupied_sdma
It also adds SDMA queue information to some private structures of amdkfd.
v3: Use the new names of some of the defines.
Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch implements the new SDMA interface functions. It also adds defines
and structures related to SDMA registers.
v2: Removed init_sdma_engines() from interface. Initialization is done in
radeon.
v3:
- Removed unused defines.
- Added SDMA_ prefix to defines that didn't have them.
Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch adds three new functions to the kfd2kgd interface:
- hqd_sdma_load() - Loads SDMA mqd to a H/W SDMA hqd slot. Used only in no HWS
mode.
- hqd_sdma_is_occupied() - Checks if an SDMA hqd slot is occupied. Used only
in no HWS mode.
- hqd_sdma_destroy() - Destructs and preempts the SDMA queue assigned to
that SDMA hqd slot. Used only in no HWS mode.
These functions are needed to support SDMA queues scheduling when using no HWS
mode (used for debug or bring-up).
v2: Removed init_sdma_engines() from interface. Initialization is done in
radeon.
Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch splits the current kfd_get_process_device_data() to two
functions, one that specifically creates a pdd and another one which
just do lookup.
This is done to enhance the readability and maintainability of the code.
Signed-off-by: Alexey Skidanov <Alexey.Skidanov@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
|
|
This patch adds the number of watch points to the node capabilities in the
topology module
Signed-off-by: Alexey Skidanov <Alexey.Skidanov@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
|
|
dma_alloc_attrs() returns NULL if it cannot allocate a dma buffer (or
mapping), not a negative error code.
Rerported-by: Pawel Osciak <posciak@chromium.org>
Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Mark Yao <mark.yao@rock-chips.com>
|
|
git://anongit.freedesktop.org/drm-intel into drm-next
Next batch of atomic work. Most important is the propertification from Rob
and the nth iteration of the actual atomic ioctl originally from Ville.
Big differences compared to earlier revisions:
- Core properties are now fully handled by the core, drivers can only
handle driver-specific properties.
- Atomic props&ioctl are opt-in per file_priv, userspace needs to
explicitly ask for it (like universal plane support).
- For now all hidden behind the atomic module option until this has
settled a bit.
- Atomic modesets are currently not possible since the exact abi for how
to handle the mode property is still under discussion.
Besides this some cleanup patches from me and the addition of per-object
state to global state backpointers to simplify drivers.
* tag 'topic/atomic-core-2015-01-05' of git://anongit.freedesktop.org/drm-intel:
drm: Ensure universal_planes is set for atomic
drm/atomic: Hide drm.ko internal interfaces
drm: Atomic modeset ioctl
drm/atomic: atomic connector properties
drm/atomic: atomic plane properties
drm: small property creation cleanup
drm/atomic: atomic_check functions
drm: add atomic properties
drm: refactor getproperties/getconnector
drm: tweak getconnector locking
drm: add atomic_get_property
drm: add atomic_set_property wrappers
drm: get rid of direct property value access
drm: store property instead of id in obj attachment
drm: allow property validation for refcnted props
drm/atomic: Introduce state->obj backpointers
drm/atomic-helper: Again check modeset *before* plane states
drm/atomic-helper: Export both plane and modeset check helpers
|
|
git://anongit.freedesktop.org/drm-intel into drm-next
Misc drm patches with mostly polish patches from Thierry, with a bit of
generic mode validation from Ville and a few other oddball things.
* tag 'topic/core-stuff-2014-12-19' of git://anongit.freedesktop.org/drm-intel: (25 commits)
drm: Include drm_crtc_helper.h in DocBook
drm: Make drm_crtc_helper.h standalone includible
drm: Move IRQ related fields to proper section
drm: Remove stale comment
drm: Do basic sanity checks for user modes
drm: Perform basic sanity checks on probed modes
drm: Reorganize probed mode validation
drm/doc: Remove duplicate "by"
drm/info: Remove unused code
drm/cache: Use wbinvd helpers
drm/plane-helper: Test for plane disable earlier
drm/doc: Document drm_add_modes_noedid() usage
drm: bit of spell-check / editorializing.
drm: Prefer sizeof(type) over sizeof type
drm: Remove useless else block
drm: Remove unneeded braces for single statement blocks
drm: Do not assign in if condition
drm: Prefer kmalloc_array() over kmalloc() with multiply
drm: Prefer kcalloc() over kzalloc() with multiply
drm: Miscellaneous checkpatch whitespace cleanups
...
|
|
Disable dpm on certain problematic boards rather than
disabling dpm for the entire chip family since most
boards work fine.
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1386534
https://bugzilla.kernel.org/show_bug.cgi?id=83731
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
|
|
We need to wait for the GPUVM flush to complete. There
was some confusion as to how this mechanism was supposed
to work. The operation is not atomic. For GPU initiated
invalidations you need to read back a VM register to
introduce enough latency for the update to complete.
v2: drop gart changes
v3: just read back rather than polling
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
We need to wait for the GPUVM flush to complete. There
was some confusion as to how this mechanism was supposed
to work. The operation is not atomic. For GPU initiated
invalidations you need to read back a VM register to
introduce enough latency for the update to complete.
v2: drop gart changes
v3: just read back rather than polling
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
We need to wait for the GPUVM flush to complete. There
was some confusion as to how this mechanism was supposed
to work. The operation is not atomic. For GPU initiated
invalidations you need to read back a VM register to
introduce enough latency for the update to complete.
v2: drop gart changes
v3: just read back rather than polling
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
If not pinned VMA can become an eviction target just before it needs to be
executed which breaks the internal object lifetime rules.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87399
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
The work queue couldn't reliably prevent the SW ring buffer from
overflowing, so dmesg was spammed by
kfd kfd: Interrupt ring overflow, dropping interrupt.
messages when running e.g. the Atlantis Substance demo from
https://wiki.unrealengine.com/Linux_Demos on Kaveri.
Since the SW ring buffer doesn't actually do anything at this point, just
remove it for now. When actual interrupt processing code is added to
amdkfd, it should try to do things immediately and only defer to work
queues when necessary.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
|