diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2020-01-30 08:04:01 -0800 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2020-01-30 08:04:01 -0800 |
commit | 9f68e3655aae6d49d6ba05dd263f99f33c2567af (patch) | |
tree | 42c2c4579c4acbbb456695326af4f4ad8f402813 /drivers/gpu/drm/i915/i915_active.c | |
parent | 4cadc60d6bcfee9c626d4b55e9dc1475d21ad3bb (diff) | |
parent | d47c7f06268082bc0082a15297a07c0da59b0fc4 (diff) |
Merge tag 'drm-next-2020-01-30' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Davbe Airlie:
"This is the main pull request for graphics for 5.6. Usual selection of
changes all over.
I've got one outstanding vmwgfx pull that touches mm so kept it
separate until after all of this lands. I'll try and get it to you
soon after this, but it might be early next week (nothing wrong with
code, just my schedule is messy)
This also hits a lot of fbdev drivers with some cleanups.
Other notables:
- vulkan timeline semaphore support added to syncobjs
- nouveau turing secureboot/graphics support
- Displayport MST display stream compression support
Detailed summary:
uapi:
- dma-buf heaps added (and fixed)
- command line add support for panel oreientation
- command line allow overriding penguin count
drm:
- mipi dsi definition updates
- lockdep annotations for dma_resv
- remove dma-buf kmap/kunmap support
- constify fb_ops in all fbdev drivers
- MST fix for daisy chained hotplug-
- CTA-861-G modes with VIC >= 193 added
- fix drm_panel_of_backlight export
- LVDS decoder support
- more device based logging support
- scanline alighment for dumb buffers
- MST DSC helpers
scheduler:
- documentation fixes
- job distribution improvements
panel:
- Logic PD type 28 panel support
- Jimax8729d MIPI-DSI
- igenic JZ4770
- generic DSI devicetree bindings
- sony acx424AKP panel
- Leadtek LTK500HD1829
- xinpeng XPP055C272
- AUO B116XAK01
- GiantPlus GPM940B0
- BOE NV140FHM-N49
- Satoz SAT050AT40H12R2
- Sharp LS020B1DD01D panels.
ttm:
- use blocking WW lock
i915:
- hw/uapi state separation
- Lock annotation improvements
- selftest improvements
- ICL/TGL DSI VDSC support
- VBT parsing improvments
- Display refactoring
- DSI updates + fixes
- HDCP 2.2 for CFL
- CML PCI ID fixes
- GLK+ fbc fix
- PSR fixes
- GEN/GT refactor improvments
- DP MST fixes
- switch context id alloc to xarray
- workaround updates
- LMEM debugfs support
- tiled monitor fixes
- ICL+ clock gating programming removed
- DP MST disable sequence fixed
- LMEM discontiguous object maps
- prefaulting for discontiguous objects
- use LMEM for dumb buffers if possible
- add LMEM mmap support
amdgpu:
- enable sync object timelines for vulkan
- MST atomic routines
- enable MST DSC support
- add DMCUB display microengine support
- DC OEM i2c support
- Renoir DC fixes
- Initial HDCP 2.x support
- BACO support for Arcturus
- Use BACO for runtime PM power save
- gfxoff on navi10
- gfx10 golden updates and fixes
- DCN support on POWER
- GFXOFF for raven1 refresh
- MM engine idle handlers cleanup
- 10bpc EDP panel fixes
- renoir watermark fixes
- SR-IOV fixes
- Arcturus VCN fixes
- GDDR6 training fixes
- freesync fixes
- Pollock support
amdkfd:
- unify more codepath with amdgpu
- use KIQ to setup HIQ rather than MMIO
radeon:
- fix vma fault handler race
- PPC DMA fix
- register check fixes for r100/r200
nouveau:
- mmap_sem vs dma_resv fix
- rewrite the ACR secure boot code for Turing
- TU10x graphics engine support (TU11x pending)
- Page kind mapping for turing
- 10-bit LUT support
- GP10B Tegra fixes
- HD audio regression fix
hisilicon/hibmc:
- use generic fbdev code and helpers
rockchip:
- dsi/px30 support
virtio:
- fb damage support
- static some functions
vc4:
- use dma_resv lock wrappers
msm:
- use dma_resv lock wrappers
- sc7180 display + DSI support
- a618 support
- UBWC support improvements
vmwgfx:
- updates + new logging uapi
exynos:
- enable/disable callback cleanups
etnaviv:
- use dma_resv lock wrappers
atmel-hlcdc:
- clock fixes
mediatek:
- cmdq support
- non-smooth cursor fixes
- ctm property support
sun4i:
- suspend support
- A64 mipi dsi support
rcar-du:
- Color management module support
- LVDS encoder dual-link support
- R8A77980 support
analogic:
- add support for an6345
ast:
- atomic modeset support
- primary plane garbage fix
arcgpu:
- fixes for fourcc handling
tegra:
- minor fixes and improvments
mcde:
- vblank support
meson:
- OSD1 plane AFBC commit
gma500:
- add pageflip support
- reomve global drm_dev
komeda:
- tweak debugfs output
- d32 support
- runtime PM suppotr
udl:
- use generic shmem helpers
- cleanup and fixes"
* tag 'drm-next-2020-01-30' of git://anongit.freedesktop.org/drm/drm: (1998 commits)
drm/nouveau/fb/gp102-: allow module to load even when scrubber binary is missing
drm/nouveau/acr: return error when registering LSF if ACR not supported
drm/nouveau/disp/gv100-: not all channel types support reporting error codes
drm/nouveau/disp/nv50-: prevent oops when no channel method map provided
drm/nouveau: support synchronous pushbuf submission
drm/nouveau: signal pending fences when channel has been killed
drm/nouveau: reject attempts to submit to dead channels
drm/nouveau: zero vma pointer even if we only unreference it rather than free
drm/nouveau: Add HD-audio component notifier support
drm/nouveau: fix build error without CONFIG_IOMMU_API
drm/nouveau/kms/nv04: remove set but not used variable 'width'
drm/nouveau/kms/nv50: remove set but not unused variable 'nv_connector'
drm/nouveau/mmu: fix comptag memory leak
drm/nouveau/gr/gp10b: Use gp100_grctx and gp100_gr_zbc
drm/nouveau/pmu/gm20b,gp10b: Fix Falcon bootstrapping
drm/exynos: Rename Exynos to lowercase
drm/exynos: change callback names
drm/mst: Don't do atomic checks over disabled managers
drm/amdgpu: add the lost mutex_init back
drm/amd/display: skip opp blank or unblank if test pattern enabled
...
Diffstat (limited to 'drivers/gpu/drm/i915/i915_active.c')
-rw-r--r-- | drivers/gpu/drm/i915/i915_active.c | 142 |
1 files changed, 84 insertions, 58 deletions
diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index a19e7d89bc8a..f3da5c06f331 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -6,6 +6,7 @@ #include <linux/debugobjects.h> +#include "gt/intel_context.h" #include "gt/intel_engine_pm.h" #include "gt/intel_ring.h" @@ -91,10 +92,9 @@ static void debug_active_init(struct i915_active *ref) static void debug_active_activate(struct i915_active *ref) { - spin_lock_irq(&ref->tree_lock); + lockdep_assert_held(&ref->tree_lock); if (!atomic_read(&ref->count)) /* before the first inc */ debug_object_activate(ref, &active_debug_desc); - spin_unlock_irq(&ref->tree_lock); } static void debug_active_deactivate(struct i915_active *ref) @@ -186,18 +186,33 @@ active_retire(struct i915_active *ref) __active_retire(ref); } +static inline struct dma_fence ** +__active_fence_slot(struct i915_active_fence *active) +{ + return (struct dma_fence ** __force)&active->fence; +} + +static inline bool +active_fence_cb(struct dma_fence *fence, struct dma_fence_cb *cb) +{ + struct i915_active_fence *active = + container_of(cb, typeof(*active), cb); + + return cmpxchg(__active_fence_slot(active), fence, NULL) == fence; +} + static void node_retire(struct dma_fence *fence, struct dma_fence_cb *cb) { - i915_active_fence_cb(fence, cb); - active_retire(container_of(cb, struct active_node, base.cb)->ref); + if (active_fence_cb(fence, cb)) + active_retire(container_of(cb, struct active_node, base.cb)->ref); } static void excl_retire(struct dma_fence *fence, struct dma_fence_cb *cb) { - i915_active_fence_cb(fence, cb); - active_retire(container_of(cb, struct i915_active, excl.cb)); + if (active_fence_cb(fence, cb)) + active_retire(container_of(cb, struct i915_active, excl.cb)); } static struct i915_active_fence * @@ -244,7 +259,7 @@ active_instance(struct i915_active *ref, struct intel_timeline *tl) } node = prealloc; - __i915_active_fence_init(&node->base, &tl->mutex, NULL, node_retire); + __i915_active_fence_init(&node->base, NULL, node_retire); node->ref = ref; node->timeline = idx; @@ -262,7 +277,8 @@ out: void __i915_active_init(struct i915_active *ref, int (*active)(struct i915_active *ref), void (*retire)(struct i915_active *ref), - struct lock_class_key *key) + struct lock_class_key *mkey, + struct lock_class_key *wkey) { unsigned long bits; @@ -280,9 +296,12 @@ void __i915_active_init(struct i915_active *ref, init_llist_head(&ref->preallocated_barriers); atomic_set(&ref->count, 0); - __mutex_init(&ref->mutex, "i915_active", key); - __i915_active_fence_init(&ref->excl, &ref->mutex, NULL, excl_retire); + __mutex_init(&ref->mutex, "i915_active", mkey); + __i915_active_fence_init(&ref->excl, NULL, excl_retire); INIT_WORK(&ref->work, active_work); +#if IS_ENABLED(CONFIG_LOCKDEP) + lockdep_init_map(&ref->work.lockdep_map, "i915_active.work", wkey, 0); +#endif } static bool ____active_del_barrier(struct i915_active *ref, @@ -376,15 +395,8 @@ void i915_active_set_exclusive(struct i915_active *ref, struct dma_fence *f) /* We expect the caller to manage the exclusive timeline ordering */ GEM_BUG_ON(i915_active_is_idle(ref)); - /* - * As we don't know which mutex the caller is using, we told a small - * lie to the debug code that it is using the i915_active.mutex; - * and now we must stick to that lie. - */ - mutex_acquire(&ref->mutex.dep_map, 0, 0, _THIS_IP_); if (!__i915_active_fence_set(&ref->excl, f)) atomic_inc(&ref->count); - mutex_release(&ref->mutex.dep_map, _THIS_IP_); } bool i915_active_acquire_if_busy(struct i915_active *ref) @@ -407,8 +419,10 @@ int i915_active_acquire(struct i915_active *ref) if (!atomic_read(&ref->count) && ref->active) err = ref->active(ref); if (!err) { + spin_lock_irq(&ref->tree_lock); /* vs __active_retire() */ debug_active_activate(ref); atomic_inc(&ref->count); + spin_unlock_irq(&ref->tree_lock); } mutex_unlock(&ref->mutex); @@ -461,6 +475,7 @@ int i915_active_wait(struct i915_active *ref) if (wait_var_event_interruptible(ref, i915_active_is_idle(ref))) return -EINTR; + flush_work(&ref->work); return 0; } @@ -590,12 +605,15 @@ int i915_active_acquire_preallocate_barrier(struct i915_active *ref, struct intel_engine_cs *engine) { intel_engine_mask_t tmp, mask = engine->mask; + struct llist_node *pos = NULL, *next; struct intel_gt *gt = engine->gt; - struct llist_node *pos, *next; int err; GEM_BUG_ON(i915_active_is_idle(ref)); - GEM_BUG_ON(!llist_empty(&ref->preallocated_barriers)); + + /* Wait until the previous preallocation is completed */ + while (!llist_empty(&ref->preallocated_barriers)) + cond_resched(); /* * Preallocate a node for each physical engine supporting the target @@ -615,10 +633,6 @@ int i915_active_acquire_preallocate_barrier(struct i915_active *ref, goto unwind; } -#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM) - node->base.lock = - &engine->kernel_context->timeline->mutex; -#endif RCU_INIT_POINTER(node->base.fence, NULL); node->base.cb.func = node_retire; node->timeline = idx; @@ -639,18 +653,27 @@ int i915_active_acquire_preallocate_barrier(struct i915_active *ref, node->base.cb.node.prev = (void *)engine; atomic_inc(&ref->count); } + GEM_BUG_ON(rcu_access_pointer(node->base.fence) != ERR_PTR(-EAGAIN)); GEM_BUG_ON(barrier_to_engine(node) != engine); - llist_add(barrier_to_ll(node), &ref->preallocated_barriers); + next = barrier_to_ll(node); + next->next = pos; + if (!pos) + pos = next; intel_engine_pm_get(engine); } + GEM_BUG_ON(!llist_empty(&ref->preallocated_barriers)); + llist_add_batch(next, pos, &ref->preallocated_barriers); + return 0; unwind: - llist_for_each_safe(pos, next, take_preallocated_barriers(ref)) { + while (pos) { struct active_node *node = barrier_from_ll(pos); + pos = pos->next; + atomic_dec(&ref->count); intel_engine_pm_put(barrier_to_engine(node)); @@ -702,12 +725,18 @@ void i915_active_acquire_barrier(struct i915_active *ref) } } +static struct dma_fence **ll_to_fence_slot(struct llist_node *node) +{ + return __active_fence_slot(&barrier_from_ll(node)->base); +} + void i915_request_add_active_barriers(struct i915_request *rq) { struct intel_engine_cs *engine = rq->engine; struct llist_node *node, *next; unsigned long flags; + GEM_BUG_ON(!intel_context_is_barrier(rq->context)); GEM_BUG_ON(intel_engine_is_virtual(engine)); GEM_BUG_ON(i915_request_timeline(rq) != engine->kernel_context->timeline); @@ -721,19 +750,13 @@ void i915_request_add_active_barriers(struct i915_request *rq) */ spin_lock_irqsave(&rq->lock, flags); llist_for_each_safe(node, next, node) { - RCU_INIT_POINTER(barrier_from_ll(node)->base.fence, &rq->fence); - smp_wmb(); /* serialise with reuse_idle_barrier */ + /* serialise with reuse_idle_barrier */ + smp_store_mb(*ll_to_fence_slot(node), &rq->fence); list_add_tail((struct list_head *)node, &rq->fence.cb_list); } spin_unlock_irqrestore(&rq->lock, flags); } -#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM) -#define active_is_held(active) lockdep_is_held((active)->lock) -#else -#define active_is_held(active) true -#endif - /* * __i915_active_fence_set: Update the last active fence along its timeline * @active: the active tracker @@ -744,7 +767,7 @@ void i915_request_add_active_barriers(struct i915_request *rq) * fence onto this one. Returns the previous fence (if not already completed), * which the caller must ensure is executed before the new fence. To ensure * that the order of fences within the timeline of the i915_active_fence is - * maintained, it must be locked by the caller. + * understood, it should be locked by the caller. */ struct dma_fence * __i915_active_fence_set(struct i915_active_fence *active, @@ -753,34 +776,41 @@ __i915_active_fence_set(struct i915_active_fence *active, struct dma_fence *prev; unsigned long flags; - /* NB: must be serialised by an outer timeline mutex (active->lock) */ - spin_lock_irqsave(fence->lock, flags); + if (fence == rcu_access_pointer(active->fence)) + return fence; + GEM_BUG_ON(test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)); - prev = rcu_dereference_protected(active->fence, active_is_held(active)); + /* + * Consider that we have two threads arriving (A and B), with + * C already resident as the active->fence. + * + * A does the xchg first, and so it sees C or NULL depending + * on the timing of the interrupt handler. If it is NULL, the + * previous fence must have been signaled and we know that + * we are first on the timeline. If it is still present, + * we acquire the lock on that fence and serialise with the interrupt + * handler, in the process removing it from any future interrupt + * callback. A will then wait on C before executing (if present). + * + * As B is second, it sees A as the previous fence and so waits for + * it to complete its transition and takes over the occupancy for + * itself -- remembering that it needs to wait on A before executing. + * + * Note the strong ordering of the timeline also provides consistent + * nesting rules for the fence->lock; the inner lock is always the + * older lock. + */ + spin_lock_irqsave(fence->lock, flags); + prev = xchg(__active_fence_slot(active), fence); if (prev) { GEM_BUG_ON(prev == fence); spin_lock_nested(prev->lock, SINGLE_DEPTH_NESTING); __list_del_entry(&active->cb.node); spin_unlock(prev->lock); /* serialise with prev->cb_list */ - - /* - * active->fence is reset by the callback from inside - * interrupt context. We need to serialise our list - * manipulation with the fence->lock to prevent the prev - * being lost inside an interrupt (it can't be replaced as - * no other caller is allowed to enter __i915_active_fence_set - * as we hold the timeline lock). After serialising with - * the callback, we need to double check which ran first, - * our list_del() [decoupling prev from the callback] or - * the callback... - */ - prev = rcu_access_pointer(active->fence); } - - rcu_assign_pointer(active->fence, fence); + GEM_BUG_ON(rcu_access_pointer(active->fence) != fence); list_add_tail(&active->cb.node, &fence->cb_list); - spin_unlock_irqrestore(fence->lock, flags); return prev; @@ -792,10 +822,6 @@ int i915_active_fence_set(struct i915_active_fence *active, struct dma_fence *fence; int err = 0; -#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM) - lockdep_assert_held(active->lock); -#endif - /* Must maintain timeline ordering wrt previous active requests */ rcu_read_lock(); fence = __i915_active_fence_set(active, &rq->fence); @@ -812,7 +838,7 @@ int i915_active_fence_set(struct i915_active_fence *active, void i915_active_noop(struct dma_fence *fence, struct dma_fence_cb *cb) { - i915_active_fence_cb(fence, cb); + active_fence_cb(fence, cb); } #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) |