summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
AgeCommit message (Collapse)Author
2017-04-04drm/amdgpu: use a 64bit interval tree for VM management v2Christian König
This only makes a difference for 32-bit systems. The idea is to have a fixed virtual address space size with 4-level page tables and to minimize differences between 32 and 64-bit systems. v2: Update commit message. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29drm/amdgpu: Couple small warning fixesHarry Wentland
Signed-off-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29drm/amdgpu:changes in gfx DMAframe scheme (v2)Monk Liu
1) Adapt to vulkan: Now use double SWITCH BUFFER to replace the 128 nops w/a, because when vulkan introduced, umd can insert 7 ~ 16 IBs per submit which makes 256 DW size cannot hold the whole DMAframe (if we still insert those 128 nops), CP team suggests use double SWITCH_BUFFERs, instead of tricky 128 NOPs w/a. 2) To fix the CE VM fault issue when MCBP introduced: Need one more COND_EXEC wrapping IB part (original one us for VM switch part). this change can fix vm fault issue caused by below scenario without this change: >CE passed original COND_EXEC (no MCBP issued this moment), proceed as normal. >DE catch up to this COND_EXEC, but this time MCBP issued, thus DE treats all following packages as NOP. The following VM switch packages now looks just as NOP to DE, so DE dosen't do VM flush at all. >Now CE proceeds to the first IBc, and triggers VM fault, because DE didn't do VM flush for this DMAframe. 3) change estimated alloc size for gfx9. with new DMAframe scheme, we need modify emit_frame_size for gfx9 4) No need to insert 128 nops after gfx8 vm flush anymore because there was double SWITCH_BUFFER append to vm flush, and for gfx7 we already use double SWITCH_BUFFER following after vm_flush so no change needed for it. 5) Change emit_frame_size for gfx8 v2: squash in BUG removal from Monk Signed-off-by: Monk Liu <Monk.Liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29drm/amdgpu:fix the check in cs_ib_fill for SRIOVMonk Liu
1,the check is only appliable for SRIOV GFX engine. 2,use chunk_ib instead of ib. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Ken Wang <Qingqing.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29drm/amdgpu:protect cs submitMonk Liu
to prevent submit two or more IBs with PREEMPT flags. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29drm/amdgpu:fix cs_ib_fillMonk Liu
should use chunk_ib instead of ib, otherwise the logic is incorrect. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Ken Wang <Qingqing.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29drm/amdgpu: handle multi level PD updates V2Christian König
Update all levels of the page directory. V2: a. sub level pdes always are written to incorrect place. b. sub levels need to update regardless of parent updates. Signed-off-by: Christian König <christian.koenig@amd.com> (V1) Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (V1) Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> (V2) Acked-by: Alex Deucher <alexander.deucher@amd.com> (V2) Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29drm/amdgpu: generalize page table levelChristian König
No functional change, but the base for multi level page tables. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29drm/amdgpu: rename page_directory_fence to last_dir_updateChristian König
Decribes better what this is used for. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29drm/amdgpu: add optional fence out-parameter to amdgpu_vm_clear_freedNicolai Hähnle
We will add the fence to freed buffer objects in a later commit, to ensure that the underlying memory can only be re-used after all references in page tables have been cleared. Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29drm/amdgpu: get cs support of AMDGPU_HW_IP_UVD_ENCLeo Liu
Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29drm/amdgpu: IOCTL interface for PRT support v4Junwei Zhang
Till GFX8 we can only enable PRT support globally, but with the next hardware generation we can do this on a per page basis. Keep the interface consistent by adding PRT mappings and enable support globally on current hardware when the first mapping is made. v2: disable PRT support delayed and on all error paths v3: PRT and other permissions are mutal exclusive, PRT mappings don't need a BO. v4: update PRT mappings durign CS as well, make va_flags 64bit Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-10drm/amdgpu: fix parser init error path to avoid crash in parser finiDave Airlie
If we don't reset the chunk info in the error path, the subsequent fini path will double free. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-02-23Merge tag 'v4.10-rc8' into drm-nextDave Airlie
Linux 4.10-rc8 Backmerge Linus rc8 to fix some conflicts, but also to avoid pulling it in via a fixes pull from someone.
2017-02-09drm/amdgpu: report the number of bytes moved at buffer creationSamuel Pitoiset
Like ttm_bo_validate(), ttm_bo_init() might need to move BO and the number of bytes moved by TTM should be reported. This can help the throttle buffer migration mechanism to make a better decision. v2: fix computation Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-01-27drm/amdgpu: use the num_rings variable for checking vce ringsAlex Deucher
Difference families may have different numbers of rings. Use the variable rather than a hardcoded number. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-01-27drm/amdgpu:invoke CSA functions (v2)Monk Liu
Make sure the CSA is mapped. v2: agd: rebase. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-01-23drm/amdgpu: check ring being ready before usingDing Pixel
Return success when the ring is properly initialized, otherwise return failure. Tonga SRIOV VF doesn't have UVD and VCE engines, the initialization of these IPs is bypassed. The system crashes if application submit IB to their rings which are not ready to use. It could be a common issue if IP having ring buffer is disabled for some reason on specific ASIC, so it should check the ring being ready to use. Bug: amdgpu_test crashes system on Tonga VF. Signed-off-by: Ding Pixel <Pixel.Ding@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-12-06drm/amd/amdgpu: validate the shadow BO.Alex Xie
Fixes a rare NULL pointer dereference in amdgpu_ttm_bind. The issue was found by Nicolai Haehnle. The patch was tested by Nicolai Haehnle. Signed-off-by: Alex Xie <AlexBin.Xie@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2016-11-11drm/amdgpu: remove amdgpu_cs_handle_lockupHuang Rui
In fence waiting, it never return -EDEADLK yet, so drop this function here. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-11-11drm/amdgpu: cleanup amdgpu_cs_ioctl to make code logicality clearHuang Rui
Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-11-11Merge tag 'topic/drm-misc-2016-11-10' of ↵Dave Airlie
git://anongit.freedesktop.org/drm-intel into drm-next - better atomic state debugging from Rob - fence prep from gustavo - sumits flushed out his backlog of pending dma-buf/fence patches from various people - drm_mm leak debugging plus trying to appease Kconfig (Chris) - a few misc things all over * tag 'topic/drm-misc-2016-11-10' of git://anongit.freedesktop.org/drm-intel: (35 commits) drm: Make DRM_DEBUG_MM depend on STACKTRACE_SUPPORT drm/i915: Restrict DRM_DEBUG_MM automatic selection drm: Restrict stackdepot usage to builtin drm.ko drm/msm: module param to dump state on error irq drm/msm/mdp5: add atomic_print_state support drm/atomic: add debugfs file to dump out atomic state drm/atomic: add new drm_debug bit to dump atomic state drm: add helpers to go from plane state to drm_rect drm: add helper for printing to log or seq_file drm: helper macros to print composite types reservation: revert "wait only with non-zero timeout specified (v3)" v2 drm/ttm: fix ttm_bo_wait dma-buf/fence: revert "don't wait when specified timeout is zero" (v2) dma-buf/fence: make timeout handling in fence_default_wait consistent (v2) drm/amdgpu: add the interface of waiting multiple fences (v4) dma-buf: return index of the first signaled fence (v2) MAINTAINERS: update Sync File Framework files dma-buf/sw_sync: put fence reference from the fence creation dma-buf/sw_sync: mark sync_timeline_create() static drm: Add stackdepot include for DRM_DEBUG_MM ...
2016-11-09drm/amdgpu: add the interface of waiting multiple fences (v4)Junwei Zhang
v2: agd: rebase and squash in all the previous optimizations and changes so everything compiles. v3: squash in Slava's 32bit build fix v4: rebase on drm-next (fence -> dma_fence), squash in Monk's ioctl update patch Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org> [sumits: fix checkpatch warnings] Link: http://patchwork.freedesktop.org/patch/msgid/1478290570-30982-2-git-send-email-alexander.deucher@amd.com
2016-11-07Backmerge tag 'v4.9-rc4' into drm-nextDave Airlie
Linux 4.9-rc4 This is needed for nouveau development.
2016-10-28Merge tag 'topic/drm-misc-2016-10-27' of ↵Dave Airlie
git://anongit.freedesktop.org/git/drm-intel into drm-next Pull request already again to get the s/fence/dma_fence/ stuff in and allow everyone to resync. Otherwise really just misc stuff all over, and a new bridge driver. * tag 'topic/drm-misc-2016-10-27' of git://anongit.freedesktop.org/git/drm-intel: drm/bridge: fix platform_no_drv_owner.cocci warnings drm/bridge: fix semicolon.cocci warnings drm: Print some debug/error info during DP dual mode detect drm: mark drm_of_component_match_add dummy inline drm/bridge: add Silicon Image SiI8620 driver dt-bindings: add Silicon Image SiI8620 bridge bindings video: add header file for Mobile High-Definition Link (MHL) interface drm: convert DT component matching to component_match_add_release() dma-buf: Rename struct fence to dma_fence dma-buf/fence: add an lockdep_assert_held() drm/dp: Factor out helper to distinguish between branch and sink devices drm/edid: Only print the bad edid when aborting drm/msm: add missing header dependencies drm/msm/adreno: move function declarations to header file drm/i2c/tda998x: mark symbol static where possible doc: add missing docbook parameter for fence-array drm: RIP mode_config->rotation_property drm/msm/mdp5: Advertize 180 degree rotation drm/msm/mdp5: Use per-plane rotation property
2016-10-25drm/amdgpu: improve parse_cs handling a bitChristian König
This way we can use parse_cs and still keep VM mode enabled. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-and-Tested by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25drm/amdgpu: move the ring type into the funcs structure (v2)Christian König
It's constant, so it doesn't make to much sense to keep it with the variable data. v2: update vce and uvd phys mode ring structures as well Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25drm/amdgpu: move PT validation back into VM code v2Christian König
Saves a bunch of CPU cycles when swapping things back in and allows us to split the VM headers into a separate file. v2: rename parameters Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25drm/amdgpu: remove adev pointer from struct amdgpu_bo v2Christian König
It's completely pointless to have two pointers to the device in the same structure. v2: rename function to amdgpu_ttm_adev, fix typos Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25drm/amdgpu: add AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS flag v3Christian König
Add a flag noting that a BO must be created using linear VRAM and set this flag on all in kernel users where appropriate. Hopefully I haven't missed anything. v2: add it in a few more places, fix CPU mapping. v3: rename to VRAM_CONTIGUOUS, fix typo in CS code. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Tested-by: Mike Lothian <mike@fireburn.co.uk> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25dma-buf: Rename struct fence to dma_fenceChris Wilson
I plan to usurp the short name of struct fence for a core kernel struct, and so I need to rename the specialised fence/timeline for DMA operations to make room. A consensus was reached in https://lists.freedesktop.org/archives/dri-devel/2016-July/113083.html that making clear this fence applies to DMA operations was a good thing. Since then the patch has grown a bit as usage increases, so hopefully it remains a good thing! (v2...: rebase, rerun spatch) v3: Compile on msm, spotted a manual fixup that I broke. v4: Try again for msm, sorry Daniel coccinelle script: @@ @@ - struct fence + struct dma_fence @@ @@ - struct fence_ops + struct dma_fence_ops @@ @@ - struct fence_cb + struct dma_fence_cb @@ @@ - struct fence_array + struct dma_fence_array @@ @@ - enum fence_flag_bits + enum dma_fence_flag_bits @@ @@ ( - fence_init + dma_fence_init | - fence_release + dma_fence_release | - fence_free + dma_fence_free | - fence_get + dma_fence_get | - fence_get_rcu + dma_fence_get_rcu | - fence_put + dma_fence_put | - fence_signal + dma_fence_signal | - fence_signal_locked + dma_fence_signal_locked | - fence_default_wait + dma_fence_default_wait | - fence_add_callback + dma_fence_add_callback | - fence_remove_callback + dma_fence_remove_callback | - fence_enable_sw_signaling + dma_fence_enable_sw_signaling | - fence_is_signaled_locked + dma_fence_is_signaled_locked | - fence_is_signaled + dma_fence_is_signaled | - fence_is_later + dma_fence_is_later | - fence_later + dma_fence_later | - fence_wait_timeout + dma_fence_wait_timeout | - fence_wait_any_timeout + dma_fence_wait_any_timeout | - fence_wait + dma_fence_wait | - fence_context_alloc + dma_fence_context_alloc | - fence_array_create + dma_fence_array_create | - to_fence_array + to_dma_fence_array | - fence_is_array + dma_fence_is_array | - trace_fence_emit + trace_dma_fence_emit | - FENCE_TRACE + DMA_FENCE_TRACE | - FENCE_WARN + DMA_FENCE_WARN | - FENCE_ERR + DMA_FENCE_ERR ) ( ... ) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Acked-by: Sumit Semwal <sumit.semwal@linaro.org> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161025120045.28839-1-chris@chris-wilson.co.uk
2016-10-21drm/amdgpu: avoid drm error log during S3 on RHEL7.3jimqu
Signed-off-by: JimQu <Jim.Qu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-09-28drm/amdgpu: add a custom GTT memory manager v2Christian König
Only allocate address space when we really need it. v2: fix a typo, add correct function description, stop leaking the node in the error case. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-09-14drm/amdgpu: validate size and offset of user fence BOChristian König
We need to validate the offset to make sure that we don't write after the BO. Additional to that a page should be enough and can make address space handling much easier. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-09-14drm/amdgpu: mark symbols static where possibleBaoyou Xie
We get a few warnings when building kernel with W=1: drivers/gpu/drm/amd/amdgpu/cz_smc.c:51:5: warning: no previous prototype for 'cz_send_msg_to_smc_async' [-Wmissing-prototypes] drivers/gpu/drm/amd/amdgpu/cz_smc.c:143:5: warning: no previous prototype for 'cz_write_smc_sram_dword' [-Wmissing-prototypes] drivers/gpu/drm/amd/amdgpu/iceland_smc.c:124:6: warning: no previous prototype for 'iceland_start_smc' [-Wmissing-prototypes] drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c:3926:6: warning: no previous prototype for 'gfx_v8_0_rlc_stop' [-Wmissing-prototypes] drivers/gpu/drm/amd/amdgpu/amdgpu_job.c:94:6: warning: no previous prototype for 'amdgpu_job_free_cb' [-Wmissing-prototypes] .... In fact, these functions are only used in the file in which they are declared and don't need a declaration, but can be made static. So this patch marks these functions with 'static'. Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-09-14drm/amdgpu: bind GTT on demandChristian König
We don't really need the GTT table any more most of the time. So bind it only on demand. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-09-14drm/amdgpu:implement CONTEXT_CONTROL (v5)Monk Liu
v1: for gfx8, use CONTEXT_CONTROL package to dynamically skip preamble CEIB and other load_xxx command in sequence. v2: support GFX7 as well. remove cntxcntl in compute ring funcs because CPC doesn't support this packet. v3: fix reduntant judgement in cntxcntl. v4: some cleanups, don't change cs_submit() v5: keep old MESA supported & bump up KMS version. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Ack-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-09-12drm/amdgpu: change job->ctx field nameMonk Liu
job->ctx actually is a fence_context of the entity it belongs to, naming it as ctx is too vague, and we'll need add amdgpu_ctx into the job structure later. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-09-02drm/amdgpu: prevent command submission failures under memory pressure v2Christian König
As last resort try to evict BOs from the current working set into other memory domains. This effectively prevents command submission failures when VM page tables have been swapped out. v2: fix typos Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-09-02drm/amdgpu: only try again if we actually run into -ENOMEMChristian König
All other errors can't be fixed by using a different memory domain. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-08-30drm/amdgpu: throttle buffer migrations at CS using a fixed MBps limit (v2)Marek Olšák
The old mechanism used a per-submission limit that didn't take previous submissions within the same time frame into account. It also filled VRAM slowly when VRAM usage dropped due to a big eviction or buffer deallocation. This new method establishes a configurable MBps limit that is obeyed when VRAM usage is very high. When VRAM usage is not very high, it gives the driver the freedom to fill it quickly. The result is more consistent performance. It can't keep the BO move rate low if lots of evictions are happening due to VRAM fragmentation, or if a big buffer is being migrated. The amdgpu.moverate parameter can be used to set a non-default limit. Measurements can be done to find out which amdgpu.moverate setting gives the best results. Mainly APUs and cards with small VRAM will benefit from this. For F1 2015, anything with 2 GB VRAM or less will benefit. Some benchmark results - F1 2015 (Tonga 2GB): Limit MinFPS AvgFPS Old code: 14 32.6 128 MB/s: 28 41 64 MB/s: 15.5 43 32 MB/s: 28.7 43.4 8 MB/s: 27.8 44.4 8 MB/s: 21.9 42.8 (different run) Random drops in Min FPS can still occur (due to fragmented VRAM?), but the average FPS is much better. 8 MB/s is probably a good limit for this game & the current VRAM management. The random FPS drops are still to be tackled. v2: use a spinlock Signed-off-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-08-22drm/amdgpu: cleanup amdgpu_vm_bo_update paramsChristian König
Make it more obvious what we are doing here. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-08-16drm/amdgpu: validate shadow as well when validating boChunming Zhou
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-08-08drm/amdgpu: print more accurate error messages on IB submission failureMarek Olšák
It's useful for debugging. Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: remove fence parameter from amd_sched_job_initChristian König
We return the fence as part of the job structur anyway, no need to do this twice. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: earlier free SA resourcesChristian König
Keep the time we don't have a fence associated with the resource smaller. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: fix user fence handling once moreChristian König
Same problem as with the VM page tables. The user fence address must be determined before the job is scheduled, not when the IB is executed. This fixes a security problem where user fences could be used to overwrite any part of VRAM. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: don't update page tables for VM emulationChristian König
It's just overhead to do so and allocating a VMID when we don't need one is actually a bit dangerous. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: validate VM PTs only on evictionChristian König
We don't need to validate them again if the eviction counter didn't changed. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: save the PD addr before scheduling the jobChristian König
When we pipeline evictions the page directory could already be moving somewhere else when grab_id is called. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>