Techniques to Optimise Multi-threaded Data Building During Game Development
- keys in sorted array - giving O(logn) access Speaker notesJOB SYSTEM • Schedules jobs on many worker threads • Uses a Counter to synchronise between Jobs ▪ Leading "wait" counter marks job as runnable when finished • Inspired by Naughty Dog and CD Projekt RED job systems 12Jobs run on worker threads Number worker threads limited to logical processors Counters used to sync between jobs Each job uses std::vectorowned_nodes_; std::vector intersecting_nodes_; }; 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 24GridCell stores sorted arrays of GridNodes • Array of owned nodes - cell contains 0 码力 | 99 页 | 2.40 MB | 5 月前3Rethinking Task Based Concurrency and Parallelism for Low Latency C++
with their own logic and, if needed, data, queue, etc) ○ A Signal Tree (which has as many leaf nodes as there are work contracts in the group) ● Threads are brought to the “task” rather than the “task” throughput than the average MPMC queue at scale ○ Approximately 1/2N memory requirement (N = number of nodes) Alternative: Work Contracts Work Contract Group Work Contract (Logic) Work Contract Work Contract with their own logic and, if needed, data, queue, etc) ○ A Signal Tree (which has as many leaf nodes as there are work contracts in the group) ● Threads are brought to the “task” rather than the “task”0 码力 | 142 页 | 2.80 MB | 5 月前3Design patterns for error handling in C++ programs using parallel algorithms and executors
detect and handle recoverable errorsWhat is “parallel”? • Use multiple hardware resources – Nodes, cores, SIMD, … • To accomplish >1 work item at the same time • To improve performance – Latency execution; handling constrains order • Errors could lead to deadlock (waiting forever) – e.g., 1 worker drops out before collective synchronization • Correct handling requires communication – Data movement0 码力 | 32 页 | 883.27 KB | 5 月前3Building a Coroutine-Based Job System Without Standard Library
frames on the CPU timeline. 4445 COROUTINE JOB SYSTEM isa::await_suspend Scheduler Worker Worker Worker Worker Job Queue Job Queue Job Queue Job coroutine_handle Job::Job *isa == initial_suspend_awaitable final_suspend And voila! This is our system. 4546 isa::await_suspend Scheduler Worker Worker Worker Worker Job Queue Job Queue Job Queue Job coroutine_handle Job::Job *isa == initial_suspend_awaitable glues our system and coroutine semantics together. 4647 isa::await_suspend Scheduler Worker Worker Worker Worker Job Queue Job Queue Job Queue Job Job::Job *isa == initial_suspend_awaitable get_return_object0 码力 | 120 页 | 2.20 MB | 5 月前3Taro: Task graph-based Asynchronous Programming Using C++ Coroutine
C B HPQ LPQ Worker 1 GPU stream Worker 2 Block Poll Enqueue Steal Offload Callback Polling Wait Query • Assume two CPU threads (workers) and one GPU stream • Each worker has: High-priority the same worker as soon as possible to avoid another worker stealing the task 60Taro’s Scheduler Taro: https://github.com/dian-lun-lin/taro D A C B HPQ LPQ A Worker 1 GPU stream Worker 2 Block Offload Callback Polling Wait Query • Assume two CPU threads (workers) and one GPU stream • Each worker has: High-priority queue (HPQ): store suspended tasks Low-priority queue (LPQ): store0 码力 | 84 页 | 8.82 MB | 5 月前3The Roles of Symmetry And Orthogonality In Design
know what it solves and how to defend against edge cases) Thread-Stealing Work Queue worker worker worker Work Engine Foo Subsystem A Bar Subsystem B BazCharley Bay - charleyb123 at gmail dot as special handling to execute one work item many times) Thread-Stealing Work Queue worker worker worker Work Engine Foo Subsystem A Bar Subsystem B BazCharley Bay - charleyb123 at gmail dot work progress (but indirect monitoring can be implemented) Thread-Stealing Work Queue worker worker worker Work Engine Foo Subsystem A Bar Subsystem B BazCharley Bay - charleyb123 at gmail dot0 码力 | 151 页 | 3.20 MB | 5 月前3Writing Python Bindings for C++ Libraries: Easy-to-use Performance
monitoring code ● Let’s try to add more worker threads to thisComplexity level 2 : Multiple worker threadsComplexity level 2 : Multiple worker threads ● Worker threads cannot call the callback handlers the worker threads and calls the callbacks? ■ Inefficient! Do you know why? ○ Main thread releases GIL before blocking, workers acquire GIL before callbackComplexity level 2 : Multiple worker threadsComplexity threadsComplexity level 2 : Multiple worker threads ● Seems scalable ● C++ dispatching thread doesn’t even need GIL ○ Python is not inherently blocked!Complexity level 3 : Python background tasksComplexity0 码力 | 118 页 | 2.18 MB | 5 月前3Tracy: A Profiler You Don't Want to Miss
on_scheduler_entry(bool is_worker) override { if (is_worker) { int tid = tbb::this_task_arena::current_thread_index(); tracy::SetThreadName(std::string("tbb worker #").append(std::to_string(tid)) append(std::to_string(tid)).c_str()); } } void on_scheduler_exit(bool is_worker) override { } }; 50 Tips & Tricks Name your threads!example: don’t call memcpy() directly… call some my_memcpy() which (it’s basically a stackable clock/sustain plot!) 87 e.g.: task starts in main thread, moves between worker threads, and gets retired on a reclaimer thread Tips & Tricks Begin/End (and track) zone on different0 码力 | 84 页 | 8.70 MB | 5 月前3Tracy: A Profiler You Don't Want to Miss
on_scheduler_entry(bool is_worker) override { if (is_worker) { int tid = tbb::this_task_arena::current_thread_index(); tracy::SetThreadName(std::string("tbb worker #").append(std::to_string(tid)) append(std::to_string(tid)).c_str()); } } void on_scheduler_exit(bool is_worker) override { } }; 50 Tips & Tricks Name your threads!example: don’t call memcpy() directly… call some my_memcpy() which (it’s basically a stackable stairstep plot!) 63 e.g.: task starts in main thread, moves between worker threads, and gets retired on a reclaimer thread Tips & Tricks Begin/End (and track) zone on different0 码力 | 85 页 | 6.51 MB | 5 月前3Back to Basics: Concurrency
Variables (1/4) ● Perhaps a stranger thing when working with threads ○ Idea: ■ Given 2 threads (a worker and reporter), we want one thread to wait on the result of the other ■ Condition_variable allows Variables (2/4) ● Perhaps a stranger thing when working with threads ○ Idea: ■ Given 2 threads (a worker and reporter), we want one thread to wait on the result of the other ■ Condition_variable allows Variables (3/4) ● Perhaps a stranger thing when working with threads ○ Idea: ■ Given 2 threads (a worker and reporter), we want one thread to wait on the result of the other ■ Condition_variable allows0 码力 | 141 页 | 6.02 MB | 5 月前3
共 122 条
- 1
- 2
- 3
- 4
- 5
- 6
- 13
相关搜索词
TechniquestoOptimiseMultithreadedDataBuildingDuringGameDevelopmentRethinkingTaskBasedConcurrencyandParallelismforLowLatencyC++DesignpatternserrorhandlinginprogramsusingparallelalgorithmsexecutorsCoroutineJobSystemWithoutStandardLibraryTarographbasedAsynchronousProgrammingUsingTheRolesofSymmetryAndOrthogonalityInWritingPythonBindingsLibrariesEasyusePerformanceTracyProfilerYouDonWantMissBackBasics