C++20: An (Almost) Complete Overview
constexpr string & vector Concurrency Changes Atomic Smart Pointers Joining & Cancellable Threads The C++20 Synchronization Library Semaphores, efficient atomic waiting, latches, and barriers & Cancellable Threads std::jthread Supports cooperative cancellation Destructor automatically Asks thread to cancel Calls join()38 Joining & Cancellable Threads Cancelling Cancelling threads std::stop_token Supports actively checking for a stop request Can be used with condition_variable_any std::stop_source Used to request a thread to stop execution Stop requests 0 码力 | 85 页 | 512.18 KB | 5 月前3hazard pointer synchronous reclamation
custom domain hazard pointers. • Even worse if many instances of Container are used by thousands of threads. Is there a good solution using the default domain? Is there any good solution? Fast Scalable Robust shards don’t even need to be reentrant. Side-effect: Cohort objects are reclaimed only by related threads (threads that retired objects to the specific cohort). No reclamation while holding any domain shard0 码力 | 31 页 | 856.38 KB | 5 月前3使用硬件加速Tokio - 戴翔
Queue-Based Modules in Tokio • Channel • Scheduler • Tokio uses Channel for communication between threads (incl. pthread, co-routines). • Channel allows a unidirectional flow of information between two and 3 consumer cores Throughput / Relative Value MPSC Test Scenario MPSC channel allows many threads sending to one place. Conclusion: • DLB channel is more stable than SW channels • Core count0 码力 | 17 页 | 1.66 MB | 1 年前3Working with Asynchrony Generically: A Tour of C++ Executors
); }) | unifex::repeat_effect(); } Accept requests on low-latency threads. Process the requests on the worker threads.16 EXAMPLE: TRANSITIONING EXECUTION CONTEXT namespace ex = std::execution;0 码力 | 121 页 | 7.73 MB | 5 月前3C++高性能并行编程与优化 - 课件 - 05 C++11 开始的多线程编程
但当我们直接尝试编译刚才的代码,却在链接时发生了错误。 • 原来 std::thread 的实现背后是基于 pthread 的。 • 解决: CMakeLists.txt 里链接 Threads::Threads 即可: 有了多线程:异步处理请求 • 有了多线程的话,文件下载和用户交互分 别在两个线程,同时独立运行。从而下载 过程中也可以响应用户请求,提升了体验 。 • 可是发现一个问题:我输入完0 码力 | 79 页 | 14.11 MB | 1 年前3C++高性能并行编程与优化 - 课件 - 07 深入浅出访存优化
result in up to 11 times the performance on the Intel Xeon Phi coprocessor. When using multiple threads (which the Intel Xeon Phi is designed for), using the Morton ordering, performance is up to 2.44 44 times faster for a large matrix transpose. Even on an Intel Xeon CPU, by using up to 32 threads, comparing the naïve implementation to one using Morton ordering, the speedup was up to over 4X while0 码力 | 147 页 | 18.88 MB | 1 年前3Lock-Free Atomic Shared Pointers Without a Split Reference Count? It Can Be Done!
Workload: Proportion of reads vs writes • Hotness: Does the data fit in cache? • Contention: How many threads operation on the same location? • We will benchmark the lock-free stack implementation, using different0 码力 | 45 页 | 5.12 MB | 5 月前3
共 7 条
- 1