Building a Coroutine-Based Job System Without Standard Library
frames on the CPU timeline. 4445 COROUTINE JOB SYSTEM isa::await_suspend Scheduler Worker Worker Worker Worker Job Queue Job Queue Job Queue Job coroutine_handle Job::Job *isa == initial_suspend_awaitable final_suspend And voila! This is our system. 4546 isa::await_suspend Scheduler Worker Worker Worker Worker Job Queue Job Queue Job Queue Job coroutine_handle Job::Job *isa == initial_suspend_awaitable glues our system and coroutine semantics together. 4647 isa::await_suspend Scheduler Worker Worker Worker Worker Job Queue Job Queue Job Queue Job Job::Job *isa == initial_suspend_awaitable get_return_object0 码力 | 120 页 | 2.20 MB | 5 月前3Taro: Task graph-based Asynchronous Programming Using C++ Coroutine
C B HPQ LPQ Worker 1 GPU stream Worker 2 Block Poll Enqueue Steal Offload Callback Polling Wait Query • Assume two CPU threads (workers) and one GPU stream • Each worker has: High-priority the same worker as soon as possible to avoid another worker stealing the task 60Taro’s Scheduler Taro: https://github.com/dian-lun-lin/taro D A C B HPQ LPQ A Worker 1 GPU stream Worker 2 Block Offload Callback Polling Wait Query • Assume two CPU threads (workers) and one GPU stream • Each worker has: High-priority queue (HPQ): store suspended tasks Low-priority queue (LPQ): store0 码力 | 84 页 | 8.82 MB | 5 月前3TiDB v8.3 Documentation
reorg �→ _ �→ batch �→ _ �→ size Modified Adds the SES- SION scope. tidb_ �→ ddl_ �→ reorg �→ _ �→ worker �→ _cnt Modified Adds the SES- SION scope. 44 Variable name Change type Description tidb_gc �→ partial_worker:{wall_time:60.660079ms, �→ concurrency:5, task_num:293, tot_wait:262.536669ms, tot_exec �→ :40.171833ms, tot_time:302.827753ms, max:60.636886ms, p95:60.636886ms �→ }, final_worker:{wall_time:60 Look at the TTL Scan Worker Time By Phase and TTL Delete Worker Time By �→ Phase panels. If the scan worker is in the dispatch phase for a large percentage of time and the delete worker is rarely in the idle0 码力 | 6606 页 | 109.48 MB | 9 月前3TiDB v8.2 Documentation
partial_worker:{wall_time:60.660079ms, �→ concurrency:5, task_num:293, tot_wait:262.536669ms, tot_exec �→ :40.171833ms, tot_time:302.827753ms, max:60.636886ms, p95:60.636886ms �→ }, final_worker:{wall_time:60 Look at the TTL Scan Worker Time By Phase and TTL Delete Worker Time By �→ Phase panels. If the scan worker is in the dispatch phase for a large percentage of time and the delete worker is rarely in the idle then the scan worker is waiting for the delete worker to finish the deletion. If the cluster resources are still free at this point, you can consider increasing tidb_ttl_delete_worker_count to increase0 码力 | 6549 页 | 108.77 MB | 9 月前3TiDB v8.1 Documentation
partial_worker:{wall_time:60.660079ms, �→ concurrency:5, task_num:293, tot_wait:262.536669ms, tot_exec �→ :40.171833ms, tot_time:302.827753ms, max:60.636886ms, p95:60.636886ms �→ }, final_worker:{wall_time:60 Look at the TTL Scan Worker Time By Phase and TTL Delete Worker Time By �→ Phase panels. If the scan worker is in the dispatch phase for a large percentage of time and the delete worker is rarely in the idle then the scan worker is waiting for the delete worker to finish the deletion. If the cluster resources are still free at this point, you can consider increasing tidb_ttl_delete_worker_count to increase0 码力 | 6479 页 | 108.61 MB | 9 月前3TiDB v8.5 Documentation
during graceful shutdown #55464 @YangKeao • Fix the issue that reducing the value of tidb_ttl_delete_worker_count during TTL job execution makes the job fail to complete #55561 @lcwangchao • Fix the issue partial_worker:{wall_time:60.660079ms, �→ concurrency:5, task_num:293, tot_wait:262.536669ms, tot_exec �→ :40.171833ms, tot_time:302.827753ms, max:60.636886ms, p95:60.636886ms �→ }, final_worker:{wall_time:60 Look at the TTL Scan Worker Time By Phase and TTL Delete Worker Time By �→ Phase panels. If the scan worker is in the dispatch phase for a large percentage of time and the delete worker is rarely in the idle0 码力 | 6730 页 | 111.36 MB | 9 月前3TiDB v8.4 Documentation
during graceful shutdown #55464 @YangKeao • Fix the issue that reducing the value of tidb_ttl_delete_worker_count during TTL job execution makes the job fail to complete #55561 @lcwangchao • Fix the issue partial_worker:{wall_time:60.660079ms, �→ concurrency:5, task_num:293, tot_wait:262.536669ms, tot_exec �→ :40.171833ms, tot_time:302.827753ms, max:60.636886ms, p95:60.636886ms �→ }, final_worker:{wall_time:60 Look at the TTL Scan Worker Time By Phase and TTL Delete Worker Time By �→ Phase panels. If the scan worker is in the dispatch phase for a large percentage of time and the delete worker is rarely in the idle0 码力 | 6705 页 | 110.86 MB | 9 月前3BRPC与UCX集成指南
operation, tag match, stream27 典型的RDMA栈28 UCX 编程的一些基本概念 ●Context –收集机器资源(内存,网卡等),在应用的各个部分共享 ●Worker –完成ucx的功能,可以在应用程序中调用的函数(不是单独执行的线程) ●Listener –接收连接请求 ●Ep –连接对象,在ep上请求发送和接收29 UCP 消息接口类型 ●Active –消息异步发送 ●Tag –MPI使用 ●Stream –官方不推荐30 WORKER ●worker是UCX通讯中的核心概念,它是一个进度引擎(progress engine) ●worker既不是协程也不是线程,而是一个状态机,可以通过不停地调用 ucp_worker_progress(worker)完成功能。如果你用过libuv或者libevent的evbuffer,它们有点 像 就可完成buffer自动读写。31 WORKER ●Busy poll –Busy poll可以有效降低时延,但是在空闲时浪费CPU ●Wait –会增加时延,但是节省CPU使用 ●通过ucp_worker_get_efd(*ucp_worker, efd)获得轮询文件句柄 ●调用poll(efd)等待有任务执行,然后再调用ucp_worker_progress() ●/dev/cpu_dma_latency0 码力 | 66 页 | 16.29 MB | 5 月前3TiDB v8.2 中文手册
partial_worker:{wall_time:60.660079ms, concurrency:5, task_num:293, tot_wait �→ :262.536669ms, tot_exec:40.171833ms, tot_time:302.827753ms, max:60.636886ms, p95 �→ :60.636886ms}, final_worker:{wall_time:60 任务的瓶颈在扫描还是删除? 观察面板中 TTL Scan Worker Time By Phase 与 TTL Delete Worker Time By Phase 监控项。如果 scan worker 处于 dispatch 状态的时间有很大占比,且 delete worker 很少处于 idle 状态,那么说明 scan worker 在等待 delete worker 完成删除工作,如果此时集群资源仍然较为宽松,可以考虑提高 tidb_ttl_delete_ �→ worker_count 来提高删除的 worker 数量。例如: 图 20: scan fast example 与之相对,如果 scan worker 很少处于 dispatch 的状态,且 delete worker 长期处于 idle 阶段,那么说明 delete worker 闲置,且 scan worker 较为忙碌。例如: 图 21: delete0 码力 | 4987 页 | 102.91 MB | 9 月前3TiDB v8.5 中文手册
@Defined2014 – 修复 TiDB 优雅关闭时不等待 auto commit 事务完成的问题 #55464 @YangKeao – 修复在 TTL 任务执行过程中,减小 tidb_ttl_delete_worker_count 的值导致任务无法完成的问题 #55561 @lcwangchao – 修复当一张表的索引中包含生成列时,通过 ANALYZE 语句收集这张表的统计信息时可能报错 Unknown partial_worker:{wall_time:60.660079ms, concurrency:5, task_num:293, tot_wait �→ :262.536669ms, tot_exec:40.171833ms, tot_time:302.827753ms, max:60.636886ms, p95 �→ :60.636886ms}, final_worker:{wall_time:60 任务的瓶颈在扫描还是删除? 观察面板中 TTL Scan Worker Time By Phase 与 TTL Delete Worker Time By Phase 监控项。如果 scan worker 处于 dispatch 状态的时间有很大占比,且 delete worker 很少处于 idle 状态,那么说明 scan worker 在等待 delete worker 完成删除工作,如果此时集群资源仍然较为宽松,可以考虑提高0 码力 | 5095 页 | 104.54 MB | 9 月前3
共 122 条
- 1
- 2
- 3
- 4
- 5
- 6
- 13