并行不悖- OLAP 在互联网公司的实践与思考
1 并行不悖 – OLAP 在互联网公司的实践与思考 赵飞祥 2 Greenplum现状说明 三 Greenplum体系架构 二 数据仓库体系架构 一 Greenplum开发规范 五 Greenplum运维体系 四 Greenplum扩展规划 六 3 数据仓库体系架构 业务数据与数据使用归类 时间维度:过去 - 现在 - 未来 (数据的生命周期) • “现在”的数据 —— —— OLTP • “过去”的数据 —— OLAP • “未来”的数据 —— 趋势分析 4 数据仓库体系架构 业务数据与数据特点 • 现在的数据 —— OLTP Ø实时,在线系统,客户使用 Ø事务小,频率高,并发高 • 过去的数据 —— OLAP Ø非实时(T+1,或小时级),离线系统,分析决策 Ø事务大,频率相对小,并发低 • 未来的数据 —— 趋势分析 Ø非实时,离线+在线流系统,趋势分析 Ø非实时,离线+在线流系统,趋势分析 Ø算法分析,持续计算 5 数据仓库体系架构 OLAP场景举例 • 业务相关场景 Ø用户状态 (注册数,活跃数,并发量,峰值) Ø金币状态 Ø道具/物品状态 Ø对账状态 Ø活动反馈 • 架构相关场景 Ø不同数据量,不同事务特点,不同查询需求 Ø历史数据归档与冷热分离 Ø实时与延时需求的权衡 6 数据仓库体系架构 数据流转过程 • 1 业务数据的产生 —— OLTP0 码力 | 43 页 | 9.66 MB | 1 年前3The Vitess 6.0 Documentation
needs. However, OLAP mode has no limit to the number of rows returned. In order to change to this mode, you may issue the following command before executing your query: set workload='olap' You can also and reparent commands. The general convention is to send OLTP queries to REPLICA tablet types, and OLAP queries to RDONLY. Is there a list of supported/unsupported queries? Please see “SQL Syntax” under slightly stale data, the queries should be sent to REPLICA tablets for OLTP, and RDONLY tablets for OLAP workloads. This allows you to scale your read traffic more easily, and gives you the ability to distribute0 码力 | 210 页 | 846.79 KB | 1 年前3The Vitess 5.0 Documentation
needs. However, OLAP mode has no limit to the number of rows returned. In order to change to this mode, you may issue the following command before executing your query: set workload='olap' 23 You can and reparent commands. The general convention is to send OLTP queries to REPLICA tablet types, and OLAP queries to RDONLY. Is there a list of supported/unsupported queries? Please see “SQL Syntax” under slightly stale data, the queries should be sent to REPLICA tablets for OLTP, and RDONLY tablets for OLAP workloads. This allows you to scale your read traffic more easily, and gives you the ability to distribute0 码力 | 206 页 | 875.06 KB | 1 年前3RDBMSとNoSQLのメリットを併せ持つクラウドネイティブなNewSQLデータベース 「TiDB」をKubernetesで動かしてみよう!
(MySQL 互換) Distributed Transactions (分散トランザクション) Cloud Native (クラウドネイティブ志向) Minimize ETL (OLTP と OLAP のサポート) High Availability (高可用性) Open Source Conference 2022 Online/Spring 12 TiDB の特徴 Horizontal (MySQL 互換) Distributed Transactions (分散トランザクション) Cloud Native (クラウドネイティブ志向) Minimize ETL (OLTP と OLAP のサポート) High Availability (高可用性) Open Source Conference 2022 Online/Spring 13 TiDB の特徴 (Horizontal (MySQL 互換) Distributed Transactions (分散トランザクション) Cloud Native (クラウドネイティブ志向) Minimize ETL (OLTP と OLAP のサポート) High Availability (高可用性) Open Source Conference 2022 Online/Spring 16 TiDB の特徴 (MySQL Compatible0 码力 | 71 页 | 6.65 MB | 1 年前3The Vitess 7.0 Documentation
needs. However, OLAP mode has no limit to the number of rows returned. In order to change to this mode, you may issue the following command before executing your query: set workload='olap' You can also and reparent commands. The general convention is to send OLTP queries to REPLICA tablet types, and OLAP queries to RDONLY. Is there a list of supported/unsupported queries? Please see “SQL Syntax” under slightly stale data, the queries should be sent to REPLICA tablets for OLTP, and RDONLY tablets for OLAP workloads. This allows you to scale your read traffic more easily, and gives you the ability to distribute0 码力 | 254 页 | 949.63 KB | 1 年前3Greenplum 6: 混合负载的理想数据平台
Node1 Segment Host Node2 Segment Host Node3 Segment Host NodeN Greenplum (MPP) Oracle (SMP) OLAP - Online Analytical Processing - 联机分析处理 Gartner 2019数据分析行业报告 Pivotal Greenplum scored highly this here as an MPP relational database are well-showcased 12 Pivotal Confidential–Internal Use Only 卓越的OLAP特性 列式存储 分区、压缩 高级特性 递归查询、窗口函数 集成分析 多格式、多语言 Madlib: 机器学习 数据库内并行模型训练和预测、分类 ORCA 复杂查询优化器 成熟稳定 混合事务/分析处理 Gartner技术成熟度曲线 OLTP-OLAP独立部署 OLTP数据库 OLAP数据仓库 ■ 实时性 ■ 数据同步复杂性 ■ 应用复杂性 HTAP HTAP = ? ■ 卓越的OLAP特性 ■ 出色的OLTP特性 ■ 多态存储 ■ 有效的并发和资源管理 OLTP-OLAP独立部署 OLTP数据库 OLAP数据仓库 ■ 实时性 ■ 数据同步复杂性 ■ 应用复杂性0 码力 | 52 页 | 4.48 MB | 1 年前3The Vitess 8.0 Documentation
needs. However, OLAP mode has no limit to the number of rows returned. In order to change to this mode, you may issue the following command before executing your query: set workload='olap' You can also and reparent commands. The general convention is to send OLTP queries to REPLICA tablet types, and OLAP queries to RDONLY. Is there a list of supported/unsupported queries? Please see “SQL Syntax” under slightly stale data, the queries should be sent to REPLICA tablets for OLTP, and RDONLY tablets for OLAP workloads. This allows you to scale your read traffic more easily, and gives you the ability to distribute0 码力 | 331 页 | 1.35 MB | 1 年前3TiDB 开源分布式关系型数据库
OLTP 交易几乎没有影响。提供和 TiDB 保持强一致 的数据读取,是真正的内核级 HTAP 分布式混合负载数据处理平台。 这套系统可以很好的解决: 行存储和列存储的取舍问题; OLTP 负载和 OLAP 负载的资源隔离问题; 快速批量写与事务型写操作混合模式的问题; Adhoc 查询与 Adhoc 混合负载及批处理作业共存的问题; 。 数据 0ffload 到数据仓库引起的不一致风险。 Sattayer 分布式存储引擎 YY 1 集群调度器 TiSpark OLAP 分析引擎 2 人折3合 Tash 分布式下式 7 “ 存储引擎 异地灾备 (主从集群异步模式) OLTP 与 OLAP 业务,支撑 2019 年双十一大促,QPS 峰值在 12 万 +,支持 百亿级的播入和更新。Prometheus 与 Grafana 提供丰富的监控指标满足运维管理的需求,使用DataX 将 TiDB 的数据以 T+1 同步到 Hive 做数据备份。 基于TiDB 中通快递进行实时数仓宽表的建设,业务的 OLTP 数据通过 TiDB 实时写入,后续 OLAP 的业 务通过0 码力 | 58 页 | 9.51 MB | 1 年前3TiDB 实践 HTAP 的架构进展和未来展望-韦万
and the improvement in v6.2 Wei Wan @ PingCAP About Me Wei Wan, work at PingCAP, as the leader of OLAP Storage team. Over 11 years of experience in game, e-commerce, mobile apps, and database development provide users with a one-stop database solution that covers OLTP (Online Transactional Processing), OLAP (Online Analytical Processing), and HTAP services. Agenda 1. A typical user case 2. The challenges infrastructure adaption The challenges to storage module on HTAP scenario Isolation between OLTP and OLAP workloads • Isolation is difficult if we mix them in the same node • TP and AP scale separately0 码力 | 32 页 | 6.61 MB | 1 年前3Greenplum开源MPP数据库介绍
7 Confidential │ ©2022 VMware, Inc. 3 Greenplum简介:什么是Greenplum? 基于PostgreSQL、开源、分布式MPP、ACID完备、为OLAP优化的关系型数据仓库。 https://greenplum.org https://github.com/greenplum-db/gpdb Confidential │ ©2022 VMware (WAL replication) Ø 自动灾难恢复 (FTS,主备切换) Confidential │ ©2022 VMware, Inc. 9 分布式优化器:OLAP Ø OLTP系统的SQL语句相对简单(CURD) Ø OLAP系统的SQL语句就复杂得多(OLTP则尽量避免) q Join 很复杂(多表, outer join, lateral…) q 子查询、子链接 q 聚集 (grouping 优化器非常非常重要 Ø 基于规则优化和基于代价优化 Confidential │ ©2022 VMware, Inc. 10 ORCA Ø 历时十年,独立开发 Ø Cascades 架构 Ø OLAP性能很棒 Ø https://db.cs.cmu.edu/events/vaccination-2022-orca-a-modular-query-optimizer- architectur0 码力 | 23 页 | 4.55 MB | 1 年前3
共 205 条
- 1
- 2
- 3
- 4
- 5
- 6
- 21