深度揭秘Greenplum开源数据库透明加密基于pgcypto的数据加密方案 3. GPDB数据透明加密方案设计 4. GPDB数据透明加解密流程 5. 总结 我们所面临的问题 什么是Greenplum数据库 一款开源的HTAP数据库: • MPP架构 • 完整的事务+ACID+标准SQL支持 • 支持上千个节点的部署 • 支持PB级文件 • 丰富的ETL和外部组件 • 支持Python/R/Java直接访问处理数据库数据 • 不兼容现有查询语句 • 不兼容ETL工具 性能低 • 不支持索引 • 优化器无法使用,需要全表扫描 局限性高 • 多表关联查询需要先全表解密 • 只能加密表数据 pgcypto的问题 一款开源的HTAP数据库: • MPP架构 • 完整的事务+ACID+标准SQL支持 • 支持上千个节点的部署 • 支持PB级文件 • 丰富的ETL和外部组件 • 支持Python/R/Java直接访问处理数据库数据 完整的事务+ACID+标准SQL支持 • 支持上千个节点的部署 • 支持PB级文件 • 丰富的ETL和外部组件 • 支持Python/R/Java直接访问处理数据库数据 • https://github.com/greenplum-db/gpdb Recall GPDB数据透明加密方案设计 GPDB TDE GPDB透明加密 加密目标 • 表数据 • 预写日志数据 • 主从节点所有数据0 码力 | 48 页 | 10.19 MB | 1 年前3
 Greenplum 精粹文集Batch(不需要交互式),对计算性能不是 很敏感,那 Hadoop 也是不错的选择,因为 Hadoop 不需要你花费 较多的精力来模式化你的数据,节省数据模型设计和数据加载设计 方面的投入。这些系统包括:历史数据系统、ETL 临时数据区、数 据交换平台等等。 切记,千万不要为了大数据而大数据(就好像不要为了创新而创新一 个道理),否则,你项目最后的产出与你的最初设想可能 将差之千里,行业内不乏失败案例。 最后,提一下,GreenplumMPP 某金融客户的测试结果,比 HIVE 高 8 倍左右),因此可以考虑在项 目中同时部署 MPP 数据库和 Hadoop,MPP 用于交互式高性能分析, Hadoop 用于数据 Staging、MPP 的数据备份或一些 ETL batch 的数据 清洗任务,两者相辅相成,在各自最擅长的场景中发挥其特性和优势。 Big Date2.indd 18 16-11-22 下午3:38 Greenplum 精粹文集 19 28 4. ETL 服务器 ETL 服务器是数据的临时存放区,由于 Greenplum 服务器并行加载 的特点,数据可以直接通过网络从 ETL 服务器导入到 Greenplum 计 算节点,所以 ETL 服务器网络和磁盘 IO 的性能直接关系到数据加载 和卸载的性能,官方的测试数据 16 台计算节点 Greenplum 集群, 加载性能可以达到 16TB/ 小时。 ETL 服务器推荐采用的0 码力 | 64 页 | 2.73 MB | 1 年前3
 VMware Greenplum 6 DocumentationInterconnect Redundancy 309 Network Interface Configuration 309 Switch Configuration 310 About ETL Hosts for Data Loading 311 About VMware Greenplum Performance Monitoring 312 About Management and 2111 Examples 2111 list 2112 set 2112 database 2112 network base-vm 2112 network gp-virtual-etl-bar 2113 network gp-virtual-external 2113 network gp-virtual-internal 2113 vm 2113 vsphere 2114 VMware by Broadcom 291 Greenplum Streaming Server v1.5.3 - The VMware Greenplum Streaming Server is an ETL tool that provides high speed, parallel data transfer from Informatica, Kafka, Apache NiFi and custom0 码力 | 2445 页 | 18.05 MB | 1 年前3
 VMware Greenplum v6.25 DocumentationInterconnect Redundancy 295 Network Interface Configuration 295 Switch Configuration 296 About ETL Hosts for Data Loading 297 About VMware Greenplum Performance Monitoring 298 About Management and 2068 Examples 2068 list 2069 set 2069 database 2069 network base-vm 2069 network gp-virtual-etl-bar 2070 network gp-virtual-external 2070 network gp-virtual-internal 2070 vm 2071 vsphere 2071 usage information. Greenplum Streaming Server v1.5.3 - The VMware Greenplum Streaming Server is an ETL tool that provides high speed, parallel data transfer from Informatica, Kafka, Apache NiFi and custom0 码力 | 2400 页 | 18.02 MB | 1 年前3
 VMware Greenplum 6 DocumentationInterconnect Redundancy 277 Network Interface Configuration 277 Switch Configuration 278 About ETL Hosts for Data Loading 279 About VMware Greenplum Performance Monitoring 280 About Management and VMware, Inc 261 Greenplum Streaming Server v1.5.3 - The VMware Greenplum Streaming Server is an ETL tool that provides high speed, parallel data transfer from Informatica, Kafka, Apache NiFi and custom parallel data transfer from a Kafka cluster to a Greenplum Database cluster for batch and streaming ETL operations. It requires Kafka version 0.11 or newer for exactly-once delivery assurance. Refer to0 码力 | 2374 页 | 44.90 MB | 1 年前3
 VMware Tanzu Greenplum v6.23 DocumentationInterconnect Redundancy 269 Network Interface Configuration 269 Switch Configuration 270 About ETL Hosts for Data Loading 271 About Tanzu Greenplum Performance Monitoring 272 About Management and PXF documentation. Greenplum Streaming Server v1.5.3 - The Tanzu Greenplum Streaming Server is an ETL tool that provides high speed, parallel data transfer from Informatica, Kafka, Apache NiFi and custom parallel data transfer from a Kafka cluster to a Greenplum Database cluster for batch and streaming ETL operations. It requires Kafka version 0.11 or newer for exactly-once delivery assurance. Refer to0 码力 | 2298 页 | 40.94 MB | 1 年前3
 VMware Tanzu Greenplum 6 DocumentationInterconnect Redundancy 265 Network Interface Configuration 265 Switch Configuration 266 About ETL Hosts for Data Loading 267 About Tanzu Greenplum Performance Monitoring 268 About Management and PXF documentation. Greenplum Streaming Server v1.5.3 - The Tanzu Greenplum Streaming Server is an ETL tool that provides high speed, parallel data transfer from Informatica, Kafka, Apache NiFi and custom parallel data transfer from a Kafka cluster to a Greenplum Database cluster for batch and streaming ETL operations. It requires Kafka version 0.11 or newer for exactly-once delivery assurance. Refer to0 码力 | 2311 页 | 17.58 MB | 1 年前3
 VMware Tanzu Greenplum v6.21 DocumentationInterconnect Redundancy 240 Network Interface Configuration 240 Switch Configuration 241 About ETL Hosts for Data Loading 242 About Tanzu Greenplum Performance Monitoring 243 About Management and PXF documentation. Greenplum Streaming Server v1.5.3 - The Tanzu Greenplum Streaming Server is an ETL tool that provides high speed, parallel data transfer from Informatica, Kafka, Apache NiFi and custom parallel data transfer from a Kafka cluster to a Greenplum Database cluster for batch and streaming ETL operations. It requires Kafka version 0.11 or newer for exactly-once delivery assurance. Refer to0 码力 | 2025 页 | 33.54 MB | 1 年前3
 Pivotal HVR meetup 20190816“dove-tailing” with subsequent “sematic” ETL steps Ingestion into relational data lake semantic layer (cubes, marts etc..) ODS, e.g. redshift, greenplum or RDS ETL users HVR Strengths • Persistent storage compaction storage, e.g. CSV or Avro files on S3 semantic layer (cubes, marts etc..) ETL ODS, e.g. redshift, greenplum or RDS users HVR Strengths • Publish/subscribe model, enhances business (cubes, marts etc..) ETL ODS, e.g. redshift, greenplum or RDS Kafka or Kinesis streamed analytics users HVR Simultaneous ingestion into streaming, relational and storage ETL users streamed analytics0 码力 | 31 页 | 2.19 MB | 1 年前3
 Greenplum 新一代数据管理和数据分析解决方案Architecture 案例分享:上海航空 结算 系统 源系统 Oracle GreenPlum 结算 ETL Staging ETL 结算 ODS Export 文 本 Query (oracle native driver) BO前端 呼叫 中心 航线 分析 其他 ETL ETL ETL 原有数据仓 库部分(包 括EDW, DM, ODS。不含 结算ODS ) Query (ODBC)0 码力 | 45 页 | 2.07 MB | 1 年前3
共 24 条
- 1
 - 2
 - 3
 













