Greenplum 精粹文集IO 吞吐上不 能满足海量数据的计算需求。 分布式存储和分布式计算理论刚刚被提出来,Google 的两篇著名论文 发表后引起业界的关注,一篇是关于 GFS 分布式文件系统,另外一篇 是关于 MapReduce 并行计算框架的理论,分布式计算模式在互联网 行业特别是收索引擎和分词检索等方面获得了巨大成功。 Big Date2.indd 1 16-11-22 下午3:38 2 由此,业界 集群在整体上提供的计算能力已大幅高 于传统 SMP 主机,并且成本很低,横向的扩展性还可带来系统良好 的成长性。 问 题 来 了, 在 X86 集 群 上 实 现 自 动 的 并 行 计 算, 无 论 是 后 来 的 MapReduce 计算框架还是 MPP(海量并行处理)计算框架,最终还 是需要软件来实现,Greenplum 正是在这一背景下产生的,借助于分 布式计算思想,Greenplum 实现了基于数据库的分布式数据存储和并 再进一步看,Master-Slave 架构在业界的大数据分布式计算和云计 算体系中被广泛应用,大家可以看到,现在主流分布式系统都是采 用 Master-Slave 架 构, 包 括:Hadoop FS、Hbase、MapReduce、 Storm、Mesos...... 无一例外都是 Master-Slave 架构。相反,采用 MultipleActive Master 的软件系统,需要消耗更多资源和机制来保证 元数据一0 码力 | 64 页 | 2.73 MB | 1 年前3
Greenplum 新一代数据管理和数据分析解决方案强大并且不断扩展的合作伙伴网络 硬件供应商 商务智能工具 15 服务供应商 业内支持和认可 行业奖励 “ Greenplum能够让企业在两 个方面同时达到最满意的效果: 供程序员使用的MapReduce以 及供数据库管理使用的 SQL。” Monash Research 的Curt Monash 分析师褒奖 “ Greenplum正在通过新式技术来 推动并行数据库的发展,从而满足互 • 可以使用SQL、 MapReduce、R等在 所有层次上对任何数 据进行并行分析 19 通过经济的方案扩展 到千万亿字节规模 • 不用担心数据增长或 者开始的规模太小 • 在商用硬件上通过线 性、经济的方式扩展 Greenplum数据引擎体系 主机 网络互连 并行查询规划和调度 区段服务器 (处理和存储) SQL 查询和 MapReduce程序 MPP (海量并行处理) (海量并行处理) “完全不共享”体系 Greenplum体系:并行数据流 21 • 通用并行数据流引擎可以通过本地方 式执行 SQL和MapReduce • 采用了针对商用硬件优化的MPP“完 全不共享”体系 • 可以在很多100s服务器上扩展到 1000s商用处理内核 • 将所有处理操作尽量移动到数据附近 计算内核 Greenplu m并行数 据流引擎 对本地磁盘进行直 接的高性能访问0 码力 | 45 页 | 2.07 MB | 1 年前3
VMware Greenplum 6 DocumentationLimitations 889 Using Greenplum MapReduce 889 About the Greenplum MapReduce Configuration File 889 Example Greenplum MapReduce Job 891 Flow Diagram for MapReduce Example 897 Query Performance 897 supercomputer performing tens or hundreds times faster than a traditional database. It supports SQL, MapReduce parallel processing, and data volumes ranging from hundreds of gigabytes, to hundreds of terabytes timestamp9_ntz datatypes. Greenplum Database 6.24.0 deprecates the following features: Greenplum MapReduce, PL/Container 3 Beta and GreenplumR client. GPORCA now supports direct dispatch for randomly distributed0 码力 | 2445 页 | 18.05 MB | 1 年前3
VMware Greenplum 6 DocumentationLimitations 880 Using Greenplum MapReduce 880 About the Greenplum MapReduce Configuration File 880 Example Greenplum MapReduce Job 882 Flow Diagram for MapReduce Example 888 VMware Greenplum 6 Documentation supercomputer performing tens or hundreds times faster than a traditional database. It supports SQL, MapReduce parallel processing, and data volumes ranging from hundreds of gigabytes, to hundreds of terabytes timestamp9_ntz datatypes. Greenplum Database 6.24.0 deprecates the following features: Greenplum MapReduce, PL/Container 3 Beta and GreenplumR client. GPORCA now supports direct dispatch for randomly distributed0 码力 | 2374 页 | 44.90 MB | 1 年前3
VMware Greenplum v6.25 DocumentationGreenplum MapReduce 860 VMware Greenplum 6 Documentation VMware, Inc. 38 About the Greenplum MapReduce Configuration File 860 Example Greenplum MapReduce Job 862 Flow Diagram for MapReduce Example 868 supercomputer performing tens or hundreds times faster than a traditional database. It supports SQL, MapReduce parallel processing, and data volumes ranging from hundreds of gigabytes, to hundreds of terabytes timestamp9_ntz datatypes. Greenplum Database 6.24.0 deprecates the following features: Greenplum MapReduce, PL/Container 3 Beta and GreenplumR client. GPORCA now supports direct dispatch for randomly distributed0 码力 | 2400 页 | 18.02 MB | 1 年前3
VMware Tanzu Greenplum v6.21 DocumentationLimitations 740 Using Greenplum MapReduce 740 About the Greenplum MapReduce Configuration File 740 Example Greenplum MapReduce Job 742 Flow Diagram for MapReduce Example 747 Query Performance 748 supercomputer performing tens or hundreds times faster than a traditional database. It supports SQL, MapReduce parallel processing, and data volumes ranging from hundreds of gigabytes, to hundreds of terabytes another database or ETL tool to load the data elsewhere Receiving output from Greenplum parallel MapReduce calculations. Writable external tables allow only INSERT operations. External tables can be file-based0 码力 | 2025 页 | 33.54 MB | 1 年前3
VMware Tanzu Greenplum v6.23 DocumentationLimitations 851 Using Greenplum MapReduce 852 About the Greenplum MapReduce Configuration File 852 Example Greenplum MapReduce Job 853 Flow Diagram for MapReduce Example 859 Query Performance 860 supercomputer performing tens or hundreds times faster than a traditional database. It supports SQL, MapReduce parallel processing, and data volumes ranging from hundreds of gigabytes, to hundreds of terabytes another database or ETL tool to load the data elsewhere Receiving output from Greenplum parallel MapReduce calculations. Writable external tables allow only INSERT operations. External tables can be file-based0 码力 | 2298 页 | 40.94 MB | 1 年前3
VMware Tanzu Greenplum 6 DocumentationLimitations 849 Using Greenplum MapReduce 850 About the Greenplum MapReduce Configuration File 850 Example Greenplum MapReduce Job 852 Flow Diagram for MapReduce Example 857 Query Performance 858 supercomputer performing tens or hundreds times faster than a traditional database. It supports SQL, MapReduce parallel processing, and data volumes ranging from hundreds of gigabytes, to hundreds of terabytes another database or ETL tool to load the data elsewhere Receiving output from Greenplum parallel MapReduce calculations. Writable external tables allow only INSERT operations. External tables can be file-based0 码力 | 2311 页 | 17.58 MB | 1 年前3
VMware Greenplum v6.18 Documentation1799 Using Greenplum MapReduce 969 Using Greenplum MapReduce 0 About the Greenplum MapReduce Configuration File 970 Example Greenplum MapReduce Job 971 Flow Diagram for MapReduce Example 977 Query Performance supercomputer performing tens or hundreds times faster than a traditional database. It supports SQL, MapReduce parallel processing, and data volumes ranging from hundreds of gigabytes, to hundreds of terabytes another database or ETL tool to load the data elsewhere Receiving output from Greenplum parallel MapReduce calculations. Writable external tables allow only INSERT operations. External tables can be file-based0 码力 | 1959 页 | 19.73 MB | 1 年前3
VMware Greenplum v6.19 Documentation1812 Using Greenplum MapReduce 978 Using Greenplum MapReduce 0 About the Greenplum MapReduce Configuration File 979 Example Greenplum MapReduce Job 980 Flow Diagram for MapReduce Example 986 Query Performance supercomputer performing tens or hundreds times faster than a traditional database. It supports SQL, MapReduce parallel processing, and data volumes ranging from hundreds of gigabytes, to hundreds of terabytes another database or ETL tool to load the data elsewhere Receiving output from Greenplum parallel MapReduce calculations. Writable external tables allow only INSERT operations. External tables can be file-based0 码力 | 1972 页 | 20.05 MB | 1 年前3
共 15 条
- 1
- 2













