HBase Read PathHBase Read Path openinx@apache.org Abstract ❏ Client Side ❏ Server Side ❏ Tuning Part-1 Client Side HBase Client ClientScanner ClientScanner cache(queue) scanner.next() RegionServer-0 RegionServer-1 (old generation) ● Less mixed GC(s) and shorter STW time. End-to-end offheap on the read-path (HBASE-11425) BucketCache StoreFileScanner Copy the Block from BucketCache(offheap) to onheap. Rpc Handler MaxResultSize ● Timeout ○ HeartBeat: abort this rpc once timeout and just return the current results to client. ○ Cursor: return a fake result with the current rowkey for next rpc once timeout. ● Batch ○0 码力 | 38 页 | 970.76 KB | 1 年前3
HBase Practice At XiaomiHBase Practice At Xiaomi huzheng@xiaomi.com About This Talk ● Async HBase Client ○ Why Async HBase Client ○ Implementation ○ Performance ● How do we tuning G1GC for HBase ○ CMS vs G1 ○ Tuning Tuning G1GC ○ G1GC in XiaoMi HBase Cluster Part-1 Async HBase Client Why Async HBase Client ? Request-1 Response-1 Request-2 Response-2 Request-3 Response-3 Request-4 Response-4 Request-1 RPC-3 RPC-4 RPC-1 RPC-2 RPC-3 RPC-4 Blocking Client (Single Thread) Non-Blocking Client(Single Thread) Fault Amplification When Using Blocking Client RegionServer RegionServer RegionServer Handler-10 码力 | 45 页 | 1.32 MB | 1 年前3
HBase Practice At XiaoMiHBase Practice At XiaoMi tianjy1990@gmail.com openinx@apache.org Part-1 Problems In Practice Problems in XiaoMi ❏ Problem 1. How to satisfy the regular demand of scanning table without affecting analysis need to scan a large number of data from hbase ❏ They are executed by mapreduce or spark, that put a heavy burden on HBase Scan snapshot directly ❏ HBase already provides this feature: TableSnapshotInputFormat TableSnapshotInputFormat (ClientSideRegionScanner) ❏ Construct regions by snapshot files ❏ Read data without any HBase RPC requests ❏ Required READ access to reference files and HFiles Snapshot ACL ❏ HDFS ACL could0 码力 | 56 页 | 350.38 KB | 1 年前3
HBase基本介绍HBase基本介绍 ⽥田志鹏 20190714 上次分位点估算当时没解决的两个问题已更更新ppt. 今天讲的内容⽐比较基础, ⽽而且偏理理论, 因为我个⼈人也没有太多实际使⽤用经验, 纸上谈兵. Apache HBase™ is the Hadoop database, a distributed, scalable, big data store. Use Apache HBase™ clusters of commodity hardware. Apache HBase is an open-source, distributed, versioned, non-relational database modeled after Google's Bigtable … 先来⼀一段HBase官⽹网的⾃自我介绍. blabla翻译⼀一下 重点看其中的红字, 什什么hadoop数据库 像redis是存kv结构的数据, MongoDB是存储⽂文档型数据, 那HBase存什什么样的数据? • ’表/⾏行行/列列’ • Row Key • ColumnFamily列列族 : ColumnQualifier列列限定名 • Version/Timestamp 分数:语⽂文 数据模型 逻辑视图 整个HBase和关系数据库很像, 但⼜又要时时注意两者的区别. 右⾯面我继续以⼀一次考试学⽣生分数距离0 码力 | 33 页 | 4.86 MB | 1 年前3
HBase最佳实践及优化Postgres Conference China 2016 中国用户大会 HBase最佳实践及优化 陈飚 cb@cloudera.com Cloudera Postgres Conference China 2016 中国用户大会 关于我… 陈飚 Cloudera售前技术经理、资深方案架构师 http://biaobean.pro 原Intel Hadoop发行版核心开发人员, 成功实施并运维多 产品开发及方案顾问,先后负责Hadoop 产品 化、HBase 性能调优,以及行业解决方案顾问 2 Postgres Conference China 2016 中国用户大会 HBase的历史 2006年 Google发表 了BigTable 论文 2006年底由 PowerSet 的 Chad Walters和 Jim Kellerman 发起了HBase 项目,依据 BigTable的论文 重构关系数据 重构关系数据 库 2007年2月建立 了HBase的原型 版本 2007年10月建立 了第一个可用的 HBase版本 2008年成为 Apache Hadoop 的一个子项目 3 HBase是Google BigTable的开源实现 • BigTable利用GFS作为其文件存储系统 • HBase使用HDFS作为其文件存储系统 Postgres Conference China 20160 码力 | 45 页 | 4.33 MB | 1 年前3
TiDB: HBase分布式事务与SQL实现TiDB: HBase分布式事务与SQL实现 About me ● TiDB & Codis founder ● Golang expert ● Distributed database developer ● Currentlly, CEO and co-founder of PingCAP liuqi@pingcap.com https://github.com/pingcap/tidb com/pingcap/tidb weibo: @goroutine Agenda ● HBase introduction ● TiDB features ● Google percolator and omid ● Internals of TiDB over HBase Features of HBase ● Linear and modular scalability. ● Strictly side Filters ● MVCC What did they say ? “Nothing is hotter than SQL-on-Hadoop, and now SQL-on- HBase is fast approaching equal hotness status” Form HBaseCon 2015 We want more !0 码力 | 34 页 | 526.15 KB | 1 年前3
HBASE-21879 Read HFile ’s Block into ByteBuffer directly.HBASE-21879 Read HFile ’s Block into ByteBuffer directly. 1. Background For reducing the Java GC impact to p99/p999 RPC latency, HBase 2.x has made an offheap read and write path. The KV are allocated deallocate the memory explicitly by themself. On the write path, the request packet received from client will be allocated offheap and retained until those key values are successfully written to the WAL HFile and read the corresponding block. The workflow: reading block from cache OR sending cells to client, is basically not involved in heap memory allocations. However, in our performance test at0 码力 | 18 页 | 1.14 MB | 1 年前3
Greenplum数据仓库UDW - UCloud中立云计算服务商198 198 198 200 201 201 202 202 202 203 203 203 203 203 204 205 206 访问 Hive 访问 HBase 使⽤ 使⽤ pg_dump 迁移数据 迁移数据 安装 greenplum-db-clients 使⽤ pg_dump 导出数据 使⽤ psql 重建数据 利⽤ 利⽤ hdfs 外部表迁移数据 MPP 架构,适⽤于海量数据的存储和计算。UDW 的架构如上图所⽰,主要有 Client、Master Node 和 Compute Node 组成。基本组成部分的功能如下: 产品架构 Greenplum数据仓库 UDW Copyright © 2012-2021 UCloud 优刻得 7/206 1. Client:访问 UDW 的客⼾端 ⽀持通过 JDBC、ODBC、PHP、Python、命令⾏ com/greenplum-client.tar tar -zxvf greenplum-client.tar.gz 2)配置udw客⼾端 进⼊ greenplum-client 安装⽬录,编辑 greenplum_client_path.sh 修改UDW_HOME(export UDW_HOME= client安装⽬录)(如/root/greenplum-client) 3) 使配置⽣效 在~/0 码力 | 206 页 | 5.35 MB | 1 年前3
6. ClickHouse在众安的实践02 集智平台 X-Brain AI 开放平台 计算框架 Hadoop, JStorm, Spark Streaming, Flink 离线/实时任务监控 数据、模型存储 Hive, HBase, Clickhouse, Kylin 数据接入 消 息 中 间 件 模型、 算法 模版 机器学习平台 Antron 机器人平台 X-Insight 数据洞察平台 X-Zatlas 0_0' | clickhouse-client --host=127.0.0.1 -- port=10000 -u user --password password --query="INSERT INTO Insight_zhongan.baodan_yonghushuju FORMAT CSV" 效果: 单进程:每分钟2600w条记录,client占用核数=1,server占用核数=1,导入速率=80mb/s ,导入速率=80mb/s 2进程:每分钟4000w条记录,client占用核数=2,server占用核数约2-5,导入速率=140mb/s 4进程: 每分钟8000w条记录,每个client占核数=1,server占用核约2-5,导入速率=280mb/s 22 ClickHouse 百亿数据性能测试与优化 • 数据查询 4.48 5.56 4.71 8.64 18.6 250.570 码力 | 28 页 | 4.00 MB | 1 年前3
TiDB 原理与实战A A brief introduction of NewSQL 1970s 2010 2015 Present MySQL PostgreSQL Oracle DB2 ... Redis HBase Cassandra MongoDB ... Google Spanner Google F1 TiDB RDBMS NoSQL NewSQL TiDB and TiKV TiDB 执行流程: bucket size: 2 12 120 200 280 789 809 999 Dist SQL 分布式计算 ● 减少计算成本 ● 减少网络开销 Executor DistSQL API TiKV Client Coprocessor Computation Logic API Send requests Computation Dist SQL ● select * from t where age + 乐观锁) ○ 引擎 RocksDB ● 水平扩容/缩容 ○ raft 协议 + PlacementDriver ● 容错 TiKV Store4 Raft groups RPC RPC Client Store1 TiKV Node1 Region 1 Region 3 ... Store2 TiKV Node2 Region 1 Region 2 Region 3 ... Store3 TiKV0 码力 | 23 页 | 496.41 KB | 6 月前3
共 291 条
- 1
- 2
- 3
- 4
- 5
- 6
- 30













