CurveBS IO Processing Flowarchitecture, data organization and topology structure of CURVE. CurveBS uses the central node Metadata Server (MDS) to manage virtual disk mapping and data replicas distribution. Decentralization of snapshots l Support asynchronous and incremental snapshot l Support lazy (pre-allocated space) and non-lazy (allocated space on needs) clones from snapshots/mirrors l Support rollback from snapshot scattered all over the storage nodes. ChunkServer provides 4KB random read/write capability to support 4KB aligned read/write on block devices.CurveBS file structure of virtual block device As mentioned0 码力 | 13 页 | 2.03 MB | 6 月前3
Curve for CNCF Maincontainer service (in Plan) • Config CurveBS by (Cluster and Pool CRDs) in Kubernetes (in Plan) • Support Operator capability level 5 (in Plan) • horizontal / vertical scaling, auto config tuning, abnormal (RAFT) CEPH WRITE SUCCESS majority write successful all write successful READ Leader of copyset Node in PG SLOW STORAGE/DISK FAILURE INFLUENCE without I/O disruption I/O jitter occasionally CAN SYNC SYNC WITH REMOTE DISK SERVER Y NI/O Jitter (vs. Ceph) 3 replicas with 9 nodes cluster each node has 20 x SSD, 2xE5-2660 v4 and 256GB mem FAULTS CASE CURVE I/O JITTER CEPH I/O JITTER COMMENT ONE DISK0 码力 | 21 页 | 4.56 MB | 6 月前3
Open Flags 调研for more details. If this request is answered with an error code of ENOSYS and FUSE_CAP_NO_OPEN_SUPPORT is set in fuse_conn_info.capable, this is treated as success and future calls to open and release uint64_t lock_owner; /** Requested poll events. Available in ->poll. Only set on kernels which support it. If unsupported, this field is set to zero. */ uint32_t poll_events; }; // fastcfs typedef 3.html https://juejin.cn/post/6844903923048792078 https://www.gnu.org/software/libc/manual/html_node/File-Status-Flags.html https://android.googlesource.com/platform/prebuilts/gcc/linux-x86/host/x86_64-linux-glibc20 码力 | 23 页 | 524.47 KB | 6 月前3
OID CND Asia Slide: CurveFS26%Performance compare with CEPH The test environment ● A cluster consists of six nodes. Each node consists of 20 x SSDS, 256GB memory, and 2 x e5-2660 cpus Performance using RDMA ● Compared with Restore data on a disk within 5 minutes Data availability of 6 nines can be achievedCloud native Support Currently we offer CSI Driver for block storage to provide PV/PVC resources on KubernetesAgenda api Manage multiple types of storage (object storage, HDFS storage, Elastic block storage) Support both on-premise and public cloudsAgenda Why develop storage Design objectives Achievements in0 码力 | 24 页 | 3.47 MB | 6 月前3
TGT服务器的优化• Initiator 重新发送SCSI READ CAPACITY命令 • Windows 磁盘管理器refresh • Linux open-iscsi, iscsiadm --mode node -RDPO & FUA • DPO是disable page out的缩写,FUA是force unit access的缩写 • FUA可以让某些文件系统在做写操作时,不需要提交一个SCSI 驱动没有本地cache,所以DPO & FUA可以turn on. • sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA • 这个对于curve驱动,Linux Initiator的dmesg不会显示这个信息TGT的性能问题 • 性能问题主要体现在不能有效使用多CPU • 对多个socket0 码力 | 15 页 | 637.11 KB | 6 月前3
Curve核心组件之chunkserver作 • RaftService。Braft内置的service, 完成raft成员之间的选举,日志复制, 安装快照等操作。 ChunkServer架构CopysetNode封装了braft的Node,并 实现了braft的状态机,完成与raft的交 互。详细交互流程后面展开。 CopysetNodeManager负责管理 CopysetNode的创建、初始化、删除等 ChunkServer架构心跳模块有两方面的职责: 心跳上报时携带这些信息。ChunkServer核心模块-CopysetNode 写请求: 1. Client发送写请求给Leader ChunkServer 2. 请求封装,提交给Raft node 3. 本地持久化entry的同时发送给其他peer 4. 本地持久化log entry成功,并且有一个peer也落 盘成功,则commit 5. Commit后apply,此时把写请求写0 码力 | 29 页 | 1.61 MB | 6 月前3
CurveFS Copyset与FS对应关系on管理剩下的一直到2^63-1的Inode id。创建meta partition的时候,选择的3个meta node组成一个复制组。如何选择?论文上写的是按照存储节点的memory和disk usage来选的,通常选择内存和disk使用率最低的节点。 并去对应的meta node上去创建对应的meta partition。 如何选择partition的host,通过这个函数去选择。 func0 码力 | 19 页 | 383.29 KB | 6 月前3
NJSD eBPF 技术文档 - 0924版本FUSE_LOOKUP / FUSE_GETATTR / FUSE_SETATTR / • map 结构 • dentry map BPF_MAP_TYPE_HASH • key (inode id, node name) • value inode id • inode map BPF_MAP_TYPE_HASH • key inode id • value fuse_attr (⽂件属性)基于data0 码力 | 20 页 | 7.40 MB | 6 月前3
Raft在Curve存储中的工程实践aft的一致性协议和复制状态机,而且提供了一种通用的基础库。基 于braft,可以基于自己的业务逻辑构建自己的分布式系统。 • braft本身不提供server功能,需要业务自己实现状态机。 Node(一个raft实例) int init(const NodeOptions& options); void apply(const Task& task); void add_peer(const0 码力 | 29 页 | 2.20 MB | 6 月前3
Curve Cloud NativeLIFECYCLE Plan to Support app lifecycle, storage lifecycle(backup, failure, recovery) DEEP INSIGHTS Plan to Support metrics, alerts, log processing and workload analysis AUTO PILOT Plan to Support horizontal/vertical metadata backup and recovery • MDS / ChunkServer should respect failure domains of Kubernetes • Support for public cloud environments • Dashboard-driven configuration after minimal Curve installFeature alternative to host path • Support automatically detect new nodes, adding / removing nodes and disk drives • Support dynamic volume resizing • chunkserver on PVC support for different data and metadata(HDD0 码力 | 9 页 | 2.85 MB | 6 月前3
共 13 条
- 1
- 2













