CurveFs 用户权限系统调研w_other”(该配置项是无值的)。详见libfuse官方文 档:https://github.com/libfuse/libfuse#security-implications # The file /etc/fuse.conf allows for the following parameters: # # user_allow_other - Using the allow_other mount file access to the filesystem owner, so that all users (including root) can access the files. allow_root This option is similar to allow_other but file access is nt$ touch file1 wanghai01@pubbeta1-nostest2:/tmp/fsmount$ ls -l total 0 -rw-r--r-- 0 wanghai01 neteaseusers 0 Jan 7 2079 file1 wanghai01@pubbeta1-nostest2:/tmp/fsmount$ echo "hello" > file1 wanghai00 码力 | 33 页 | 732.13 KB | 6 月前3
Curve文件系统元数据Proto(接口定义)Inc. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * http://www OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ syntax="proto2"; package curvefs.mds; option cc_generic_services Inc. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * http://www0 码力 | 15 页 | 80.33 KB | 6 月前3
Open Flags 调研FASYNC, O_TMPFILE 结论 参考文献 open接口原型 # man page open, openat, creat - open and possibly create a file #includeint open(const char *pathname, int flags); int open(const char *pathname, int Page 4 of 23 文件创建标志只影响打开操作, 文件状态标志影响后面的读写操作 file creation flags: O_CLOEXEC, O_CREAT, O_DIRECTORY, O_EXCL, O_NOCTTY, O_NOFOLLOW, O_TMPFILE, and O_TRUNC file status flags: O_APPEND, FASYNC, O_DIRECT, O_SYNC(O_DSYNC) 输入都将会影响用户的进程。 O_NOCTTY : 如果文件存在,且是个普通文件,具有对该文件的写权限,该flag会将文件长度截断为0。 O_TRUNC : 追加写,每次write都会将file offset 指向文件尾(file offset的修改和write操作在一个原子操作中完成)。 O_APPEND O_NONBLOCK O_NDELAY: O_NONBLOCK和O_NDELAY所产生的结果 0 码力 | 23 页 | 524.47 KB | 6 月前3
CurveBS IO Processing Flowvirtual block device to a file. For example, block device /dev/sda corresponds to file /foo/bar in CurveBS 2. The address space of the block device /dev/sda maps to chunks of file in the system. For example Each file (/foo/bar) contains chunks scattered all over the storage nodes. ChunkServer provides 4KB random read/write capability to support 4KB aligned read/write on block devices.CurveBS file structure look at the metadata for a file in CurveBS. 1. A file in CurveBS consists of chunks. The default size of chunk is 16MB. If the file directly maps to chunk, a 4TB file will consist of 256KB chunks0 码力 | 13 页 | 2.03 MB | 6 月前3
curvefs client删除文件和目录功能设计request from * the kernel even after calls to unlink, rmdir or (when * overwriting an existing file) rename. Filesystems must handle * such requests properly and it is recommended to defer removal closely by forget * unless the file or directory is open, in which case the * kernel issues forget only after the release or releasedir * calls. * * Note that if a file system will be exported over unmount the lookup count for all inodes implicitly drops * to zero. It is not guaranteed that the file system will * receive corresponding forget messages for the affected© XXX Page 5 of 15 * inodes0 码力 | 15 页 | 325.42 KB | 6 月前3
OID CND Asia Slide: CurveFS○ apps bundled with data locations ● Requirements for elastic block storage ● Requirements for file systemopen-source storage ● Requirements ○ Cloud Native ○ Easy operation and maintenance ○ High ● CopySet pre-allocation algorithm ● Raft Consistency protocol High performance ● pre-created file pool ● data strip like RAID ● Zero data copy ● RDMA Cloud NativeCluster topology The physical on a physical serverCurve metadata organization Curve maps virtual block devices to files Each file contains chunks scattered across storage nodes in the cluster Chunkservers are grouped by failure0 码力 | 24 页 | 3.47 MB | 6 月前3
CurveFS Client 概要设计+retrieve_reply +forget_multi +flock +fallocate© XXX Page 5 of 11 +readdirplus +copy_file_range +lseek 关键接口分析 init void (*init) (void *userdata, struct fuse_conn_info *conn); 根据 void (*write) (fuse_req_t req, fuse_ino_t ino, const char *buf, size_t size, off_t off, struct fuse_file_info *fi); 首先根据inode id 从缓存中查找到对应inode结构; 如果inode缓存中不存在对应的inode,则从mds获取inode所在copyset,metaserver 结构,缓存之; 判断inode结构中,对应请求[off, size]位置的空间是否有分配:如果未分配或只有部分分配空间,则调用空间分配器分配空间,并根据空间分配器返回结果,修改inode结构(包括file length); inode修改需要持久化到底层并修改本地cache; 调用curve client接口,写curve卷对应[offset,len] 数据。 (这里涉及到一个问题,是否从fus0 码力 | 11 页 | 487.92 KB | 6 月前3
Curve文件系统空间分配方案Curve文件系统空间分配方案(基于块的方案,已实现)© XXX Page 2 of 11 背景 本地文件系统空间分配相关特性 局部性 延迟分配/Allocate-on-flush Inline file/data 空间分配 整体设计 空间分配流程 特殊情况 空间回收 小文件处理 并发问题 文件系统扩容 接口设计 RPC接口 空间分配器接口 背景 根据 ,文件系统基于当前的块进行实 间。 延迟分配/Allocate-on-flush 在sync/flush之前,尽可能多的积累更多的文件数据块才进行空间分配,一方面可以提高局部性,另一方面可以降低磁盘碎片。 Inline file/data 几百字节的小文件不单独分配磁盘空间,直接把数据存放到文件的元数据中。 针对上述的本地文件系统特性,Curve文件系统分配需要着重考虑 。 局部性 虽然Curve是一个分布式 tent进行记录即可,(0,100MiB,2MiB)。 所以,如果能对文件的多次空间申请分配连续的地址空间,则inode中记录的extent数量可以大大减少,能够降低整个文件系统的元数据量。 对于延迟分配和Inline file这两个特性,需要fuse client端配合完成。 空间分配 整体设计 分配器包括两层结构: 第一层用bitmap进行表示,每个bit标识其所对应的一块空间(以4MiB为例,具体大小可配置)是否分配出去。0 码力 | 11 页 | 159.17 KB | 6 月前3
Curve for CNCF Mainperformance cloud native distributed block storage • Curve File System (CurveFS) • CurveFS: a high performance cloud native file systemUse Cases • Container • Database • Data apps(middleware/bigdata/ai) Features • RAFT for data consistency • minor impaction when chunk server fails • Precreated chunk file for volume space mapping • high performance framework • Use bthread (M bthread map N pthread) Engine Comparison (vs. Ceph) META MANAGEMENT CURVE CHUNK SERVER BLUESTORE META Precreate Chunk File Pool on ext4 RocksDB META OVERHEAD without ext4 meta overhead increase read/write magnification0 码力 | 21 页 | 4.56 MB | 6 月前3
CurveFS对接S3方案设计offset, uint64_t length) = 0; }; metaserver.proto enum FileType { TYPE_DIRECTORY = 1; TYPE_FILE = 2; TYPE_SYM_LINK = 3; TYPE_S3 = 4; };© XXX Page 6 of 11 // inodes3chunk message S3ChunkInfo version = 2; required uint64 offset = 3; required uint64 len = 4; // file logic length required uint64 size = 5; // file size in object storage }; message S3ChunkInfoList { repeated S3ChunkInfo gid = 8; optional uint32 mode = 9; optional VolumeExtentList volumeExtentList = 10; // TYPE_FILE only optional S3ChunkInfoList s3ChunkInfoList = 11; // TYPE_S3 only } message UpdateInodeResponse0 码力 | 11 页 | 145.77 KB | 6 月前3
共 22 条
- 1
- 2
- 3













