Containers and BPF: twagent storynamespaces: cgroup, mount, pid and optionally: ipc, net, user, uts ● cgroup v2 ● ... other usual building blocks ... ● cgroup-bpf programs 2 Vast majority of twagent tasks have one or more cgroup-bpf features only: ○ sysctl access control Let’s look at some of them .. Example of cgroup-bpf programs (bpftool cgroup tree ): cgroup-bpf 3 Task IP assignment (aka IP-per-task) ● Facebook DC network is and UDP is enough Solution: ● Make task use specified IP by a set of BPF_PROG_TYPE_CGROUP_SOCK_ADDR and BPF_CGROUP_SOCK_OPS programs Move TCP/UDP servers to task IP: ● bind(2): ctx.user_ip6 = task_ip0 码力 | 9 页 | 427.42 KB | 1 年前3
Cilium v1.11 Documentationkube- proxy), cgroup v2 needs to be enabled by setting the kernel systemd.unified_cgroup_hierarchy=1 parameter. Also, cgroup v1 controllers net_cls and net_prio have to be disabled, or cgroup v1 has to be (e.g. by setting the kernel cgroup_no_v1="all" parameter). This ensures that Kind nodes have their own cgroup namespace, and Cilium can attach BPF programs at the right cgroup hierarchy. To verify this, sudo ls -al /proc/$(docker inspect -f '{{.State.Pid}}' kind- control-plane)/ns/cgroup $ sudo ls -al /proc/self/ns/cgroup See the Pull Request [https://github.com/cilium/cilium/pull/16259] for more details0 码力 | 1373 页 | 19.37 MB | 1 年前3
Cilium v1.10 Documentationreplacement (Kubernetes Without kube- proxy), cgroup v1 controllers net_cls and net_prio have to be disabled, or cgroup v1 has to be disabled (e.g. by setting the kernel cgroup_no_v1="all" parameter). Validate the overlapping BPF cgroup type programs attached to the parent cgroup hierarchy of the kind container nodes. In such cases, either tear down Cilium, or manually detach the overlapping BPF cgroup programs running running in the parent cgroup hierarchy by following the bpftool documentation [https://manpages.ubuntu.com/manpages/focal/man8/bpftool-cgroup.8.html]. For more information, see the Pull Request [https://github0 码力 | 1307 页 | 19.26 MB | 1 年前3
Cilium v1.9 Documentationreplacement (Kubernetes Without kube- proxy), cgroup v1 controllers net_cls and net_prio have to be disabled, or cgroup v1 has to be disabled (e.g. by setting the kernel cgroup_no_v1="all" parameter). Validate the overlapping BPF cgroup type programs attached to the parent cgroup hierarchy of the kind container nodes. In such cases, either tear down Cilium, or manually detach the overlapping BPF cgroup programs running running in the parent cgroup hierarchy by following the bpftool documentation [https://manpages.ubuntu.com/manpages/focal/man8/bpftool-cgroup.8.html]. For more information, see the Pull Request [https://github0 码力 | 1263 页 | 18.62 MB | 1 年前3
Cilium的网络加速秘诀• sched_cls 。cilium在内核 TC 处实现数据包转发、负载均衡、过滤 • xdp 。cilium在内核 XDP 处实现数据包的转发、负载均衡、过滤 • cgroup_sock_addr 。cilium在 cgroup 中实现对service解析 • sock_ops + sk_msg。记录本地应用之间通信的socket,实现本地数据包的加速转发 加速同节点pod间通信 cilium —> pod3: 172.20.0.30:80 step2 pod3: 172.20.0.30:80 —> pod1: 172.20.0.10:10000 cgroup ebpf service DNAT connect sendmsg recvmsg getpeername bind cilium的Host-Reachable 技术,利 用eBPF程序,拦截应用在内核connect0 码力 | 14 页 | 11.97 MB | 1 年前3
Cilium v1.6 DocumentationRESTARTS AGE cilium-crf7f 1/1 Running 0 10m Limitations The kernel BPF cgroup hooks operate at connect(2), sendmsg(2) and recvmsg(2) system call layers for connecting the application The socket operations hook is attached to a specific cgroup and runs on TCP events. Cilium attaches a BPF socket operations program to the root cgroup and uses this to monitor for TCP state transitions, do echo "cat $log"; cat $log; done cat /var/run/cilium/state/bpf_features.log BPF/probes: CONFIG_CGROUP_BPF=y is not in kernel configuration BPF/probes: CONFIG_LWTUNNEL_BPF=y is not in kernel configuration0 码力 | 734 页 | 11.45 MB | 1 年前3
Cilium v1.5 Documentationopera�ons: The socket opera�ons hook is a�ached to a specific cgroup and runs on TCP events. Cilium a�aches a BPF socket opera�ons program to the root cgroup and uses this to monitor for TCP state transi�ons, specifically do echo "cat $log"; cat $log; don cat /var/run/cilium/state/bpf_features.log BPF/probes: CONFIG_CGROUP_BPF=y is not in kernel configuration BPF/probes: CONFIG_LWTUNNEL_BPF=y is not in kernel configuration currently in the kernel are BPF_MAP_TYPE_PROG_ARRAY , BPF_MAP_TYPE_PERF_EVENT_ARRAY , BPF_MAP_TYPE_CGROUP_ARRAY , BPF_MAP_TYPE_STACK_TRACE , BPF_MAP_TYPE_ARRAY_OF_MAPS , BPF_MAP_TYPE_HASH_OF_MAPS . For0 码力 | 740 页 | 12.52 MB | 1 年前3
Cilium v1.7 DocumentationRESTARTS AGE cilium-crf7f 1/1 Running 0 10m Limitations The kernel BPF cgroup hooks operate at connect(2), sendmsg(2) and recvmsg(2) system call layers for connecting the application Cilium’s BPF kube-proxy replacement relies upon the Host-Reachable Services feature which uses BPF cgroup hooks to implement the service translation. The getpeername(2) hook is currently missing which will The socket operations hook is attached to a specific cgroup and runs on TCP events. Cilium attaches a BPF socket operations program to the root cgroup and uses this to monitor for TCP state transitions,0 码力 | 885 页 | 12.41 MB | 1 年前3
Cilium v1.8 DocumentationRESTARTS AGE cilium-crf7f 1/1 Running 0 10m Limitations The kernel BPF cgroup hooks operate at connect(2), sendmsg(2) and recvmsg(2) system call layers for connecting the application Cilium’s eBPF kube-proxy replacement relies upon the Host-Reachable Services feature which uses eBPF cgroup hooks to implement the service translation. Using it with libceph deployments currently requires The socket operations hook is attached to a specific cgroup and runs on TCP events. Cilium attaches a BPF socket operations program to the root cgroup and uses this to monitor for TCP state transitions,0 码力 | 1124 页 | 21.33 MB | 1 年前3
How and When You
Should Measure CPU
Overhead of eBPF
ProgramsR ✔ ✔ ✔ ✔ ✔ BPF_PROG_TYPE_SCHED_CLS ✔ ✔ ✔ ✔ ✔ BPF_PROG_TYPE_SCHED_ACT ✔ ✔ ✔ ✔ ✔ BPF_PROG_TYPE_CGROUP_SKB ✔ ✔ ✔ ✔ ✔ BPF_PROG_TYPE_LWT_IN ✔ ✔ ✔ ✔ ✔ BPF_PROG_TYPE_LWT_OUT ✔ ✔ ✔ ✔ ✔ BPF_PROG_TYPE_LWT_XMIT0 码力 | 20 页 | 2.04 MB | 1 年前3
共 12 条
- 1
- 2













