绕过conntrack,使用eBPF增强 IPVS优化K8s网络性能## Bypassing conntrack: Optimizing K8s Service By Enhancing IPV5 with eBPF Jianmingfan (kenieevan@github) Zhiguohong (honkiko@github) ## Tencent Cloud ## Agenda 01 Problems with K8s Service 02 How rules are difficult to debug ## I PVS mode • Services are organized in hash table • IPV5 DNAT • conntrack/iptables SNAT ## • Pros • O(1) time complexity in control/data plane • Stably runs for two decades algorithm ## • Cons • Performance cost caused by conntrack ## • Some bugs hit? alloc rs and DNAT lookup service ipv6 NF local-in hooks route lookup conntrack NF pre-route hooks ip_rcv route lookup call0 码力 | 24 页 | 1.90 MB | 2 年前3
腾讯云 Kubernetes 高性能网络技术揭秘——使用 eBPF 增强 IPVS 优化 K8s 网络性能-范建明使用hashtable 管理service • IPV5 仅仅提供了 DNAT,还需要借用 iptables+conntrack 做SNAT hit? alloc rs and DNAT lookup service ipv6 NF local-in hooks route lookup conntrack NF pre-route hooks ip_rcv  virtio_xmit  ## I PVS mode ## 优势 • 控制面和数据面算法复杂度都是O(1) 经历了二十多年的运行,比较稳定成熟 支持多种调度算法 ## 不足之处 - 没有绕过conntrack,由此带来了性能开销 在k8s的实际使用中还有一些Bug ## 02 优化的方法 ## 指导思路 用尽量少的cpu指令处理每一个报文 • 不能独占cpu • 兼顾产品的稳定性,功能足够丰富0 码力 | 27 页 | 1.19 MB | 1 年前3
Building a Secure and Maintainable PaaSCILIUM_INPUT" -j CILIUM_INPUT -A INPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES -A INPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes FORWARD -m comment --comment "kubernetes forwarding rules" -j KUBE-FORWARD -A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES -A OUTPUT OUTPUT -m comment --comment "cilium-feeder: CILIUM_OUTPUT" -j CILIUM_OUTPUT -A OUTPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES -A OUTPUT0 码力 | 20 页 | 2.26 MB | 1 年前3
Cilium v1.5 Documentationor until pods were restarted. ###### Upgrading from $ \geq $ 1.4.0 to 1.5.y In v1.4, the TCP conntrack table size ct-global-max-entries-tcp ConfigMap parameter was ineffective due to a bug and thus, utilization below 25%. If needed, the interval can be set to a static interval with the option --conntrack-gc-interval. If connectivity fails and cilium monitor --type drop shows xx drop (CT: Map insertion filling up and the automatic adjustment of the garbage collector interval is insufficient. Set --conntrack-gc-interval to an interval lower than the default. Alternatively, the value for bpf-ct-global-any-max0 码力 | 740 页 | 12.52 MB | 1 年前3
Cilium v1.7 Documentationmark --mark KUBE-MARK-MASQ -j ACCEPT -s 10.233.64.0/18 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT -d 10.233.64.0/18 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT KUBE-SERVICES ! -s policy_l7_total instead. 1.5 Upgrade Notes Upgrading from >=1.4.0 to 1.5.y 1. In v1.4, the TCP conntrack table size ct-global-max-entries-tcp ConfigMap parameter was ineffective due to a bug and thus, utilization below 25%. If needed, the interval can be set to a static interval with the option --conntrack-gc- interval. If connectivity fails and cilium monitor --type drop shows xx drop (CT: Map insertion0 码力 | 885 页 | 12.41 MB | 1 年前3
Cilium v1.8 Documentationmark --mark KUBE-MARK-MASQ -j ACCEPT -s 10.233.64.0/18 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT -d 10.233.64.0/18 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT KUBE-SERVICES ! -s mark --mark KUBE-MARK-MASQ -j ACCEPT -s 10.233.64.0/18 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT -d 10.233.64.0/18 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT KUBE-SERVICES ! -s required the following command can be used to check the currently configured maximum number of TCP conntrack entries: sudo grep -R CT_MAP_SIZE_TCP /var/run/cilium/state/templates/ If the maximum number is0 码力 | 1124 页 | 21.33 MB | 1 年前3
Cilium v1.10 Documentationcilium_policy_import_errors_total instead. - cilium_datapath_errors_total is removed. Please use cilium_datapath_conntrack_dump_resets_total instead. - Label mapName in cilium_bpf_map_ops_total is removed. Please use label subnet_id and availability_zone instead. ## New Metrics - cilium_datapath_conntrack_dump_resets_total Number of conntrack dump resets. Happens when a BPF entry gets removed while dumping the map is in to toggle Cilium installing itself as the only available CNI plugin on all nodes. - install-no-conntrack-iptables-rules: This option, by default set to false, installs some extra Iptables rules to skip0 码力 | 1307 页 | 19.26 MB | 1 年前3
Cilium v1.6 Documentationinstead. ### 1.5 Upgrade Notes ###### Upgrading from $ \geq $ 1.4.0 to 1.5.y 1. In v1.4, the TCP conntrack table size ct-global-max-entries-tcp ConfigMap parameter was ineffective due to a bug and thus, utilization below 25%. If needed, the interval can be set to a static interval with the option --conntrack-gc-interval. If connectivity fails and cilium monitor --type drop shows xx drop (CT: Map insertion filling up and the automatic adjustment of the garbage collector interval is insufficient. Set --conntrack-gc-interval to an interval lower than the default. Alternatively, the value for bpf-ct-global-any-max0 码力 | 734 页 | 11.45 MB | 1 年前3
k8s操作手册 2.3bridge-nf-call-arptables="1" net.ipv4.ip_forward="1" eof="" #前3行表示bridge设备在二层转发时也去调用iptables配置的三层规则(包含conntrack)="" #="" sysctl="" -p="" #加载配置="" ⑧="" 防火墙放行端口="" tcp:="" 6443,="" 2379,="" 2380,="" 10250~10252 < ip_vs_rr ip_vs_wrr nf_conntrack_ipv4 EOF # modprobe ip_vs # modprobe ip_vs_sh # modprobe ip_vs_rr # modprobe ip_vs_wrr # modprobe nf_conntrack_ipv4 # 一般默认只用 ip_vs_rr # lsmod lsmod | grep -e ip_vs -e nf_conntrack # 检查是否已加载 ip_vs 模块 ★最后重启操作系统 # reboot ### ★第 1 章、部署 k8s 版本 $ \leq $ 1.23 k8s 在 1.23 及之前版本默认是调用 docker 作为底层的容器运行时,从 1.24 版本开始移除了 dockerShim 组件,不再支持 docker,从而默认使用 containerd0 码力 | 126 页 | 4.33 MB | 2 年前3
Cilium v1.9 Documentationmark --mark KUBE-MARK-MASQ -j ACCEPT -s 10.233.64.0/18 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT -d 10.233.64.0/18 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT KUBE-SERVICES ! -s required the following command can be used to check the currently configured maximum number of TCP conntrack entries: sudo grep -R CT_MAP_SIZE_TCP /var/run/cilium/state/templates/ If the maximum number is table size parameter bpf-nat-global-max in the daemon is derived from the default value of the conntrack table size parameter bpf-ct-global- tcp-max. Since the latter was changed (see above), the default0 码力 | 1263 页 | 18.62 MB | 1 年前3
共 52 条
- 1
- 2
- 3
- 4
- 5
- 6













