Scaling a Multi-Tenant k8s Cluster in a TelcoScaling a Multi-Tenant k8s Cluster in a Telco Pablo Moncada October 28, 2020 About MasMovil group ● 4th telecom company in Spain ● Provides voice and broadband services to +12M customers ● Several Services +3k CPU +2k Mem +5TB Nodes +300 kube-proxy replacement NetworkPolicy logging Multi-cluster DNS Aware NetworkPolicy Increased Istio security External Services TLS visibility Performance0 码力 | 6 页 | 640.05 KB | 1 年前3
Cilium v1.10 DocumentationAdvanced Networking Cluster Mesh Operations Istio Concepts Component Overview Terminology Networking Network Security eBPF Datapath Observability Kubernetes Integration Multi-Cluster (Cluster Mesh) Getting Troubleshooting Component & Cluster Health Observing Flows with Hubble Observing flows with Hubble Relay Connectivity Problems Policy Troubleshooting etcd (kvstore) Cluster Mesh Troubleshooting Symptom clusters? What is the 95th and 99th percentile latency between HTTP requests and responses in my cluster? Which services are performing the worst? What is the latency between two services? Security observability0 码力 | 1307 页 | 19.26 MB | 1 年前3
Cilium v1.11 DocumentationAdvanced Networking Cluster Mesh Operations Istio Concepts Component Overview Terminology Networking Network Security eBPF Datapath Observability Kubernetes Integration Multi-Cluster (Cluster Mesh) Getting Troubleshooting Component & Cluster Health Observing Flows with Hubble Observing flows with Hubble Relay Connectivity Problems Policy Troubleshooting etcd (kvstore) Cluster Mesh Troubleshooting Symptom clusters? What is the 95th and 99th percentile latency between HTTP requests and responses in my cluster? Which services are performing the worst? What is the latency between two services? Security observability0 码力 | 1373 页 | 19.37 MB | 1 年前3
Cilium v1.9 DocumentationTerminology Networking Network Security eBPF Datapath Observability Kubernetes Integration Multi-Cluster (Cluster Mesh) Getting Help FAQ Slack GitHub Training Enterprise support Security Bugs Operations Scalability report Performance Evaluation Setup Evaluation Results Tuning Troubleshooting Component & Cluster Health Observing Flows with Hubble Observing flows with Hubble Relay Connectivity Problems Policy clusters? What is the 95th and 99th percentile latency between HTTP requests and responses in my cluster? Which services are performing the worst? What is the latency between two services? Security observability0 码力 | 1263 页 | 18.62 MB | 1 年前3
Cilium v1.8 DocumentationOverview Terminology Networking Network Security eBPF Datapath Kubernetes Integration Multi-Cluster (Cluster Mesh) Getting Help FAQ Slack GitHub Security Bugs Operations System Requirements Summary Scalability report Performance Evaluation Setup Evaluation Results Tuning Troubleshooting Component & Cluster Health Observing Flows with Hubble Observing flows with Hubble Relay Connectivity Problems Policy clusters? What is the 95th and 99th percentile latency between HTTP requests and responses in my cluster? Which services are performing the worst? What is the latency between two services? Security observability0 码力 | 1124 页 | 21.33 MB | 1 年前3
Cilium v1.7 DocumentationAgent Monitoring & Metrics Installation cilium-agent cilium-operator Troubleshooting Component & Cluster Health Connectivity Problems Policy Troubleshooting Symptom Library Useful Scripts Reporting a problem microservices. Traditional Linux network security approaches (e.g., iptables) filter on IP address and TCP/UDP ports, but IP addresses frequently churn in dynamic microservices environments. The highly volatile additional challenge is the ability to provide accurate visibility as traditional systems are using IP addresses as primary identification vehicle which may have a drastically reduced lifetime of just a0 码力 | 885 页 | 12.41 MB | 1 年前3
Cilium v1.6 DocumentationTroubleshooting Monitoring & Metrics Installation cilium-agent cilium-operator Troubleshooting Component & Cluster Health Connectivity Problems Policy Troubleshooting Symptom Library Useful Scripts Reporting a problem microservices. Traditional Linux network security approaches (e.g., iptables) filter on IP address and TCP/UDP ports, but IP addresses frequently churn in dynamic microservices environments. The highly volatile additional challenge is the ability to provide accurate visibility as traditional systems are using IP addresses as primary identification vehicle which may have a drastically reduced lifetime of just a0 码力 | 734 页 | 11.45 MB | 1 年前3
Cilium v1.5 DocumentationExported Metrics Cilium as a Kubernetes pod Cilium as a host-agent on a node Troubleshoo�ng Component & Cluster Health Connec�vity Problems Policy Troubleshoo�ng Automa�c Diagnosis Symptom Library Useful Scripts microservices. Tradi�onal Linux network security approaches (e.g., iptables) filter on IP address and TCP/UDP ports, but IP addresses frequently churn in dynamic microservices environments. The highly vola�le An addi�onal challenge is the ability to provide accurate visibility as tradi�onal systems are using IP addresses as primary iden�fica�on vehicle which may have a dras�cally reduced life�me of just a few0 码力 | 740 页 | 12.52 MB | 1 年前3
eBPF Summit 2020 Lightning Talkcorrections! Sad Rabbit Has No Memory • A faulty client spammed “AMQP consumers” • RabbitMQ cluster runs out of memory • Need a way to limit the number of consumers • But adding such a feature frame, IP header, TCP header • Only look at IPv4, TCP packet to AMQP port • Extract source IP & port as BPF map key Extract AMQP Methods Use BPF Maps Use BPF Maps • Using the source IP & port as Use BPF Maps • Using the source IP & port as map key • Map is a counter for consumers per connection • Increase when declare Use BPF Maps • Using the source IP & port as map key • Map is a counter0 码力 | 22 页 | 1.81 MB | 1 年前3
1.5 Years of Cilium Usage at DigitalOceanon control plane to enable control/data plane connectivity ● Cilium state-keeping in shared cluster etcd Cilium in the DOKS Architecture Data Plane Node #1 cilium-agent Node #1 cilium-agent pretty smooth ○ moved from Cilium 1.4 initially to 1.8 today ○ retain old RBAC rules across certain cluster upgrades to avoid disruptions ● (Health checking) tooling really helpful in troubleshooting issues0 码力 | 7 页 | 234.36 KB | 1 年前3
共 16 条
- 1
- 2













