A Day in the Life of a Data Scientist Conquer Machine Learning Lifecycle on KubernetesA Day in the Life of a Data Scientist Conquer Machine Learning Lifecycle on Kubernetes Brian Redmond • Cloud Architect @ Microsoft (18 years) • Azure Global Black Belt Team • Live in Pittsburgh, PA Repeatable/consistent • CI/CD • This has worked well for App Dev. Now time for AI/ML • But, must ensure data scientist are not hindered by structure Why Containers, Kubernetes & Helm? • Container • Contains Scalable • Easy to explore hyper-parameters space • Easy to do distributed training But really, Data Scientists shouldn’t have to care about containers, kubernetes and all that stuff • Pachyderm can0 码力 | 21 页 | 68.69 MB | 1 年前3
Kubernetes开源书 - 周立: "qa" , "environment" : "production" "tier" : "frontend" , "tier" : "backend" , "tier" : "cache" "partition" : "customerA" , "partition" : "customerB" "track" : "daily" , "track" : "weekly" component: redis matchExpressions: 09-Label和Selector 28 - {key: tier, operator: In, values: [cache]} - {key: environment, operator: NotIn, values: [dev]} matchLabels 是 { key,value } 的映射。 capacity: storage: 10Gi accessModes: - ReadWriteOnce hostPath: path: "/mnt/data" --- kind: PersistentVolumeClaim apiVersion: v1 metadata: name: pv-claim spec: storageClassName:0 码力 | 135 页 | 21.02 MB | 1 年前3
Operator Pattern 用 Go 扩展 Kubernetes 的最佳实践Patching & Upgrades 小版本升级、大版本升级、安全漏洞修复等等。 Data Migrations 迁移、同步、清洗、跨地域、灾备、多活等等。 DB Operator Day-2 Operations Operator 基础模型 第二部分 K8s 架构 Cache Informer 机制 Cache 如何获取到本地(内存中) Informer 启动后会通过 reflector HTTP/2 长连接 Cache 如何保持与 API Server 一致性 list & watch 机制中,list 获取 API Server 中数据的一份快照,并记 录 ResourceVersion 版本信息,watch 从 ResourceVersion 开始,获取后 续的增量数据。 watch 通过网络异步(asynchronous)获取增量数据,所以 cache 提供 的是最终一致性(eventual 并完成初始化 Cache 注意事项 Cache 中的对象都保存在内存中,如果对象很多,内存占用会比较大, 所以一方面要根据单个对象大小以及总得对象规模来评估 controller 内 存消耗。 另一方面 informer 提供了同类型对象的共享机制,降低内存开销 近距离感受 list & watch 机制 Cache 本质及开发建议 相信 Cache 相信 cache 最终能提供所有的你想要的数据版0 码力 | 21 页 | 3.06 MB | 9 月前3
Kubernetes Native DevOps Practicescale • Reduce the learning curve for customer and ourselves • Get consistent user experience and data, leverage with PaaS capability • Facilitate our PaaS and micro-service product Kubernetes Capabilities/Advantages scheduler policy Build tasks and the dependent environments(sidecar) Share files between containers, or cache build files Container Image - Image of build / dependent environment [] Command - agent to collecting log data ElasticSearch ElasticSearch Monitor/Alert Service CronJob Node Pod Node Pod Unified logging、monitoring、alert with PaaS Consistent data Node group of build nodes0 码力 | 21 页 | 6.39 MB | 1 年前3
Alluxio 助力 Kubernetes, 加速云端深度学习由硅谷著名投资公司Andreessen Horowitz投资,公司在2015年在 旧金山湾区成立,致力于推动开源项目和社区以及商业化 8 面向大数据和AI应用的内存级数据编排系统 数据编排层(Data Orchestration) Java File API HDFS Interface S3 Interface REST API POSIX Interface Alluxio是什么 HDFS Synthetic Alluxio 1.缓存元数据减少gRPC交互 Client Master getStatus() open() read() …() Client Master Meta Cache First Access LRU listStatus() 2. Alluxio缓存行为控制 参数 取值 含义 alluxio.user.ufs.block.read.location 节点;转而选择其他节点。 alluxio.user.file.passive.cache.enabled false 当从Alluxio远程worker读文件时,是否缓存文件到Alluxio的本 地worker。 alluxio.user.file.readtype.default CACHE 默认的CACHE_PROMOTE会带来显著的性能开销 策略:1.优先本地加载缓存 2.避免数据震荡0 码力 | 22 页 | 11.79 MB | 1 年前3
KubeCon2020/微服务技术与实践论坛/Spring Cloud Alibaba 在 Kubernetes 下的微服务治理最佳实践-方剑management of these services, which may be written in different programming languages and use different data storage technologies. " (Martin Fowler) 什么是微服务架构? 单体 – 微服务 https://tanzu.vmware.com/content/blog/ Aliware MQ GTS AHAS MSE Micro Gateway 23123 Spring Data RDS MySQL Cassandra MongoDB ElasticSearch PostgreSQL 23123 Spring Cache Redis Memcache 23123 Spring Resource OSS 23123 Spring0 码力 | 27 页 | 7.10 MB | 1 年前3
Putting an Invisible Shield on Kubernetes Secretsinformation • Passwords • OAuth tokens • ssh keys etc. • Stored in etcd • distributed Key-Value data store • How about their security? • Default K8s setup • etcd contents not encrypted (only base64 • apiserver is responsible for • DEK generation • Secret en/decryption • kms-plugin • keeps KEK cache • only en/decrypts DEK, not secrets Encryption Workflow Decryption Workflow KMS Plugin (cont.)0 码力 | 33 页 | 20.81 MB | 1 年前3
逐灵&木苏-阿里巴巴 K8S 超大规模实践经验链路RT/QPS 服务异常 队列长度 gRPC监控 长连接分布 请求分布 限流 Authorization Authenticatio n 序列化 压缩 版本转换 Admission Cache Storage Filter Chain API 存储 Kube-APIServer Webhook ETCD 数据构建 压测场景 压测环境 压测报告 压测平台 监控&大盘• APIServer Watch优化 ETCD Cache Pod A V1 Pod A V2 Pod A V3 Reflector APIServer Watch Cache List & Watch Informer Reflector Store List & Watch• 网络抖动造成informer重新List & Watch List & Watch优化 Cache APIServer 13 Watch Cache Informer Store Kubelets Watch (rv=3 node=x) Too old version err rv=3 FIFO• 网络抖动造成informer重新List & Watch List & Watch优化 Cache APIServer 5 9 11 13 Watch Cache Informer0 码力 | 33 页 | 8.67 MB | 6 月前3
Jib Kubecon 2018 Talkspeed 7. Switch to use a Maven plugin Download and install Docker Order of layers to optimize for cache hits github.com/GoogleContainerTools/jib What did we do? 1. Write first Dockerfile 2. Reduce speed 7. Switch to use a Maven plugin Download and install Docker Order of layers to optimize for cache hits Use of multi-stage builds github.com/GoogleContainerTools/jib What did we do? 1. Write first plugin Download and install Docker Order of layers to optimize for cache hits Use of multi-stage builds Understanding Docker cache mechanism and quirks github.com/GoogleContainerTools/jib What did0 码力 | 90 页 | 2.84 MB | 1 年前3
QCon北京2018/QCon北京2018-基于Kubernetes与Helm的应用部署平台构建实践-张夏-赵明+基于Kubernetes的 CI/CD� 构建Docker镜像最佳实践 预期 ������ ������ ������ 实践 ����� ������ �������� ��Cache ��Volume ��yum/apk cache 用Helm管理Kubernetes应用 • K8s服务编排唯一开源子项目,K8s包管理工具 • 用于部署复杂K8s应用,处理复杂的服务间依赖 • 查看发布历史与某一次发布的具体配置0 码力 | 28 页 | 12.18 MB | 1 年前3
共 39 条
- 1
- 2
- 3
- 4













