7. UDF in ClickHouseContent Area = 16,30 $ ¥ € $ €¥ $ £ ¥ £ ¥ UDF in ClickHouse Concept, Develpoment, and Application in ML Systems Begin Content Area = 16,30 2 About CraiditX CraiditX 氪信, a finance AI startup randomly Example: Finding a shortest path in the graph • Iterating Example: Training a regression model • Handling domain-specific data Example: Computing the edit distance between two strings • ...0 码力 | 29 页 | 1.54 MB | 1 年前3
ClickHouse: настоящее и будущееTelecom traffic analysis DPI analysis CDR records analysis Fraud & spam detection DDoS protection Application performance monitoring Logs & metrics Security events and logs. SIEM Analytics of corporate networks платформы: • x86_64, aarch64 (ARM), PowerPC 64, RISC-V • Linux, FreeBSD, mac OS ClickHouse — настоящий open-source 10 • Исходники доступны публично • Патчи от сообщества принимаются • Открытые процессы Максимальное поощрение и вовлечение сообщества Доклад «как организовать живое сообщество вокруг open-source продукта» youtube.com/watch?v=xddKLojmkus&t=4165s ClickHouse — плохая* система 11 Это не0 码力 | 32 页 | 2.62 MB | 1 年前3
ClickHouse: настоящее и будущееTelecom traffic analysis DPI analysis CDR records analysis Fraud & spam detection DDoS protection Application performance monitoring Logs & metrics Security events and logs. SIEM Analytics of corporate networks платформы: • x86_64, aarch64 (ARM), PowerPC 64, RISC-V • Linux, FreeBSD, mac OS ClickHouse — настоящий open-source 10 • Исходники доступны публично • Патчи от сообщества принимаются • Открытые процессы Максимальное поощрение и вовлечение сообщества Доклад «как организовать живое сообщество вокруг open-source продукта» youtube.com/watch?v=xddKLojmkus&t=4165s ClickHouse — плохая* система 11 Это не0 码力 | 32 页 | 776.70 KB | 1 年前3
ClickHouse on KubernetesActually it’s an open-source platform to: ● manage container-based systems ● build distributed applications declaratively ● allocate machine resources efficiently ● automate application deployment0 码力 | 34 页 | 5.06 MB | 1 年前3
ClickHouse on KubernetesActually it’s an open-source platform to: ● manage container-based systems ● build distributed applications declaratively ● allocate machine resources efficiently ● automate application deployment0 码力 | 29 页 | 3.87 MB | 1 年前3
1. Machine Learning with ClickHousestochastic gradient descent. Related page: https://www.jianshu.com/p/9329294d56d2 24 / 62 Stochastic model with default parameters SELECT stochasticLinearRegression( total_amount, trip_distance, toYear(pickup_datetime) 0.08 + 5.91 Year doesn’t seem to matter a lot for trained model 25 / 62 Stochastic model with default parameters 26 / 62 Stochastic model with default parameters Actually, our last feature was not 5418692076782445] That’s better! 27 / 62 Stochastic model with default parameters 28 / 62 Models management in ClickHouse How to store trained model You can store model as aggregate function state in a separate0 码力 | 64 页 | 1.38 MB | 1 年前3
0. Machine Learning with ClickHouse Related wiki page: https://en.wikipedia.org/wiki/Stochastic_gradient_descent 24 / 62 Stochastic model with default parameters SELECT stochasticLinearRegression( total_amount, trip_distance, toYear(pickup_datetime) 0.08 + 5.91 Year doesn’t seem to matter a lot for trained model 25 / 62 Stochastic model with default parameters 26 / 62 Stochastic model with default parameters Actually, our last feature was not 5418692076782445] That’s better! 27 / 62 Stochastic model with default parameters 28 / 62 Models management in ClickHouse How to store trained model You can store model as aggregate function state in a separate0 码力 | 64 页 | 1.38 MB | 1 年前3
8. Continue to use ClickHouse as TSDBwe choose it How we do ► ClickHouse 实现方式 ► (1) Column-Orient Model ► (2) Time-Series-Orient Model How we do ► Column-Orient Model How we do CREATE TABLE demonstration.insert_view ( `Time` ENGINE = MergeTree() PARTITION BY toYYYYMM(Time) ORDER BY (Name, Time, Age, ...); ► Column-Orient Model How we do CREATE TABLE demonstration.insert_view ( `Time` DateTime, `Name` LowCardinality(String) Column-Orient Model How we do CPU : Intel Skylake 8 core Memory : 64 GB Disk : 500GB SSD Data Set : TSBS, 12 Hours, 40000 Drivers, 10 Metrics ≈ 16.9 billion Rows ► Column-Orient Model How we do0 码力 | 42 页 | 911.10 KB | 1 年前3
6. ClickHouse在众安的实践基 础 设 施 模型 反馈 智能应用 开放与敏捷 • 大数据、流数据统一建模管理 • 垂直方向行业模板,简化开发过程 • 多语言多runtime支持,Bring your own model • 数据流转、建模、机器学习任务的全生命周 期管理 • 大规模在线任务监控、自动模型性能监测、 重训练与发布 • 追溯数据血缘,数据、算法模型版本管理 • 支持算法模型结果的可重现、可审计0 码力 | 28 页 | 4.00 MB | 1 年前3
ClickHouse in ProductionFlexible SQL dialect › Store petabytes of data › Fault-tolerant › 1000+ companies using in production › Open-source › Hundreds of contributors 1 / 97 ClickHouse is NOT Good for › Frequent small inserts › iodbc › Compatible with Tableau › Open source https://github.com/ClickHouse/clickhouse-odbc JDBC › Allows to use different formats › Configurable › Actively supported › Open source https://github.com/Clic0 码力 | 100 页 | 6.86 MB | 1 年前3
共 15 条
- 1
- 2













