<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>分布式 on 黄文卓 | DevOps Engineer</title><link>https://socake.github.io/tags/%E5%88%86%E5%B8%83%E5%BC%8F/</link><description>Recent content in 分布式 on 黄文卓 | DevOps Engineer</description><generator>Hugo -- gohugo.io</generator><language>zh-CN</language><managingEditor>17691281867@163.com (Wenzhuo Huang)</managingEditor><webMaster>17691281867@163.com (Wenzhuo Huang)</webMaster><copyright>© 2026 Wenzhuo Huang</copyright><lastBuildDate>Sun, 29 Mar 2026 10:45:00 +0800</lastBuildDate><atom:link href="https://socake.github.io/tags/%E5%88%86%E5%B8%83%E5%BC%8F/index.xml" rel="self" type="application/rss+xml"/><item><title>Ray Serve 模型部署实战：Deployment、DAG 编排与弹性伸缩</title><link>https://socake.github.io/posts/ray-serve-model-deployment/</link><pubDate>Sun, 29 Mar 2026 10:45:00 +0800</pubDate><author>17691281867@163.com (Wenzhuo Huang)</author><guid>https://socake.github.io/posts/ray-serve-model-deployment/</guid><description>Ray Serve 是被很多团队忽视的模型服务框架。它在复杂 DAG、异构资源、弹性伸缩上的表现远超单纯的 FastAPI。本文讲清它的核心抽象和生产落地。</description><media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://socake.github.io/posts/ray-serve-model-deployment/featured.jpg"/></item><item><title>ClickHouse 生产运维实战：集群部署、副本分片、性能调优与故障排查</title><link>https://socake.github.io/posts/clickhouse-ops-practice/</link><pubDate>Sun, 15 Mar 2026 10:00:00 +0800</pubDate><author>17691281867@163.com (Wenzhuo Huang)</author><guid>https://socake.github.io/posts/clickhouse-ops-practice/</guid><description>ClickHouse 高吞吐 OLAP 能力背后有一套独特的运维范式：ReplicatedMergeTree、ZooKeeper/Keeper、分布式表、物化视图、TTL、MergeTree 家族选型。本文按生产落地路径，从集群规划、副本分片、写入优化、查询调优、物化视图到慢查询排查，配套可直接复用的 SQL 与运维脚本。</description><media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://socake.github.io/posts/clickhouse-ops-practice/featured.jpg"/></item><item><title>vLLM 多机多卡分布式推理：Tensor Parallel 调优与踩坑实录</title><link>https://socake.github.io/posts/vllm-multi-node-distributed/</link><pubDate>Tue, 03 Mar 2026 09:30:00 +0800</pubDate><author>17691281867@163.com (Wenzhuo Huang)</author><guid>https://socake.github.io/posts/vllm-multi-node-distributed/</guid><description>从单机 8 卡讲到多机多卡，把 vLLM 的 TP/PP 拆分、Ray 启动方式、NCCL 调优、PagedAttention 显存核算和常见翻车场景串成一条完整的落地路径。</description><media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://socake.github.io/posts/vllm-multi-node-distributed/featured.jpg"/></item><item><title>ETCD 运维实战：部署、备份恢复与 K8s 集群数据管理</title><link>https://socake.github.io/posts/etcd-ops-practice/</link><pubDate>Sun, 13 Apr 2025 13:37:00 +0800</pubDate><author>17691281867@163.com (Wenzhuo Huang)</author><guid>https://socake.github.io/posts/etcd-ops-practice/</guid><description>ETCD 是 Kubernetes 的命脉，所有集群状态都存储在这里。本文从实际运维角度梳理部署、备份、恢复和配置动态更新的完整操作链路，包含多个踩坑经验。</description><media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://socake.github.io/posts/etcd-ops-practice/featured.jpg"/></item><item><title>MongoDB 分片集群实战：从 shard key 设计到 chunk 均衡的全链路</title><link>https://socake.github.io/posts/mongodb-sharding-practice/</link><pubDate>Wed, 20 Nov 2024 15:00:00 +0800</pubDate><author>17691281867@163.com (Wenzhuo Huang)</author><guid>https://socake.github.io/posts/mongodb-sharding-practice/</guid><description>很多团队把 MongoDB 分片当成&amp;quot;设个 shard key 就完事&amp;quot;，结果上线半年后发现 80% 数据在一个 shard 上、balancer 每天搬几十 GB 却怎么都追不上、某个 collection 出现 jumbo chunk 无法分裂。这篇文章把我在几套 MongoDB 分片集群上的经验整理出来，希望能让你在分片之前少走一些弯路。</description><media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://socake.github.io/posts/mongodb-sharding-practice/featured.jpg"/></item></channel></rss>