<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>AI训练 on 黄文卓 | DevOps Engineer</title><link>https://socake.github.io/tags/ai%E8%AE%AD%E7%BB%83/</link><description>Recent content in AI训练 on 黄文卓 | DevOps Engineer</description><generator>Hugo -- gohugo.io</generator><language>zh-CN</language><managingEditor>17691281867@163.com (Wenzhuo Huang)</managingEditor><webMaster>17691281867@163.com (Wenzhuo Huang)</webMaster><copyright>© 2026 Wenzhuo Huang</copyright><lastBuildDate>Sat, 15 Mar 2025 09:40:00 +0800</lastBuildDate><atom:link href="https://socake.github.io/tags/ai%E8%AE%AD%E7%BB%83/index.xml" rel="self" type="application/rss+xml"/><item><title>Kueue 批处理调度实战：让 Kubernetes 真正承担 AI/HPC 工作负载</title><link>https://socake.github.io/posts/kueue-batch-workload/</link><pubDate>Sat, 15 Mar 2025 09:40:00 +0800</pubDate><author>17691281867@163.com (Wenzhuo Huang)</author><guid>https://socake.github.io/posts/kueue-batch-workload/</guid><description>把 AI 训练任务塞进 Kubernetes，第一天你会发现原生调度器完全不够用：没有队列、没有 quota、没有 gang scheduling、没有公平共享、preemption 语义一塌糊涂。Kueue 是 sig-scheduling 官方给出的答案，它比 Volcano 更贴近 Kubernetes 原生、比自研 controller 更成熟。这是一份真实的生产笔记。</description><media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://socake.github.io/posts/kueue-batch-workload/featured.jpg"/></item></channel></rss>