<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>NVIDIA on 黄文卓 | DevOps Engineer</title><link>https://socake.github.io/tags/nvidia/</link><description>Recent content in NVIDIA on 黄文卓 | DevOps Engineer</description><generator>Hugo -- gohugo.io</generator><language>zh-CN</language><managingEditor>17691281867@163.com (Wenzhuo Huang)</managingEditor><webMaster>17691281867@163.com (Wenzhuo Huang)</webMaster><copyright>© 2026 Wenzhuo Huang</copyright><lastBuildDate>Wed, 11 Mar 2026 10:00:00 +0800</lastBuildDate><atom:link href="https://socake.github.io/tags/nvidia/index.xml" rel="self" type="application/rss+xml"/><item><title>Triton Inference Server 生产部署：模型编排、动态批处理与多框架混部</title><link>https://socake.github.io/posts/triton-inference-server-production/</link><pubDate>Wed, 11 Mar 2026 10:00:00 +0800</pubDate><author>17691281867@163.com (Wenzhuo Huang)</author><guid>https://socake.github.io/posts/triton-inference-server-production/</guid><description>把 Triton 从一个陌生的 NVIDIA 推理服务器讲清楚：model repository、backend、动态批处理、ensemble、BLS、Python backend、生产监控和踩坑实录。</description><media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://socake.github.io/posts/triton-inference-server-production/featured.jpg"/></item><item><title>Kubernetes GPU 调度实战：AI 训练与推理基础设施</title><link>https://socake.github.io/posts/kubernetes-gpu-scheduling/</link><pubDate>Wed, 05 Nov 2025 14:00:00 +0800</pubDate><author>17691281867@163.com (Wenzhuo Huang)</author><guid>https://socake.github.io/posts/kubernetes-gpu-scheduling/</guid><description>GPU 是 AI 基础设施的核心资源，如何在 Kubernetes 上高效调度和管理 GPU 直接影响训练效率和推理成本。本文从底层驱动安装到上层调度策略，完整覆盖 K8s GPU 基础设施的搭建、监控和优化实践。</description><media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://socake.github.io/posts/kubernetes-gpu-scheduling/featured.jpg"/></item></channel></rss>