<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>AI on 黄文卓 | DevOps Engineer</title><link>https://socake.github.io/tags/ai/</link><description>Recent content in AI on 黄文卓 | DevOps Engineer</description><generator>Hugo -- gohugo.io</generator><language>zh-CN</language><managingEditor>17691281867@163.com (Wenzhuo Huang)</managingEditor><webMaster>17691281867@163.com (Wenzhuo Huang)</webMaster><copyright>© 2026 Wenzhuo Huang</copyright><lastBuildDate>Fri, 03 Apr 2026 11:20:00 +0800</lastBuildDate><atom:link href="https://socake.github.io/tags/ai/index.xml" rel="self" type="application/rss+xml"/><item><title>运维工程师的 AI 工具实践</title><link>https://socake.github.io/posts/%E8%BF%90%E7%BB%B4%E5%B7%A5%E7%A8%8B%E5%B8%88ai%E5%B7%A5%E5%85%B7%E5%AE%9E%E8%B7%B5/</link><pubDate>Fri, 03 Apr 2026 11:20:00 +0800</pubDate><author>17691281867@163.com (Wenzhuo Huang)</author><guid>https://socake.github.io/posts/%E8%BF%90%E7%BB%B4%E5%B7%A5%E7%A8%8B%E5%B8%88ai%E5%B7%A5%E5%85%B7%E5%AE%9E%E8%B7%B5/</guid><description>从写 Shell 脚本、解读错误信息到辅助故障排查，分享运维工程师真实使用 AI 工具的高效场景、无效场景和 Prompt 技巧，以及各工具的适合场景。</description><media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://socake.github.io/posts/%E8%BF%90%E7%BB%B4%E5%B7%A5%E7%A8%8B%E5%B8%88ai%E5%B7%A5%E5%85%B7%E5%AE%9E%E8%B7%B5/featured.jpg"/></item><item><title>Ollama 在 K8s 上跑大模型：本地 LLM 的运维实践</title><link>https://socake.github.io/posts/ollama-kubernetes-llm/</link><pubDate>Mon, 30 Mar 2026 09:08:00 +0800</pubDate><author>17691281867@163.com (Wenzhuo Huang)</author><guid>https://socake.github.io/posts/ollama-kubernetes-llm/</guid><description>在 Kubernetes 上部署 Ollama 运行本地大模型，从 GPU 调度到 CPU 推理降级，再到运维场景的实际集成，记录完整的踩坑与实践过程。</description><media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://socake.github.io/posts/ollama-kubernetes-llm/featured.jpg"/></item><item><title>Embedding 模型选型与优化实战：从 BGE 到 OpenAI Embedding</title><link>https://socake.github.io/posts/embedding-model-selection-guide/</link><pubDate>Sat, 21 Feb 2026 09:30:00 +0800</pubDate><author>17691281867@163.com (Wenzhuo Huang)</author><guid>https://socake.github.io/posts/embedding-model-selection-guide/</guid><description>系统对比 2026 年主流 Embedding 模型，从原理到工程实践，覆盖选型决策、缓存设计和批量优化</description><media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://socake.github.io/posts/embedding-model-selection-guide/featured.jpg"/></item><item><title>Advanced RAG：超越 Naive RAG 的高级检索增强技术</title><link>https://socake.github.io/posts/advanced-rag-techniques/</link><pubDate>Wed, 04 Feb 2026 11:33:00 +0800</pubDate><author>17691281867@163.com (Wenzhuo Huang)</author><guid>https://socake.github.io/posts/advanced-rag-techniques/</guid><description>系统拆解 Naive RAG 的三类失败模式，提供混合检索、HyDE、查询改写、Parent-Child 分块等高级技术的完整实现</description><media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://socake.github.io/posts/advanced-rag-techniques/featured.jpg"/></item><item><title>大模型赋能运维：LLM 在故障排查和自动化中的实际应用</title><link>https://socake.github.io/posts/aiops-llm-devops/</link><pubDate>Sat, 31 Jan 2026 12:06:00 +0800</pubDate><author>17691281867@163.com (Wenzhuo Huang)</author><guid>https://socake.github.io/posts/aiops-llm-devops/</guid><description>LLM 不能替代运维工程师，但确实能把重复性、低价值的工作自动化掉。本文分享我在实际工作中用 Claude 落地的几个场景。</description><media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://socake.github.io/posts/aiops-llm-devops/featured.jpg"/></item><item><title>LLM 应用安全：Prompt Injection 防御与 AI Guardrails 实战</title><link>https://socake.github.io/posts/llm-security-guardrails/</link><pubDate>Fri, 23 Jan 2026 11:01:00 +0800</pubDate><author>17691281867@163.com (Wenzhuo Huang)</author><guid>https://socake.github.io/posts/llm-security-guardrails/</guid><description>我们的 AI 客服系统曾被一个用户用一句话绕过所有限制，让它泄露了内部知识库的敏感信息。这篇文章系统梳理 LLM 应用的安全威胁模型，以及我们在生产系统中实施的防御层次。</description><media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://socake.github.io/posts/llm-security-guardrails/featured.jpg"/></item><item><title>LLM 成本优化实战：从 Token 预算到模型路由</title><link>https://socake.github.io/posts/llm-cost-optimization/</link><pubDate>Mon, 19 Jan 2026 13:03:00 +0800</pubDate><author>17691281867@163.com (Wenzhuo Huang)</author><guid>https://socake.github.io/posts/llm-cost-optimization/</guid><description>我们的 AI 功能上线第一个月，LLM API 账单是 $18,000。通过模型路由、Prompt Caching 和 Batch API，第三个月降到了 $3,200。这篇文章记录具体怎么做到的。</description><media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://socake.github.io/posts/llm-cost-optimization/featured.jpg"/></item><item><title>LLM Tool Use 完全指南：Function Calling 设计模式与生产实践</title><link>https://socake.github.io/posts/llm-tool-use-function-calling/</link><pubDate>Sun, 18 Jan 2026 12:36:00 +0800</pubDate><author>17691281867@163.com (Wenzhuo Huang)</author><guid>https://socake.github.io/posts/llm-tool-use-function-calling/</guid><description>从工程视角深入 LLM Tool Use：覆盖 OpenAI 与 Claude API 差异、工具 Schema 设计、并发调用、错误恢复，附完整运维助手代码示例</description><media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://socake.github.io/posts/llm-tool-use-function-calling/featured.jpg"/></item><item><title>LLM 生产服务化：vLLM 部署与 GPU 推理优化实战</title><link>https://socake.github.io/posts/llm-production-serving-vllm/</link><pubDate>Tue, 13 Jan 2026 13:36:00 +0800</pubDate><author>17691281867@163.com (Wenzhuo Huang)</author><guid>https://socake.github.io/posts/llm-production-serving-vllm/</guid><description>团队把 Ollama 搬上生产后，高峰期请求排队超过 30 秒，用户纷纷反映 AI 功能不可用。这篇文章记录我们迁移到 vLLM 的全过程，包括 PagedAttention、Continuous Batching 原理，以及 Kubernetes GPU 部署的完整配置。</description><media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://socake.github.io/posts/llm-production-serving-vllm/featured.jpg"/></item></channel></rss>