AI工程化 on 黄文卓 | DevOps Engineer

AI工程化 on 黄文卓 | DevOps Engineerhttps://socake.github.io/tags/ai%E5%B7%A5%E7%A8%8B%E5%8C%96/Recent content in AI工程化 on 黄文卓 | DevOps EngineerHugo -- gohugo.iozh-CN17691281867@163.com (Wenzhuo Huang)17691281867@163.com (Wenzhuo Huang)© 2026 Wenzhuo HuangThu, 05 Feb 2026 10:20:00 +0800RAG 评估体系：RAGAS 指标与幻觉检测实践https://socake.github.io/posts/rag-evaluation-ragas/Thu, 05 Feb 2026 10:20:00 +080017691281867@163.com (Wenzhuo Huang)https://socake.github.io/posts/rag-evaluation-ragas/RAG 系统上线后，‘感觉回答质量还不错’不是一个可持续的评估方式。RAGAS 提供了一套可量化的评估框架，让你能追踪 Faithfulness、Answer Relevancy 等指标随时间的变化，并在每次改动后自动验证系统质量没有退化。大模型核心概念：工程师需要理解的 LLM 基础https://socake.github.io/posts/llm-core-concepts/Mon, 17 Nov 2025 11:37:00 +080017691281867@163.com (Wenzhuo Huang)https://socake.github.io/posts/llm-core-concepts/同事第一次用 GPT-4 API 写代码时问我：为什么我发了一段中文，token 消耗比英文多那么多？为什么模型有时候会一本正经地胡说八道？这篇文章把我认为工程师必须理解的 LLM 概念系统整理了一遍，不涉及 Transformer 数学，只讲对你写代码有帮助的部分。Python 异步编程实战：asyncio 在 AI 应用中的使用https://socake.github.io/posts/python-async-programming/Fri, 22 Nov 2024 12:44:00 +080017691281867@163.com (Wenzhuo Huang)https://socake.github.io/posts/python-async-programming/AI 应用天然是 I/O 密集型的：等 LLM 响应、等向量数据库检索、等多个工具调用返回。同步写法在这里是性能杀手。这篇文章从 event loop 原理讲到实际的 AI 应用模式，重点是 asyncio.gather 并发调用、SSE 流式输出处理和常见陷阱排查。