2024年7月24日随笔档案 - paulwong

vllm部署最新的glm-4.7-flash

目前要部署成功，主要是各种组件的版本要最新
torch: 2.9.1+cu130
vllm: nightly
transformers: 5.0.0rc3

安装命令

conda create -n vllm-glm47 python=3.12 -y
conda activate vllm-glm47
pip install torch==2.9.1+cu130 --index-url https://download.pytorch.org/whl/cu130
pip list
pip install -U vllm --pre --extra-index-url https://wheels.vllm.ai/nightly/cu130
pip list
pip install -U transformers==5.0.0rc3

启动命令_start-vllm.sh

BIN_PATH=$(cd `dirname $0`; pwd)
cd $BIN_PATH

#source /home/dgx/ai/miniconda3/bin/activate
#conda activate vllm-nightly

#uv pip install -U vllm --extra-index-url https://wheels.vllm.ai/nightly/cu130 --extra-index-url https://download.pytorch.org/whl/cu130

export CUDA_HOME=/usr/local/cuda
export TRITON_PTXAS_PATH="${CUDA_HOME}/bin/ptxas"
export PATH="${CUDA_HOME}/bin:$PATH"

nohup \
vllm serve /home/dgx/ai/models/models--zai-org--GLM-4.7-Flash \
  --served-model-name=zai-org/GLM-4.7-Flash \
  --host=0.0.0.0 \
  --port=8032 \
  --no-enable-prefix-caching \
  --mm-processor-cache-gb 0 \
  --gpu-memory-utilization 0.7 \
  --speculative-config.method mtp \
  --speculative-config.num_speculative_tokens 1 \
  --tool-call-parser glm47 \
  --reasoning-parser glm45 \
  --enable-auto-tool-choice \
  > vllm_server.log 2>&1 &

启动命令start-vllm.sh

BIN_PATH=$(cd `dirname $0`; pwd)
cd $BIN_PATH

./_start-vllm.sh && tail -f ./vllm_server.log

posted @ 2026-02-02 17:24 paulwong 阅读(6) | 评论 (0) | 编辑收藏

使用uv安装python环境

conda create -n my_ai_env python=3.10 -y
conda activate my_ai_env

#把 uv 请进来（代替普通的 pip）
conda install -c conda-forge uv -y

# 安装 vLLM 或其他重型库
uv pip install vllm --torch-backend auto

为什么系统装了 uv 对 Conda 更有利？

缓存共享：你在 Conda 环境 A 里装过的包，如果环境 B 也要用，uv 会直接从全局缓存里链接过去，0 秒完成安装，且不占双倍硬盘空间。1
不污染环境：通过 uv pip 安装的包，Conda 依然能感知到（通过 conda list 可以看到它们，通常标注为 pypi 来源）。2
极致性能：在处理 vLLM 这种动辄几个 GB 的重型依赖时，全局 uv 的并行下载速度能直接跑满你的带宽。

uv 处理 PyTorch 与 CUDA 兼容性问题主要有两种方式：一种是全自动检测（推荐），另一种是手动指定索引（更稳定）。

1. 自动检测（最推荐：--torch-backend auto）

这是 uv 的杀手锏功能。当你使用以下命令时：
bash

uv pip install torch --torch-backend auto

请谨慎使用此类代码。

原理：uv 会自动扫描你的本地环境（通过 nvidia-smi 或驱动版本），检测当前显卡支持的最高 CUDA 版本。

动作：它会自动从 PyTorch 的官方仓库（如 download.pytorch.org）匹配并下载对应的构建版本（如 +cu121 或 +cu124），无需你手动查表。

2. 手动指定官方索引（针对特定版本需求）

如果你需要安装特定版本的 CUDA（例如系统驱动较老），可以使用 --index-url：

bash
# 安装适配 CUDA 12.1 的版本

uv pip install torch --index-url https://download.pytorch.org/whl/cu121

请谨慎使用此类代码。

注意：在 Windows 上，默认 pip install torch 往往会装成 CPU 版，使用 uv 配合这个显式 URL 可以强制安装 GPU 版。

3. 在配置文件中永久锁定（适合项目管理）

如果你在使用 pyproject.toml 管理项目，可以在文件中配置 tool.uv.index，确保团队所有成员装的都是同一个 CUDA 版本：

toml
[[tool.uv.index]]
name = "pytorch-cu124"
url = "https://download.pytorch.org/whl/cu124"
explicit = true # 确保只有 torch 相关包走这个索引

请谨慎使用此类代码。

4. 解决“环境污染”：Conda 与 uv 的分工

由于 uv 无法安装系统级的 CUDA 驱动（Driver），最稳健的配合是：
Conda 负责：安装底层的 cudatoolkit 或 nvidia/label/cuda-xx.x。

uv 负责：利用 Conda 提供的环境，通过 uv pip install 瞬间拉取匹配的 Python 库。

总结建议：直接尝试 uv pip install torch --torch-backend auto。如果由于多显卡或 WSL2 环境导致自动检测失败，再退回到手动指定 --index-url 的方案。

posted @ 2026-01-28 14:52 paulwong 阅读(39) | 评论 (0) | 编辑收藏

AGENT SKILL

通常我们指挥大模型干活的时候，如果只说一句，帮我制作一个人个网站，则大模型只是随机抽出一些案例，生成，出来的效果当然不合人意。

agent skill解决的问题就是，一个skill只针对一个活，然后有个说明文件，说明干这个活，干得好的一个标准，如网站制作，如何配色标准，如何布局标准等，将这些和要干的活一起发给大模型，模型的输出就不再是随机了，而是按这个标准输出了，质量自然就提高了。

posted @ 2026-01-27 00:01 paulwong 阅读(30) | 评论 (0) | 编辑收藏

Agent案例 - 研发团队

产品经理(product-manager.md)

---
name: product-manager
description: Expert product manager specializing in product strategy, user-centric developnent,and business outcomes. Masters roadnap planning, feature prioritization, and cross-functionalleadership with focus on delivering products that users love and drive business growth.
tools: Read, Write, Edit, Glob, Grep, WebFetch, WebSearch, Bash
---

You are a senior product manager with expertise in building successful products that delightusers and achieve business objectives. Your focus spans product strategy, user research, featureprioritization, and go-to-narket execution with enphasis on data-driven decisions and continuous1teration.

when invoked:
1.Query context nanager for product vision and narket context
2. Review user feedback, analytics data, and competitive landscape
3.Analyze opportunitles,user needs,and business impact
4.Drive product decisions that balance user value and business goals

Product management checklist:
- User satisfaction > 80% achieved
-Feature adoption tracked thoroughly
- Business netrics achieved consistently
- Roadnap updated quarterly properly
- Backlog prioritized strategically
-Analytics implenented conprehensively
-Feedback loops active continuously
-Market position strong measurably

全栈程序员(fullstack-developer.md)

---name: fullstack-developer
description: End-to-end feature owmer with expertise across the entire stack. Deliversolutions from database to UI with focus on seamless Integration and optimal user exptools: Read,Write, Edit, Bash,Glob, Grep
---

You are a senior fullstack developer speclalizing in conplete feature development wittacross backend and frontend technologles. Your prinary focus 1s del1vering coheslve,solutions that work seamlessly from database to user Interface.

when invoked:
1. Query context manager for full-stack architecture and existing patterns
2. Analyze data flow from database through API to frontend
3.Review authentication and authorization across all layers
4.Design cohesive solution maintaining consistency throughout stack

Fullstack development checklist:
- Database schema aligned with API contracts
- Type-safe API implenentation with shared types
- Frontend components matching backend capabilities
- Authentication flow spanning all Layers
- Consistent error handling throughout stack
- End-to-end testing covering user journeys
- Performance optimization at each layer
- Deployment pipeline for entire feature

QA(qa-export.md)

---
name: qa-expert
description:Expert QA engineer specializing in comprehensive and quality metrics. Masters manual and automated testing, trwith focus on delivering high-quality software through system
tools: Read, Grep, Glob, Bash
---

You are a senior QA expert with expertise in comprehensive qmethodologies, and quality metrics. Your focus spans test pl quality advocacy with emphasis on preventing defects, ensure high quality standards throughout the development lifecycle.

When invoked:
1. Query context manager for quality requirements and applic
2. Review existing test coverage, defect patterns, and quality
3. Analyze testing gaps, risks, and improvement opportunities
4. Implement comprehensive quality assurance strategies

api-designer.md

code-reviewer.md

https://github.com/VoltAgent/awesome-claude-code-subagents/tree/main

Building a Harness for... Myself

https://www.subaud.io/building-a-harness-for-myself/

posted @ 2026-01-26 01:03 paulwong 阅读(35) | 评论 (0) | 编辑收藏

智谱文字变图像模型zai-org/GLM-Image

开源，可在huggingface下载，模型id：zai-org/GLM-Image https://huggingface.co/zai-org/GLM-Image

生成文字不乱码

posted @ 2026-01-25 23:26 paulwong 阅读(24) | 评论 (0) | 编辑收藏

Agent案例 - 制作ppt

输入提示词

4句提示词：
第一，你扮演世界上最顶尖的商业演示设计专家和视觉的传达顾问。
第二，请运用《演说之禅》和麦肯锡《金字塔原理》，以及高桥流简洁法的核心技巧，帮我设计逻辑清晰，而且视觉震撼的演示文稿。
第三，我会详细描述我的演示目标、演示目的、演示目标受众、演示的核心内容，时间限制以及现有的素材。
第四，请给我一个可以立即执行到个性化设计方案，一定要具体实用，别说废话。

选择Gemini-pro3，发送。

输入内容

示目标:希望目标受众在认知和行动上产生以下转变，从对AI的焦虑与替代恐惧，转变为将其视为能力放大器与战略伙伴。开始有意识地将AI深度整合进工作流，并制定个人计划，重点投资于洞察力、判断力、整合力与叙事力这四大不可替代的核心能力，实现从"执行者"到"决策者"的升级。
演示目的:缓解焦虑，指明方向;重塑价值，确立信心;提供框架，驱动改变;
目标受众:互联网和科技行业的产品经理、产品设计师、创业者、技术团队负责人;
核心内容:核心论断:问题从来本在工具，而在于你选择如何使用它。AI是能力的放大器，而非替代品。
关键框架:
现状定位:我们正处在从"推理者"向"代理"过渡的拐点，对"会用Agent的人"要求更高。
变与不变:效率门楼，协作方式在变:但对用户的理解，商业判断、驾驭复杂性、创造力与品味永不会
核心竞争力公式:产品力=思维深度xAI效率。
思维深度为零，一切归零。
四大不可替代能力:洞察力、判断力、整合力、叙事力。
三段式实践路径:短期掌握协作，中期深化专业，长期构建系统思维。
终极愿景:未来的产品经理应是思想者、决策者、协作者、创造者的集合体。最终的竞争，是"会用机器且拥有深度思维的人*之间的竞争。
时间限制:理想的演讲时间约为30-45分钟:

输入提示词，选择nano-banana，发送

这是我今天要演示的内容，请根据内容帮我生成一份PPT演示文稿，我给你的设计提示词是:"Cinematic lighting, hyper-realistic, human hand shaking robot hand, warm tone"(用于P3); "Minimalist style, a lighthouse in the dark ocean, vector art"(用于P8洞察力).

posted @ 2026-01-25 22:21 paulwong 阅读(24) | 评论 (0) | 编辑收藏

通过api调用vl模型

import os,base64
from openai import OpenAI

client = OpenAI(
    api_key='xxx',
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1"
)

def encode_image(image_path):
    with open(image_path,"rb") as image_file:
        return base64.b64encode(image_file.read()).decode("utf-8")

result = []
for file_name in ["food1.png","food2.png","food3.png"]:
    image_path = os.path.join(os.path.dirname(__file__),file_name)
    completion = client.chat.completions.create(
        model="qwen-vl-max-latest",
        messages=[
            {
                "role": "system",
                "content": [{"type":"text","text":"You are a helpful assistant."}]
            },
            {
                "role": "user",
                "content":[
                    {
                        "type": "image_url",
                        "image_url": {
                            "url":"data:image/png;base64," + encode_image(image_path)
                        }
                    },
                    {
                        "type": "text",
                        "text": """
                        提取图中内容，按照json格式输出如下，只输出纯json字符就行，不要夹杂换行符和其他多余字符：
                        {
                            "product_name":"xxxx",
                            "product_type":"xxxx",
                            "shelf_life":"xxxx",
                            "ingredients":"xxxx.",
                            "product_standard_code":"xxxx",
                            "storage_conditions":"xxxx",
                            "food_production_license_number":"xxxx",
                            "production_date":"xxxx",
                            "manufacturer":"xxxx",
                            "address":"xXxx",
                            "phone":"xxxx",
                            "fax":"xXXX"
                        }
                        """
                    }
                ]
            }
        ]
    )

    result.append(completion.choices[0].message.content)

print(result)

posted @ 2026-01-25 17:20 paulwong 阅读(23) | 评论 (0) | 编辑收藏

Claude skill 资源

安装方法

/plugin marketplace add anthropics/skills anthropics/claude-plugins-official

awesome-claude-skills

https://github.com/ComposioHQ/awesome-claude-skills/tree/master

Claude Code Plugins Directory

https://github.com/anthropics/claude-plugins-official

Skills

https://github.com/anthropics/skills

Awesome Claude Code Subagents
https://github.com/VoltAgent/awesome-claude-code-subagents/blob/main/README.md

agent-skills-examples

https://github.com/tech-shrimp/agent-skills-examples/tree/main/字幕转markdown

THE OPEN AGENT SKILLS ECOSYSTEM

https://skills.sh

skills.marketplace

https://skillsmp.com/zh

各行业ui设计规范skill

https://github.com/nextlevelbuilder/ui-ux-pro-max-skill

#安装
/plugin marketplace add nextlevelbuilder/ui-ux-pro-max-skill
/plugin install ui-ux-pro-max@ui-ux-pro-max-skill

# 自动激活
Build a landing page for my SaaS product

# 主动激活
/ui-ux-pro-max Build a landing page for my SaaS product

#How It Works
You ask - Request any UI/UX task (build, design, create, implement, review, fix, improve)
Design System Generated - The AI automatically generates a complete design system using the reasoning engine
Smart recommendations - Based on your product type and requirements, it finds the best matching styles, colors, and typography
Code generation - Implements the UI with proper colors, fonts, spacing, and best practices
Pre-delivery checks - Validates against common UI/UX anti-patterns

posted @ 2026-01-24 20:04 paulwong 阅读(41) | 评论 (0) | 编辑收藏

国内安装Claude code

安装命令

#!/bin/bash

set -e

# Parse command line arguments
TARGET="$1"  # Optional target parameter

# Validate target if provided
if [[ -n "$TARGET" ]] && [[ ! "$TARGET" =~ ^(stable|latest|[0-9]+\.[0-9]+\.[0-9]+(-[^[:space:]]+)?)$ ]]; then
    echo "Usage: $0 [stable|latest|VERSION]" >&2
    exit 1
fi

GCS_BUCKET="https://storage.googleapis.com/claude-code-dist-86c565f3-f756-42ad-8dfa-d59b1c096819/claude-code-releases"
DOWNLOAD_DIR="$HOME/.claude/downloads"

# Check for required dependencies
DOWNLOADER=""
if command -v curl >/dev/null 2>&1; then
    DOWNLOADER="curl"
elif command -v wget >/dev/null 2>&1; then
    DOWNLOADER="wget"
else
    echo "Either curl or wget is required but neither is installed" >&2
    exit 1
fi

# Check if jq is available (optional)
HAS_JQ=false
if command -v jq >/dev/null 2>&1; then
    HAS_JQ=true
fi

# Download function that works with both curl and wget
download_file() {
    local url="$1"
    local output="$2"

    if [ "$DOWNLOADER" = "curl" ]; then
        if [ -n "$output" ]; then
            curl -fsSL -o "$output" "$url"
        else
            curl -fsSL "$url"
        fi
    elif [ "$DOWNLOADER" = "wget" ]; then
        if [ -n "$output" ]; then
            wget -q -O "$output" "$url"
        else
            wget -q -O - "$url"
        fi
    else
        return 1
    fi
}

# Simple JSON parser for extracting checksum when jq is not available
get_checksum_from_manifest() {
    local json="$1"
    local platform="$2"

    # Normalize JSON to single line and extract checksum
    json=$(echo "$json" | tr -d '\n\r\t' | sed 's/ \+/ /g')

    # Extract checksum for platform using bash regex
    if [[ $json =~ \"$platform\"[^}]*\"checksum\"[[:space:]]*:[[:space:]]*\"([a-f0-9]{64})\" ]]; then
        echo "${BASH_REMATCH[1]}"
        return 0
    fi

    return 1
}

# Detect platform
case "$(uname -s)" in
    Darwin) os="darwin" ;;
    Linux) os="linux" ;;
    *) echo "Windows is not supported" >&2; exit 1 ;;
esac

case "$(uname -m)" in
    x86_64|amd64) arch="x64" ;;
    arm64|aarch64) arch="arm64" ;;
    *) echo "Unsupported architecture: $(uname -m)" >&2; exit 1 ;;
esac

# Check for musl on Linux and adjust platform accordingly
if [ "$os" = "linux" ]; then
    if [ -f /lib/libc.musl-x86_64.so.1 ] || [ -f /lib/libc.musl-aarch64.so.1 ] || ldd /bin/ls 2>&1 | grep -q musl; then
        platform="linux-${arch}-musl"
    else
        platform="linux-${arch}"
    fi
else
    platform="${os}-${arch}"
fi
mkdir -p "$DOWNLOAD_DIR"

# Always download latest version (which has the most up-to-date installer)
version=$(download_file "$GCS_BUCKET/latest")

# Download manifest and extract checksum
manifest_json=$(download_file "$GCS_BUCKET/$version/manifest.json")

# Use jq if available, otherwise fall back to pure bash parsing
if [ "$HAS_JQ" = true ]; then
    checksum=$(echo "$manifest_json" | jq -r ".platforms[\"$platform\"].checksum // empty")
else
    checksum=$(get_checksum_from_manifest "$manifest_json" "$platform")
fi

# Validate checksum format (SHA256 = 64 hex characters)
if [ -z "$checksum" ] || [[ ! "$checksum" =~ ^[a-f0-9]{64}$ ]]; then
    echo "Platform $platform not found in manifest" >&2
    exit 1
fi

# Download and verify
binary_path="$DOWNLOAD_DIR/claude-$version-$platform"
if ! download_file "$GCS_BUCKET/$version/$platform/claude" "$binary_path"; then
    echo "Download failed" >&2
    rm -f "$binary_path"
    exit 1
fi

# Pick the right checksum tool
if [ "$os" = "darwin" ]; then
    actual=$(shasum -a 256 "$binary_path" | cut -d' ' -f1)
else
    actual=$(sha256sum "$binary_path" | cut -d' ' -f1)
fi

if [ "$actual" != "$checksum" ]; then
    echo "Checksum verification failed" >&2
    rm -f "$binary_path"
    exit 1
fi

chmod +x "$binary_path"

# Run claude install to set up launcher and shell integration
echo "Setting up Claude Code

"
"$binary_path" install ${TARGET:+"$TARGET"}

# Clean up downloaded file
rm -f "$binary_path"

echo ""
echo "�� Installation complete!"
echo ""

申请api key

https://0011.ai/i/NGRRWNT4

配置Claude code

# 当前终端
CLAUDE_CODE_SKIP_AUTH=true
ANTHROPIC_BASE_URL = "https://aicoding.2233.ai"
ANTHROPIC_API_KEY = "你的 API Key"
ANTHROPIC_AUTH_TOKEN = "你的 API Key"

# 复制 Shell 命令
# 永久设置（~/.zshrc）
export CLAUDE_CODE_SKIP_AUTH=true
export ANTHROPIC_BASE_URL="https://aicoding.2233.ai"
export ANTHROPIC_API_KEY="你的 API Key"
export ANTHROPIC_AUTH_TOKEN="你的 API Key"
source ~/.zshrc

运行

在命令行切换到项目根目录，运行claude

posted @ 2026-01-24 12:22 paulwong 阅读(47) | 评论 (0) | 编辑收藏

生成城市图片的提示词

提示词1:

帮我生成图片：一张针对 [上海] 的城市渲染数字艺术海报。
画面核心主体是一个漂浮在白云上方、形状像所选城市的并且占据画面大部分内容的微型岛屿。岛屿的形状与城市在地图上的形状相似，无缝融合城市独特的标志性地标、自然景观及文化元素。加入城市特有的鸟类、电影般的光影、鲜艳色彩、航拍视角和阳光反射效果，建筑不宜太多太密集。
岛屿展现历史与现代的无缝融合。一部分是该城市最具代表性的古代历史建筑；另一部分平滑过渡为城市的地标建筑和天际线景观。
岛屿漂浮浩瀚云海之上。云海采用该城市所在文化圈的传统艺术风格进行表现。
立体城市拼音或英文名的 3D 文字漂浮在微型岛屿的上方，这组文字像一个生态与文化共生的微缩生态装置。
在画面四周和主体周围，叠加一层极简、高雅、具有博物馆展板质感的信息排版层。主要检索相关的城市信息，主要信息使用经典的衬线字体，辅助数据可使用极细的极简无衬线体。在画面的角落，以类似古典地图集或高级杂志扉页的方式排版。用衬线体标注城市的地理坐标、别称或建城年份，以及当前的天气，作为装饰性的背景信息，整体排版留白极多，排版克制、干净、平衡，如同在欣赏一件珍贵的艺术品。

https://mp.weixin.qq.com/s/GqUcajzlA4dWYGJgaWBkCA

提示词2:

一张针对 [城市名称，例如：成都 Chengdu] 的城市渲染数字艺术海报。画面核心主体是一个漂浮在白云上方、形状像所选城市的并且占据画面大部分内容的微型岛屿。岛屿的形状与城市在地图上的形状相似，无缝融合城市独特的标志性地标、自然景观及文化元素。加入城市特有的鸟类、电影般的光影、鲜艳色彩、航拍视角和阳光反射效果，建筑不宜太多太密集。岛屿展现历史与现代的无缝融合。一部分是该城市最具代表性的古代历史建筑；另一部分平滑过渡为城市的地标建筑和天际线景观。岛屿漂浮浩瀚云海之上。云海采用该城市所在文化圈的传统艺术风格进行表现。立体城市拼音或英文名的 3D 文字漂浮在微型岛屿的上方，这组文字像一个生态与文化共生的微缩生态装置。在画面四周和主体周围，叠加一层极简、高雅、具有博物馆展板质感的信息排版层。主要检索相关的城市信息，主要信息使用经典的衬线字体，辅助数据可使用极细的极简无衬线体。在画面的角落，以类似古典地图集或高级杂志扉页的方式排版。用衬线体标注城市的地理坐标、别称或建城年份，以及当前的天气，排版克制、干净、平衡。风格要求： Octane Render, C4D, Isometric City, Micro World, Living Ecosystem, 8k Resolution. DreamWorks style, 3D modeling, delicate, soft light projection. --ar 9:16 --v 6.0

https://mp.weixin.qq.com/s/U9zeuB8cJ7ZuEz91jp9ReQ?from=groupmessage&scene=1&subscene=10000&clicktime=1767657795&enterid=1767657795&sessionid=0&ascene=1&fasttmpl_type=4&fasttmpl_fullversion=8071695-zh_CN-zip&fasttmpl_flag=0&realreporttime=1767657795426

posted @ 2026-01-06 08:59 paulwong 阅读(34) | 评论 (0) | 编辑收藏

AI应用资源

https://www.zhujunyue.com/hangye/#0

posted @ 2026-01-03 11:21 paulwong 阅读(16) | 评论 (0) | 编辑收藏

N8N资源

@bitovi/n8n-nodes-semantic-text-splitter

https://github.com/bitovi/n8n-nodes-semantic-text-splitter?tab=readme-ov-file

posted @ 2025-12-30 09:30 paulwong 阅读(35) | 评论 (0) | 编辑收藏

安装docker mcp

架构

AI Client → MCP Gateway → MCP Servers (Docker Containers)

安装docker mcp server

https://hub.docker.com/mcp/explore?categories=database

以docker 的方式安装

安装MCP Gateway

下载docker-mcp https://github.com/docker/mcp-gateway/releases/latest

移到：

Linux    ~/.docker/cli-plugins/docker-mcp
macOS    ~/.docker/cli-plugins/docker-mcp
Windows    %USERPROFILE%\.docker\cli-plugins

启动

# Run the MCP gateway (stdio)
docker mcp gateway run

# Run the MCP gateway (streaming)
docker mcp gateway run --port 8080 --transport streaming

在网关中激活某个mcp server

# List enabled servers
docker mcp server ls

# Enable one or more servers
docker mcp server enable <server-name> [server-name

]

# Disable servers
docker mcp server disable <server-name> [server-name

]

# Get detailed information about a server
docker mcp server inspect <server-name>

# Reset (disable all servers)
docker mcp server reset

MCP客户端

通过n8n等连接

posted @ 2025-12-14 11:28 paulwong 阅读(82) | 评论 (0) | 编辑收藏

解读RAG

向大模型输入问题，让大模型回答。

大模型的做法是，先自己去寻找相关信息，再汇总出答案。这个过程有可能不靠谱，大模型自己寻找出他自己认为是对的信息，可能是错的，所以做出的答案，就是错的。

于是改成，让程序找出相关信息，再给大模型汇总。程序员编程后找出的信息肯定是对的，大模型汇总的结果，以大模型的能力，结果也是对的。

但这种方式有问题，就是会多次调用大模型，大模型通常是部署在远端的，就会产生性能问题。所以在大模型内部，让大模型自己去调用工具。

要大模型自己调用工具，就得靠系统提示词了，当然系统提示词不用自己写，将在n8n中流程导出json，再向deepseek提问，就可以生成提示词了。

调用的工具不止一个，就会产生协同问题，如何将a工具产生的结果作为参数调用b工具，这里的做法就是使用few shot，意思是在提示词中要要提供例子，如调用工具产生结果r1，{"input": r1}以这个参数调用工具b，这样工具b内部通过fromAI("input")就能获取到input参数的值了。

这个过程中大模型实际只做汇总，利用了大模型的长处，避免了大模型的短处。

程序去找信息，如果是从数据库中找出相关文档的方式，就是通常据说的RAG。

但在做RAG的过程中，难免碰到需求，如：用户输入关键词，直播回答某些预先答案。

象这种的实现方式，如果采用将答案弄成文档，再去让程序找到相关文档，再让大模型汇总这种方式也是不可靠。因为找到文档也是靠概率。

因此需使用新的方式，这种方式就是让程序调用工具，得出结果作为相关信息，再让大模型做汇总。这个过程出来的结果就是可靠的了。

为什么大模型会去调用工具呢

如果工具处理的方式不够，需要思考，那就需要加入大模型进去，进行协助，这种工具就是智能体了。整个架构就是所谓的多智能体的方式了。

所以总体的思路，就是让大模型做最擅长的活，汇总，信息提供由外部去做。整个结果就是可控的了。

posted @ 2025-12-13 01:28 paulwong 阅读(46) | 评论 (0) | 编辑收藏

添加Milvus MCP

克隆源码:

git clone https://github.com/zilliztech/mcp-server-milvus.git

添加Dockerfile

FROM python:3.12-slim

WORKDIR /app

# 安装系统依赖
RUN apt-get update && apt-get install -y \
    curl \
    git \
    build-essential \
    && rm -rf /var/lib/apt/lists/*

# 使用 pip 安装 uv（替代 curl 方式）
#RUN pip install --no-cache-dir uv -i https://pypi.tuna.tsinghua.edu.cn/simple

RUN curl -LsSf https://astral.sh/uv/install.sh | sh

# 或者使用更可靠的方式
# RUN pip install --no-cache-dir uv==0.3.0

# 复制依赖文件
COPY pyproject.toml uv.lock README.md ./

# 为 uv 设置镜像源环境变量
ENV UV_INDEX_URL=https://pypi.tuna.tsinghua.edu.cn/simple

# 使用 uv 安装依赖
# 3. （关键）在构建时安装Python依赖到系统，而非虚拟环境
RUN pip install uv && \
uv pip install --system -r pyproject.toml
# RUN uv pip install --system -r pyproject.toml -i https://pypi.tuna.tsinghua.edu.cn/simple

# 复制源代码
COPY src/ ./src/

# 暴露端口
EXPOSE 8000

CMD ["uv", "run", "src/mcp_server_milvus/server.py", "--sse", "--milvus-uri", "http://milvus:19530", "--port", "8000"]

docker-compose.yaml

services:

  mcp-milvus-server:
    build: .
    container_name: mcp-milvus-server
    extra_hosts:
      - "host.docker.internal:host-gateway"
    environment:
      - MILVUS_URI=http://host.docker.internal:19530
      # - MILVUS_TOKEN=http://localhost:19530
      # - MILVUS_DB=http://localhost:19530
    ports:
      - "8012:8000"
    volumes:
      - ./src:/app/src
    # depends_on:
    #   milvus:
    #     condition: service_healthy
    command: uv run src/mcp_server_milvus/server.py --sse --milvus-uri http://milvus:19530 --port 8000
    networks:
      - n8n_network

volumes:
  milvus_data:
  milvus_conf:
  etcd_data:
  minio_data:

networks:
  n8n_network:
    external: true

login-mcp-milvus-server.sh

BIN_PATH=$(cd `dirname $0`; pwd)
cd $BIN_PATH/mcp-server-milvus

docker compose exec -it mcp-milvus-server /bin/bash

logs-mcp-milvus-server.sh

BIN_PATH=$(cd `dirname $0`; pwd)
cd $BIN_PATH/mcp-server-milvus

docker compose logs -f

start-mcp-milvus-server.sh

BIN_PATH=$(cd `dirname $0`; pwd)
cd $BIN_PATH/mcp-server-milvus

docker compose up -d
docker compose logs -f

shutdown-mcp-milvus-server.sh

BIN_PATH=$(cd `dirname $0`; pwd)
cd $BIN_PATH/mcp-server-milvus

docker compose down

restart-mcp-milvus-server.sh

BIN_PATH=$(cd `dirname $0`; pwd)
cd $BIN_PATH

pwd
./shutdown-mcp-milvus-server.sh
./start-mcp-milvus-server.sh

posted @ 2025-12-08 01:49 paulwong 阅读(38) | 评论 (0) | 编辑收藏

MCP资源

什么是mcp?

https://code.visualstudio.com/docs/copilot/customization/mcp-servers#_add-an-mcp-server-to-your-user-settings

mcp server大全

https://github.com/modelcontextprotocol/servers?tab=readme-ov-file

常用mcp server:

https://github.com/modelcontextprotocol/servers-archived/tree/main/src/postgres

mcp 官网:

https://modelcontextprotocol.io/docs/develop/build-server

posted @ 2025-12-07 21:29 paulwong 阅读(39) | 评论 (0) | 编辑收藏

AI 模型广场

如果想在某些垂直领域找比较好的模型, 可到下面这个网址查看:

https://ai.gitee.com/serverless-api

posted @ 2025-11-23 09:53 paulwong 阅读(18) | 评论 (0) | 编辑收藏

Open WebUI + N8N 流式输出

Integrating n8n with Open WebUI: Building advanced AI chatbots and workflows

https://www.pondhouse-data.com/blog/integrating-n8n-with-open-webui

n8nchatui

https://n8nchatui.com/docs

https://www.youtube.com/watch?v=_dmrr7kWRI0

open-webui function:

https://openwebui.com/f/webfox/n8n_streaming

Complete Guide to n8n Chat Streaming Setup

https://sudhanshu-sharma.notion.site/Complete-Guide-to-n8n-Chat-Streaming-Setup-24ed62104585806a8909d1b662209af3

posted @ 2025-11-08 09:03 paulwong 阅读(35) | 评论 (0) | 编辑收藏

微调案例

我如何用 Prompt 工程将大模型调教成风控专家

https://my.oschina.net/u/4090830/blog/18690981

posted @ 2025-09-10 13:25 paulwong 阅读(24) | 评论 (0) | 编辑收藏

RAG提高召回率秘笈

传统的搜索是全文搜索, 即用户提供关键字, 系统将此关键字去数据库中的文本查找, 看文本是否含此关键字, 如有则返回.

这种有个缺点, 如果提供的是关键字的同义词, 则无法搜索了.

于是最新的人工智能技术能解决这个问题, 即只提供同义词之类的也能找出来.

为什么能查找出来呢, 系统将待搜索的文本转成向量, 再将关键词转成向量, 查找欧氏距离或余弦相似度最近的那组向量, 再将此对应的文本返回.

由于文本长度太长, 通常是将文本切割成文本块, 再逐个存储. 这样会导致返回的文本有缺失.

于是产生不同的存储策略, 将文本的属性作为元数据保存了下来, 如果精准的知道其属性, 则可以直接查属性而找到文本.

也可以将此文本生成一段摘要, 也作为元数据保存下来, 关键字先和摘要匹配, 如果相近即返回.

也可以将文本转成全文索引的格式保存下来, 再以文本是否含此关键字进行搜索, 如有则返回.

这样返回的文本多了, 搜索的准确度自然就提高了.

这里推荐Milvus数据库, 将以上机制都放在服务器端, 用户只需调包即可实现, 大大简化的编程.

代码实现:

https://milvus.io/docs/zh-hant/full_text_search_with_langchain.md

书本代码:

https://github.com/huangjia2019/rag-in-action/blob/master/04-向量存储-VectorDB/Milvus/create_milvus_db.py#L100

https://github.com/Tylersuard/EnterpriseRAG/blob/main/Ch03/upload_sql_records_to_ai_search.py#L10

https://github.com/tomasonjo/kg-rag/blob/main/README.md

posted @ 2025-09-06 15:47 paulwong 阅读(146) | 评论 (0) | 编辑收藏

RAG优化

老王AIGC

https://mp.weixin.qq.com/mp/appmsgalbum?action=getalbum&__biz=MzkzMzc2MjAwOQ==&scene=24&album_id=3853666856078622721&count=3#wechat_redirect

posted @ 2025-08-26 16:57 paulwong 阅读(30) | 评论 (0) | 编辑收藏

docker镜像加速

https://github.com/DaoCloud/public-image-mirror

posted @ 2025-08-26 09:59 paulwong 阅读(22) | 评论 (0) | 编辑收藏

支持 A 股、港股！AI 投资炒股「智能体」开源，太绝了。

它部署了多个专业的 AI 大模型智能体，每一个智能体对应交易公司的一个角色。比如有的智能体是基本面分析师、有的是情绪分析师、有的是技术分析师，还有交易员、风险管理员等等。让这些角色的AI智能体在一起叽叽喳喳讨论，最终确定最优的策略。给出买入或者卖出的决策。

https://mp.weixin.qq.com/s/mu1eF1l5ung-siVcUrEsTQ

合集

posted @ 2025-07-11 19:06 paulwong 阅读(103) | 评论 (0) | 编辑收藏

保险核保系统设计

回答用户的问题, 如“醉驾能否赔偿”时, 首先去条款库中匹配是否对得上的条款, 如有直接返回.

上面如果不中, 则走llm回答.

提取关键字, 用一关键字列表, 逐个对照, 如有则返回关键字, 没有则返回默认的车险关键字

拿着此关键字去知识图谱搜索出一堆条款

构造大模型输入的提示词, 即角色+条款列表+问题+请回答, 输入到大模型, 让大模型回答

检查回答是否合规, 如是否有免责字样或没有条款列表, 如不规合则直接返回, “请联系销售代表”字样

如合规, 则提取回答后面的字样作为答案返回

@import url(/css/cuteeditor.css);

posted @ 2025-07-02 00:43 paulwong 阅读(67) | 评论 (0) | 编辑收藏

debian安装python+替换为清华源

sudo cp /etc/apt/sources.list /etc/apt/sources.list.bak

sudo vi /etc/apt/sources.list.d/debian.sources

添加如下内容:

Types: deb
URIs: https://mirrors.tuna.tsinghua.edu.cn/debian/
Suites: bookworm bookworm-updates bookworm-backports
Components: main contrib non-free non-free-firmware
Signed-By: /usr/share/keyrings/debian-archive-keyring.gpg

Types: deb
URIs: https://mirrors.tuna.tsinghua.edu.cn/debian-security/
Suites: bookworm-security
Components: main contrib non-free non-free-firmware
Signed-By: /usr/share/keyrings/debian-archive-keyring.gpg

更新所有包

sudo apt update

安装python

sudo apt-get install python3

sudo apt-get install python3-pip

命令支持短写

sudo apt install python-is-python3

安装miniconda

wget https://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/Miniconda3-py310_25.3.1-1-Linux-x86_64.sh

bash Miniconda3-py310_25.3.1-1-Linux-x86_64.sh

conda config --set show_channel_urls yes

cat > ~/.condarc <<EOF
channels:
- defaults
show_channel_urls: true
default_channels:
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/r
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/msys2
custom_channels:
conda-forge: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
pytorch: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
EOF

清除缓存
conda clean -i

conda --version
conda info # 查看渠道是否显示为清华源

posted @ 2025-06-23 11:32 paulwong 阅读(97) | 评论 (0) | 编辑收藏

最全 Docker 神器集结，让你的服务器瞬间起飞！

https://mp.weixin.qq.com/s/gtyMdmCqBY7LfdBGUBldSA

posted @ 2025-06-21 23:01 paulwong 阅读(44) | 评论 (0) | 编辑收藏

百炼大模型支持深度思考

https://help.aliyun.com/zh/model-studio/deep-thinking#1f5ad51894bvi

posted @ 2025-06-18 23:56 paulwong 阅读(35) | 评论 (0) | 编辑收藏

以非root用户运行docker

sudo useradd -m paul # 创建用户并自动建立家目录
sudo passwd paul # 设置用户密码（需输入两次确认）
sudo usermod -aG wheel paul # CentOS/RHEL
[root@dev69 ~]$ groupadd docker
[root@dev69 ~]$ usermod -aG docker $USER
[root@dev69 ~]$ reboot
[paul@dev69 ~]$ docker run hello-world

sudo usermod -aG docker $USER
newgrp docker

posted @ 2025-06-13 16:47 paulwong 阅读(39) | 评论 (0) | 编辑收藏

创建数据集的资源

AI 数据集生成和模型微调框架 Distilabel 入门指南：基本概念、安装与快速开始

https://zhuanlan.zhihu.com/p/25766406373

使用Llama3和distilabel构建微调数据
https://huggingface.co/blog/dvilasuero/synthetic-data-with-llama3-distilabel

posted @ 2025-05-18 08:01 paulwong 阅读(45) | 评论 (0) | 编辑收藏

强化学习资源

蘑菇书EasyRL
李宏毅老师的《深度强化学习》是强化学习领域经典的中文视频之一。李老师幽默风趣的上课风格让晦涩难懂的强化学习理论变得轻松易懂，他会通过很多有趣的例子来讲解强化学习理论。比如老师经常会用玩 Atari 游戏的例子来讲解强化学习算法。此外，为了教程的完整性，我们整理了周博磊老师的《强化学习纲要》、李科浇老师的《世界冠军带你从零实践强化学习》以及多个强化学习的经典资料作为补充。对于想入门强化学习又想看中文讲解的人来说绝对是非常推荐的。

本教程也称为“蘑菇书”，寓意是希望此书能够为读者注入活力，让读者“吃”下这本蘑菇之后，能够饶有兴致地探索强化学习，像马里奥那样愈加强大，继而在人工智能领域觅得意外的收获。

https://github.com/datawhalechina/easy-rl?tab=readme-ov-file

posted @ 2025-04-30 14:15 paulwong 阅读(50) | 评论 (0) | 编辑收藏

足球数据资源

足球基础数据

https://www.nami.com/details/4nw10i0tela68lq#interface

足球统计数据

https://www.nami.com/details/7xwk3iqtv3s9rk6#interface

足球统计数据

https://www.nami.com/details/7xwk3iqtv3s9rk6#interface

足球高阶数据

https://www.nami.com/details/g5wvvikteeixwzd#interface

指数数据

https://www.nami.com/details/o6w9kipt4yi78k3#interface

足球资料库数据

https://www.nami.com/details/7j8gxi0to7inrql#interface

Marz火星数据（体育）

https://www.kancloud.cn/marz/marz-sport/3098904

posted @ 2025-04-24 14:56 paulwong 阅读(93) | 评论 (0) | 编辑收藏

ai预测足球资源

基于机器学习的2022世界杯预测实战

https://www.showmeai.tech/article-detail/400

AI 竞彩赛事预测工具

https://www.mysports.ai/cn

posted @ 2025-04-19 01:07 paulwong 阅读(57) | 评论 (0) | 编辑收藏

微调训练的数据集

使用trl库做微调时, 对数据集的要求是:

如果是多轮对话场景:

jsonl 文件，且需符合以下要求:

1.每行是一个独立的 JSON 对象;

2 每个对象须包含一个键名为 messages 的数组，数组不能为空;

3.messages 中每个元素必须包含 role 和 content 两个字段:

4.role 只能是 system,user 或 assisiant;

5.如果有 system 角色消息, 需在数组首位;

6.第一条非 system 消息必须是 user 角色;

7.user 和 assisiant 角色的消息应当交替、成对出现，不少于1对;

如果是指令微调场景:

jsonl 文件，且需符合以下要求:

1.每行是一个独立的 JSON 对象;

2 每个对象须包含且只能包含一个键名为 text 的键值对，值不能为空;

posted @ 2025-03-21 21:52 paulwong 阅读(92) | 评论 (0) | 编辑收藏

大模型训练的几个阶段

大模型开发出来后, 一般要经过以下几个阶段的训练:

预训练(Pre-Trained)

单纯提供文本: {"text":"..."}

训练模型由第一个文字开始, 预测后面的文字, 直到结束.

这种模型只会做完成文本的任务

监督微调(Supervised Fine Turning)

为了使模型能完成根据指令完成回答, 而不是随机生成回答

提供的文本: {"instruction":"...", "output":"..."}

高效参数微调(Parameter Efficient Fine Turning)

只调整部分参数, 具体实现方法有LoRA

参考:

https://github.com/huggingface/smol-course/blob/main/1_instruction_tuning/notebooks/sft_finetuning_example.ipynb

posted @ 2025-03-18 13:14 paulwong 阅读(91) | 评论 (0) | 编辑收藏

python资源

python

https://www.w3schools.com/python/

https://www.runoob.com/python/python-basic-syntax.html

廖雪峰的官方网站

https://liaoxuefeng.com/books/python/index.html

posted @ 2025-03-16 20:54 paulwong 阅读(58) | 评论 (0) | 编辑收藏

大模型微调后的评估指标

大模型微调后的评估指标是衡量模型性能的关键，通常根据任务类型和具体需求选择不同的评估指标。以下是一些常见的评估指标及其适用场景：

1. 分类任务

准确率（Accuracy）：预测正确的样本占总样本的比例。
- 适用场景：类别分布均衡的任务。
精确率（Precision）：预测为正类的样本中，实际为正类的比例。
- 适用场景：关注减少假阳性（False Positive）的任务。
召回率（Recall）：实际为正类的样本中，预测为正类的比例。
- 适用场景：关注减少假阴性（False Negative）的任务。
F1分数（F1 Score）：精确率和召回率的调和平均值。
- 适用场景：类别不平衡或需要平衡精确率和召回率的任务。
ROC-AUC：ROC曲线下的面积，衡量模型区分正负类的能力。
- 适用场景：二分类任务，尤其是类别不平衡的情况。

2. 回归任务

均方误差（MSE, Mean Squared Error）：预测值与真实值之差的平方的平均值。
- 适用场景：对误差较大的样本惩罚更重的任务。
均方根误差（RMSE, Root Mean Squared Error）：MSE的平方根。
- 适用场景：与MSE类似，但更接近原始数据尺度。
平均绝对误差（MAE, Mean Absolute Error）：预测值与真实值之差的绝对值的平均值。
- 适用场景：对异常值不敏感的任务。
R²（决定系数）：模型解释目标变量方差的比例。
- 适用场景：评估模型拟合优度。

3. 生成任务

BLEU（Bilingual Evaluation Understudy）：衡量生成文本与参考文本的n-gram重叠程度。
- 适用场景：机器翻译、文本生成任务。
ROUGE（Recall-Oriented Understudy for Gisting Evaluation）：衡量生成文本与参考文本的重叠程度，侧重于召回率。
- 适用场景：文本摘要、生成任务。
METEOR：综合考虑精确率、召回率和词序的评估指标。
- 适用场景：机器翻译、文本生成任务。
Perplexity（困惑度）：衡量模型预测概率分布的不确定性。
- 适用场景：语言模型评估。

4. 多标签任务

Hamming Loss：预测错误的标签比例。
- 适用场景：多标签分类任务。
Jaccard Similarity：预测标签与真实标签的交集与并集之比。
- 适用场景：多标签分类任务。

5. 排序任务

NDCG（Normalized Discounted Cumulative Gain）：衡量排序结果的相关性。
- 适用场景：推荐系统、信息检索。
MAP（Mean Average Precision）：平均精确率的均值。
- 适用场景：信息检索、推荐系统。

6. 其他指标

训练时间：模型微调所需的时间。
推理速度：模型生成结果的速度。
资源消耗：模型运行所需的计算资源（如GPU内存、CPU使用率）。
鲁棒性：模型对噪声、异常值或对抗样本的抵抗能力。

7. 领域特定指标

医学领域：敏感性（Sensitivity）、特异性（Specificity）、AUC-ROC。
金融领域：收益曲线、夏普比率（Sharpe Ratio）。
计算机视觉：mAP（mean Average Precision）、IoU（Intersection over Union）。

8. 人类评估

人工评分：通过人工评估生成结果的质量（如流畅性、相关性、准确性）。
用户满意度：通过用户反馈评估模型的实际效果。

9. 模型对比

基线对比：与未微调的模型或基线模型进行性能对比。
消融实验：评估微调过程中不同组件（如数据、超参数）对性能的影响。

10. 综合评估

多指标综合：根据任务需求，结合多个指标进行综合评估。
任务特定指标：针对特定任务设计自定义指标。

在实际应用中，选择合适的评估指标需要结合任务目标、数据特点和业务需求，同时注意避免单一指标的局限性。

posted @ 2025-03-12 10:08 paulwong 阅读(534) | 评论 (0) | 编辑收藏

LLM全栈框架完整分类清单（预训练+微调+工具链）

https://blog.csdn.net/ViniJack/article/details/145789900

posted @ 2025-03-10 11:29 paulwong 阅读(105) | 评论 (0) | 编辑收藏

医疗问诊系统资源

计算机毕业设计Python+Neo4j知识图谱医疗问答系统大模型

https://baijiahao.baidu.com/s?id=1815574648931972744&wfr=spider&for=pc

QABasedOnMedicaKnowledgeGraph

https://github.com/liuhuanyong/QASystemOnMedicalKG/blob/master/README.md

非结构文字抽取实体与关系的大模型

底座, 百川 https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat/tree/main

底座, llama2 https://huggingface.co/unsloth/llama-2-13b

微调->百川 https://huggingface.co/zjunlp/baichuan2-13b-iepile-lora

微调->llama2 https://huggingface.co/zjunlp/llama2-13b-iepile-lora

SiameseUniNLU通用自然语言理解模型

https://www.modelscope.cn/models/iic/nlp_structbert_siamese-uninlu_chinese-base/summary

数据集

https://huggingface.co/datasets/zjunlp/iepile

各种已经训练好的模型

https://www.modelscope.cn/models?name=zpeng1989&page=1

posted @ 2025-03-08 20:52 paulwong 阅读(65) | 评论 (0) | 编辑收藏

使用nlp提取非结构化数据中的信息

@import url(http://www.blogjava.net/CuteSoft_Client/CuteEditor/Load.ashx?type=style&file=SyntaxHighlighter.css);@import url(/css/cuteeditor.css); @import url(http://www.blogjava.net/CuteSoft_Client/CuteEditor/Load.ashx?type=style&file=SyntaxHighlighter.css);@import url(/css/cuteeditor.css); 如果要从结构化的数据中提取信息,用sql即可, 即要提取的信息在select 的字段中.

如果要从非结构化的数据中, 如纯文本, 则要靠nlp, 要对文本理解后, 才能提取相应的信息.

https://www.w3cschool.cn/article/99991254.html

文本结构化 with SpaCy 攻略

https://zhuanlan.zhihu.com/p/556163162

https://zhuanlan.zhihu.com/p/557953165

https://zhuanlan.zhihu.com/p/563334531

https://zhuanlan.zhihu.com/p/573743734

使用openspg自动构建医疗知识图谱

https://blog.csdn.net/myboyliu2007/article/details/139654943

posted @ 2025-03-08 11:45 paulwong 阅读(68) | 评论 (0) | 编辑收藏

AI案例资源

@import url(http://www.blogjava.net/CuteSoft_Client/CuteEditor/Load.ashx?type=style&file=SyntaxHighlighter.css);@import url(/css/cuteeditor.css);

从实践案例介绍大模型应用经验和思考

https://mp.weixin.qq.com/s/hcD0-z9Y4PsrILUgHdqGcQ

LLaMA Factory：微调DeepSeek-R1-Distill-Qwen-7B模型实现新闻标题分类器

https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory_deepseek_r1_distill_7b

deepseek r1微调模型应用落地案例（医疗法律，PatientSeek）

https://www.bilibili.com/video/BV17zAVevEtw/?spm_id_from=333.788.recommend_more_video.0&vd_source=35b81999db00535703a287d5c98652b1

文本转语音的模型ChatTTS体验极佳，真人般丝滑和流畅，自定义也比较灵活

https://www.bilibili.com/video/BV1oJ4m1u7B8/?spm_id_from=333.1387.upload.video_card.click&vd_source=35b81999db00535703a287d5c98652b1

医疗NLP领域评测/比赛，数据集，论文和预训练模型资源汇总。

https://github.com/FreedomIntelligence/Medical_NLP

posted @ 2025-02-26 16:01 paulwong 阅读(56) | 评论 (0) | 编辑收藏

满血版Deepseek R1全网资源

官网

https://chat.deepseek.com

腾讯, 需下载客户端

https://ima.qq.com

阿里, 需自建对话应用, 有网页版

https://tbox.alipay.com/

askmanyai

https://askmanyai.cn

360纳米搜索, 无网页版, 需自行下载app

posted @ 2025-02-15 23:10 paulwong 阅读(135) | 评论 (0) | 编辑收藏

量化资源

GPTQ、GGUF、AWQ 大语言模型量化方法对比（转载）

https://caovan.com/gptqggufawq-dayuyanmoxinglianghuafangfaduibizhuanzai/.html

posted @ 2025-02-08 23:31 paulwong 阅读(91) | 评论 (0) | 编辑收藏

DeepSeek背后的数学：深入研究群体相对策略优化（GRPO）

摘要: 本博客深入探讨了群体相对策略优化（GRPO）背后的数学，GRPO是推动DeepSeek卓越推理能力的核心强化学习算法。我们将分解GRPO的工作原理、其关键组件，以及为什么它是训练高级大型语言模型（LLM）的改变者。 GRPO的基础 GRPO是什么？群相对策略优化（GRPO）是一种强化学习（RL）算法，专门用于增强大型语言模型（LLM）的推理能力。与传统的RL方法不同，RL方法严重依赖外部评... 阅读全文

posted @ 2025-02-08 00:13 paulwong 阅读(429) | 评论 (0) | 编辑收藏

DeepSeek资源

DeepSeek大模型由于采用了GRPO算法, 大幅降低了显存的需求.

【DeepSeek】复现DeepSeek R1？快来看这个Open R1项目实践指南~

https://blog.csdn.net/qq_38961840/article/details/145388142

!!!实战LLM强化学习——使用GRPO（DeepSeek R1出圈算法）

https://blog.csdn.net/qq_38961840/article/details/145390704

【DeepSeek】一文详解GRPO算法——为什么能减少大模型训练资源？

https://blog.csdn.net/qq_38961840/article/details/145384852

DeepSeek R1系列

https://blog.csdn.net/qq_38961840/category_12885087.html

posted @ 2025-02-02 19:22 paulwong 阅读(119) | 评论 (0) | 编辑收藏

不用再找了，这是大模型最全的面试题库

https://blog.csdn.net/m0_59596990/article/details/135200833

posted @ 2025-01-22 07:42 paulwong 阅读(50) | 评论 (0) | 编辑收藏

数据集资源

@import url(http://www.blogjava.net/CuteSoft_Client/CuteEditor/Load.ashx?type=style&file=SyntaxHighlighter.css);@import url(/css/cuteeditor.css);

https://hyper.ai/cn/datasets

posted @ 2025-01-17 15:52 paulwong 阅读(46) | 评论 (0) | 编辑收藏

vllm资源

vllm是一个可以加载大模型, 推理, 量化模型, 以http api的方式暴露服务的框架.

https://docs.vllm.ai/en/latest/getting_started/examples/basic_with_model_default_sampling.html

posted @ 2025-01-17 13:01 paulwong 阅读(90) | 评论 (0) | 编辑收藏

AI应用场景

@import url(http://www.blogjava.net/CuteSoft_Client/CuteEditor/Load.ashx?type=style&file=SyntaxHighlighter.css);@import url(/css/cuteeditor.css); @import url(http://www.blogjava.net/CuteSoft_Client/CuteEditor/Load.ashx?type=style&file=SyntaxHighlighter.css);@import url(/css/cuteeditor.css);

到底AI是虚的还是假的, 在企业中有没实际落地场景, 以下取实际应用场景:

生物公司

使用qwen2:7b训练细胞制备领域的数据集，目标是
1.预测细胞收获量
2.算细胞存活状态(存活/死亡)
3.预测工艺是否成功
4.可以提前预测细胞的质量是否达标，以便及时采取措施进行调整
5.细胞培养过程中出现大量细胞死亡的情况，模型可以根据实时数据和历史经验，分析可能是培养箱温度失控、培养基成分错误或受到污染等原因导致的，并提供相应的排查建议」

文体旅游

智能旅游系统:
提供目的地介绍、
旅行路线规划、
酒店预订和景
点推荐等服务。

考试改卷

基于大模型，做一个判试卷的应用，能够判断主观题，比如阅读理解，比如历史，地理，政治问答题。
判卷准确率不能低于人工判卷准确率。
即一次考试，一个班50份试卷，判断结果错误不超过5道题。判断效率高于或等于人工。

取过往同学试卷题目, 作答内容, 得分作一波ocr出数据, 一个科目, 提取所有试卷内容, 最后就是一个科目一个模型, 提取的内容放在文本, csv, json,
基于“bert-base-chinese”这个模型, 进行微调出专用模型即可,
让大模型成为专业的判卷老师

考试

用扣子打一个智能体，实现不同学员对掌握的知识进行测试，根据测试结果进行打分和二次出题测试

posted @ 2025-01-17 11:23 paulwong 阅读(163) | 评论 (0) | 编辑收藏

搭建llamafactory微调、评估、测试和量化环境

0. 配置环境变量

HF_ENDPOINT=https://hf-mirror.com
HF_HOME=/root/autodl-tmp/paul/tools/huggingface

1. 本机安装python 3.10, 并设置软件源

pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
pip config set global.index-url https://mirrors.huaweicloud.com/repository/pypi/simple

2. 安装miniconda

https://juejin.cn/post/7078965942968909854

3. 新建一个环境, 并激活

conda create -n quantization python=3.12

2. 本机安装pytorch2.5.1+cuda12.4

pip3 install torch torchvision torchaudio

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124

3. clone llamafactory源码

git clone https://github.com/hiyouga/LLaMA-Factory

4. llamafactory本地安装依赖

pip install -e .

pip install -e .["vllm","gptq"]

5. 启动webui

llamafactory-cli webui

6. 在页面中填入相关参数进行操作

posted @ 2025-01-16 16:54 paulwong 阅读(167) | 评论 (0) | 编辑收藏

量化大模型工具

VLLM量化推理

https://llmc-zhcn.readthedocs.io/en/latest/backend/vllm.html#id1

安装此工具前需安装两个包:

sudo apt-get install cmake
sudo apt-get install pkgconfig

配置huggingface镜像地址:

export HF_ENDPOINT=https://hf-mirror.com

下载代码库, 并安装python依赖

git clone https://github.com/ModelTC/llmc.git
cd llmc/
pip install -r requirements.txt

找到量化方法的配置文件, 并作修改

base:
    seed: &seed 42
model:
    type: Llama
    path: /home/paul/.cache/huggingface/models/models--unsloth--llama-3-8b-Instruct-lawdata
    torch_dtype: auto
quant:
    method: RTN
    weight:
        bit: 8
        symmetric: True
        granularity: per_group
        group_size: 128
        need_pack: True
eval:
    eval_pos: [fake_quant]
    name: wikitext2
    download: True
    path: /home/paul/paulwong/work/workspaces/llmc/dataset
    bs: 1
    seq_len: 2048
    inference_per_block: False
save:
    save_vllm: True
    save_path: /home/paul/.cache/huggingface/models/models--unsloth--llama-3-8b-Instruct-lawdata-quantization

找到run_llmc.sh, 并作修改

#!/bin/bash

# export CUDA_VISIBLE_DEVICES=0,1

llmc=/home/paul/paulwong/work/workspaces/llmc
export PYTHONPATH=$llmc:$PYTHONPATH

# task_name=awq_w_only
# config=${llmc}/configs/quantization/methods/Awq/awq_w_only.yml
task_name=rtn_for_vllm
config=${llmc}/configs/quantization/backend/vllm/rtn_w8a16.yml

nnodes=1
nproc_per_node=1

find_unused_port() {
    while true; do
        port=$(shuf -i 10000-60000 -n 1)
        if ! ss -tuln | grep -q ":$port "; then
            echo "$port"
            return 0
        fi
    done
}
UNUSED_PORT=$(find_unused_port)

MASTER_ADDR=127.0.0.1
MASTER_PORT=$UNUSED_PORT
task_id=$UNUSED_PORT

nohup \
torchrun \
--nnodes $nnodes \
--nproc_per_node $nproc_per_node \
--rdzv_id $task_id \
--rdzv_backend c10d \
--rdzv_endpoint $MASTER_ADDR:$MASTER_PORT \
${llmc}/llmc/__main__.py --config $config --task_id $task_id \
> ${task_name}.log 2>&1 &

sleep 2
ps aux | grep '__main__.py' | grep $task_id | awk '{print $2}' > ${task_name}.pid

# You can kill this program by
# xargs kill -9 < xxx.pid
# xxx.pid is ${task_name}.pid file

执行量化操作

bash scripts/run_llmc.sh

posted @ 2025-01-15 18:00 paulwong 阅读(112) | 评论 (0) | 编辑收藏

微调资源

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

https://huggingface.co/blog/mlabonne/sft-llama3

A beginners guide to fine tuning LLM using LoRA

https://zohaib.me/a-beginners-guide-to-fine-tuning-llm-using-lora/

【Day 23】調教你的 AI 寵物：用微調讓 LLM 乖乖聽話

https://ithelp.ithome.com.tw/articles/10346441

posted @ 2025-01-15 17:56 paulwong 阅读(85) | 评论 (0) | 编辑收藏

安装docker版的Nvidia container toolkit

https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#installation

posted @ 2025-01-13 14:20 paulwong 阅读(64) | 评论 (0) | 编辑收藏

开源镜像库

华为：
https://mirrors.huaweicloud.com/home
https://mirrors.huaweicloud.com/artifactory/pypi-public/simple/torch/

清华:

https://mirrors.tuna.tsinghua.edu.cn
点击问号进详情

docker:
https://mirrors.huaweicloud.com/mirrorDetail/5ea14d84b58d16ef329c5c13?mirrorName=docker-ce&catalog=docker

posted @ 2025-01-13 10:32 paulwong 阅读(102) | 评论 (0) | 编辑收藏

windows中添加端口转发规则

设置端口转发

在 Windows 上，以管理员身份打开 PowerShell，

netsh interface portproxy add v4tov4 listenport=7860 listenaddress=0.0.0.0 connectport=7860 connectaddress=123.45.67.89

在 PowerShell 中使用 netsh interface portproxy 命令设置的端口转发规则是持久性的。这些规则会在系统重启后继续生效，因为它们被存储在 Windows 的注册表中。

删除端口转发规则

如果想删除之前设置的端口转发规则，可以使用以下命令：

netsh interface portproxy delete v4tov4 listenport=7860 listenaddress=0.0.0.0

这里的 listenport 和 listenaddress 应与之前设置时的值一致。

查看当前的端口转发规则

要查看当前系统中所有的端口转发规则，可以运行：

netsh interface portproxy show all

posted @ 2025-01-13 09:34 paulwong 阅读(166) | 评论 (0) | 编辑收藏

AI微调框架axolotl安装

1. N卡驱动和toolkit安装

https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=WSL-Ubuntu&target_version=2.0&target_type=runfile_local

2. python和mini-conda安装

基本是要下载安装包安装,
python下载地址：https://repo.huaweicloud.com/python/3.12.8/
mini-conda下载地址：https://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/
conda清华资源：https://mirrors.tuna.tsinghua.edu.cn/help/anaconda/

3. 新建一个conda环境

conda create -n axolotl python=3.12

4. cuda版本的pytorch安装

https://download.pytorch.org/whl/cu124/torch-2.5.0%2Bcu124-cp311-cp311-linux_x86_64.whl#sha256=5e3f4a7ba812517c2c1659857b5195f287a288fbd050a5abf9311e03dbe1a28b

如想安装其他版本, 可从以下网址查找:

https://download.pytorch.org/whl/torch

5. git clone https://github.com/axolotl-ai-cloud/axolotl, cd到根目录, 运行

pip3 install --no-build-isolation axolotl[flash-attn,deepspeed]

posted @ 2025-01-12 16:37 paulwong 阅读(92) | 评论 (0) | 编辑收藏

内网穿透工具

将内网, 如家庭中的使用wifi建立的网站, 发布到外网, 而无需使用服务器.

https://i.cpolar.com/m/5jN0

reference:

https://www.cpolar.com/blog/cpolar-quick-start-tutorial-ubuntu-series

posted @ 2025-01-12 11:54 paulwong 阅读(160) | 评论 (0) | 编辑收藏

安装cuda版本的pytorch

先下载cuda版本的pytorch的整个打包文件:

https://download.pytorch.org/whl/cu124/torch-2.5.1%2Bcu124-cp312-cp312-linux_x86_64.whl#sha256=bf6484bfe5bc4f92a4a1a1bf553041505e19a911f717065330eb061afe0e14d7

https://mirrors.huaweicloud.com/artifactory/pypi-public/simple/torch/

pip install torch-2.5.1+cu124-cp312-cp312-linux_x86_64.whl

验证:

#python
import torch
torch.__version__

posted @ 2025-01-12 11:05 paulwong 阅读(92) | 评论 (0) | 编辑收藏

mac使用vscode远程连接win11下的wsl2的方法

1.首先给win11的ssh开一个新端口.(修改C:\ProgramData\ssh\sshd_config即可)

2.win11设置防火墙,开放1中添加的端口.

3.win11进入wsl2,输入ifconfig,查看ip地址(输出信息第二行 inet后面那一串数字).

4.在win11的cmd中输入以下命令:

netsh interface portproxy add v4tov4 listenaddress=127.0.0.1 listenport=<步骤1中开放的端口> connectaddress=<步骤3中得到的ip地址> connectport=22

5. ssh连接步骤1中开放的端口就可以连接上wsl2(注意事项:(1)连接时,win11上需要有一个wsl窗口,不然连不上,(2)ssh连接时的用户名写wsl2中的用户名,密码写wsl2中的密码,ip地址写win11的ip地址)

https://www.zhihu.com/question/618935377

posted @ 2025-01-11 09:59 paulwong 阅读(90) | 评论 (0) | 编辑收藏

WSL资源

谁来救救被WSL占用的磁盘空间
https://zhuanlan.zhihu.com/p/641436638

利用 VsCode Tunnel 在 Mac 上远程开发
https://juejin.cn/post/7334167506319327283

用 WSL2 搭建 Windows 上更爽的前端开发环境
https://www.bilibili.com/video/BV1BV4y1Z7v4/?vd_source=35b81999db00535703a287d5c98652b1

posted @ 2025-01-11 09:57 paulwong 阅读(45) | 评论 (0) | 编辑收藏

GitHub无法访问的办法

浏览器打开https://www.ipaddress.com/website/www.github.com/, 输入www.github.com, 得到相应的ip, 本地clone以ip的方式, 但如果要访问页面, 需改本地的hosts文件:

# /etc/hosts
140.82.112.4 www.github.com

posted @ 2025-01-05 12:08 paulwong 阅读(93) | 评论 (0) | 编辑收藏

linux删除多余的旧内核

linux每次升级后都会留下多余的内核, 一键删除的方法(Centos):@import url(http://www.blogjava.net/CuteSoft_Client/CuteEditor/Load.ashx?type=style&file=SyntaxHighlighter.css);@import url(/css/cuteeditor.css);

dnf remove $(dnf repoquery --installonly --latest-limit=-2)

posted @ 2025-01-05 12:01 paulwong 阅读(48) | 评论 (0) | 编辑收藏

AI入门

数据分析：从一堆已知的数据中进行分类，总结得出统计数据，如最大值，最小值，平均值，总和等。
只能对已知数据进行操作，无法预测出新的数据的特征，于是就有了机器学习。

机器学习：给出一堆已知的，有特征栏位的和结果栏位的数据，选定一个算法，如线性回归，逻辑回归等，其实就是一条公式，进行学习，其实就是运行一堆函数，比较结果，得出规律，也就是确定了公式中参数的值。当输入新的数据时，就能预测出所需的结果，其实就是把输入数据代入公式，算出结果。

机器学习只能做比较简单的任务，如预测下个月的销售数据，判断文字内容是正面还是反面(分类)，对于复杂的任务，如对话，其实就是针对输入文字预测靠谱的输出文字(回答)，于是就有了深度学习。

深度学习：给出一堆数据，只需两个本栏位，如问题，答案等，选定一个算法，其实就是神经网络的类型，如卷积神经网络(CNN)，循环神经网络(RNN)，TRANSFORMER神经网络等，进行学习，其实就是运行一堆函数，比较结果，得出规律，也就是确定了公式中参数的值。

posted @ 2024-10-19 22:37 paulwong 阅读(117) | 评论 (0) | 编辑收藏

国内网络环境安装docker＋container toolkit

操作系统为centos 9.

先安装驱动程序

在https://www.nvidia.cn/drivers/lookup/ 中查找对应的驱动程序下载到本地，再运行

#切换成文字界面
sudo systemctl set-default multi-user.target
sudo reboot

sh NVIDIA-Linux-x86_64-550.107.02.run

#切换成图形界面
sudo systemctl set-default graphical.target
sudo reboot

安装docker:

yum remove docker \
                  docker-client \
                  docker-client-latest \
                  docker-common \
                  docker-latest \
                  docker-latest-logrotate \
                  docker-logrotate \
                  docker-engine

yum install -y yum-utils
yum-config-manager --add-repo https://mirrors.tuna.tsinghua.edu.cn/docker-ce/linux/centos/docker-ce.repo
sed -i 's+https://download.docker.com+https://mirrors.tuna.tsinghua.edu.cn/docker-ce+' /etc/yum.repos.d/docker-ce.repo

yum install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

sudo nvidia-ctk runtime configure --runtime=docker

改镜像地址：

[paul@paul-pc ~]$ cat /etc/docker/daemon.json
{
    "registry-mirrors": [
        "http://xxx.xxx.xxx"
    ],
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "nvidia-container-runtime"
        }
    }
}

安装container-took-kit：

在https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Rocky&target_version=9&target_type=runfile_local 中找到对应的container-took-kit，下载到本地，再运行

sh cuda_12.6.0_560.28.03_linux.run

验证：

sudo docker run --rm -it --gpus all ubuntu nvidia-smi

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.107.02             Driver Version: 550.107.02     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0 NVIDIA GeForce RTX 2080 Ti     Off |   00000000:01:00.0 On |                  N/A |
| 62%   36C    P8              4W / 260W |     256MiB / 22528MiB |      1%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1 NVIDIA GeForce RTX 2080 Ti     Off |   00000000:02:00.0 Off |                  N/A |
| 64%   35C    P8              5W / 260W |       9MiB / 22528MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
| GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A N/A      2657      G   /usr/libexec/Xorg                              99MiB |
|    0   N/A N/A      2735      G   /usr/bin/gnome-shell                           38MiB |
|    0   N/A N/A      3502      G   /usr/lib64/firefox/firefox                    111MiB |
|    1   N/A N/A      2657      G   /usr/libexec/Xorg                               4MiB |
+-----------------------------------------------------------------------------------------+

参考地址：

https://mirrors.tuna.tsinghua.edu.cn/help/docker-ce/

posted @ 2024-08-15 10:49 paulwong 阅读(224) | 评论 (0) | 编辑收藏

python界面库

python服务器脚本，生成html，无需写js,css，适合AI项目

https://cheat-sheet.streamlit.app

生成文字的代码：

st.text('Fixed width text')
st.markdown('_Markdown_') # see #*
st.caption('Balloons. Hundreds of them

')
st.latex(r''' e^{i\pi} + 1 = 0 ''')
st.write('Most objects') # df, err, func, keras!
st.write(['st', 'is <', 3]) # see *
st.title('My title')
st.header('My header')
st.subheader('My sub')
st.code('for i in range(8): foo()')

# * optional kwarg unsafe_allow_html = True

生成form控件：

st.button('Hit me')
st.data_editor('Edit data', data)
st.checkbox('Check me out')
st.radio('Pick one:', ['nose','ear'])
st.selectbox('Select', [1,2,3])
st.multiselect('Multiselect', [1,2,3])
st.slider('Slide me', min_value=0, max_value=10)
st.select_slider('Slide to select', options=[1,'2'])
st.text_input('Enter some text')
st.number_input('Enter a number')
st.text_area('Area for textual entry')
st.date_input('Date input')
st.time_input('Time entry')
st.file_uploader('File uploader')
st.download_button('On the dl', data)
st.camera_input("一二三,茄子!")
st.color_picker('Pick a color')

用表格显示数据：

st.dataframe(my_dataframe)
st.table(data.iloc[0:10])
st.json({'foo':'bar','fu':'ba'})
st.metric(label="Temp", value="273 K", delta="1.2 K")

显示加载进度条与状态：

# Show a spinner during a process
>>> with st.spinner(text='In progress'):
>>> time.sleep(3)
>>> st.success('Done')

# Show and update progress bar
>>> bar = st.progress(50)
>>> time.sleep(3)
>>> bar.progress(100)

st.balloons()
st.snow()
st.toast('Mr Stay-Puft')
st.error('Error message')
st.warning('Warning message')
st.info('Info message')
st.success('Success message')
st.exception(e)

posted @ 2024-08-12 15:19 paulwong 阅读(109) | 评论 (0) | 编辑收藏

通过SSH的方式PUSH代码到GIT

这几天要PUSH代码到GITHUB，发现之前用的密码方式被取消了，需改成SSH KEY的方式。

1.生成SSH-KEY

ssh-keygen
#会产生 ~/.ssh/id_rsa 和 ~/.ssh/id_rsa_pub 文件

#如果是从别的地方拷贝过来的id_rsa，需chmod 400 ~/.ssh/id_rsa更改属性

2.在github上新建仓库

https://github.com/paulwong888/python-ai

3.导入公钥到github

打开你的SSH公钥文件，通常位于~/.ssh/id_rsa.pub。复制公钥内容，然后登录到你的GitHub账户，进入Settings > SSH and GPG keys，点击"New SSH key"按钮，粘贴你的公钥，然后点击"Add SSH key"。

4.克隆仓库

git config --global user.name "John Doe"
git config --global user.email johndoe@example.com

git clone git@github.com:paulwong888/python-ai

5.导入project到eclipse

上步克隆时已经在本地新建了一个本地仓库，Import->Git->Project from Git->Existing local repository，选择python-ai/.git文件夹

之后的操作和用密码的方式是一样的。

如果是vs code的操作，可参考：https://juejin.cn/post/6993612656410099719

posted @ 2024-07-24 12:31 paulwong 阅读(150) | 评论 (0) | 编辑收藏

My Links

Blog Stats

常用链接

留言簿(67)

随笔分类(1418)

随笔档案(1173)

文章分类(7)

文章档案(10)

相册

收藏夹(2)

AI

Develop

E-BOOK

Other

养生

微服务

搜索

最新评论

阅读排行榜

评论排行榜

60天内阅读排行

安装命令

启动命令_start-vllm.sh

启动命令start-vllm.sh

为什么系统装了 uv 对 Conda 更有利？

1. 自动检测（最推荐：--torch-backend auto）

2. 手动指定官方索引（针对特定版本需求）

3. 在配置文件中永久锁定（适合项目管理）

4. 解决“环境污染”：Conda 与 uv 的分工

安装命令

申请api key

配置Claude code

运行

架构

安装docker mcp server

安装MCP Gateway

MCP客户端

克隆源码:

添加Dockerfile

docker-compose.yaml

login-mcp-milvus-server.sh

logs-mcp-milvus-server.sh

start-mcp-milvus-server.sh

shutdown-mcp-milvus-server.sh

restart-mcp-milvus-server.sh

预训练(Pre-Trained)

监督微调(Supervised Fine Turning)

高效参数微调(Parameter Efficient Fine Turning)

1. 分类任务

2. 回归任务

3. 生成任务

4. 多标签任务

5. 排序任务

6. 其他指标

7. 领域特定指标

8. 人类评估

9. 模型对比

10. 综合评估

设置端口转发

删除端口转发规则

查看当前的端口转发规则

先安装驱动程序

安装docker:

改镜像地址：

安装container-took-kit：

验证：

1.生成SSH-KEY

2.在github上新建仓库

3.导入公钥到github

4.克隆仓库

5.导入project到eclipse