﻿<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/"><channel><title>BlogJava-paulwong-随笔分类-AI-FINE-TUNNING</title><link>http://www.blogjava.net/paulwong/category/55398.html</link><description /><language>zh-cn</language><lastBuildDate>Thu, 11 Sep 2025 20:29:54 GMT</lastBuildDate><pubDate>Thu, 11 Sep 2025 20:29:54 GMT</pubDate><ttl>60</ttl><item><title>微调案例</title><link>http://www.blogjava.net/paulwong/archive/2025/09/10/451675.html</link><dc:creator>paulwong</dc:creator><author>paulwong</author><pubDate>Wed, 10 Sep 2025 05:25:00 GMT</pubDate><guid>http://www.blogjava.net/paulwong/archive/2025/09/10/451675.html</guid><wfw:comment>http://www.blogjava.net/paulwong/comments/451675.html</wfw:comment><comments>http://www.blogjava.net/paulwong/archive/2025/09/10/451675.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/paulwong/comments/commentRss/451675.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/paulwong/services/trackbacks/451675.html</trackback:ping><description><![CDATA[<div>我如何用 Prompt 工程将大模型调教成风控专家</div>
<div><a href="https://my.oschina.net/u/4090830/blog/18690981" target="_blank">https://my.oschina.net/u/4090830/blog/18690981</a><br />
</div>
<div><br />
</div>
<div><br />
</div>
<img src ="http://www.blogjava.net/paulwong/aggbug/451675.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/paulwong/" target="_blank">paulwong</a> 2025-09-10 13:25 <a href="http://www.blogjava.net/paulwong/archive/2025/09/10/451675.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>ai预测足球资源</title><link>http://www.blogjava.net/paulwong/archive/2025/04/19/451611.html</link><dc:creator>paulwong</dc:creator><author>paulwong</author><pubDate>Fri, 18 Apr 2025 17:07:00 GMT</pubDate><guid>http://www.blogjava.net/paulwong/archive/2025/04/19/451611.html</guid><wfw:comment>http://www.blogjava.net/paulwong/comments/451611.html</wfw:comment><comments>http://www.blogjava.net/paulwong/archive/2025/04/19/451611.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/paulwong/comments/commentRss/451611.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/paulwong/services/trackbacks/451611.html</trackback:ping><description><![CDATA[<div>基于机器学习的2022世界杯预测实战</div>
<div><a href="https://www.showmeai.tech/article-detail/400" target="_blank">https://www.showmeai.tech/article-detail/400</a><br />
</div>
<div><br />
</div>
<div>AI 竞彩赛事 预测工具<br />
</div>
<div><a href="https://www.mysports.ai/cn" target="_blank">https://www.mysports.ai/cn</a><br />
</div><img src ="http://www.blogjava.net/paulwong/aggbug/451611.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/paulwong/" target="_blank">paulwong</a> 2025-04-19 01:07 <a href="http://www.blogjava.net/paulwong/archive/2025/04/19/451611.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>微调训练的数据集</title><link>http://www.blogjava.net/paulwong/archive/2025/03/21/451602.html</link><dc:creator>paulwong</dc:creator><author>paulwong</author><pubDate>Fri, 21 Mar 2025 13:52:00 GMT</pubDate><guid>http://www.blogjava.net/paulwong/archive/2025/03/21/451602.html</guid><wfw:comment>http://www.blogjava.net/paulwong/comments/451602.html</wfw:comment><comments>http://www.blogjava.net/paulwong/archive/2025/03/21/451602.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/paulwong/comments/commentRss/451602.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/paulwong/services/trackbacks/451602.html</trackback:ping><description><![CDATA[<div>使用trl库做微调时, 对数据集的要求是:</div>
<div><br />
</div>
<div>如果是多轮对话场景:</div>
<div>jsonl 文件，且需符合以下要求:</div>
<div>1.每行是一个独立的 JSON 对象;</div>
<div>2 每个对象须包含一个键名为 messages 的数组，数组不能为空;</div>
<div>3.messages 中每个元素必须包含 role 和 content 两个字段:</div>
<div>4.role 只能是 system,user 或 assisiant;</div>
<div>5.如果有 system 角色消息, 需在数组首位;</div>
<div>6.第一条非 system 消息必须是 user 角色;</div>
<div>7.user 和 assisiant 角色的消息应当交替、成对出现，不少于1对;</div>
<div><br />
</div>
<div>如果是指令微调场景:</div>
<div>
<div>jsonl 文件，且需符合以下要求:</div>
<div>1.每行是一个独立的 JSON 对象;</div>
<div>2 每个对象须包含且只能包含一个键名为 text 的键值对，值不能为空;</div>
</div>
<div><br />
</div>
<img src ="http://www.blogjava.net/paulwong/aggbug/451602.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/paulwong/" target="_blank">paulwong</a> 2025-03-21 21:52 <a href="http://www.blogjava.net/paulwong/archive/2025/03/21/451602.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>大模型训练的几个阶段</title><link>http://www.blogjava.net/paulwong/archive/2025/03/18/451600.html</link><dc:creator>paulwong</dc:creator><author>paulwong</author><pubDate>Tue, 18 Mar 2025 05:14:00 GMT</pubDate><guid>http://www.blogjava.net/paulwong/archive/2025/03/18/451600.html</guid><wfw:comment>http://www.blogjava.net/paulwong/comments/451600.html</wfw:comment><comments>http://www.blogjava.net/paulwong/archive/2025/03/18/451600.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/paulwong/comments/commentRss/451600.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/paulwong/services/trackbacks/451600.html</trackback:ping><description><![CDATA[大模型开发出来后, 一般要经过以下几个阶段的训练:
<div><br />
</div>
<h2>预训练(Pre-Trained)</h2>
<div>单纯提供文本: {"text":"..."}</div>
<div>训练模型由第一个文字开始, 预测后面的文字, 直到结束.</div>
<div>这种模型只会做完成文本的任务</div>
<div><br />
</div>
<h2>监督微调(Supervised Fine Turning)</h2>
<div>为了使模型能完成根据指令完成回答, 而不是随机生成回答</div>
<div>提供的文本: {"instruction":"...", "output":"..."}</div>
<div><br />
</div>
<h2>高效参数微调(Parameter Efficient Fine Turning)</h2>
<div>只调整部分参数, 具体实现方法有LoRA</div>
<div><br />
</div>
<div>参考:</div>
<div><a href="https://github.com/huggingface/smol-course/blob/main/1_instruction_tuning/notebooks/sft_finetuning_example.ipynb" target="_blank">https://github.com/huggingface/smol-course/blob/main/1_instruction_tuning/notebooks/sft_finetuning_example.ipynb</a><br />
</div>
<div><br />
</div>
<div><br />
</div>
<div><br />
</div>
<img src ="http://www.blogjava.net/paulwong/aggbug/451600.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/paulwong/" target="_blank">paulwong</a> 2025-03-18 13:14 <a href="http://www.blogjava.net/paulwong/archive/2025/03/18/451600.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>搭建llamafactory微调、评估、测试和量化环境</title><link>http://www.blogjava.net/paulwong/archive/2025/01/16/451558.html</link><dc:creator>paulwong</dc:creator><author>paulwong</author><pubDate>Thu, 16 Jan 2025 08:54:00 GMT</pubDate><guid>http://www.blogjava.net/paulwong/archive/2025/01/16/451558.html</guid><wfw:comment>http://www.blogjava.net/paulwong/comments/451558.html</wfw:comment><comments>http://www.blogjava.net/paulwong/archive/2025/01/16/451558.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/paulwong/comments/commentRss/451558.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/paulwong/services/trackbacks/451558.html</trackback:ping><description><![CDATA[<div>0. 配置环境变量</div>
<div>
<div style="background-color: #eeeeee; font-size: 13px; border-left-color: #cccccc; border-image: none; padding: 4px 5px 4px 4px; width: 98%; word-break: break-all;"><!--<br />
<br />
Code highlighting produced by Actipro CodeHighlighter (freeware)<br />
http://www.CodeHighlighter.com/<br />
<br />
-->HF_ENDPOINT=https://hf-mirror.com<br />
HF_HOME=/root/autodl-tmp/paul/tools/huggingface</div>
</div>
<div><br />
</div>
1. 本机安装python 3.10, 并设置软件源
<div>
<div style="background-color: #eeeeee; font-size: 13px; border-left-color: #cccccc; border-image: none; padding: 4px 5px 4px 4px; width: 98%; word-break: break-all;"><!--<br />
<br />
Code highlighting produced by Actipro CodeHighlighter (freeware)<br />
http://www.CodeHighlighter.com/<br />
<br />
-->pip&nbsp;config&nbsp;set&nbsp;global.index-url&nbsp;https://pypi.tuna.tsinghua.edu.cn/simple<br />
pip&nbsp;config&nbsp;set&nbsp;global.index-url&nbsp;https://mirrors.huaweicloud.com/repository/pypi/simple</div>
<div><br />
<div>2. 安装miniconda</div>
<div><a href="https://juejin.cn/post/7078965942968909854" target="_blank">https://juejin.cn/post/7078965942968909854</a><br />
</div>
<div><br />
</div>
<div>3. 新建一个环境, 并激活</div>
<div>
<div style="background-color: #eeeeee; font-size: 13px; border-left-color: #cccccc; border-image: none; padding: 4px 5px 4px 4px; width: 98%; word-break: break-all;"><!--<br />
<br />
Code highlighting produced by Actipro CodeHighlighter (freeware)<br />
http://www.CodeHighlighter.com/<br />
<br />
-->conda&nbsp;create&nbsp;-n&nbsp;quantization&nbsp;python=3.12</div>
</div>
<div><br />
<div>2. 本机安装pytorch2.5.1+cuda12.4</div>
<div>
<div style="background-color: #eeeeee; font-size: 13px; border-left-color: #cccccc; border-image: none; padding: 4px 5px 4px 4px; width: 98%; word-break: break-all;"><!--<br />
<br />
Code highlighting produced by Actipro CodeHighlighter (freeware)<br />
http://www.CodeHighlighter.com/<br />
<br />
-->pip3&nbsp;install&nbsp;torch&nbsp;torchvision&nbsp;torchaudio</div>
<div style="background-color: #eeeeee; font-size: 13px; border-left-color: #cccccc; border-image: none; padding: 4px 5px 4px 4px; width: 98%; word-break: break-all;">pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124</div>
</div>
<div><br />
</div>
<div>3. clone llamafactory源码</div>
<div>
<div style="background-color: #eeeeee; font-size: 13px; border-left-color: #cccccc; border-image: none; padding: 4px 5px 4px 4px; width: 98%; word-break: break-all;"><!--<br />
<br />
Code highlighting produced by Actipro CodeHighlighter (freeware)<br />
http://www.CodeHighlighter.com/<br />
<br />
-->git&nbsp;clone&nbsp;https://github.com/hiyouga/LLaMA-Factory</div>
</div>
<div><br />
</div>
<div>4. llamafactory本地安装依赖</div>
</div>
<div>
<div style="background-color: #eeeeee; font-size: 13px; border-left-color: #cccccc; border-image: none; padding: 4px 5px 4px 4px; width: 98%; word-break: break-all;"><!--<br />
<br />
Code highlighting produced by Actipro CodeHighlighter (freeware)<br />
http://www.CodeHighlighter.com/<br />
<br />
-->pip&nbsp;install&nbsp;-e&nbsp;.</div>
</div>
<div style="background-color: #eeeeee; font-size: 13px; border-left-color: #cccccc; border-image: none; padding: 4px 5px 4px 4px; width: 98%; word-break: break-all;">pip install -e .["vllm","gptq"]<br />
</div>
<div><br />
</div>
<div>5. 启动webui</div>
</div>
<div>
<div style="background-color: #eeeeee; font-size: 13px; border-left-color: #cccccc; border-image: none; padding: 4px 5px 4px 4px; width: 98%; word-break: break-all;"><!--<br />
<br />
Code highlighting produced by Actipro CodeHighlighter (freeware)<br />
http://www.CodeHighlighter.com/<br />
<br />
-->llamafactory-cli&nbsp;webui</div>
</div>
<div><br />
</div>
<div>6. 在页面中填入相关参数进行操作</div>
</div>
<img src ="http://www.blogjava.net/paulwong/aggbug/451558.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/paulwong/" target="_blank">paulwong</a> 2025-01-16 16:54 <a href="http://www.blogjava.net/paulwong/archive/2025/01/16/451558.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>微调资源</title><link>http://www.blogjava.net/paulwong/archive/2025/01/15/451556.html</link><dc:creator>paulwong</dc:creator><author>paulwong</author><pubDate>Wed, 15 Jan 2025 09:56:00 GMT</pubDate><guid>http://www.blogjava.net/paulwong/archive/2025/01/15/451556.html</guid><wfw:comment>http://www.blogjava.net/paulwong/comments/451556.html</wfw:comment><comments>http://www.blogjava.net/paulwong/archive/2025/01/15/451556.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/paulwong/comments/commentRss/451556.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/paulwong/services/trackbacks/451556.html</trackback:ping><description><![CDATA[<div>Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth</div>
<div><a href="https://huggingface.co/blog/mlabonne/sft-llama3" target="_blank">https://huggingface.co/blog/mlabonne/sft-llama3</a><br />
</div>
<div><br />
</div>
<div>A beginners guide to fine tuning LLM using LoRA</div>
<div><a href="https://zohaib.me/a-beginners-guide-to-fine-tuning-llm-using-lora/" target="_blank">https://zohaib.me/a-beginners-guide-to-fine-tuning-llm-using-lora/</a><br />
</div>
<div><br />
</div>
<div>【Day 23】調教你的 AI 寵物：用微調讓 LLM 乖乖聽話</div>
<div><a href="https://ithelp.ithome.com.tw/articles/10346441" target="_blank">https://ithelp.ithome.com.tw/articles/10346441</a><br />
</div>
<div><br />
</div>
<div><br />
</div>
<img src ="http://www.blogjava.net/paulwong/aggbug/451556.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/paulwong/" target="_blank">paulwong</a> 2025-01-15 17:56 <a href="http://www.blogjava.net/paulwong/archive/2025/01/15/451556.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>AI微调框架axolotl安装</title><link>http://www.blogjava.net/paulwong/archive/2025/01/12/451548.html</link><dc:creator>paulwong</dc:creator><author>paulwong</author><pubDate>Sun, 12 Jan 2025 08:37:00 GMT</pubDate><guid>http://www.blogjava.net/paulwong/archive/2025/01/12/451548.html</guid><wfw:comment>http://www.blogjava.net/paulwong/comments/451548.html</wfw:comment><comments>http://www.blogjava.net/paulwong/archive/2025/01/12/451548.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/paulwong/comments/commentRss/451548.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/paulwong/services/trackbacks/451548.html</trackback:ping><description><![CDATA[1. N卡驱动和toolkit安装
<div><a href="https://developer.nvidia.com/cuda-downloads?target_os=Linux&amp;target_arch=x86_64&amp;Distribution=WSL-Ubuntu&amp;target_version=2.0&amp;target_type=runfile_local" target="_blank">https://developer.nvidia.com/cuda-downloads?target_os=Linux&amp;target_arch=x86_64&amp;Distribution=WSL-Ubuntu&amp;target_version=2.0&amp;target_type=runfile_local</a><br />
&nbsp;
<div>2. python和mini-conda安装</div>
<div>基本是要下载安装包安装, <br />
python下载地址：<a href="https://repo.huaweicloud.com/python/3.12.8/" target="_blank">https://repo.huaweicloud.com/python/3.12.8/<br />
m</a>ini-conda下载地址：<a href="https://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/" target="_blank">https://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/<br />
co</a>nda清华资源：<a href="https://mirrors.tuna.tsinghua.edu.cn/help/anaconda/" target="_blank">https://mirrors.tuna.tsinghua.edu.cn/help/anaconda/<br />
<br />
</a></div>
<div><br />
</div>
<div>3. 新建一个conda环境</div>
<div>
<div style="background-color: #eeeeee; font-size: 13px; border-left-color: #cccccc; border-image: none; padding: 4px 5px 4px 4px; width: 98%; word-break: break-all;"><!--<br />
<br />
Code highlighting produced by Actipro CodeHighlighter (freeware)<br />
http://www.CodeHighlighter.com/<br />
<br />
-->conda&nbsp;create&nbsp;-n&nbsp;axolotl&nbsp;python=3.12</div>
</div>
<div><br />
</div>
<div>4. cuda版本的pytorch安装</div>
<div><a href="https://download.pytorch.org/whl/cu124/torch-2.5.0%2Bcu124-cp311-cp311-linux_x86_64.whl#sha256=5e3f4a7ba812517c2c1659857b5195f287a288fbd050a5abf9311e03dbe1a28b" target="_blank">https://download.pytorch.org/whl/cu124/torch-2.5.0%2Bcu124-cp311-cp311-linux_x86_64.whl#sha256=5e3f4a7ba812517c2c1659857b5195f287a288fbd050a5abf9311e03dbe1a28b</a><br />
</div>
<div>如想安装其他版本, 可从以下网址查找:</div>
<div><a href="https://download.pytorch.org/whl/torch">https://download.pytorch.org/whl/torch</a><br />
</div>
<div><br />
</div>
</div>
<div>5. git clone&nbsp;<a href="https://github.com/axolotl-ai-cloud/axolotl" target="_blank">https://github.com/axolotl-ai-cloud/axolotl</a>, cd到根目录, 运行</div>
<div>
<div style="background-color: #eeeeee; font-size: 13px; border-left-color: #cccccc; border-image: none; padding: 4px 5px 4px 4px; width: 98%; word-break: break-all;"><!--<br />
<br />
Code highlighting produced by Actipro CodeHighlighter (freeware)<br />
http://www.CodeHighlighter.com/<br />
<br />
-->pip3&nbsp;install&nbsp;--no-build-isolation&nbsp;axolotl[flash-attn,deepspeed]</div>
</div>
<br />
<div><br />
</div>
<div><br />
</div>
<div><br />
</div>
<div><br />
</div>
<img src ="http://www.blogjava.net/paulwong/aggbug/451548.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/paulwong/" target="_blank">paulwong</a> 2025-01-12 16:37 <a href="http://www.blogjava.net/paulwong/archive/2025/01/12/451548.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>微调llama3大模型(2) - 使用ollama搭建chatbot</title><link>http://www.blogjava.net/paulwong/archive/2024/07/08/451464.html</link><dc:creator>paulwong</dc:creator><author>paulwong</author><pubDate>Mon, 08 Jul 2024 11:48:00 GMT</pubDate><guid>http://www.blogjava.net/paulwong/archive/2024/07/08/451464.html</guid><wfw:comment>http://www.blogjava.net/paulwong/comments/451464.html</wfw:comment><comments>http://www.blogjava.net/paulwong/archive/2024/07/08/451464.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/paulwong/comments/commentRss/451464.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/paulwong/services/trackbacks/451464.html</trackback:ping><description><![CDATA[<div> 上篇已经合并出了训练好的大模型，现在要搭建起一套CHATBOT，使得这套大模型能有一个WEBUI用起来。</div><div></div><div><h3>1.设置环境变量，ollama的模型保存路径，/etc/profile</h3></div><div><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />--><span style="color: #000000; ">export&nbsp;OLLAMA_MODELS</span><span style="color: #000000; ">=</span><span style="color: #000000; ">/root/autodl-tmp/models/ollama</span></div></div><div></div><div></div><div><h3>2.克隆ollama代码</h3></div><div><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />--><span style="color: #000000; ">curl&nbsp;-fsSL&nbsp;https://ollama.com/install.sh&nbsp;|&nbsp;sh</span></div></div><div></div><div></div><div><h3>3.启动ollama</h3></div><div><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />--><span style="color: #000000; ">ollama&nbsp;serve</span></div></div><div></div><div></div><div></div><div></div><div><h3>4.建立ollama镜像的配置文件，Modelfile</h3></div><div><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />--><span style="color: #000000; ">#&nbsp;set&nbsp;the&nbsp;base&nbsp;model<br />FROM&nbsp;/root/.ollama/llamafactory-export/saves/llama3-8b/lora/docker-commnad-nlp/export<br /><br />#&nbsp;set&nbsp;custom&nbsp;parameter&nbsp;values<br />PARAMETER&nbsp;temperature&nbsp;</span><span style="color: #000000; ">1</span><span style="color: #000000; "><br />PARAMETER&nbsp;num_keep&nbsp;</span><span style="color: #000000; ">24</span><span style="color: #000000; "><br />PARAMETER&nbsp;stop&nbsp;&lt;|start_header_id|&gt;<br />PARAMETER&nbsp;stop&nbsp;&lt;|end_header_id|&gt;<br />PARAMETER&nbsp;stop&nbsp;&lt;|eot_id|&gt;<br />PARAMETER&nbsp;stop&nbsp;&lt;|reserved_special_token<br /><br />#&nbsp;set&nbsp;the&nbsp;model&nbsp;template<br />TEMPLATE&nbsp;</span><span style="color: #000000; ">"""</span><span style="color: #000000; "><br />{{&nbsp;if&nbsp;.System&nbsp;}}&lt;|start_header_id|&gt;system&lt;|end_header_id|&gt;<br />{{&nbsp;.System&nbsp;}}&lt;|eot_id|&gt;{{&nbsp;end&nbsp;}}{{&nbsp;if&nbsp;.Prompt&nbsp;}}&lt;|start_header_id|&gt;user&lt;|end_header_id|&gt;<br />{{&nbsp;.Prompt&nbsp;}}&lt;|eot_id|&gt;{{&nbsp;end&nbsp;}}&lt;|start_header_id|&gt;assistant&lt;|end_header_id|&gt;<br />{{&nbsp;.Response&nbsp;}}&lt;|eot_id|&gt;<br /></span><span style="color: #000000; ">"""</span><span style="color: #000000; "><br /><br />#&nbsp;set&nbsp;the&nbsp;system&nbsp;message<br />SYSTEM&nbsp;You&nbsp;are&nbsp;llama3&nbsp;from&nbsp;Meta</span><span style="color: #000000; ">,</span><span style="color: #000000; ">&nbsp;customized&nbsp;and&nbsp;hosted&nbsp;@&nbsp;Paul&nbsp;Wong&nbsp;(http://paulwong88.tpddns.cn).<br /><br />#&nbsp;set&nbsp;Chinese&nbsp;lora&nbsp;support<br />#ADAPTER&nbsp;/root/.ollama/models/lora/ggml-adapter-model.bin</span></div></div><div></div><div>建立镜像命令，create-ollama-image-docker-command-nlp.sh</div><div><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />--><span style="color: #000000; ">BIN_PATH</span><span style="color: #000000; ">=</span><span style="color: #000000; ">$(cd&nbsp;`dirname&nbsp;$</span><span style="color: #000000; ">0</span><span style="color: #000000; ">`</span><span style="color: #008000; ">;</span><span style="color: #008000; ">&nbsp;pwd)</span><span style="color: #008000; "><br /></span><span style="color: #000000; ">cd&nbsp;$BIN_PATH/<br />pwd<br />ollama&nbsp;create&nbsp;llama3-docker-commnad-nlp:paul&nbsp;-f&nbsp;Modelfile</span></div></div><div></div><div></div><div></div><div></div><div><h3>5.运行大模型</h3></div><div><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />--><span style="color: #000000; ">llama3-docker-commnad-nlp:paul</span></div></div><div></div><div></div><img src ="http://www.blogjava.net/paulwong/aggbug/451464.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/paulwong/" target="_blank">paulwong</a> 2024-07-08 19:48 <a href="http://www.blogjava.net/paulwong/archive/2024/07/08/451464.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>微调llama3大模型(1) - 使用Llama Factory微调llama3大模型</title><link>http://www.blogjava.net/paulwong/archive/2024/07/08/451463.html</link><dc:creator>paulwong</dc:creator><author>paulwong</author><pubDate>Mon, 08 Jul 2024 10:44:00 GMT</pubDate><guid>http://www.blogjava.net/paulwong/archive/2024/07/08/451463.html</guid><wfw:comment>http://www.blogjava.net/paulwong/comments/451463.html</wfw:comment><comments>http://www.blogjava.net/paulwong/archive/2024/07/08/451463.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/paulwong/comments/commentRss/451463.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/paulwong/services/trackbacks/451463.html</trackback:ping><description><![CDATA[<div> 对于象META的开源大模型，如llama3，由于都是用通用数据进行预训练，对想使用其模型的公司来说，可能会不适用，因为这大模型对公司的数据不熟悉，因此引入微调(Fine-Tunning)。</div><div></div><div>通过喂给大模型大量数据，1万条起步，使得大模型也能对公司的数据熟悉，进而用于各种对话场景。</div><div></div><div></div><div><h3>1.克隆并安装LLAMA FACTORY库，install-llamafactory.sh</h3></div><div><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />--><span style="color: #000000; ">BIN_PATH</span><span style="color: #000000; ">=</span><span style="color: #000000; ">$(cd&nbsp;`dirname&nbsp;$</span><span style="color: #000000; ">0</span><span style="color: #000000; ">`</span><span style="color: #008000; ">;</span><span style="color: #008000; ">&nbsp;pwd)</span><span style="color: #008000; "><br /></span><span style="color: #000000; ">cd&nbsp;$BIN_PATH/../<br />pwd<br />git&nbsp;clone&nbsp;--depth&nbsp;</span><span style="color: #000000; ">1</span><span style="color: #000000; ">&nbsp;https://github.com/hiyouga/LLaMA-Factory.git<br />cd&nbsp;LLaMA-Factory<br />pip&nbsp;install&nbsp;-e&nbsp;</span><span style="color: #000000; ">"</span><span style="color: #000000; ">.[torch,metrics,bitsandbytes,modelscope]</span><span style="color: #000000; ">"</span></div></div><div></div><div></div><div><h3>2.设置环境变量</h3></div><div><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />--><span style="color: #000000; ">export&nbsp;USE_MODELSCOPE_HUB</span><span style="color: #000000; ">=</span><span style="color: #000000; ">1</span><span style="color: #000000; ">&nbsp;#使用modelscop模型库，非huggingface的<br />export&nbsp;CUDA_VISIBLE_DEVICES</span><span style="color: #000000; ">=</span><span style="color: #000000; ">0</span><span style="color: #000000; ">&nbsp;＃设置使用GPU<br />export&nbsp;HF_ENDPOINT</span><span style="color: #000000; ">=</span><span style="color: #000000; ">https://hf-mirror.com&nbsp;＃设置huggingface的替代地址<br />export&nbsp;MODELSCOPE_CACHE</span><span style="color: #000000; ">=</span><span style="color: #000000; ">/root/autodl-tmp/models/modelscope&nbsp;＃设置modelscope中的大模型保存路径<br />export LLAMAFACTORY_HOME=/root/autodl-tmp/LLaMA-Factory<br /></span></div></div><div></div><div></div><div><h3>3.准备数据</h3></div><div><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />--><span style="color: #000000; ">#在data/dataset_info.json中加入此数据<br /><br /></span><span style="color: #000000; ">"</span><span style="color: #000000; ">docker_command_NL</span><span style="color: #000000; ">"</span><span style="color: #000000; ">:&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #000000; ">"</span><span style="color: #000000; ">hf_hub_url</span><span style="color: #000000; ">"</span><span style="color: #000000; ">:&nbsp;</span><span style="color: #000000; ">"</span><span style="color: #000000; ">MattCoddity/dockerNLcommands</span><span style="color: #000000; ">"</span><span style="color: #000000; "><br />&nbsp;&nbsp;}</span><span style="color: #000000; ">,</span></div></div><div></div><div>在data目录中加入训练数据，MattCoddity/dockerNLcommands.json</div><div>数据格式为：</div><div><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />--><span style="color: #800000; font-weight: bold; ">[<br /></span><span style="color: #000000; ">&nbsp;&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #000000; ">"</span><span style="color: #000000; ">input</span><span style="color: #000000; ">"</span><span style="color: #000000; ">:&nbsp;</span><span style="color: #000000; ">"</span><span style="color: #000000; ">Give&nbsp;me&nbsp;a&nbsp;list&nbsp;of&nbsp;containers&nbsp;that&nbsp;have&nbsp;the&nbsp;Ubuntu&nbsp;image&nbsp;as&nbsp;their&nbsp;ancestor.</span><span style="color: #000000; ">"</span><span style="color: #000000; ">,</span><span style="color: #000000; "><br />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #000000; ">"</span><span style="color: #000000; ">instruction</span><span style="color: #000000; ">"</span><span style="color: #000000; ">:&nbsp;</span><span style="color: #000000; ">"</span><span style="color: #000000; ">translate&nbsp;this&nbsp;sentence&nbsp;in&nbsp;docker&nbsp;command</span><span style="color: #000000; ">"</span><span style="color: #000000; ">,</span><span style="color: #000000; "><br />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #000000; ">"</span><span style="color: #000000; ">output</span><span style="color: #000000; ">"</span><span style="color: #000000; ">:&nbsp;</span><span style="color: #000000; ">"</span><span style="color: #000000; ">docker&nbsp;ps&nbsp;--filter&nbsp;'ancestor=ubuntu'</span><span style="color: #000000; ">"</span><span style="color: #000000; "><br />&nbsp;&nbsp;}</span><span style="color: #000000; ">,</span><span style="color: #000000; "><br /><img src="http://www.blogjava.net/Images/dot.gif" alt="" /><br />]</span></div></div><div></div><div></div><div></div><div><h3>4.训练大模型</h3></div><div>训练的参数文件：llama3_lora_sft_docker_command.yaml</div><div><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />--><span style="color: #000000; ">###&nbsp;model<br />#md&nbsp;model&nbsp;id<br />model_name_or_path:&nbsp;LLM-Research/Meta-Llama-</span><span style="color: #000000; ">3</span><span style="color: #000000; ">-8B-Instruct<br />#huggingface&nbsp;model&nbsp;id<br />#model_name_or_path:&nbsp;meta-llama/Meta-Llama-</span><span style="color: #000000; ">3</span><span style="color: #000000; ">-8B-Instruct<br /><br />###&nbsp;method<br />stage:&nbsp;sft<br />do_train:&nbsp;true<br />finetuning_type:&nbsp;lora<br />lora_target:&nbsp;all<br /><br />###&nbsp;dataset<br />dataset:&nbsp;docker_command_NL<br />template:&nbsp;llama3<br />cutoff_len:&nbsp;</span><span style="color: #000000; ">1024</span><span style="color: #000000; "><br />max_samples:&nbsp;</span><span style="color: #000000; ">1000</span><span style="color: #000000; "><br />overwrite_cache:&nbsp;true<br />preprocessing_num_workers:&nbsp;</span><span style="color: #000000; ">16</span><span style="color: #000000; "><br /><br />###&nbsp;output<br />output_dir: /root/autodl-tmp/my-test/saves/llama3-8b/lora/sft/docker-commnad-nlp/sft<br />logging_steps:&nbsp;</span><span style="color: #000000; ">10</span><span style="color: #000000; "><br />save_steps:&nbsp;</span><span style="color: #000000; ">500</span><span style="color: #000000; "><br />plot_loss:&nbsp;true<br />overwrite_output_dir:&nbsp;true<br /><br />###&nbsp;train<br />per_device_train_batch_size:&nbsp;</span><span style="color: #000000; ">4</span><span style="color: #000000; "><br />gradient_accumulation_steps:&nbsp;</span><span style="color: #000000; ">8</span><span style="color: #000000; "><br />learning_rate:&nbsp;</span><span style="color: #000000; ">1.0e-4</span><span style="color: #000000; "><br />num_train_epochs:&nbsp;</span><span style="color: #000000; ">3.0</span><span style="color: #000000; "><br />lr_scheduler_type:&nbsp;cosine<br />warmup_ratio:&nbsp;</span><span style="color: #000000; ">0.1</span><span style="color: #000000; "><br />bf16:&nbsp;true<br />ddp_timeout:&nbsp;</span><span style="color: #000000; ">180000000</span><span style="color: #000000; "><br /><br />###&nbsp;eval<br />val_size:&nbsp;</span><span style="color: #000000; ">0.1</span><span style="color: #000000; "><br />per_device_eval_batch_size:&nbsp;</span><span style="color: #000000; ">1</span><span style="color: #000000; "><br />eval_strategy:&nbsp;steps<br />eval_steps:&nbsp;</span><span style="color: #000000; ">500</span></div></div><div></div><div>训练命令：lora-train-docker-command.sh</div><div><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />--><span style="color: #000000; ">BIN_PATH</span><span style="color: #000000; ">=</span><span style="color: #000000; ">$(cd&nbsp;`dirname&nbsp;$</span><span style="color: #000000; ">0</span><span style="color: #000000; ">`</span><span style="color: #008000; ">;</span><span style="color: #008000; ">&nbsp;pwd)</span><span style="color: #008000; "><br /></span><span style="color: #000000; ">cd&nbsp;$BIN_PATH/<br />pwd<br />cd&nbsp;$LLAMAFACTORY_HOME<br />pwd<br />llamafactory-cli&nbsp;train&nbsp;$BIN_PATH/conf/llama3_lora_sft_docker_command.yaml<br /></span></div></div><div></div><div></div><div>执行此命令即可开始训练大模型。</div><div></div><div><h3>5.合并大模型</h3></div><div>合并用的参数文件，llama3_lora_export_docker_command.yaml</div><div><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />--><span style="color: #000000; ">###&nbsp;model<br />#md&nbsp;model&nbsp;id<br />model_name_or_path:&nbsp;LLM-Research/Meta-Llama-</span><span style="color: #000000; ">3</span><span style="color: #000000; ">-8B-Instruct<br />#huggingface&nbsp;model&nbsp;id<br />#model_name_or_path:&nbsp;meta-llama/Meta-Llama-</span><span style="color: #000000; ">3</span><span style="color: #000000; ">-8B-Instruct<br /><br />adapter_name_or_path: /root/autodl-tmp/my-test/saves/llama3-8b/lora/docker-commnad-nlp/sft<br />template:&nbsp;llama3<br />export_dir: /root/autodl-tmp/my-test/saves/llama3-8b/lora/docker-commnad-nlp/export<br />finetuning_type:&nbsp;lora<br />export_size:&nbsp;</span><span style="color: #000000; ">2</span><span style="color: #000000; "><br />export_device:&nbsp;gpu<br />export_legacy_format:&nbsp;False</span></div></div><div></div><div>合并命令，lora-export-docker-command.sh</div><div><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />--><span style="color: #000000; ">BIN_PATH</span><span style="color: #000000; ">=</span><span style="color: #000000; ">$(cd&nbsp;`dirname&nbsp;$</span><span style="color: #000000; ">0</span><span style="color: #000000; ">`</span><span style="color: #008000; ">;</span><span style="color: #008000; ">&nbsp;pwd)</span><span style="color: #008000; "><br /></span><span style="color: #000000; ">cd&nbsp;$BIN_PATH/<br />pwd<br />llamafactory-cli&nbsp;export&nbsp;conf/llama3_lora_export_docker_command.yaml</span></div></div><div></div><div></div><div></div><div></div><div></div><img src ="http://www.blogjava.net/paulwong/aggbug/451463.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/paulwong/" target="_blank">paulwong</a> 2024-07-08 18:44 <a href="http://www.blogjava.net/paulwong/archive/2024/07/08/451463.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item></channel></rss>