﻿<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/"><channel><title>BlogJava-paulwong-随笔分类-AI-DEEPSEEK</title><link>http://www.blogjava.net/paulwong/category/55403.html</link><description /><language>zh-cn</language><lastBuildDate>Sun, 16 Feb 2025 09:33:57 GMT</lastBuildDate><pubDate>Sun, 16 Feb 2025 09:33:57 GMT</pubDate><ttl>60</ttl><item><title>满血版Deepseek R1全网资源</title><link>http://www.blogjava.net/paulwong/archive/2025/02/15/451580.html</link><dc:creator>paulwong</dc:creator><author>paulwong</author><pubDate>Sat, 15 Feb 2025 15:10:00 GMT</pubDate><guid>http://www.blogjava.net/paulwong/archive/2025/02/15/451580.html</guid><wfw:comment>http://www.blogjava.net/paulwong/comments/451580.html</wfw:comment><comments>http://www.blogjava.net/paulwong/archive/2025/02/15/451580.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/paulwong/comments/commentRss/451580.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/paulwong/services/trackbacks/451580.html</trackback:ping><description><![CDATA[<div>官网</div>
<div><a href="https://chat.deepseek.com" target="_blank">https://chat.deepseek.com</a><br />
</div>
<div><br />
</div>
<div>腾讯, 需下载客户端</div>
<div><a href="https://ima.qq.com" target="_blank">https://ima.qq.com</a><br />
</div>
<div><br />
</div>
<div>阿里, 需自建对话应用, 有网页版</div>
<div><a href="https://tbox.alipay.com/" target="_blank">https://tbox.alipay.com/</a><br />
</div>
<div><br />
</div>
<div>askmanyai</div>
<div><a href="https://askmanyai.cn" target="_blank">https://askmanyai.cn</a><br />
</div>
<div><br />
</div>
<div>360纳米搜索, 无网页版, 需自行下载app</div>
<div><br />
</div>
<div><br />
</div>
<img src ="http://www.blogjava.net/paulwong/aggbug/451580.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/paulwong/" target="_blank">paulwong</a> 2025-02-15 23:10 <a href="http://www.blogjava.net/paulwong/archive/2025/02/15/451580.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>DeepSeek背后的数学：深入研究群体相对策略优化（GRPO）</title><link>http://www.blogjava.net/paulwong/archive/2025/02/08/451576.html</link><dc:creator>paulwong</dc:creator><author>paulwong</author><pubDate>Fri, 07 Feb 2025 16:13:00 GMT</pubDate><guid>http://www.blogjava.net/paulwong/archive/2025/02/08/451576.html</guid><wfw:comment>http://www.blogjava.net/paulwong/comments/451576.html</wfw:comment><comments>http://www.blogjava.net/paulwong/archive/2025/02/08/451576.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/paulwong/comments/commentRss/451576.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/paulwong/services/trackbacks/451576.html</trackback:ping><description><![CDATA[&nbsp;&nbsp;&nbsp;&nbsp; 摘要: 本博客深入探讨了群体相对策略优化（GRPO）背后的数学，GRPO是推动DeepSeek卓越推理能力的核心强化学习算法。我们将分解GRPO的工作原理、其关键组件，以及为什么它是训练高级大型语言模型（LLM）的改变者。GRPO的基础GRPO是什么？群相对策略优化（GRPO）是一种强化学习（RL）算法，专门用于增强大型语言模型（LLM）的推理能力。与传统的RL方法不同，RL方法严重依赖外部评...&nbsp;&nbsp;<a href='http://www.blogjava.net/paulwong/archive/2025/02/08/451576.html'>阅读全文</a><img src ="http://www.blogjava.net/paulwong/aggbug/451576.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/paulwong/" target="_blank">paulwong</a> 2025-02-08 00:13 <a href="http://www.blogjava.net/paulwong/archive/2025/02/08/451576.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>DeepSeek资源</title><link>http://www.blogjava.net/paulwong/archive/2025/02/02/451569.html</link><dc:creator>paulwong</dc:creator><author>paulwong</author><pubDate>Sun, 02 Feb 2025 11:22:00 GMT</pubDate><guid>http://www.blogjava.net/paulwong/archive/2025/02/02/451569.html</guid><wfw:comment>http://www.blogjava.net/paulwong/comments/451569.html</wfw:comment><comments>http://www.blogjava.net/paulwong/archive/2025/02/02/451569.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/paulwong/comments/commentRss/451569.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/paulwong/services/trackbacks/451569.html</trackback:ping><description><![CDATA[<div>DeepSeek大模型由于采用了GRPO算法, 大幅降低了显存的需求.</div>
<div><br />
</div>
<div>【DeepSeek】复现DeepSeek R1？快来看这个Open R1项目实践指南~</div>
<div><a href="https://blog.csdn.net/qq_38961840/article/details/145388142" target="_blank">https://blog.csdn.net/qq_38961840/article/details/145388142</a><br />
</div>
<div><br />
</div>
<div>!!!实战LLM强化学习&#8212;&#8212;使用GRPO（DeepSeek R1出圈算法）</div>
<div><a href="https://blog.csdn.net/qq_38961840/article/details/145390704" target="_blank">https://blog.csdn.net/qq_38961840/article/details/145390704</a><br />
</div>
<div><br />
</div>
<div>【DeepSeek】一文详解GRPO算法&#8212;&#8212;为什么能减少大模型训练资源？</div>
<div><a href="https://blog.csdn.net/qq_38961840/article/details/145384852" target="_blank">https://blog.csdn.net/qq_38961840/article/details/145384852</a><br />
</div>
<div><br />
</div>
<div>DeepSeek R1系列</div>
<div><a href="https://blog.csdn.net/qq_38961840/category_12885087.html" target="_blank">https://blog.csdn.net/qq_38961840/category_12885087.html</a><br />
</div>
<div><br />
</div>
<div><br />
</div>
<img src ="http://www.blogjava.net/paulwong/aggbug/451569.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/paulwong/" target="_blank">paulwong</a> 2025-02-02 19:22 <a href="http://www.blogjava.net/paulwong/archive/2025/02/02/451569.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item></channel></rss>