A really good and concise deep dive into RLHF in LLM post-training, Proximal Policy Optimization (PPO), and Group Relative Policy Optimization (GRPO)
https://yugeten.github.io/posts/2025/01/ppogrpo/
#llm
https://yugeten.github.io/posts/2025/01/ppogrpo/
#llm
https://www.anthropic.com/research/tracing-thoughts-language-model
Anthropic 这个 LLM Interpretability 的研究得到了不少有趣的结论。想要 TLDR 可以读这篇博客;有兴趣可以看看两篇对应的论文,有更多细节并且页面交互做得不错。 #llm
https://transformer-circuits.pub/2025/attribution-graphs/biology.html
https://transformer-circuits.pub/2025/attribution-graphs/methods.html
Anthropic 这个 LLM Interpretability 的研究得到了不少有趣的结论。想要 TLDR 可以读这篇博客;有兴趣可以看看两篇对应的论文,有更多细节并且页面交互做得不错。 #llm
https://transformer-circuits.pub/2025/attribution-graphs/biology.html
https://transformer-circuits.pub/2025/attribution-graphs/methods.html
Anthropic
Tracing the thoughts of a large language model
Anthropic's latest interpretability research: a new microscope to understand Claude's internal mechanisms
ysymyth.github.io
The Second Half
tldr: We’re at AI’s halftime.
Truly a thought-provoking piece, from the author of τ-bench.
https://ysymyth.github.io/The-Second-Half/ #ai
https://ysymyth.github.io/The-Second-Half/ #ai
So what’s suddenly different now?
In three words: RL finally works. More precisely: RL finally generalizes. After several major detours and a culmination of milestones, we’ve landed on a working recipe to solve a wide range of RL tasks using language and reasoning.
The second half of AI — starting now — will shift focus from solving problems to defining problems. In this new era, evaluation becomes more important than training. Instead of just asking, “Can we train a model to solve X?”, we’re asking, “What should we be training AI to do, and how do we measure real progress?” To thrive in this second half, we’ll need a timely shift in mindset and skill set, ones perhaps closer to a product manager.
It turned out the most important part of RL might not even be the RL algorithm or environment, but the priors, which can be obtained in a way totally unrelated from RL (LLMs).
🔥2
https://newsletter.pragmaticengineer.com/p/the-philosophy-of-software-design
A Philosophy of Software Design 作者 John Ousterhout 做客 The Pragmatic Engineer. #podcast #software_design
A Philosophy of Software Design 作者 John Ousterhout 做客 The Pragmatic Engineer. #podcast #software_design
Pragmaticengineer
The Philosophy of Software Design – with John Ousterhout
Stanford professor John Ousterhout explains why thoughtful software design matters more than ever as AI tools transform coding practices and developer workflows.
🔥2❤1
https://arxiv.org/abs/2305.18290 #llm #ai
今天深入学习了 DPO,再次感叹扎实的数学功底对 AI/ML Research 的重要性……
原始的 RLHF 是用 pairwise human preference data(A 和 B 哪个更好)去训练一个 reward model,然后用 RL 来训练主 policy model,objective 是 minimize negative log likelihood + regularization(比如 PPO 就是通过新旧 policy 之间的 KL Divergence 来做 regularization)。这样的缺点在于 RL 是出了名的难搞,而且还需要一个 critic model 来预测 reward,使得整个系统的复杂性很高。
DPO 的思路是,观察到 RLHF 的 objective 本质上是 minimize loss over (latent) reward function,通过一番 reparameterization 等数学推导,重新设计了一个 minimize loss over policy 的 objective,绕过了中间这个 reward model,让 gradient update 直接增加 policy model 生成 winner response 的概率并降低 loser response 的概率,大幅简化了流程。
拓展阅读:
- KTO: 更进一步,不需要 pairwise comparison,只用对 individual example 的 upvote/downvote 也可以学习到 preference。
- IPO: 解决 DPO 容易 overfit 的问题。
今天深入学习了 DPO,再次感叹扎实的数学功底对 AI/ML Research 的重要性……
原始的 RLHF 是用 pairwise human preference data(A 和 B 哪个更好)去训练一个 reward model,然后用 RL 来训练主 policy model,objective 是 minimize negative log likelihood + regularization(比如 PPO 就是通过新旧 policy 之间的 KL Divergence 来做 regularization)。这样的缺点在于 RL 是出了名的难搞,而且还需要一个 critic model 来预测 reward,使得整个系统的复杂性很高。
DPO 的思路是,观察到 RLHF 的 objective 本质上是 minimize loss over (latent) reward function,通过一番 reparameterization 等数学推导,重新设计了一个 minimize loss over policy 的 objective,绕过了中间这个 reward model,让 gradient update 直接增加 policy model 生成 winner response 的概率并降低 loser response 的概率,大幅简化了流程。
拓展阅读:
- KTO: 更进一步,不需要 pairwise comparison,只用对 individual example 的 upvote/downvote 也可以学习到 preference。
- IPO: 解决 DPO 容易 overfit 的问题。
arXiv.org
Direct Preference Optimization: Your Language Model is Secretly a...
While large-scale unsupervised language models (LMs) learn broad world knowledge and some reasoning skills, achieving precise control of their behavior is difficult due to the completely...
👍3
https://www.youtube.com/watch?v=lcjdwSY2AzM
这期介绍 principle of least action 的视角很独到,还科普了几位相对不怎么被提及的科学家的贡献 👍
这期介绍 principle of least action 的视角很独到,还科普了几位相对不怎么被提及的科学家的贡献 👍
YouTube
The Biggest Misconception in Physics
Why does energy disappear in General Relativity? 👉 Use code VERITASIUM to get 50% off your first monthly KiwiCo Crate! https://www.kiwico.com/VERITASIUM
Try Snatoms! A molecular modelling kit I invented where the atoms snap together.
https://ve42.co/SnatomsV…
Try Snatoms! A molecular modelling kit I invented where the atoms snap together.
https://ve42.co/SnatomsV…
https://store.steampowered.com/app/1569580/Blue_Prince/
强烈推荐,2025 开年至今个人玩到最惊艳的游戏。结合解密和roguelike,puzzle有多层depth,好玩耐玩💯
#game
强烈推荐,2025 开年至今个人玩到最惊艳的游戏。结合解密和roguelike,puzzle有多层depth,好玩耐玩💯
#game
Steampowered
Save 25% on Blue Prince on Steam
Welcome to Mt. Holly, where every dawn unveils a new mystery. Navigate through shifting corridors and ever-changing chambers in this genre-defying strategy puzzle adventure. But will your unpredictable path lead you to the rumored Room 46?
👍2
Interesting opinion piece. I'm most impressed by the sheer number of links in this post 😅
https://www.latent.space/p/clippy-v-anton
https://www.latent.space/p/clippy-v-anton
www.latent.space
Please stop forcing Clippy on those who want Anton
ChatGPT-4o's glazing embarrassment lays open Clippy vs Anton: The two extremes of desires in AI post-training and product
Forwarded from 散步中
朋友来找我做博客嘉宾,我说我没录过,我也不是谦虚你另请高明吧,但他说是想聊聊搬到SF的体验,我说我可以:
https://www.xiaoyuzhoufm.com/episode/680eee0d7a449ae8581a3820
https://www.xiaoyuzhoufm.com/episode/680eee0d7a449ae8581a3820
Xiaoyuzhoufm
04. 在南湾上班,为什么却住在旧金山?
听《Bay Area人文活动汇总》上小宇宙。 一本旧金山湾区的本地文化生活大全。
每周的活动详情都在微信公众号 - Bay Area人文活动汇总
也可以访问我们的活动日历 https://bay-area-human.vercel.app/calendar
每周的活动详情都在微信公众号 - Bay Area人文活动汇总
也可以访问我们的活动日历 https://bay-area-human.vercel.app/calendar
koomen.dev
AI Horseless Carriages | koomen.dev
An essay about bad AI app design
https://koomen.dev/essays/horseless-carriages/
我是觉得拿工业革命时期的例子来类比 AI 时代的种种有点 cliche 了,不过这篇中心论点和例子都挺到位,还有交互。
我是觉得拿工业革命时期的例子来类比 AI 时代的种种有点 cliche 了,不过这篇中心论点和例子都挺到位,还有交互。
In most AI apps, System Prompts should be written and maintained by users, not software developers or even domain experts hired by developers.
🍾1
julian.digital
The case against conversational interfaces
Conversational interfaces are a bit of a meme. Every couple of years a shiny new AI development emerges and people in tech go "This is it! The next computing paradigm is here! We'll only use natural language going forward!". But then nothing actually changes…
https://julian.digital/2025/03/27/the-case-against-conversational-interfaces/
这篇可以一起看,标题比较钓鱼(作者自己也承认了),但其实是对怎样的 UX 能最大发挥 AI 效用很好的思考。
P.S. 这个博主的文章都很赞,比如 https://julian.digital/2023/07/06/multi-layered-calendars/ 和 https://julian.digital/2020/09/04/a-meta-layer-for-notes/
这篇可以一起看,标题比较钓鱼(作者自己也承认了),但其实是对怎样的 UX 能最大发挥 AI 效用很好的思考。
AI should function as an always-on command meta-layer that spans across all tools. Users should be able to trigger actions from anywhere with simple voice prompts without having to interrupt whatever they are currently doing with mouse and keyboard.
Productivity and collaboration shouldn’t be two separate workflows.
P.S. 这个博主的文章都很赞,比如 https://julian.digital/2023/07/06/multi-layered-calendars/ 和 https://julian.digital/2020/09/04/a-meta-layer-for-notes/
🏆1
Forwarded from C’s Random Collection
image_2025-05-14_23-36-37.png
504.8 KB
New landing page design and live at https://deeptime.now 🎉 and deeptime is now in beta, all features are free! Sign up today! #DeeptimeNow
https://store.steampowered.com/app/1425350/Botany_Manor/
有点短,解谜也较简单,但是很喜欢。画风精美,体验清新,背景里还有不错的女性主义叙事。
有点短,解谜也较简单,但是很喜欢。画风精美,体验清新,背景里还有不错的女性主义叙事。
Steampowered
Botany Manor on Steam
Welcome to Botany Manor, a stately home in 19th century England. You play as inhabitant Arabella Greene, a retired botanist. Explore your house and gardens, filled with research, to figure out the ideal habitat of forgotten flora. Grow each plant to discover…
❤2
Mary Meeker's first Trends report since 2019. 340 slides on the state of AI.
https://www.bondcap.com/reports/tai
https://www.bondcap.com/reports/tai
https://store.steampowered.com/app/2008920/Lorelei_and_the_Laser_Eyes/
今年似乎一直在玩同一类的游戏:发生在一个庄园或类似密闭环境中的解密游戏 — Blue Prince、Botany Manor 等。
但今天想推荐的 Lorelei and the Laser Eyes 是最近几年我最喜欢的游戏之一。
谜题属于偏简单的,硬核解密玩家如果对不上电波可能会踩雷。不过逐渐将散乱的线索拼凑成答案的体验依然很棒。
艺术风格赞,叙事到一半还开始加入对艺术史的反思。
#Annapurna 出品必属精品
今年似乎一直在玩同一类的游戏:发生在一个庄园或类似密闭环境中的解密游戏 — Blue Prince、Botany Manor 等。
但今天想推荐的 Lorelei and the Laser Eyes 是最近几年我最喜欢的游戏之一。
谜题属于偏简单的,硬核解密玩家如果对不上电波可能会踩雷。不过逐渐将散乱的线索拼凑成答案的体验依然很棒。
艺术风格赞,叙事到一半还开始加入对艺术史的反思。
#Annapurna 出品必属精品
Steampowered
Save 35% on Lorelei and the Laser Eyes on Steam
The stage is set. Imagine an old baroque manor, perhaps a hotel or a museum, somewhere in central Europe. A woman wanders in search of answers.
👍2
"Metacognitive sensitivity" 💯
https://x.com/fchollet/status/1932332984935625197?s=46
https://x.com/fchollet/status/1932332984935625197?s=46
X (formerly Twitter)
François Chollet (@fchollet) on X
The rate at which you learn is to a great extent a function of your metacognitive sensitivity -- your propensity to introspect and critique your own mental models and learning processes
Naming is extremely important in Computer Science and, frankly, everything. Good naming is hard. Being able to pick a good name shows a lot of good taste.
Context engineering (a term promoted by Karpathy: https://vxtwitter.com/karpathy/status/1937902205765607626) is much better than:
- Prompt engineering: "Prompt" is just too overloaded.
- In-context learning: This is more of a research term and feels awkward to describe engineering required for building good LLM applications.
- RAG: Today, when done right, RAG is a very specific kind of context engineering. But too many people conflate it with "anything that puts stuff in the prompt".
Image credit: https://github.com/humanlayer/12-factor-agents/blob/main/content/factor-03-own-your-context-window.md
Context engineering (a term promoted by Karpathy: https://vxtwitter.com/karpathy/status/1937902205765607626) is much better than:
- Prompt engineering: "Prompt" is just too overloaded.
- In-context learning: This is more of a research term and feels awkward to describe engineering required for building good LLM applications.
- RAG: Today, when done right, RAG is a very specific kind of context engineering. But too many people conflate it with "anything that puts stuff in the prompt".
Image credit: https://github.com/humanlayer/12-factor-agents/blob/main/content/factor-03-own-your-context-window.md
❤2👍2
https://hbr.org/2025/04/how-people-are-really-using-gen-ai-in-2025
可以翻到最底下快速过一遍人们现在使用 AI 的一百种方式,查漏补缺、找找灵感。
可以翻到最底下快速过一遍人们现在使用 AI 的一百种方式,查漏补缺、找找灵感。
Harvard Business Review
How People Are Really Using Gen AI in 2025
Last year, HBR published a piece on how people are using gen AI. Much has happened over the past 12 months. We now have Custom GPTs—AI tailored for narrower sets of requirements. New kids are on the block, such as DeepSeek and Grok, providing more competition…
🤩6👍2