The AI Paradox

More automation, more humans, more work

AI 悖论:自动化越多,反而需要更多人,工作也更多

Lenny's Podcast · with Dan Shipper (CEO of Every)
双语精读 + 词汇笔记 + 原声节选跟读 · 中级英语学习版
时长 1 小时 34 分钟 · 10 个核心片段

How to Use This Guide · 使用说明

三遍学习法

第一遍|盲听原片:在 YouTube/Spotify 打开 Lenny & Dan 的对话原片,先不看字幕只听大意。 YouTube 链接

第二遍|对照精读:边听边读左侧英文,遇生词查词汇表,再扫一眼右侧中文核对理解。

第三遍|跟读输出:点击每段上方的 ▶ 播放原声节选(Lenny & Dan 真人录音),可切换 0.75× / 1.0× / 1.25× / 1.5× 倍速跟读;最后合上中文,用 3–5 句英文复述核心观点。

每周节奏推荐:一周吃透 2 个片段(约 30–40 分钟/天)。重点积累商务英语 + 科技口语 + 表达地道度三类词汇。

#01

Cold Open: The Big Predictions

开场冷开:核心预测速览
⏱ 0:00 – 1:33 · 原片节选 (Lenny & Dan)
原声节选,可调速跟读
ENGLISH

LENNY: The last time you were on this podcast, you had this hot take that people were sleeping on Claude Code. You were so unbelievably right. The premise of this episode is we're going to go through what else you predict will happen.

DAN: The AI jobpocalypse is not really a thing. I am super, super bullish on PMs and full stack designers.

LENNY: You guys are hiring, doubled in people in the past year, which is not what people would've expected from a company that is so AI-forward.

DAN: I'm simultaneously extremely AI pilled and very bullish on humans. Automation is a lie. Every agent needs a human. We have so much automation, so much AI, and I also work way more.

LENNY: Creativity, it just feels like it's going to be more and more valuable to stand out from all the slop that people are shipping and launching constantly.

DAN: What models do in general is they make yesterday's human competence cheap. And so it becomes commoditized. It's not valuable anymore. What humans do is we go in there and we're like, 'Yeah, we have all this frozen human competence from yesterday. How do I use this to make something new and interesting?'

DAN: It's going to bifurcate in two main ways. One is everyone's going to have at least one agent that they talk to that they can offload work to. Second is that most of the work that you do is actually going to happen on your computer in an environment like Codex or Claude Cowork.

DAN: I think the SaaSpocalypse is dumb. I would buy SaaS stocks right now. What agents do is increase the number of users of SaaS, not get rid of it.

DAN: We speed ran the CLI era. It was nice while it lasted, but I think CLIs are over.

中文翻译

Lenny:你上次来播客时有个「大胆观点」——大家都低估了 Claude Code。结果你说得太对了。这一期我们要听听你接下来还预测什么。

Dan:所谓「AI 失业大灾难」其实不存在。我极度看好产品经理和全栈设计师。

Lenny:你们公司居然还在招人,过去一年人数翻倍——大家不会觉得一家这么 AI-first 的公司应该是这样的。

Dan:我是 AI 重度信徒,但同时也极度看好人类。「自动化」是个谎言。每个 agent 都需要一个人。我们有这么多自动化、这么多 AI,可我反而工作得更多。

Lenny:在大家不停疯狂出货的 AI 垃圾内容里,「创造力」会越来越值钱,因为它能让你脱颖而出。

Dan:模型干的事,就是把「昨天人类的能力」变得很便宜。它就被大宗商品化了,不再值钱。而人类要做的,是拿着这些「冷冻的昨日能力」,去捏出新东西、有意思的东西。

Dan:未来工作会朝两个方向「分叉」:第一,每个人在公司里至少有一个可以对话、可以把工作委派出去的 agent;第二,你大部分工作其实会发生在你的电脑里,在 Codex 或 Claude Cowork 这样的环境里。

Dan:所谓「SaaS 末日论」很蠢。我现在就会买 SaaS 股票。Agent 做的事是增加 SaaS 的用户数,而不是干掉 SaaS。

Dan:我们用「极速」把 CLI 时代过了一遍。它来得风光,但我觉得 CLI 已经结束了。

Vocabulary & Expressions · 地道表达 & 高频词汇

Word / PhrasePOS释义 / 用法
hot taken.大胆/反主流的观点
sleep on sth.phr.忽视、低估某事物
premisen.前提、(节目/论述的)主旨
jobpocalypsen.(造词)「就业大灾难」(job + apocalypse)
bullish onphr.看好、看涨(金融用语,引申到职场)
AI-forwardadj.高度拥抱 AI 的(公司/做法)
AI-pilledadj.(俚语)AI 重度信徒(pilled 源自《黑客帝国》red pill 的网络梗)
simultaneouslyadv.同时地
slopn.(口语)粗制滥造的内容;AI 垃圾输出
ship / launchv.发布、上线(产品圈高频词)
competencen.能力、胜任水平
commoditizedadj.被大宗商品化的、被去差异化的
bifurcatev.分叉、一分为二
offload work tophr.把工作转嫁/委派给
speed-runv.(游戏术语)以最快速度通关;引申「火速走完一个时代」
be overphr.结束了、过气了
#02

How Every 'Lives in the Future'

Every 公司是怎样「活在未来」的
⏱ 4:29 – 7:48 · 原片节选 (Lenny & Dan)
原声节选,可调速跟读
ENGLISH

DAN: One of the things about predicting the future at Every is that what you don't want to do is prognosticate. What you want to do instead is just live in it together. So everybody at Every is an AI early adopter. We're almost 30 people now. When we did our interview, we were 15, so we've doubled in size in the last year.

We're all early adopters and we have engineers, designers, writers, editors, salespeople, customer service people — and everybody has a little bit of that, whatever that thing is where you're just like, 'Oh, I like to explore. I like to experiment. I'm very curious and I'm super all-in on AI.'

What that does is it creates this little pocket of the future where we're all living in it together and we get to be a little bit further ahead, because at any other company, there's a mix: there's early adopters, there's the middle of the pack people, and there's people who are very anti.

Another thing that happens, which is really cool, is because of our role reviewing models and being a little bit of a pacemaker in AI, we get access to stuff before it comes out. We get to beta test and alpha test and help steer the direction of where things are going a little bit.

So when I think about predicting the future, when you create an environment like that, it's actually just about noticing what's going on. A core part of it is writing about it. Articulating what you're noticing, articulating the future brings it about in a way that makes it real for you, your team, and anybody on the internet who's reading it.

One of the things we talk about internally is what I call 'the reach test' — when you wake up in the morning, do you reach for it organically?

LENNY: I love this combination of you are using the latest stuff, and you're good at being self-aware of here's what's weird and new and different and interesting. That's a really cool combination, partly because you have to write about it and you write about it. So that's the perfect recipe for someone having a sense of where things are going.

中文翻译

Dan:在 Every 我们预测未来的方法是——别去「占卜」,而是和未来「一起生活」。Every 全公司都是 AI 早期采用者。我们现在差不多 30 人,上次采访时是 15 人,一年翻倍。

我们这里有工程师、设计师、写作者、编辑、销售、客服,每个人身上都带着那种气质:「我爱探索、爱试,对 AI 全情投入。」

这样一来,就形成了一个「未来的小气泡」,大家一起活在里面,所以我们比一般公司略微领先一点。其他公司总是混合的:早期采用者、中间派、强烈反对派都有。

还有一个很酷的点:因为我们在帮模型公司做评测、扮演「节奏带动者」,我们能比别人更早接触到还没发布的东西,可以参与 alpha / beta 测试,对产品走向施加一点影响。

所以预测未来这件事,当你身处这种环境,其实就是「观察正在发生什么」。重要的一环是「把它写下来」。把你看到的东西讲清楚、把未来描述出来,这件事本身就会让它对你、对团队、对每一个网上的读者变得真实。

我们内部有个说法叫「reach test(伸手测试)」:你早上起床,会不会下意识就去伸手用它?

Lenny:我喜欢你这种组合:你既在用最新的东西,又有自我觉察能力——能发现「这玩意又怪又新又有意思」。这是非常厉害的组合,部分原因是你需要写、并且确实在写。这就是预判未来的完美配方。

Vocabulary & Expressions · 地道表达 & 高频词汇

Word / PhrasePOS释义 / 用法
prognosticatev.预测、占卜(偏书面/讽刺色彩)
early adoptern.早期采用者(科技产品圈高频词)
middle of the packphr.中游、中间派
pacemakern.节奏带动者(原义起搏器/领跑者)
steer the directionphr.引导/掌舵方向
articulatev.清晰表达(高级商务英语高频词)
bring sth. aboutphr.促成、使之发生
organicallyadv.自然而然地、有机地
self-awareadj.有自我觉察力的
recipe forphr.做……的配方/秘诀(隐喻)
sense of where things are goingphr.对趋势的判断力
#03

The Super-Agent Inside Slack

Slack 里的「超级 Agent」
⏱ 10:29 – 16:24 · 原片节选 (Lenny & Dan)
原声节选,可调速跟读
ENGLISH

DAN: One of my favorite questions. If you look at the benchmarks, AI is going to just take all of our jobs. METR has this really cool benchmark — it measures how long the newest models can do tasks autonomously. Mythos preview, the big Anthropic model, can do tasks of 17 hours at 50% accuracy. Holy shit, that's crazy.

And I think it is real. Model progress is going up exponentially. But my feeling is that we will look back in a year and say humans actually have a lot more work to do, even as models get better. There's a really interesting paradox there.

My big prediction is it's going to bifurcate in two ways. One: you're going to have at least one agent in your company that you talk to and offload work to. Two: most of the work you do will happen on your computer in an environment like Codex or Claude Cowork.

When OpenClaw first came out, I was very convinced that everyone has their own agent — a parallel shadow org chart. Did you ever read The Golden Compass? It's like having a little daemon on your shoulder, a little part of your soul. I really thought that was happening.

But I have completely flipped. Now I really think the model is going to be a super-agent — one agent for the entire company. Shopify famously has one. Ramp has one now. There's all this hype with personal agents like OpenClaw, then everyone realizes it's way too much work. This thing breaks all the time. I have to fumble around, SSH into my server, blah blah blah. Most people just don't want to spend that time.

The fundamental, underlying thing is: in order for an AI agent to be useful right now, it really needs a human who cares about it. It needs a human personal connection — someone watching what it does, making sure it's doing the right thing. The minute you sever that connection, the agent is not really useful anymore.

That's why it has shifted to a one-agent-per-company model. You set up a forward deployed engineer who's responsible for making sure that agent works for the whole company. As models get better at being independent, that will trickle down — we'll have more personal agents because we won't have to fuck around with all the internals. The mechanism is: agents need people who care about them.

LENNY: That is so interesting — you need to garden your agent because there's context you have to keep adding to it.

中文翻译

Dan:最喜欢的问题之一。看基准测试的话,AI 似乎要把我们的工作全抢走。METR 有个很酷的 benchmark,衡量最新模型能自主完成多长时间的任务。Anthropic 那个大家都担心的 Mythos preview,可以做 17 小时的任务,50% 准确率。妈呀,太疯狂了。

这是真的。模型进步是指数级的。但我的感觉是——一年后回头看,人类反而有更多事要做,即便模型变得更强。这中间存在一个非常有意思的悖论。

我的大预测是「分叉」成两条路。第一:公司里你至少有一个 agent,可以跟它对话、把活儿交给它;第二:你大部分工作其实会发生在电脑里 Codex 或 Claude Cowork 这种环境中。

OpenClaw 刚出来时,我坚信「人手一个 agent」——一个平行的影子组织架构。你看过《黑暗物质三部曲(金罗盘)》吗?就像每个人肩膀上都有一只精灵,是你灵魂的一部分。我真的以为正在发生这件事。

但我现在彻底反转了。我认为现阶段会是「超级 agent」模式——整个公司共用一个。Shopify 有一个很出名的,Ramp 现在也有了。OpenClaw 那种个人 agent 一开始很火,但人们发现「太麻烦了」,老是坏,要 SSH 到服务器去鼓捣。大多数人不想花这个时间。

底层的本质是:现阶段一个 AI agent 想要真正有用,必须有一个真心在意它的人——盯着它干啥、确保它干对的事。一旦你切断这种连接,agent 就立刻不好用了。

这就是为什么它转向了「一公司一 agent」。你雇一个 forward deployed engineer(前线部署工程师),由他负责让这个 agent 在全公司跑得起来。等模型独立性更强,这种模式才会下沉,让我们不用再去折腾内部细节,每个人才能拥有自己的 agent。机制就是:agent 需要有人在乎它。

Lenny:太有意思了——你得「侍弄(garden)」你的 agent,因为得不停往里加上下文。

Vocabulary & Expressions · 地道表达 & 高频词汇

Word / PhrasePOS释义 / 用法
benchmarkn.基准测试(科技圈高频词)
autonomouslyadv.自主地
paradoxn.悖论
bifurcatev.分叉
shadow org chartphr.影子组织架构
daemonn.(《金罗盘》设定中的)灵魂精灵;计算机里的「守护进程」
flip (one's view)v.(观点)彻底反转
hypen.炒作、大肆宣传
fumble aroundphr.笨手笨脚地摆弄
severv.切断(连接、关系)
forward deployed engineern.前线部署工程师(驻客户/驻业务的工程师)
trickle downphr.向下渗透/流淌
garden (a system)v.(隐喻)像园艺一样持续侍弄、维护一个系统
#04

Codex/Claude Code as the New OS for Work

Codex/Claude Code 成为工作的新「操作系统」
⏱ 18:09 – 23:45 · 原片节选 (Lenny & Dan)
原声节选,可调速跟读
ENGLISH

DAN: I'm so excited about this one. Anthropic realized that with Claude Code, if you put an agent on your computer, it has access to everything you have access to. It uses the terminal, so it has superpowered access. These agents really understand how to use the terminal because there's so much content online about it — that created a super powerful coding paradigm.

OpenAI was, in my opinion, very behind on this, then has surpassed them recently. When people were still thinking about coding agents as pair programmers, Anthropic was among the first to say no, agents should run on your computer. There were people before, like Devin, who had the big cloud environment, but real adoption happened when you put it on your computer.

Once you have a coding agent on your computer that can build anything, it's actually really good for any kind of work. People started just hacking Claude Code to do all of their work. Anthropic then built Cowork — a nicer wrapping around Claude Code, fundamentally the same thing.

OpenAI's earlier Codex was very technical, super smart, but a little bit autistic. They didn't quite get what you meant — they got exactly what you said. Around the time they launched 5.3, they moved in this direction: 'Oh no, we get it. This model is fast, it's really good for general purpose knowledge work.' Then they launched the Codex desktop app.

Codex right now is my daily driver. I spend all my time in it. When I'm writing a document, Codex has an in-app browser. I open it, go to my doc (I usually do it in Proof, an online markdown editor I built), and Codex is running and watching me. I can see what Codex is doing. It's all in one place. I feel like I have this parallel work buddy that can respond, write, do research, use my computer.

I've been at inbox zero for 10 days straight, which if you know me is crazy. I literally have Codex gather all my emails with Cora, our email agent. It renders a little page, and I just monologue into it: 'Okay, go research this. Here's a question from our lawyers. Can you go collect all the documents from the last four years?' And it just does it. All the stuff I would procrastinate on, I don't really procrastinate on anymore.

For a long time, I thought the optimal experience of AI was to take AI and put it in a browser. The reverse is starting to happen and be really valuable: take the AI agent that you use all the time on your computer and put a browser in it so it can see everything you're doing. That's a magical combination.

中文翻译

Dan:这一段我超级兴奋。Anthropic 意识到——把 Claude Code 这种 agent 放到你电脑上跑,它能拿到你拥有的所有权限。它用 terminal,等于「超级权限访问」。

而且模型对 terminal 很在行,因为网上关于命令行的内容海量,模型训练数据足。这一下创造了一个超强的编码范式。

OpenAI 一段时间在这件事上落后很多,最近反超了。当大家还把编码 agent 想成「结对编程伙伴」时,Anthropic 是最早一批说「不,agent 要跑在你机器上」的。之前像 Devin 那种走云端大环境,但真正爆发还是放到你自己电脑上才发生。

一旦电脑上有了一个能盖任何东西的编码 agent,你会发现——它对任何工作都好用。人们开始疯狂「魔改」Claude Code 来做各种事。然后 Anthropic 做了 Cowork,本质就是 Claude Code 的更友好外壳。

OpenAI 早期版本的 Codex 非常技术、非常聪明,但有点「直男」——你说什么它就严格按字面做,听不出言外之意。大概 5.3 版本前后,他们转向:「我们懂了,这个模型快、适合通用知识工作。」然后推出了 Codex 桌面应用。

现在 Codex 是我的「日常主力」。我几乎全天泡在里面。写文档时,Codex 有一个内嵌浏览器,我打开它,跳到我自己做的在线 Markdown 编辑器 Proof,让 Codex 一边运行一边看着我。我看着它干啥,它看着我干啥,全部在同一个地方——这是 Claude Code 最初体验的延伸。我感觉自己有了一个「并行的工作搭档」。

我连续 10 天 inbox zero 了,认识我的人都知道这疯狂。我让 Codex 配合我们自家邮件 agent Cora 拉所有邮件,渲染成一张小页面,我对着它「自言自语」:「去查这个;律师有个问题,把过去四年所有相关文档收集成一份报告发出去。」它就做了。我以前会拖延的事,现在基本不拖了。

我一度以为 AI 的最佳形态是「把 AI 放到浏览器里」。结果反过来才有意思:把你电脑上常用的 AI agent 装一个浏览器进去,让它能看见你在干什么。这种组合非常神奇。

Vocabulary & Expressions · 地道表达 & 高频词汇

Word / PhrasePOS释义 / 用法
paradigmn.范式(学界/科技圈高频词)
pair programmern.结对编程伙伴
daily drivern.(俚语)「主力工具/座驾」(原指最常开的车)
monologue into sth.v.对着……自言自语
procrastinatev.拖延
inbox zeron.邮箱清零(一种工作目标)
autisticadj.(俚语化)字面理解、不解风情(此处口语,不严格指医学术语)
hack (a tool)v.魔改、巧妙利用
wrappingn.封装/外壳
parallel work buddyphr.并行工作搭档
affordancen.(设计学)可供性、能让用户/agent 做什么的特性
magical combinationphr.神奇的组合
#05

SaaS Lives Inside the Agent

SaaS 将运行在 Agent 之内
⏱ 23:45 – 25:39 · 原片节选 (Lenny & Dan)
原声节选,可调速跟读
ENGLISH

LENNY: This is more profound than it may even sound. Instead of AI being baked into SaaS tools, you're predicting that SaaS tools will run within Codex or Claude Code?

DAN: That is one really important second-order effect. Yeah, I'm using Proof or PostHog or whatever inside of my agent, and the agent has access to the website. It has access to everything I have access to, and it has access to my whole computer. When I run the agent on that website, I'm using my tokens — I'm not using the vendor's tokens, I'm not using the app's tokens.

So it puts SaaS back in this place where, yeah, you want to make it friendly for an agent — everyone's got a CLI now, you want to make the HTML really usable, make sure anything that happens in the CLI shows up for the user immediately. There are a lot of issues to deal with. But once you do that, you actually don't really need to think about having an AI surface that's primarily going to be the thing that users use, in the sense that you don't need to build an agent necessarily into your product.

There's another really interesting bifurcation: having two agents is better than one. With Proof, anyone who uses it, I don't pay for tokens because they bring their AI to Proof. It changes what you build as a SaaS company — you build it now for both humans and agents to use at the same time, and it changes your margins back to: well, I don't really have to pay for tokens anymore because the user's going to bring AI.

中文翻译

Lenny:这比听上去要深刻得多。你的预测不是「AI 被嵌进 SaaS 工具」,而是反过来——SaaS 工具会运行在 Codex 或 Claude Code 之内?

Dan:这是个非常重要的「二阶效应」。我在 agent 里用 Proof、PostHog 之类的,agent 能访问那个网站,能拿到我所有的权限,还能访问我整台电脑。我在那个网站上跑 agent 时,烧的是「我自己的 token」,不是厂商或应用方的 token。

这就把 SaaS 重新放回一个位置:你要让产品对 agent 友好——大家现在都有 CLI 了,你要把 HTML 写得对 agent 可读,CLI 里发生的事要立刻反映在用户界面上。要处理的细节很多。但你一旦这样做,你就不必非得在产品里硬塞一个 AI 界面,让它成为用户主要的使用入口。

还有一个有意思的「分叉」:两个 agent 比一个好。比如用 Proof 的人,我不用付 token 钱,因为他们「把自己的 AI 带进来」。这件事改变了 SaaS 公司怎么造产品——你现在是为「人 + agent 同时使用」来设计;也改变了你的毛利结构——我其实不必再为 token 付费,用户会自带 AI。

Vocabulary & Expressions · 地道表达 & 高频词汇

Word / PhrasePOS释义 / 用法
profoundadj.深刻的
baked intophr.嵌入到、内置在……里
second-order effectn.二阶效应(先有一阶变化,再衍生出的间接影响)
vendorn.供应商、厂商
tokenn.(AI 计费单位)一段文本片段
CLI (command-line interface)n.命令行界面
surfacen.(产品语境)「触达面/产品入口」
bifurcationn.分叉、二分
marginn.毛利率/利润率
#06

CLIs Are Over

CLI 时代已结束
⏱ 31:13 – 32:35 · 原片节选 (Lenny & Dan)
原声节选,可调速跟读
ENGLISH

LENNY: A lot of people are moving to CLI and trying to work from the terminal. Is part of this prediction that people shift away from that and back to actual UX with agents running alongside them?

DAN: Oh, yeah. CLIs are over. We speed-ran the CLI era. It was nice while it lasted, but it's pretty clear... Sorry. It's not that CLIs are going to completely go away — obviously they've been around for the last 30, 40, 50 years. They will continue to be around.

There was this moment when Claude Code was so popular, and people were like, 'The thing that's working is the fact that it's the CLI.' I don't think that's what it is. When you move into an actual UI for this, you start to realize we made GUIs for a reason. It's just nicer to be in a GUI, and you can get all the same benefits inside of a GUI, especially for non-programmer work.

The majority of the technical people inside Every are not using CLIs anymore as their main work surface. A lot of programmers are still flipping into it every once in a while, but more or less they're using Codex, Claude Code, Cursor — that kind of thing.

中文翻译

Lenny:现在很多人转向 CLI、试着从命令行工作。你的预测里是不是包含——大家又会从 CLI 回到「实际的 UI」,让 agent 在旁边跟着跑?

Dan:对,CLI 已经结束了。我们用极速通关的方式过了一遍 CLI 时代。它一度很美好,但已经很明显——更正一下,不是说 CLI 会彻底消失,它已经存在 30、40、50 年了,会继续存在。

Claude Code 大火的那个阶段,大家以为「让它好用的关键就是 CLI 本身」。我不这么认为。当你转到一个真正的 UI 里,你会重新意识到:我们当年发明 GUI 是有原因的。在 GUI 里更舒服,且你能获得 CLI 的所有好处,对非程序员来说尤其如此。

Every 内部的技术人员,绝大多数已经不再把 CLI 当作主工作面了。很多程序员偶尔还会切回去,但主要都在 Codex、Claude Code、Cursor 这些工具里。

Vocabulary & Expressions · 地道表达 & 高频词汇

Word / PhrasePOS释义 / 用法
speed-run (the era)v.(俚语)以极速「通关」一段时期
nice while it lastedphr.持续期间挺美好(一种带轻叹的固定表达)
GUI / UIn.图形用户界面 / 用户界面
flip into / out ofv.切换进入/退出(某种工作模式)
work surfacen.工作面(此处指主操作环境)
#07

Automation Is a Lie & The Senior Engineer Benchmark

「自动化」是个谎言 & 资深工程师基准测试
⏱ 39:15 – 45:52 · 原片节选 (Lenny & Dan)
原声节选,可调速跟读
ENGLISH

DAN: Automation is a lie, in the sense that every time you automate something, in order to make sure the automation is working well, you need a human on top of it making sure that it's working well.

I wrote this piece a couple years ago about the allocation economy — the idea that the way humans work with AI is going to be like being a manager. The thing you have to remember is managers actually spend a lot of time working. Most managers are not on the beach. They're checking in with their employees all the time, trying to figure out how to make the work better, how this person is doing.

There are differences between being a human manager and a model manager, but fundamentally it still requires a lot of time and attention. We miss that in the model discourse. One reason is benchmarks make it look like AI is more autonomous than it is.

I made my own benchmark — the senior engineer benchmark. I have this app, Proof. I vibe-coded it on the side while running the rest of Every. When we launched, it just kept going down. I had a lot of egg on my face. I'd say, 'Codex, fix it,' and Codex was like, 'I don't know what's going on,' or 'I do know — I fixed it.' And then it would cause four other errors, and I'd be going around in a circle. I wasn't sleeping. I vibe-coded so hard I got bursitis on my elbow. There's a life lesson in there.

LENNY: Vibe coder elbow.

DAN: I got two senior engineers to fix it independently — two different rewrites. Now when we get a new model, I just give it a prompt: 'This is vibe-coded slop. If you wanted to rewrite it from first principles, how would you?' All models until GPT-5.5 got 30 out of 100. A human senior engineer gets high 80s, low 90s. GPT-5.5 jumped to 62 — using an Opus 4.7 plan. Opus 4.7 plans are very good. GPT-5.5 is the only model with the sense of agency to rip out old code and rewrite from first principles. Other coding models paper over the edges — 'Oh, this is a big job, I'll just do a little patch.' And you're like, 'No, I specifically told you not to.'

It's clear that in a year or less it'll be senior engineer level. But here's the catch — I can change the benchmark to zero out the current model anytime. It took me a while to get to a prompt that didn't give away the answer but got the model to reveal what it's capable of. The original prompt I gave when production was going down was: 'We had four or five reported issues yesterday. Make a plan and resolve them all.' Every coding model on the market will take that seriously and try to fix the issues. What an actual senior engineer does is look at the code base and say, 'This is a piece of shit. We're going to have to actually rewrite a lot of this and it's going to be hard and risky.' Benchmarks rise on problems we've framed and can score, but a lot of human work can't be scored until you write it down — and the act of thinking to prompt it is something you can't measure. Even if benchmarks get saturated, it doesn't mean you replace all senior engineers.

中文翻译

Dan:「自动化」是个谎言。每次你把一件事自动化,为了确保它跑得好,你需要一个人在上面盯着、确保它跑得好。

我两年前写过一篇文章叫「分配经济(the allocation economy)」——人和 AI 协作的方式会像「做管理者」。你要记住,管理者其实工作量很大,大多数管理者不是在沙滩躺着,而是不停地和下属对齐、想怎么把活儿做得更好、这人怎么样。

管理人和管理模型有差别,但本质都需要大量的时间和注意力。我们在「模型话语」里常常忽略这点。原因之一是 benchmark 会让 AI 看起来比实际更自主。

我自己做了一个 benchmark,叫「资深工程师基准测试」。我有个 app 叫 Proof,是我抽空 vibe-code 出来的(一边运营 Every 一边写)。上线那天它一直挂,我脸都丢光了。我说「Codex,修一下」,Codex 一会儿说「不知道怎么回事」,一会儿说「我修好了」,然后修一个出四个错,绕圈子转。我没睡觉。我 vibe-code 太猛,搞得手肘患上滑囊炎,里面有教训。

Lenny:vibe code 肘。

Dan:我请两位资深工程师各自独立重写了一遍代码。每次新模型出来,我就把同一个 prompt 喂给它:「这是 vibe-code 出来的垃圾代码,如果让你从第一性原理重写,你会怎么写?」GPT-5.5 之前所有模型都是 30/100,人类资深工程师是 80 多到 90 多。GPT-5.5 一下跳到 62,并且是用 Opus 4.7 的计划生成 + GPT-5.5 执行(Opus 4.7 的「计划」非常厉害)。GPT-5.5 是唯一一个有「胆量与判断」直接把旧代码撕掉重写的模型。其他模型只会糊边——「这工作量大,我打个小补丁就行」,你心想:「不!我明确告诉你别这样。」

一年之内它一定能达到资深工程师水平。但请注意——我随时可以把 benchmark 改一下,让现役模型重新归零。我花了好一阵子才造出一个「不剧透答案但能逼模型展现真本事」的 prompt。当年我修线上故障时给的原版 prompt 是:「昨天有 4–5 个报错,做个计划,逐个解决。」市面上所有编码模型都会「认真」按这个去做,去修。可一个真正的资深工程师会看一眼代码库,然后说:「这玩意是坨屎,我们得大改,会难、会有风险。」——它不会主动这么说。Benchmark 是建立在「我们已经能描述、能打分」的题目上,而很多人类工作得先「写下来」才能打分;而「想到要写下来」这件事本身没法测量。哪怕 benchmark 被刷爆,也不意味着资深工程师全被替代。

Vocabulary & Expressions · 地道表达 & 高频词汇

Word / PhrasePOS释义 / 用法
a liephr.(修辞)「是个谎言」(夸张说法,表示不真实)
on top of sth.phr.盯着、掌控某事
allocation economyn.分配经济(Dan 创造的概念)
check in with sb.phr.和某人对齐/同步进度
model discoursen.关于(AI)模型的公共讨论
vibe-codev.(新造)凭感觉用 AI 写代码(业内梗)
egg on one's facephr.出洋相、丢脸
bursitisn.滑囊炎
from first principlesphr.从第一性原理(推导)
paper over (the edges)phr.敷衍了事、用纸糊一糊
agencyn.能动性、主动判断的能力
rip outphr. v.撕掉、彻底拆掉
saturated (benchmark)adj.被刷爆/饱和的
frame a problemphr.把问题框定/界定清楚
#08

The Forward Deployed Engineer Is the New Essential Role

前线部署工程师 (FDE) 是新关键岗位
⏱ 53:15 – 56:16 · 原片节选 (Lenny & Dan)
原声节选,可调速跟读
ENGLISH

DAN: There are definitely new job roles that are a thing. The Forward Deployed Engineer concept is for real, and it comes out of 'every agent needs a human.' Go to the big model companies — they have agents running internally, with teams of people running them. I don't think those teams are going away. Models will get more powerful, agents will get more powerful, the number of agents will grow — but people will still manage them.

That looks like a very specific kind of person. We have a couple of them internally, and they're the people in charge of making sure your agents are working and doing the right thing. We also do consulting and lend that out. It's another place where you go: hmm, automation was supposed to take away jobs, but it just created one — or many.

There's a specific type of engineer who really loves this. Nitesh, our AI engineer, fits this forward deployed category. He spends most of his time talking to one of our agents in Slack. We have an agent internally called Claudy that runs our whole consulting practice — he's in Slack with it constantly. There's code, and he uses Claude Code, but a lot of it is just talking: 'Why did you do this dumb thing? Let's fix that.'

Certain engineers love that — having their hands on the latest thing, and loving making this 'being' that's in a workspace. It looks a bit different than building more traditional software.

LENNY: Your sense is we're not near a place where these agents don't need a human?

DAN: Yes. I'm simultaneously extremely AI-pilled, extremely, and very bullish on humans and the role of humans in making sure AI is working well.

中文翻译

Dan:确实出现了一些新岗位。「前线部署工程师 (Forward Deployed Engineer, FDE)」这个概念是真的,它来自「每个 agent 都需要一个人」这条规律。看那些大模型公司,内部都跑着 agent,每一个 agent 背后都有一个团队在运营。这些团队不会消失。模型会更强、agent 会更强、agent 数量会更多,但仍然需要人去管它们。

这是一类很具体的人。我们内部就有几个,专职确保 agents 跑得对、做正确的事。我们还把这种人外派给客户做咨询。这是又一个让你忍不住感叹的地方:自动化本来「会消灭岗位」,结果创造了一种新岗位,甚至好几种。

有一类工程师特别喜欢这事。我们家的 AI 工程师 Nitesh 就属于 FDE。他大部分时间在 Slack 里跟我们的 agent 聊。我们有个内部 agent 叫 Claudy,它运营了我们整条咨询业务。Nitesh 在 Slack 里几乎一直跟它说话。当然也有代码——他用 Claude Code,但大量工作是「谈话」:「你为啥干了这件蠢事?修一下。」

某些工程师非常爱这种感觉——手放在最新的东西上,并且喜欢「养出」一个生活在工作空间里的「存在体(being)」。它跟造传统软件不太一样。

Lenny:你的感觉是——离「agent 不再需要人」还很远?

Dan:是的。我同时既极度 AI-pilled,又非常看好「人」和「人去保障 AI 跑得好」这个角色。

Vocabulary & Expressions · 地道表达 & 高频词汇

Word / PhrasePOS释义 / 用法
for realphr.(口语)千真万确、不是假的
lend (sth.) outphr. v.把…借出去/外派
beingn.(哲学化用法)「存在体」「生灵」
hands on the latest thingphr.上手玩最新的东西
specific kind of personphr.「特定类型的人」(描述某种气质)
AI-pilledadj.AI 信徒(同 Segment 1)
#09

PMs & Full-Stack Designers Become Superheroes

PM 与全栈设计师,成为超级英雄
⏱ 1:08:40 – 1:13:09 · 原片节选 (Lenny & Dan)
原声节选,可调速跟读
ENGLISH

DAN: I am super, super bullish on PMs. My anecdotal case that has convinced me of this: we have a guy internally, Marcus, who runs Spiral, our writing app. Marcus is a PM by training. He previously ran Axios's writing product, was a PM, had a big team, got it to tens of millions in ARR. He took a year off, got super AI-pilled, and just learned how to use Cursor really well. Now he uses Claude Code.

I would call him lightly technical — knows what a database migration is. If he has to look at the code, I think he can understand it. We never could have hired him to do this job even a year ago, but coding models have gotten good enough that he can pair the technical knowledge he does have with his really spiky product sense — sense for writing, sense for users — and it's so dangerous. He ships faster than almost anyone on the team. He has an eye for every single user, every single conversation, what it means and how to collect it into a story about where to go next, what to fix.

He feels liberated because he doesn't have to organize a whole team to do that. He can just do it. It makes me very, very bullish on any PM who gets really AI-pilled.

LENNY: Music to my ears. The skills you need to build are the things — the building is done for you. What do you need to be good at? Figuring out what to build, figuring out if it's great, figuring out what problems to solve.

DAN: The other people who I think are going to be super power people are full-stack designers. If you're a designer using these tools all the time, you're so used to: I make this beautiful interaction, and the engineer just doesn't want to do it, or it doesn't happen the way I think it should. I see so many designers internally and externally now feel so empowered to build because they have all these ideas to make things look amazing — interesting interactions. That's exactly the thing that's hard to do with vibe coding because it all looks the same. They can make stuff that looks so different, and now they can actually build it. When we work with them internally, they're just making pull requests. They don't need to hand it off as much. The thing is built and that's it. There's a huge opportunity for those people to become entrepreneurs and start their own thing because they can make stuff now.

中文翻译

Dan:我极度看好 PM。让我相信这一点的「内部样本」是 Marcus,他负责我们的写作产品 Spiral。Marcus 出身就是 PM,之前在 Axios 带过一个大团队,把写作产品做到了千万美金级 ARR。他停职一年,AI-pilled 到顶,把 Cursor 用熟透了,现在主要用 Claude Code。

我说他「轻度技术」——他知道 database migration 是啥,看代码他能看懂。一年前我们不可能招他干这件事,但模型变得足够好以后,他能用「不算多」的技术功底,搭配他「锐利的(spiky)」产品直觉、写作直觉、用户直觉,威力巨大。他出货比团队几乎所有人都快。他对每一个用户、每一段对话都有眼光,能把这些素材汇聚成一个关于「我们接下来去哪」的故事,并知道要修什么。

他觉得自己被「解放」了,因为不用再组织整支团队去落地,他自己就能干。这让我非常非常看好任何「真心 AI-pilled 的 PM」。

Lenny:太合我胃口了。需要打磨的技能恰好是那些——「建造」这件事系统替你做了。你要擅长什么?想清楚做什么、判断它好不好、找对要解决的问题。

Dan:另一类我觉得会变成「超级英雄」的,是全栈设计师。设计师天天在工具里待着,习惯了「我做了一个漂亮的交互,工程师不想做 / 或没按我想的做出来」。我现在看到大量内部和外部的设计师,因为「自己能造」而极度被赋能。他们脑子里有大量「让东西看起来很厉害、有有趣交互」的想法。而这恰恰是 vibe coding 难以做到的——vibe code 出来的东西长得都一样像「slop」。设计师能做出「长得跟别人完全不一样」的东西,而现在他们能真的把它写出来。我们和他们合作时,他们直接提 PR,不太需要交接。东西就这么造出来了。这对他们来说是巨大机会,可以变成创业者。

Vocabulary & Expressions · 地道表达 & 高频词汇

Word / PhrasePOS释义 / 用法
by trainingphr.出身/受训于(某专业)
lightly technicalphr.轻度技术背景的
ARR (Annual Recurring Revenue)n.年度经常性收入(SaaS 关键指标)
spiky (sense)adj.(夸张化)「尖锐的、突出的」(指能力突出某一面)
shipv.出货、发布
liberatedadj.被解放的
music to one's earsphr.听了简直心花怒放
empoweredadj.被赋予能力的
hand it offphr.交接出去(给下游)
PR (pull request)n.代码合并请求
entrepreneurn.创业者
#10

How to Stay Employed: Ride the Models

保住饭碗的唯一方法:「Ride the Models」
⏱ 1:16:25 – 1:24:32 · 原片节选 (Lenny & Dan)
原声节选,可调速跟读
ENGLISH

DAN: The only thing you need to do is ride the models. That means use them for whatever it is that you do. Codex and Cowork are becoming the standard operating system for work. When new models come out, try them and figure out: now there are new powers, how can I use them? Instead of being like, 'I'm going to try to ignore it because it makes me afraid' — which is honestly a rational, reasonable response. If you ride on top of them, they extend your powers in a way that doesn't leave you behind. You're part of the future and part of the way work happens.

LENNY: I like this term, 'ride the model.' What should someone working at, say, Salesforce do to ride the model?

DAN: A lot of companies handicap their employees from even doing this — at Salesforce you may not be able to use the latest models. You may have to do it in your off time. The thing I really like to do with new models is play. There are certain things where I know it can't quite do it yet, but when a new model comes out, I always turn the rock over again to see: 'Can I do it now?'

The way to ride the models is not one specific thing because they're always changing. It is to be curious and playful — to apply the new model to whatever you care about, whether that's your job or something outside, and to keep turning over rocks. It may not work now, but it may work eventually. The way you use it matters.

People think the edge of AI is in San Francisco. I actually don't think that's where it is. The edge of AI is wherever AI meets a real human doing something. The people in San Francisco are making it, but they don't actually know everything about how to use it. They need to see how other people use it. Whenever a new model comes out, you get to be one of the first people in the world to discover what it might be useful for. It's like a new discovery.

DAN (closing): Ride the models. Try all of your workflows in Codex or Cowork. If your company doesn't let you, do it on your own time. Try out agent products like OpenClaw, Hermes, or for less technical people, Viktor. Get comfortable with both ways of working — and try to have fun. Too many people are doing this out of FOMO, fear of losing their job. The best way to figure out useful things to do with AI is to do something enjoyable.

中文翻译

Dan:你唯一要做的事,就是「Ride the Models(骑在模型背上)」——也就是无论你干什么,都把模型用起来。Codex 和 Cowork 正在成为工作的标准操作系统。新模型出来时,试一试,弄明白「现在多了什么新能力,我怎么用」。而不是「我害怕,所以躲着不看」——后者其实是一种很自然、很合理的反应。但你「骑」在模型上面,它就会延伸你的能力,不会把你落下。你就是「未来」的一部分,是新工作方式的一部分。

Lenny:「ride the model」这词我喜欢。比如一个在 Salesforce 上班的 PM,怎么 ride?

Dan:很多公司其实把员工「绑住」了——比如 Salesforce 不一定让你用最新模型。那你可能就得用业余时间去玩。我对新模型最爱做的事就是「玩」。有些事我知道现在还做不到,但每次新模型出,我都把那块石头再翻一次:「现在能行了吗?」

Ride the model 没有「某个具体动作」可言,因为模型一直在变。它是一种态度:保持好奇与玩心,把新模型套在你真正在乎的事情上——工作内的或工作外的,不停翻石头。现在不行,迟早会行。怎么用它,很关键。

人们以为 AI 的「最前线」在旧金山。我不这么认为。AI 的最前线,是「AI 撞上一个真在干事的真人」的那个地方。旧金山的人在做模型,但他们其实并不全懂怎么用它,他们要看别人怎么用。所以每当新模型发布,你都有机会成为全世界最早发现「这玩意能干什么」的那批人之一——像一次新大陆的发现。

Dan(收尾建议):Ride the models。把你的所有工作流在 Codex 或 Cowork 里试一遍。公司不让你用?业余时间用。试试 agent 类产品——OpenClaw、Hermes,技术弱一点的人可以用 Viktor。对这两种工作方式都熟起来——并且玩得开心。太多人是因为 FOMO、怕失业才做这件事;找到「AI 能为你做什么」的最佳方式,是从让你开心的事下手。

Vocabulary & Expressions · 地道表达 & 高频词汇

Word / PhrasePOS释义 / 用法
ride the modelsphr.(创造表达)「骑在模型上」——随每次模型升级,把它纳入自己的工作流
operating system for workphr.「工作的操作系统」(隐喻)
handicap (sb. from doing)v.妨碍/阻止某人做某事
off timen.下班时间、业余时间
turn over the rockphr.(隐喻)把石头翻一翻——重新检验某事是否成立
the edge of AIphr.AI 的最前沿
FOMO (fear of missing out)n.错失恐惧症
get comfortable withphr.熟悉、上手
workflown.工作流