AI as Coworker: What Ethan Mollick Gets Right, and What I’ve Learned Running It at Scale

Ethan Mollick’s vision of AI as genuine coworker — not just a productivity tool, but an active collaborator that maintains context, remembers your preferences, and proactively contributes — is compelling and mostly right. I’ve been living it. I run a multi-agent AI system that handles significant portions of my executive workflow: prep briefs, research synthesis, pattern analysis across user data, first drafts of strategy documents. These agents know my preferences. They’ve learned my decision-making patterns. At the task level, the coworker framing is accurate.

But Mollick’s framing also glosses over the messiest part of AI collaboration at scale: the fact that your AI coworkers are only as good as the assumptions baked into how you built them, and auditing those assumptions is ongoing work that doesn’t get easier as you add more agents.

What the AI-as-Coworker Reality Looks Like at Scale

The parts of Mollick’s thesis that hold up: context persistence is genuinely transformative. When an AI agent knows I prefer morning strategy calls, that I need prep briefs 24 hours before board meetings, and my formatting preferences for different document types, it stops being a tool and starts being something closer to a capable assistant who has done the job before. The cognitive overhead reduction is real.

What his framing underweights: AI coworkers amplify the perspective of whoever designed them. My localization agent is excellent at translation mechanics and surface-level cultural adaptation — but it consistently over-indexes on the cultural frameworks most represented in its training data. It knows language but doesn’t always know context. The bias isn’t obvious; it shows up in subtle calibration errors that only become visible when you’re specifically looking for them, which most teams aren’t.

This isn’t a knock on the AI. It’s a structural reality: any agent is trained on a corpus that reflects some perspectives more than others. At scale, those systematic biases matter.

The Four Things Mollick’s AI Coworker Vision Gets Right

Context persistence is the unlock. The shift from “start every session from scratch” to “this agent knows my context” is more significant than any individual capability improvement. Persistent context is what makes AI feel like a coworker rather than a tool.

Proactive synthesis is genuinely useful. The best AI coworker behavior I’ve experienced isn’t answering questions — it’s surfacing patterns before I know to ask about them. An agent that watches your metrics and flags anomalies when you’re focused elsewhere is doing coworker-level work.

Specialization beats generalization. A general-purpose AI assistant is less useful than five purpose-built agents, each with specific context and constraints for its domain. Mollick’s research on AI collaboration points toward this, and it matches my operational experience.

The oversight burden is real and non-negotiable. Mollick is clear on this: AI coworkers require human judgment about when to trust and when to verify. You can’t abdicate that responsibility to the agent itself. This is right, and teams that try to fully automate judgment get burned.

The Three Things His Framing Misses

Cultural blind spots compound. An AI coworker trained primarily on majority-culture data will systematically underserve minority contexts. At the product level, this means recommendations that are technically sound but contextually wrong for segments of your user base. You need explicit cultural review processes, not the assumption that the AI will figure it out.

AI context gets stale like technical debt. Just like code, what your agents know needs maintenance. An agent that learned your priorities six months ago may be operating on outdated mental models. I schedule regular “context audits” — reviewing what each agent remembers, what it should forget, and what new patterns it needs to understand. This isn’t automated; it requires human judgment about what constitutes useful institutional memory versus obsolete assumptions.

Confidence calibration is a product problem, not just an individual one. The most dangerous AI coworker behavior isn’t being wrong — it’s being wrong confidently. I’ve built in explicit disagreement triggers: agents that surface alternative hypotheses when confidence intervals are wide, rather than defaulting to the most plausible-sounding answer. Training yourself and your team to expect this matters too.

What I’d Add to Mollick’s Framework

The coworker metaphor is useful but should be extended: the best AI coworker is one you’ve onboarded intentionally, given a specific domain, provided with representative context for your user base, and built explicit escalation paths into. The same care you’d give a strong new hire — clear context, defined scope, regular calibration — applies to your AI agents.

For hybrid decisions (anything affecting significant segments of your user base differently), I’ve built explicit frameworks that require both AI analysis and human review before action. The AI coworker runs the analysis; a human makes the call. That’s not distrust — it’s appropriate division of labor based on where each is actually good.

If you’re thinking about how AI changes the product manager’s job, this connects directly to the hiring question — because the PM skills that matter most in an AI-native team are exactly the ones that help humans and AI agents collaborate well rather than substituting one for the other.


Your Turn: Apply This Today

If you’re managing a team where AI is either underused or feared, here’s how to move the conversation forward:

  • Pick one high-friction workflow and run an AI sprint on it. Identify the task your team does every week that everyone quietly dreads — the status update, the competitive analysis, the draft brief. Spend one sprint running it with AI assistance and measure the time difference.
  • Set an “AI working agreement” with your team. Explicitly discuss: what outputs require human review before they go to stakeholders? What tasks can be AI-first? What should never be delegated to AI? Make it a team norm, not individual discretion.
  • Train on prompting, not just tools. The productivity gap between teams isn’t the tool — it’s prompting quality. Run a 30-minute session where team members share their most useful prompts. Document the best ones in a shared library.
  • Measure quality, not just speed. Track whether AI-assisted outputs require more or fewer rounds of revision than non-AI outputs. Speed gains that come with quality losses are not wins. Establish the baseline before you declare victory.
  • Interview your team about their actual AI use — not their reported use. Ask privately: “When did AI produce something you used directly? When did it produce something that misled you?” The honest answers will reshape your AI enablement strategy.
  • Protect the judgment-intensive work from AI defaulting. Identify the decisions in your product process that require nuanced human judgment — prioritization trade-offs, difficult stakeholder calls, ethical edge cases. Explicitly protect those from being AI-first. Delegation without boundaries creates accountability gaps.

For the strategic infrastructure question that underlies all of this, Jensen Huang’s sovereign AI argument is worth reading alongside — because how you build your AI stack determines what your AI coworkers can actually do.

Building AI-native workflows into your product team and running into the messy gaps between the vision and the reality? I consult with product leaders on AI system design, agent architecture, and the organizational changes that make AI collaboration actually work. Let’s talk.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.