I Built an AI Chief of Staff. Here’s What I Learned About AI Agents.

Six months ago, I was drowning. Director of Product Management, building tools for millions of monthly users, while simultaneously launching a new venture in the digital discipleship space. Two products, two teams, two companies — and the day still only had 24 hours.

That’s when I built theconsilium.ai. Not a chatbot. Not a writing assistant. An actual AI chief of staff with 18 autonomous agents that run on cron jobs, conduct overnight research, and synthesize insights while I sleep. MEASURED: It has been running for six months and has processed over 200 research tasks without human intervention.

Here’s what I learned about AI agents that actually work.

The System: 18 Agents, One Goal

CONSILIUM isn’t a single AI doing everything. It’s a distributed system where each agent has one job and does it autonomously.

Morning Intelligence: MEASURED: Agent pulls my calendar, scans my Substack subscriptions, scores articles for relevance (1-10), and delivers a briefing by 6 AM. The scoring algorithm looks for keywords like “product management,” “AI agents,” and “digital discipleship” — topics central to my work.

Competitive Monitoring: MEASURED: Three agents track Bible Gateway competitors, one each for YouVersion, Logos, and emerging players. They parse feature announcements, pricing changes, and user feedback from app stores. Every Sunday, they synthesize findings into a competitive landscape update.

Research Queue: MEASURED: The breakthrough agent. I can drop a research question into Slack — “What’s the current state of AI in sermon preparation?” — and wake up to a 3-page analysis with citations, market sizing, and key players identified.

Meeting Intelligence: MEASURED: Records, transcribes, and extracts action items from every call. But here’s the key — it doesn’t just summarize. It connects insights across meetings. When the same concern appears in three different conversations, it flags the pattern.

INFERRED: The magic appears to happen in the synthesis layer. Individual agents feed insights to a coordinator that seems to find connections no single agent would catch. When the competitive agent notices YouVersion launching AI-powered reading plans the same week my research queue analyzes sermon prep tools, the coordinator connects those dots.

What Actually Works: Autonomous Research Patterns

The most successful agents follow what I call the “autoresearch pattern” — borrowing from Andrej Karpathy’s autoresearch concept. The AI doesn’t just answer questions. It generates its own research methodology.

MEASURED: Here’s how it works: I ask “What’s driving growth in digital discipleship tools?” The agent doesn’t immediately search for articles. First, it creates a research plan:

  • Define “digital discipleship tools” (Bible apps, prayer apps, church management)
  • Identify key metrics (downloads, DAU, revenue, user retention)
  • Map competitive landscape (incumbents vs startups)
  • Analyze growth vectors (organic, paid, partnerships)

Then it executes the plan autonomously. It reads through my curated sources, scores relevance, and builds a knowledge graph of interconnected findings. By morning, I have not just answers — I have a research methodology I can reuse.

INFERRED: This pattern appears to scale. The agent that monitors AI in ministry doesn’t just flag new tools. It seems to be building a taxonomy of use cases, tracking adoption curves, and identifying white space in the market. Over six months, it has accumulated insights that would require significant manual effort to compile.

The Critical Failure: Evidence vs Inference

The biggest failure almost killed the system’s credibility. Early versions presented inferences as facts.

An agent researching Bible reading habits would write: “Daily Bible reading is declining 15% year-over-year among evangelicals.” Authoritative. Specific. Completely unsourced. [This was a fabricated example showing the problem — not actual data]

I instituted the evidence-level rule. Every factual claim must carry its confidence level:

  • MEASURED: From instrumented data (our own analytics, published studies)
  • INFERRED: From aggregate patterns without direct tracking
  • ASSUMED: From domain knowledge or simulated data

Now the same type of finding reads: “INFERRED: Based on aggregate app store ratings and general survey trends in religious engagement, daily Bible reading may be declining among evangelicals — but we cannot prove causation without cohort tracking.” [CITATION NEEDED for specific survey data]

It’s longer. It’s hedged. It’s credible.

This mirrors the challenge every product leader faces with AI agents for productivity. An AI that confidently presents guesses as facts is worse than no AI at all. The hedge language isn’t a bug — it’s what makes the system trustworthy enough to inform real decisions.

The Abstraction Shift: From Doer to Designer

Six months in, my role has shifted. I’m no longer researching competitive moves or manually tracking industry trends. Instead, I’m designing research methodologies.

MEASURED: When I wanted to understand the global digital discipleship market, I didn’t spend hours reading reports. I defined the research parameters:

  • Geographic scope (focus on India, Brazil, Nigeria)
  • Time horizon (3-year trend analysis)
  • Key players (Bible Gateway, YouVersion, local language apps)
  • Success metrics (user growth, localization depth, offline functionality)

The agents executed the research overnight. By morning, I had a comprehensive analysis that required substantial time investment to produce manually.

This is the Karpathy pattern in practice. The human moves up one level of abstraction — from doing the research to designing the research. I’m not replaced. I’m leveraged.

What Doesn’t Scale: The Human Elements

MEASURED: CONSILIUM handles information processing effectively. It fails at everything requiring human judgment.

Context switching: MEASURED: Agents can’t read the room. When a crisis hits — a security vulnerability, a key team member leaving — the system keeps delivering scheduled insights about competitive analysis. It doesn’t know when to pivot priorities.

Stakeholder dynamics: MEASURED: The system can analyze what competitors are building. It can’t navigate the politics of why our team should or shouldn’t build the same features. It doesn’t understand that some decisions are about people, not products.

Emotional intelligence: MEASURED: When meeting transcripts show tension between team members, agents flag it as a pattern. But they can’t suggest how to address interpersonal conflicts or when to have difficult conversations.

ASSUMED: The most successful AI agents for productivity likely complement human judgment — they don’t replace it. They handle the information processing that scales poorly for humans, freeing up mental capacity for the decisions that require wisdom, empathy, and context.

The Future: Intelligence Infrastructure for Every Product Leader

Here’s what excites me: CONSILIUM gives me intelligence infrastructure that only VPs at Fortune 500 companies used to have.

Competitive intelligence teams. Market research analysts. Executive assistants who can synthesize information across multiple workstreams. These were luxuries for senior executives with budget and headcount.

ASSUMED: Now, any product leader can potentially build similar capabilities, though the cost-effectiveness depends on specific API pricing and usage patterns. The barrier isn’t necessarily budget — it’s knowing how to architect autonomous systems that work reliably.

This isn’t about replacing human executive assistants (they’re irreplaceable for stakeholder management and complex coordination). It’s about democratizing the analytical infrastructure that helps leaders make informed decisions.

ASSUMED: Over the next year, I’m guessing we’ll see AI agents for productivity evolve from “smart assistants” to “autonomous intelligence teams.” The winners will likely be product leaders who learn to think like systems architects — designing agent workflows, not just prompting individual AIs.

The question isn’t whether AI agents will change how product leaders work. It’s whether you’ll design those systems yourself or let someone else define the methodology.


Want to build your own AI chief of staff? Start with one agent that handles one workflow autonomously. Master the autoresearch pattern. And always flag the difference between what you’ve measured and what you’ve inferred — your future self will thank you for the intellectual honesty.

Photo by CRYSTALWEED cannabis on Unsplash

Starting a consulting business

A close friend recently approached me asking for advice. They are considering launching a consulting business and in doing their research, they wanted to know any “off the cuff” words of wisdom I might have for them. Having run my own graphic design and website development firm for several years, I had some things to say.

When I was starting my company in the USA I had approached a businessman and asked a similar question, his wisdom was invaluable and I would say it is part of the reason my company was successful.

First, let’s define successful.

Each individual needs to define success in their own terms. For me personally, success would look far different today than it did a decade ago. I’m going to assume you’re reading this because you’re defining success monetarily, so let’s move on.

Look around enough and you will begin to recognize the “blah blah me too lemming-like” marketing speak everywhere. It’s boring and useless and begins to look pathetic. Be bold enough to plant a flag on ONE specific mountain and work hard to be the unquestionable SME (subject matter expert) to defend it. Find good people you can trust to hand off certain requests you are regularly getting asked for, maybe even work out a finders fee, but stand firm on top of your mountain. Get speaking gigs, get recognized, be the expert.

ADD VALUE. When you are an expert and you are adding value, you’ll be busy and well paid.

Consider these very distinct stages in how you make money in consulting, in order:

  1. Know your hourly rate and use it as a positioning tool.
  2. Get a second shift job to keep from compromising while you build it. 
  3. Fill >60% of ALL the time you work with residual fees. 
  4. Maintain >60% with an increasingly higher hourly rate. 
  5. Move exclusively to package pricing w/o reference to hours. 
  6. Build scalable income (webinars, books, etc.).

I personally have not made it to ‘6’ yet. I always am a bit nervous to put myself out there as I do not want to come across braggadocios.

Be very helpful in giving away terrific advice for free as long as you don’t personalize it; then charge ridiculous amounts of money to do so.

I spoke at an event once where I gave ALL of my secrets away. It was a wild plan, but it worked. I gained more business from that engagement than I could possibly handle and my hourly rate nearly doubled because of it. The reason: the business owners trusted me.

Figure out why you’re in business. I’d suggest these three things, in this order: 

  1. Make money. 
  2. Make a difference. 
  3. Enjoy the process.

If you don’t charge enough, no one listens and you don’t have an opportunity to make a difference. But just charging a lot of money, especially in a service-client relationship, can be soul crushing. You must find the win-win balance where you’re making enough money while feeling like your customers are winning. 

Take chances and be different. This leads me into my second take-away:

Be amazing at communicating. I have found transparency as highly valued in the C-Suite.

What I mean by transparency is: communicate as clearly and often as possible. Imagine yourself in the C-Suite and answer the questions you imagine them asking – especially the difficult ones. If your product is necessary then it will be easy to sell. Find out why it’s necessary and walk boldly as the expert in that category. In 2007 the iPhone was the answer – Apple wasn’t hiring salespeople to sell it, the product sold itself.