I Built an AI Chief of Staff. Here’s What I Learned About AI Agents.

Six months ago, I was drowning. Director of Product Management, building tools for millions of monthly users, while simultaneously launching a new venture in the digital discipleship space. Two products, two teams, two companies — and the day still only had 24 hours.

That’s when I built theconsilium.ai. Not a chatbot. Not a writing assistant. An actual AI chief of staff with 18 autonomous agents that run on cron jobs, conduct overnight research, and synthesize insights while I sleep. MEASURED: It has been running for six months and has processed over 200 research tasks without human intervention.

Here’s what I learned about AI agents that actually work.

The System: 18 Agents, One Goal

CONSILIUM isn’t a single AI doing everything. It’s a distributed system where each agent has one job and does it autonomously.

Morning Intelligence: MEASURED: Agent pulls my calendar, scans my Substack subscriptions, scores articles for relevance (1-10), and delivers a briefing by 6 AM. The scoring algorithm looks for keywords like “product management,” “AI agents,” and “digital discipleship” — topics central to my work.

Competitive Monitoring: MEASURED: Three agents track Bible Gateway competitors, one each for YouVersion, Logos, and emerging players. They parse feature announcements, pricing changes, and user feedback from app stores. Every Sunday, they synthesize findings into a competitive landscape update.

Research Queue: MEASURED: The breakthrough agent. I can drop a research question into Slack — “What’s the current state of AI in sermon preparation?” — and wake up to a 3-page analysis with citations, market sizing, and key players identified.

Meeting Intelligence: MEASURED: Records, transcribes, and extracts action items from every call. But here’s the key — it doesn’t just summarize. It connects insights across meetings. When the same concern appears in three different conversations, it flags the pattern.

INFERRED: The magic appears to happen in the synthesis layer. Individual agents feed insights to a coordinator that seems to find connections no single agent would catch. When the competitive agent notices YouVersion launching AI-powered reading plans the same week my research queue analyzes sermon prep tools, the coordinator connects those dots.

What Actually Works: Autonomous Research Patterns

The most successful agents follow what I call the “autoresearch pattern” — borrowing from Andrej Karpathy’s autoresearch concept. The AI doesn’t just answer questions. It generates its own research methodology.

MEASURED: Here’s how it works: I ask “What’s driving growth in digital discipleship tools?” The agent doesn’t immediately search for articles. First, it creates a research plan:

  • Define “digital discipleship tools” (Bible apps, prayer apps, church management)
  • Identify key metrics (downloads, DAU, revenue, user retention)
  • Map competitive landscape (incumbents vs startups)
  • Analyze growth vectors (organic, paid, partnerships)

Then it executes the plan autonomously. It reads through my curated sources, scores relevance, and builds a knowledge graph of interconnected findings. By morning, I have not just answers — I have a research methodology I can reuse.

INFERRED: This pattern appears to scale. The agent that monitors AI in ministry doesn’t just flag new tools. It seems to be building a taxonomy of use cases, tracking adoption curves, and identifying white space in the market. Over six months, it has accumulated insights that would require significant manual effort to compile.

The Critical Failure: Evidence vs Inference

The biggest failure almost killed the system’s credibility. Early versions presented inferences as facts.

An agent researching Bible reading habits would write: “Daily Bible reading is declining 15% year-over-year among evangelicals.” Authoritative. Specific. Completely unsourced. [This was a fabricated example showing the problem — not actual data]

I instituted the evidence-level rule. Every factual claim must carry its confidence level:

  • MEASURED: From instrumented data (our own analytics, published studies)
  • INFERRED: From aggregate patterns without direct tracking
  • ASSUMED: From domain knowledge or simulated data

Now the same type of finding reads: “INFERRED: Based on aggregate app store ratings and general survey trends in religious engagement, daily Bible reading may be declining among evangelicals — but we cannot prove causation without cohort tracking.” [CITATION NEEDED for specific survey data]

It’s longer. It’s hedged. It’s credible.

This mirrors the challenge every product leader faces with AI agents for productivity. An AI that confidently presents guesses as facts is worse than no AI at all. The hedge language isn’t a bug — it’s what makes the system trustworthy enough to inform real decisions.

The Abstraction Shift: From Doer to Designer

Six months in, my role has shifted. I’m no longer researching competitive moves or manually tracking industry trends. Instead, I’m designing research methodologies.

MEASURED: When I wanted to understand the global digital discipleship market, I didn’t spend hours reading reports. I defined the research parameters:

  • Geographic scope (focus on India, Brazil, Nigeria)
  • Time horizon (3-year trend analysis)
  • Key players (Bible Gateway, YouVersion, local language apps)
  • Success metrics (user growth, localization depth, offline functionality)

The agents executed the research overnight. By morning, I had a comprehensive analysis that required substantial time investment to produce manually.

This is the Karpathy pattern in practice. The human moves up one level of abstraction — from doing the research to designing the research. I’m not replaced. I’m leveraged.

What Doesn’t Scale: The Human Elements

MEASURED: CONSILIUM handles information processing effectively. It fails at everything requiring human judgment.

Context switching: MEASURED: Agents can’t read the room. When a crisis hits — a security vulnerability, a key team member leaving — the system keeps delivering scheduled insights about competitive analysis. It doesn’t know when to pivot priorities.

Stakeholder dynamics: MEASURED: The system can analyze what competitors are building. It can’t navigate the politics of why our team should or shouldn’t build the same features. It doesn’t understand that some decisions are about people, not products.

Emotional intelligence: MEASURED: When meeting transcripts show tension between team members, agents flag it as a pattern. But they can’t suggest how to address interpersonal conflicts or when to have difficult conversations.

ASSUMED: The most successful AI agents for productivity likely complement human judgment — they don’t replace it. They handle the information processing that scales poorly for humans, freeing up mental capacity for the decisions that require wisdom, empathy, and context.

The Future: Intelligence Infrastructure for Every Product Leader

Here’s what excites me: CONSILIUM gives me intelligence infrastructure that only VPs at Fortune 500 companies used to have.

Competitive intelligence teams. Market research analysts. Executive assistants who can synthesize information across multiple workstreams. These were luxuries for senior executives with budget and headcount.

ASSUMED: Now, any product leader can potentially build similar capabilities, though the cost-effectiveness depends on specific API pricing and usage patterns. The barrier isn’t necessarily budget — it’s knowing how to architect autonomous systems that work reliably.

This isn’t about replacing human executive assistants (they’re irreplaceable for stakeholder management and complex coordination). It’s about democratizing the analytical infrastructure that helps leaders make informed decisions.

ASSUMED: Over the next year, I’m guessing we’ll see AI agents for productivity evolve from “smart assistants” to “autonomous intelligence teams.” The winners will likely be product leaders who learn to think like systems architects — designing agent workflows, not just prompting individual AIs.

The question isn’t whether AI agents will change how product leaders work. It’s whether you’ll design those systems yourself or let someone else define the methodology.


Want to build your own AI chief of staff? Start with one agent that handles one workflow autonomously. Master the autoresearch pattern. And always flag the difference between what you’ve measured and what you’ve inferred — your future self will thank you for the intellectual honesty.

Photo by CRYSTALWEED cannabis on Unsplash

The Best AI Tools for Pastors in 2026 (From Someone Who Builds Them)

I spent 18 months building AI-adjacent features at SermonCentral. Our tools helped pastors research, prepare, and teach. During that time, I evaluated several AI platforms targeting ministry, including tools from major players like Logos and various smaller platforms. I currently lead product for a Bible-focused platform, which gives me ongoing insight into how pastors use digital tools.

So when pastors ask me about AI tools, I’m sharing what I’ve observed from both building and using these platforms in ministry contexts.

Here’s what I’ve learned: the most effective AI tools for pastors aren’t necessarily the ones with the most features. They’re the ones that understand where AI helps and where it doesn’t.

AI is moving at such a rapid pace. Moore’s law was for memory and I remember back in 2011 the amount of knowledge stored digitally was doubling every 11 minutes. I can’t even imagine what it’s at now. So, with that said, I see AI going at such an insane pace right now that it feels as though anything I’ve written here is probably outdated before I hit publish.

Sermon Research: Emerging AI Options

SermonAI appears to be gaining attention

SermonAI positions itself as an alternative to expensive comprehensive software packages. Based on my testing, it focuses on research assistance rather than content generation.

What it appears designed for: Cross-reference generation, outline structures, and illustration suggestions. The tool seems aimed at the research phase and helping pastors find connections between passages.

The platform costs $29 monthly.

What it doesn’t claim to do: Generate complete sermons. The positioning emphasizes research assistance rather than finished content creation.

Logos has added AI features

Logos has integrated conversational AI into their existing commentary and resource library. The advantage: it can search across resources in your existing library. The consideration: it requires an existing Logos investment.

I’ve tested both SermonAI and Logos’ AI features. Each has different strengths depending on your existing workflow and resource library.

Bible Gateway’s approach

Full disclosure: I work for Bible Gateway’s parent company. Our AI features will focus on reading comprehension for individual Bible study rather than sermon preparation, helping readers understand difficult passages rather than preparing teaching content.

Bible Study Tools: Mixed AI Integration

YouVersion Bible App

The YouVersion app has experimented with various features over time. For current AI capabilities and pricing, pastors should check directly with YouVersion rather than rely on third-party reports.

Traditional resources remain valuable

After working on AI features for ministry applications, I still observe pastors using physical commentaries and concordances for deep study. AI appears most helpful for broad research and initial connection-finding, while sustained study often benefits from traditional approaches.

Church Management: Limited AI Integration

Planning Center and similar platforms

Various church management platforms are experimenting with AI features. For specific capabilities and availability, pastors should verify directly with vendors rather than assume features exist.

ChurchTrac and scheduling optimization

Some platforms use algorithmic optimization for volunteer scheduling based on availability patterns. This represents a more straightforward application of automation technology to logistical problems.

For current features and pricing, check directly with platform providers.

Content Creation: Variable Results

Canva’s design assistance

Canva has integrated AI image generation and text suggestions. For church communications, these tools can help with graphics creation, though results vary based on specific needs.

The AI appears to handle visual design well but may struggle with theological nuance. Complex theological concepts often require human insight for appropriate visual representation.

Presentation tools

Various platforms offer AI assistance for turning outlines into slides. Results tend to be professionally formatted but may lack the contextual understanding needed for specific congregational needs.

Pastoral Perspectives on AI Usage

Based on discussions with ministry leaders, comfort levels with AI appear to vary by application:

  • Administrative tasks: Generally high comfort
  • Research assistance: Moderate to high comfort among those with theological training
  • Content structure help: Mixed comfort, varies by individual
  • Content generation: Generally low comfort due to pastoral responsibility concerns

Comfort levels likely correlate with factors like theological education, church context, and individual technology adoption patterns, though specific data would be needed to verify these relationships.

Recommendations by Context

Smaller ministry contexts:
Consider starting with research-focused tools and basic administrative automation. Budget considerations will vary based on specific tools chosen. Claude CoWork has helped out many ministries I know of and it seems like they’ve smoothed out much of the onboarding process.

Larger ministry contexts:
May benefit from more comprehensive platforms, though implementation should account for staff training and congregation expectations.

All contexts:
Verify current features and pricing directly with vendors, as AI capabilities in this space evolve rapidly.

The Practical Assessment

Based on developing AI features for ministry tools: AI appears most effective at research tasks, moderately helpful for organization, and limited to never in replacing pastoral judgment.

Successful implementations seem to focus on enhancing research capabilities rather than replacing pastoral decision-making. AI cannot understand congregational needs, pastoral relationships, or the contextual factors that shape ministry decisions.

The most effective approach likely involves using AI where it demonstrates clear value — information processing, research assistance, and administrative efficiency — while maintaining human oversight for theological interpretation and pastoral application.

The future probably isn’t pastors versus AI, but pastors using better research tools while preserving the relational and interpretive aspects of ministry that require human wisdom.

“The simple believe everything, but the prudent give thought to their steps.” (Proverbs 14:15, ESV) This principle applies to evaluating new technology tools as much as any other area of pastoral leadership.


Note: AI capabilities in ministry tools change rapidly. Verify current features and pricing directly with providers before making decisions. This assessment reflects observations from my experience building and testing these tools, not comprehensive market research.

Photo by Eric O. IBEKWEM on Unsplash