Anthropic's 10 Claude Agent Templates for Financial Services

Anthropic shipped ten ready-to-run Claude agent templates targeting finance workflows—here's what they do and how to deploy them.

Anthropic's 10 Claude Agent Templates for Financial Services

What Anthropic Released: 10 Templates, Two Deployment Modes

On May 5, 2026, Anthropic published ten ready-to-run agent templates for financial services — the company's most concentrated push into enterprise finance to date . The release was timed to a private event in New York attended by banks, asset managers, and hedge funds , and came one day after Anthropic disclosed a reported $1.5 billion joint venture with Goldman Sachs, Blackstone, and Hellman & Friedman . Each template is a packaged reference architecture — not a proof-of-concept — intended to reduce the integration work that usually sits between a model demo and a governed enterprise workflow .

Quick Answer: Anthropic released 10 Claude agent templates for financial services on May 5, 2026, covering pitch building, KYC screening, GL reconciliation, and more. Each template ships as a plugin for Claude Cowork/Code or a cookbook for Claude Managed Agents (public beta), with Claude Opus 4.7 scoring 64.37% on the Vals AI Finance Agent benchmark .

Each template bundles three components: skills (domain-specific task logic and instructions), connectors (governed access to financial data sources), and subagents (additional Claude model instances that handle specialized subtasks) . The three-component structure is Anthropic's architectural pattern for multi-step financial workflows — skills define what the agent does, connectors define what data it can reach, and subagents delegate work to specialized Claude instances without bloating a single context window.

Two deployment modes exist for every template. Plugin mode drops the agent into an analyst's existing desktop workflow through Claude Cowork or Claude Code, available immediately on all paid plans with no infrastructure changes. Cookbook mode on Claude Managed Agents — Claude Platform, currently in public beta — is the path for long-running autonomous tasks: processing an entire book of deals end-to-end, or running nightly reconciliation schedules . Managed Agent deployments add per-tool permissions management, managed credential vaults, and full audit logs in Claude Console. According to Anthropic's financial services webinar, the long-running mode is specifically designed for book-of-deals processing and overnight scheduled runs that exceed interactive session durations.

For developers evaluating which mode fits their organization: plugin mode carries minimal operational surface area and is available today without compliance review of autonomous execution. Managed Agent mode exposes more control infrastructure — credential vaults, audit logs, per-tool permissions — but carries beta-status risk on API stability and SLA commitments before GA. The choice maps to workflow duration, compliance posture, and organizational tolerance for upstream API changes.

The 10 Templates: Capabilities and Use Cases

The ten templates divide into two functional groups. Research & Client Coverage templates automate information-gathering and synthesis work that precedes client interaction: building pitchbooks, preparing pre-call briefs, updating financial models from earnings filings, and synthesizing broker research across sectors. Finance & Operations templates handle reconciliation, compliance, and reporting workflows that run after transactions close — general ledger reconciliation, month-end closing, KYC screening . The split reflects where Anthropic found the highest-friction, most document-intensive tasks across its financial services customer base, and is why the connector ecosystem spans both market data providers and back-office data sources.

Template Domain Primary Task Key Output
Pitch Builder Research & Client Coverage Create target lists, run comps, draft pitchbooks Pitchbook deck with comparable analysis
Meeting Preparer Research & Client Coverage Assemble pre-call research briefs Structured meeting brief
Earnings Reviewer Research & Client Coverage Read transcripts and filings, update models, flag thesis changes Model updates + thesis-change flags
Model Builder Research & Client Coverage Create and maintain financial models from filings and data feeds Maintained financial model
Market Researcher Research & Client Coverage Synthesize news, filings, and broker research across sectors Sector research synthesis
Valuation Reviewer Finance & Operations Review valuations for consistency and methodology Valuation review report
GL Reconciler Finance & Operations Reconcile general ledger entries against source records Reconciliation exception report
Month-End Closer Finance & Operations Handle NAV calculations and account reconciliations Month-end close package
Statement Auditor Finance & Operations Review financial statements for internal consistency Consistency flags and audit notes
KYC Screener Finance & Operations Package compliance materials, apply AML rules Structured JSON: risk ratings, document requirements, escalation reasons

Two templates warrant closer attention for developers evaluating integration scope. The KYC screener outputs structured JSON containing risk ratings, required document lists, and escalation reasons — designed for downstream system ingestion, not human reading . This positions it as a component in a larger compliance pipeline rather than a standalone tool: the output is meant to feed into case management or onboarding platforms. That architectural assumption matters if your firm has a custom AML workflow — validate the JSON schema against downstream system expectations before deploying. Schema drift between the template output and your ingestion layer is the most likely integration failure point.

The month-end closer handles NAV calculations and account reconciliations — tasks that are procedurally well-defined but highly sensitive to data quality . The earnings reviewer reads transcripts and filings, updates financial models, and flags thesis changes. For research teams, this template is most likely to overlap with existing Bloomberg or Refinitiv workflows. The practical question is how conflict resolution works when the agent's model update disagrees with an analyst's pre-existing assumptions — a workflow design decision that the template architecture leaves to the deploying team.

The cookbook structure means these templates are extensible. Each component — skills, connectors, subagents — can be replaced or extended. A team with internal data infrastructure can substitute a custom connector for a third-party one while keeping the skill logic intact. This is the design intent of the cookbook format on Claude Platform: deploy as-is in days, extend the component structure for firm-specific requirements afterward.

Deployment Architecture: Plugin vs. Managed Agent

Plugin mode and Managed Agent mode share the same template logic but differ substantially in where state lives, who manages permissions, and what the compliance surface area looks like. Plugin mode runs the agent inline with an analyst's desktop session in Claude Cowork or Claude Code — no additional infrastructure, no credential management overhead, no configuration beyond authentication on a paid plan . The agent exists within the conversation context. When the session ends, the agent run ends. This is the appropriate starting point for teams that want to evaluate template behavior before committing to a production deployment path.

Managed Agent mode changes the operational model significantly. Templates deployed as cookbooks on Claude Platform run autonomously: they can process a full book of deals end-to-end, execute on a nightly schedule, and chain multiple tool calls without human presence at each step . The platform provides per-tool permissions management — each connector and subagent call requires an explicit permission grant in Claude Console, not a blanket API key. Credential vaults handle secrets management. Full audit logs expose every tool call and decision in a format designed for compliance teams to inspect post-hoc, independent of the engineer who built the workflow.

"We want to close the gap between how fast the models are improving and how quickly financial services teams are actually able to put AI into production," — Jonathan Pelosi, Head of Financial Services at Anthropic

The audit log design reflects a deliberate architectural decision: compliance teams need to reconstruct exactly what data the agent accessed, which tools it called in which sequence, and what decisions it made — without relying on application-level logs that require custom tooling to parse. The logs are first-class objects in Claude Console, not a side effect of execution. This is the infrastructure that makes autonomous agent deployment viable in regulated financial environments, assuming the schema remains stable post-beta.

Human-in-the-loop is a structural product requirement across both modes. According to Anthropic's announcement, Claude prepares and drafts at every stage, but nothing is sent to a client, filed with a regulator, or acted upon in a downstream system without explicit human sign-off . This is not a configuration toggle — it is built into the template architecture and ties directly to the benchmark accuracy discussed in the next section.

For developer teams, the practical deployment decision maps to organizational readiness. Plugin mode requires no new infrastructure approvals. Managed Agent mode requires security review of the credential vault model, validation of connector permission scopes against data-use agreements, and a compliance conversation about autonomous multi-step execution before any production sign-off. Start with plugin; escalate to Managed Agent when the use case genuinely requires autonomous long-running execution and the compliance review is complete.

Recommended Model and Benchmark Reality Check

Anthropic recommends Claude Opus 4.7 for financial agent workflows and cites a score of 64.37% on Vals AI's Finance Agent benchmark as evidence of industry-leading performance . The benchmark tests multi-step financial reasoning including analysis of filings, model updates, and compliance screening. A 64% score means the model completes roughly two in three benchmark tasks correctly — which, framed as a competitive benchmark, is Anthropic's strongest point. Framed as an operational failure rate, the picture is different.

"The corollary to a 64.37% score — roughly a 36% failure rate — would be disqualifying for a human professional performing equivalent tasks," — The Register, May 5, 2026

The ~36% task failure rate under controlled benchmark conditions is the primary reason the mandatory human-review layer is a product-level structural requirement rather than a configuration preference . Deploying these templates without human checkpoints for high-stakes outputs — client-facing documents, regulatory filings, AML decisions — would create unacceptable operational risk. The benchmark score makes the case for the product; the failure rate explains the architecture.

For developers, the accuracy number has three workflow-design implications. First, human-in-the-loop is not a soft guideline — it needs to be engineered into every production workflow, with defined owners for review gates and understood latency impacts on previously automated processes. Second, in high-volume workflows like nightly GL reconciliation or batch KYC screening, a ~36% exception rate means review queues need to be dimensioned for volume, not edge-case handling. Third, benchmark scores measure performance on a defined task set; whether those tasks map precisely to your firm's specific workflow inputs is an empirical question that only production calibration can answer.

Anthropic's recommendation of Opus 4.7 is the correct default starting point. Teams should plan for a calibration period — running the agent against historical data, measuring actual task completion rates on firm-specific inputs, and tuning prompts or workflow branching logic before treating any benchmark figure as a production accuracy guarantee.

Data Connectors: Existing and New

The template release expanded the connector ecosystem from 8 to 16 data providers, adding eight new integrations alongside the existing set . Connectors use a governed access pattern: agents receive scoped permissions per template rather than raw API keys. A pitch builder cannot call connectors outside its defined scope even if the underlying credential technically has broader access. This matters for firms with data licensing restrictions — connector governance is part of the control model, not a convenience abstraction.

Connector Status Primary Data Category
FactSet Existing Market data, analytics, financial content
S&P Capital IQ Existing Company financials, M&A data, credit research
MSCI Existing Risk analytics, ESG data, indexes
PitchBook Existing Private market data, VC/PE deal flow
Morningstar Existing Fund data, equity research
Chronograph Existing Private equity portfolio monitoring
LSEG Existing Market data, news, financial analytics
Daloopa Existing AI-extracted financial model data
Dun & Bradstreet New (2026-05) Business credit, commercial risk
Fiscal AI New (2026-05) AI-native financial data
Financial Modeling Prep New (2026-05) Financial statements, ratios, market data
Guidepoint New (2026-05) Expert network, primary research
IBISWorld New (2026-05) Industry research, market sizing
SS&C Intralinks New (2026-05) Deal rooms, document exchange
Third Bridge New (2026-05) Expert network, sector intelligence
Verisk New (2026-05) Insurance analytics, risk data

A new Moody's MCP app is part of the release, providing credit data on more than 600 million public and private companies . The Moody's integration is notable because it brings private company credit coverage at a scale that most other connectors in the set do not match — the existing ecosystem focuses predominantly on public markets or PE/VC datasets. For KYC and credit risk workflows, this is likely the most impactful single addition in the release.

The new connectors fill specific gaps by use case. Guidepoint and Third Bridge add expert network access for primary research and market researcher workflows. SS&C Intralinks brings deal-room document access, which maps directly to M&A due diligence scenarios. IBISWorld adds industry-level market sizing for pitch builder and market researcher context. Dun & Bradstreet's commercial credit data covers the business risk screening dimension that the KYC screener needs beyond individual AML checks.

For developers: governed access relocates credential management to Anthropic's platform, but it does not eliminate data licensing compliance obligations. FactSet, PitchBook, and Capital IQ licenses commonly restrict automated bulk extraction, redistribution, and use in AI pipelines. Validate specific license terms against the access patterns each template will generate before production deployment — this is a legal and compliance question that sits outside the platform's governance model. As Finextra noted, enterprise adoption will depend substantially on whether connector governance passes internal risk and compliance reviews at each firm .

Microsoft 365 Integration and Context Persistence

Claude add-ins for Microsoft Excel, PowerPoint, and Word are generally available as of this release; an Outlook add-in is described as coming soon . GA status for Excel, Word, and PowerPoint removes the caveats that typically come with add-in previews. These are production-ready integrations for analyst desktop workflows — no beta disclaimers, no feature-incomplete gaps for the three core productivity apps where most financial modeling and document work actually happens.

The defining feature of the M365 integration is cross-application context persistence. Work started in an Excel model automatically carries forward into a PowerPoint deck — the analyst does not need to re-explain the underlying financial data when switching applications . For analyst workflows, this removes the copy-paste and re-explanation overhead that typically creates friction between the modeling phase in Excel and the presentation phase in PowerPoint. The model built in Excel becomes the context the agent uses when drafting deck content — the analyst moves between applications without resetting the agent's state.

From a developer integration standpoint, context persistence across M365 apps is a session-management capability: the Claude add-in maintains state across Office application boundaries. This has implications for multi-tool workflow sequences. A financial model review that starts in Excel (model audit), continues in Word (write-up draft), and finishes in PowerPoint (client summary) can now be one continuous Claude session rather than three separate conversations with manually re-introduced context at each transition step.

The Outlook gap matters for email-triggered workflows. Many financial processes are initiated by email: a client sends a term sheet request, an analyst receives an earnings release, a compliance officer gets a KYC request. Without Outlook at GA, email-driven workflow triggers still require manual handoff from Outlook into one of the other M365 apps or into Claude Cowork directly. Anthropic has flagged the Outlook add-in as coming soon, but the timeline is not specified in available documentation . For teams designing email-triggered workflows, treat Outlook integration as a dependency to track before finalizing workflow architecture.

Enterprise Adoption and Business Context

Anthropic named Citadel, FIS, BNY, Carlyle, Mizuho, and Travelers as existing customers using Claude in financial services workflows . Reuters and Bloomberg reporting added Goldman Sachs, Visa, Citi, and AIG to the customer or active evaluation set . According to Markets Media, financial institutions represent approximately 40% of Anthropic's top 50 customers — a concentration that explains why this release is a sustained product investment, not a vertical experiment.

"Finance is a great blueprint for the rest of knowledge work," — Nicholas Lin, Head of Product for Financial Services at Anthropic, as reported by Tech.co

The timing relative to the Goldman Sachs joint venture is not incidental. The reported $1.5 billion JV with Goldman Sachs, Blackstone, and Hellman & Friedman — disclosed one day before the template release — signals a go-to-market push backed by capital partners with direct influence over the target customer base. JV structures in enterprise software typically include co-selling arrangements, reference customer pipelines, and joint product roadmap input. The template release is partly a product launch and partly a commercial signal to the same banks and funds that were in the room at the New York event.

For developers, the enterprise adoption context predicts roadmap priorities. If 40% of Anthropic's top 50 customers are financial institutions, the connector ecosystem, compliance features, and audit capabilities will receive sustained engineering investment. The current template set reflects the most common use cases across that customer base — which means the templates are reasonably battle-tested at the design level, even if individual deployments require calibration. Named customers like Citadel and Carlyle have non-trivial data governance requirements, which provides implicit validation that the connector governance model has been reviewed by firms with serious security and compliance postures.

The reference architecture model also has a direct commercial implication for integration work. Instead of custom scoping discussions for every new financial deployment, implementation teams can start from a working template that already covers the target use case. That reduces specification risk and compresses the time from initial scoping to working prototype. The component structure is defined, the connector integrations are documented, and the human-review model is built in — what remains is firm-specific calibration and compliance review, not architecture from scratch.

What Developers Should Evaluate Before Adopting

Before deploying any of these templates in a financial services production environment, four areas require systematic evaluation — not because the templates are poorly designed, but because enterprise financial context introduces constraints that reference architectures cannot resolve by themselves. According to Markets Media, enterprise adoption will hinge on whether connector governance passes internal risk and compliance reviews, whether accuracy is measurable enough to satisfy audit requirements, and whether integrations fit existing data lineage and approval processes .

Managed Agents beta status. Claude Managed Agents is in public beta. The audit API surface, SLA commitments, and tool-call log schema definitions may change before GA. Compliance teams that need stable, auditable API contracts — common requirements for financial institutions — face version drift risk on beta infrastructure. Plugin mode, which runs in session context without the Managed Agents platform, is lower risk for initial adoption. Reserve the Managed Agent path for workflows where autonomous long-running execution is genuinely required and your organization can absorb upstream API changes before the platform reaches GA.

Connector data-use agreements. The connector governance model handles credential scoping, but it does not replace your firm's data licensing obligations. FactSet, PitchBook, and Capital IQ licenses commonly restrict automated bulk extraction and redistribution. Before connecting these sources to agent templates, validate the specific license terms against the access patterns the templates will generate. This is a legal and compliance question, not just a technical one, and it needs firm answers before production deployment. The fact that Anthropic's platform manages the credentials does not transfer the licensing compliance obligation.

The 64% accuracy floor in production context. A 64.37% benchmark score means roughly one in three tasks fails under controlled test conditions . Design human-review checkpoints for workflows where exceptions requiring human attention are common, not exceptional. High-volume workflows — nightly GL reconciliation, batch KYC screening — need review queues dimensioned for significant exception volume from day one. Discovering this mid-production is a preventable design failure.

Plugin mode as the lower-risk first step. For most organizations, plugin mode in Claude Cowork or Claude Code is the correct initial deployment. Minimal infrastructure footprint, no new compliance review for autonomous execution, and direct analyst experience with template behavior before committing to a Managed Agent architecture. Use the plugin phase to measure actual accuracy on firm-specific data, identify which human-review checkpoints are genuinely necessary versus unnecessary friction, and build compliance team confidence before escalating to autonomous production deployment.

Frequently Asked Questions

What is the difference between the plugin mode and Managed Agent mode for Claude finance templates?

Plugin mode runs the Claude finance agent inline within an analyst's existing desktop session in Claude Cowork or Claude Code. It is available immediately on all paid Claude plans and requires no infrastructure changes — the agent operates within the conversation context and stops when the session ends. Managed Agent mode, available in public beta on Claude Platform, runs templates autonomously for long-running tasks such as processing a full book of deals or executing nightly scheduled reconciliations . Managed Agent deployments include per-tool permissions management, managed credential vaults, and full audit logs in Claude Console for post-hoc compliance inspection. Plugin mode is lower-risk for initial evaluation; Managed Agent mode adds autonomy and auditability but carries beta-stage API instability risk before general availability.

Which Claude model does Anthropic recommend for financial agent workflows?

Anthropic recommends Claude Opus 4.7 for financial services agent workflows. The model scored 64.37% on Vals AI's Finance Agent benchmark , which Anthropic describes as industry-leading among evaluated models. The practical implication of that score is a roughly 36% task failure rate under benchmark conditions — which is the primary reason human-in-the-loop review is a structural product requirement for every template, not an optional configuration. Teams should treat the benchmark figure as a starting reference, run calibration tests on their own firm-specific data, and design human-review checkpoints assuming exception volumes consistent with a ~36% error rate in high-throughput workflows.

What data connectors are available for Claude financial agents?

The May 2026 release expanded the connector ecosystem to 16 providers total . Eight existing connectors: FactSet, S&P Capital IQ, MSCI, PitchBook, Morningstar, Chronograph, LSEG, and Daloopa. Eight new connectors added with this release: Dun & Bradstreet, Fiscal AI, Financial Modeling Prep, Guidepoint, IBISWorld, SS&C Intralinks, Third Bridge, and Verisk. A new Moody's MCP app covering more than 600 million public and private companies was also announced. All connectors use governed, scoped access patterns — agents receive per-template permissions rather than raw API keys. Data licensing compliance for FactSet, PitchBook, and Capital IQ licenses remains the responsibility of the deploying organization.

Do Claude finance agents act autonomously or is human review required?

Human review is a structural requirement built into every template, not a configuration option. According to Anthropic's announcement, Claude prepares drafts and completes analytical tasks, but nothing is sent to a client, filed with a regulator, or acted upon in a downstream system without explicit human sign-off . This constraint is embedded in the template architecture itself. The rationale connects directly to model accuracy: at roughly 36% task failure in controlled benchmark conditions, autonomous final action without human review creates unacceptable operational and compliance risk for financial workflows. In Managed Agent mode, audit logs allow compliance teams to verify that human sign-off occurred and reconstruct the full decision chain post-hoc.

Can I build on these templates or are they fixed configurations?

The templates are reference architectures — starting points for rapid deployment, not sealed configurations. Each template is built from three separable components: skills (task logic and instructions), connectors (data access), and subagents (specialized Claude instances). The cookbook format on Claude Platform is explicitly designed for teams to extend and customize the component structure for firm-specific workflows. A team can replace a connector with an internal data source, extend the skills logic to incorporate firm-specific rules, or add subagents for tasks the base template does not cover. According to Anthropic's documentation, the design intent is to deploy as-is within days, with the component structure available for custom extension beyond that baseline .

What to Watch: Beta Risks, Accuracy, and the Connector Compliance Gap

The ten finance agent templates are a well-scoped, commercially motivated release that addresses real workflow friction in financial services. The connector ecosystem is broad enough to cover most research and operations use cases. The two deployment modes give organizations a sensible adoption ladder from low-risk plugin use to autonomous Managed Agent deployments. The M365 GA integrations reduce friction for analyst desktop workflows that already live in Excel and PowerPoint.

Three open questions will determine whether this release leads to widespread production deployments or stays in extended pilot territory. First, Managed Agents' public beta status means the compliance infrastructure — audit API contracts, SLA terms, log schema stability — may shift before GA. Organizations with formal change management requirements will need to make a deliberate decision about when to commit to production on beta-stage infrastructure. Second, connector governance handles credential management, but data licensing compliance for FactSet, PitchBook, and Capital IQ remains an open question that each firm must resolve independently. Third, the 64.37% Vals AI benchmark accuracy figure will need to hold up against firm-specific task distributions, not just controlled benchmark conditions — and the calibration work to validate that is the firm's responsibility, not Anthropic's.

For developers building on these templates: start with plugin mode, measure accuracy against real firm data, design review queues for meaningful exception volumes, and validate connector data agreements before committing to Managed Agent production deployment. The template architecture is sound and the component model is extensible. The operational unknowns are well-defined. The timeline for resolving them depends on Anthropic's GA roadmap for Managed Agents and each firm's internal compliance review cycle.

Last updated: 2026-05-26. Based on Anthropic's May 5, 2026 announcement, contemporaneous reporting from The Register, Markets Media, and Tech.co. Managed Agents public beta status and connector availability subject to change as the platform moves toward general availability.

Stay in the loop

Field notes on AI tooling, agents, and the protocols connecting them.

Explore Creeta