๐Ÿค–
AI Engineering

Building Agentic AI Workflows with Anthropic Claude

AK
Arun Kataria
๐Ÿ“… December 20, 2024 โฑ 7 min read
โ† Back to all posts

At Intuit, I was tasked with reducing the manual effort involved in tax filing workflows. The solution? Build an agentic AI system powered by Anthropic Claude that could autonomously handle multi-step reasoning โ€” asking questions, interpreting documents, and making filing decisions.

Here's exactly how I built it, what worked, and what I'd do differently.

โœ… What you'll learnHow to architect an agentic AI pipeline with Claude, design prompts for multi-step tax workflows, handle tool use and retries, and keep costs under control in production.

What is an Agentic AI Workflow?

A standard LLM interaction is stateless โ€” you send a prompt, you get a response. An agentic workflow is different. The model is given tools, memory, and a goal โ€” and it figures out the steps to achieve it autonomously.

In our case, the goal was: given a user's tax documents and financial situation, determine the optimal filing strategy and pre-fill their return.

System Architecture

The Three Layers

๐Ÿ’ก Architecture tipKeep your orchestration layer language-agnostic from Claude. This lets you swap models (Claude โ†’ GPT-4) without rewriting business logic. We used an interface pattern in Spring Boot for this.

Prompt Design for Multi-Step Reasoning

The most critical part of an agentic system is the system prompt. Here's a simplified version of ours:

You are a tax filing assistant for TurboTax. Your job is to help
users complete their federal tax return accurately.

You have access to the following tools:
- get_user_documents(userId): Fetch uploaded W2, 1099, receipts
- calculate_deduction(type, amount): Compute deduction eligibility
- validate_filing_rule(rule_id, context): Check IRS compliance
- submit_draft_return(data): Save progress

Rules:
1. Always verify document data before making calculations
2. Never guess โ€” if data is missing, call get_user_documents first
3. Explain each decision to the user in plain language
4. If confidence < 80%, ask a clarifying question before proceeding

Tool Use with Claude

Claude's tool use feature was the core of the agentic loop. When Claude decides it needs to call a tool, it returns a structured JSON response that our orchestrator executes:

// Spring Boot tool dispatcher (simplified)
public String dispatchTool(String toolName, JsonNode input) {
  return switch (toolName) {
    case "get_user_documents" -> documentService.fetch(input.get("userId").asText());
    case "calculate_deduction" -> deductionEngine.calculate(input);
    case "validate_filing_rule" -> irsRuleValidator.check(input);
    default -> throw new UnknownToolException(toolName);
  };
}

Handling the Agentic Loop

The loop runs until Claude either completes the filing or asks the user a question it can't answer itself:

  1. Send user context + conversation history to Claude
  2. Claude responds โ€” either with tool calls or a user-facing message
  3. If tool calls: execute, append results, loop back to step 1
  4. If user message: stream to frontend, wait for user input
  5. Repeat until filing is complete or max iterations reached
โš ๏ธ Always set a max iteration limitWithout one, a poorly prompted agent can loop indefinitely โ€” burning tokens and money. We set max 12 iterations per session and logged anything that hit the limit for manual review.

Results

Metric Before After Improvement
Manual review time per return 18 min 4 min 78% reduction
Filing accuracy 91% 97% +6%
User drop-off rate 34% 21% 38% reduction
Avg tokens per session โ€” ~4,200 Baseline

Key Lessons

Note: Code examples are simplified for clarity and don't represent Intuit's actual implementation. Architecture patterns are based on general agentic AI design principles.

Found this useful?

Share it with someone building AI systems ๐Ÿ‘‡