How to Add Audit Trails to CrewAI Agents

CrewAI is built for multi-agent collaboration — you define agents with roles, assign tasks, and let them work together. That's powerful in production, but it also means decisions are distributed across multiple agents. When your compliance officer asks "which agent decided what, and in what order?", you need more than console output.

You need an audit trail that captures every agent's contribution, links them into a single session, and proves nothing was altered after the fact.

The Multi-Agent Audit Problem

Single-agent audit trails are relatively straightforward: one agent, one decision chain. CrewAI introduces complexity:

Multiple agents make decisions within the same crew execution
Task delegation means Agent A might trigger work in Agent B
Sequential and parallel workflows create branching decision trees
Agent-to-agent communication passes context that affects downstream decisions

Standard logging captures individual events but loses the relationships between them. You get a flat list of log lines with no structure showing that the Researcher agent's output became the Writer agent's input, which the Editor agent then revised.

A proper audit trail for CrewAI must capture the crew-level session, individual agent actions, task assignments and completions, and the data flow between agents — all in a single, verifiable chain.

Tutorial: Add AgentTraceHQ to a CrewAI Crew

Prerequisites

Node.js 18+ (or Python 3.10+ — CrewAI is Python-native, but the patterns are identical)
A CrewAI project with agents and tasks defined
An AgentTraceHQ account (free tier — 10K traces/month)

Install the SDK

npm install @agenttracehq/sdk crewai

Initialize and Configure

import { AgentTraceHQ, CrewAIHandler } from "@agenttracehq/sdk";

const athq = new AgentTraceHQ({
  apiKey: process.env.AGENTTRACEHQ_API_KEY,
  agentId: "research-crew",
});

const handler = new CrewAIHandler(athq, {
  captureIO: true,
  captureReasoning: true,
  traceTaskDelegation: true, // Capture when tasks are delegated between agents
});

Full Working Example: Financial Research Crew

Here's a complete CrewAI crew with three agents that researches a stock, writes an analysis, and reviews it — all traced through AgentTraceHQ:

import { Crew, Agent, Task, Process } from "crewai";
import { ChatOpenAI } from "@langchain/openai";
import { AgentTraceHQ, CrewAIHandler } from "@agenttracehq/sdk";

// 1. Initialize AgentTraceHQ
const athq = new AgentTraceHQ({
  apiKey: process.env.AGENTTRACEHQ_API_KEY,
  agentId: "financial-research-crew",
});

const handler = new CrewAIHandler(athq, {
  captureIO: true,
  captureReasoning: true,
  traceTaskDelegation: true,
});

// 2. Define agents
const researcher = new Agent({
  role: "Senior Financial Researcher",
  goal: "Gather and analyze financial data for investment decisions",
  backstory: `You are an experienced financial analyst with 15 years of
    experience in equity research. You specialize in technology stocks
    and have a track record of identifying undervalued companies.`,
  llm: new ChatOpenAI({ modelName: "gpt-4o", temperature: 0 }),
  tools: [stockDataTool, newsFeedTool, secFilingsTool],
});

const analyst = new Agent({
  role: "Investment Analyst",
  goal: "Synthesize research into actionable investment recommendations",
  backstory: `You are a CFA charterholder who translates complex financial
    data into clear buy/hold/sell recommendations with specific price
    targets and risk assessments.`,
  llm: new ChatOpenAI({ modelName: "gpt-4o", temperature: 0 }),
});

const complianceReviewer = new Agent({
  role: "Compliance Reviewer",
  goal: "Review investment recommendations for regulatory compliance",
  backstory: `You are a compliance officer who ensures all investment
    recommendations meet MiFID II suitability requirements and include
    proper risk disclosures.`,
  llm: new ChatOpenAI({ modelName: "gpt-4o", temperature: 0 }),
});

// 3. Define tasks
const researchTask = new Task({
  description: `Research {ticker} thoroughly:
    - Current price and key financial metrics (P/E, P/S, debt/equity)
    - Recent news and catalysts
    - Competitive landscape
    - Risk factors`,
  agent: researcher,
  expectedOutput: "Comprehensive research report with data and sources",
});

const analysisTask = new Task({
  description: `Based on the research, produce an investment recommendation:
    - Buy/Hold/Sell rating with conviction level
    - 12-month price target with bull/base/bear scenarios
    - Key risks ranked by likelihood and impact
    - Position sizing recommendation`,
  agent: analyst,
  expectedOutput: "Structured investment recommendation",
  context: [researchTask], // Uses researcher's output as input
});

const complianceTask = new Task({
  description: `Review the investment recommendation for compliance:
    - Verify risk disclosures are adequate
    - Check suitability assessment completeness
    - Flag any statements that need qualification
    - Approve or request revisions`,
  agent: complianceReviewer,
  expectedOutput: "Compliance review with approval status",
  context: [analysisTask],
});

// 4. Create and run the crew with tracing
const crew = new Crew({
  agents: [researcher, analyst, complianceReviewer],
  tasks: [researchTask, analysisTask, complianceTask],
  process: Process.sequential,
  verbose: false,
});

const result = await handler.traceCrew(crew, {
  inputs: { ticker: "NVDA" },
});

console.log(result);

// 5. Flush before exit
await athq.flush();

What Appears in the Dashboard

Run this crew, and the AgentTraceHQ dashboard shows a single session containing the entire crew execution. Here's what the trace timeline looks like:

Session: research-crew-2026-03-07-a8f3b2

#	Agent	Action	Latency	Tokens
1	research-crew	`crew_start`	-	-
2	Senior Financial Researcher	`task_start: research`	-	-
3	Senior Financial Researcher	`tool_call: stock_data`	340ms	-
4	Senior Financial Researcher	`tool_call: news_feed`	520ms	-
5	Senior Financial Researcher	`tool_call: sec_filings`	890ms	-
6	Senior Financial Researcher	`llm_call: synthesize`	3.2s	4,120
7	Senior Financial Researcher	`task_complete: research`	5.1s	4,120
8	Investment Analyst	`task_start: analysis`	-	-
9	Investment Analyst	`llm_call: recommendation`	4.8s	3,890
10	Investment Analyst	`task_complete: analysis`	4.9s	3,890
11	Compliance Reviewer	`task_start: review`	-	-
12	Compliance Reviewer	`llm_call: compliance_check`	2.1s	2,340
13	Compliance Reviewer	`task_complete: review`	2.2s	2,340
14	research-crew	`crew_complete`	12.4s	10,350

Every trace is clickable. Click trace #9 (the analyst's recommendation) and you see:

Input: The full research report from the Senior Financial Researcher (trace #7's output)
Output: The structured buy/hold/sell recommendation with price targets
Reasoning: The chain-of-thought showing how the analyst weighed the data
Model: gpt-4o
Tokens: 3,890 (input: 2,640, output: 1,250)
Cost: $0.054

The key insight: you can see exactly how data flowed from one agent to the next. The researcher's output became the analyst's input. The analyst's output became the compliance reviewer's input. The entire decision chain is traceable.

Multi-Agent Session Linking

The CrewAIHandler automatically:

Creates a single sessionId for the entire crew execution
Tags each trace with the originating agent's role
Links task outputs to downstream task inputs via parentTraceId
Captures task delegation events when agents hand off work

This means you can click Session View and see the entire crew execution as a connected graph — not just a flat list. The compliance reviewer's approval (or rejection) is directly linked to the analyst's recommendation, which is directly linked to the researcher's data.

Hash Chain Verification

All 14 traces in this session are part of the same organizational hash chain. Click Verify Chain and AgentTraceHQ confirms that every trace — across all three agents — is cryptographically intact.

If someone modified the analyst's recommendation after the compliance reviewer approved it, the hash chain would break at that exact point. You'd see precisely which trace was tampered with and when.

This is particularly important for multi-agent systems. With a single agent, the decision chain is linear. With CrewAI, agents build on each other's outputs. If any link in that chain is altered, the entire crew's output is suspect. Hash chain verification lets you prove the integrity of the complete multi-agent decision process.

Why This Matters for CrewAI Specifically

CrewAI's value proposition is autonomous multi-agent collaboration. That's also what makes it the hardest to audit:

Distributed decisions: No single agent is responsible for the final output. You need to trace the entire pipeline.
Emergent behavior: Agents can produce unexpected results when their outputs combine. An audit trail lets you diagnose where unexpected behavior originated.
Delegation chains: In hierarchical crews, a manager agent delegates to worker agents. The delegation itself is a decision that needs to be audited.

Without a structured audit trail, debugging a misbehaving crew means reading through unstructured logs trying to piece together what happened. With AgentTraceHQ, you click on the session and see the complete decision tree.

For teams in regulated industries like fintech, multi-agent crews processing financial data need the same audit rigor as any other automated financial process. The EU AI Act doesn't distinguish between single-agent and multi-agent systems — all high-risk AI systems need compliant logging.

Start Tracing Your CrewAI Agents

CrewAI makes it easy to build powerful multi-agent systems. AgentTraceHQ makes it easy to prove what those agents did and why — with cryptographic guarantees that the records are genuine.

Start free at agenttracehq.com — 10K traces/month, no credit card required. Your entire crew, one verifiable audit trail.