How to build deterministic agentic AI with state machines in n8n

See how LogRocket's Galileo AI surfaces the most severe issues for you

No signup required

Check it out

The current agentic AI hype cycle often pitches the idea of autonomous agents: give an LLM a goal, a set of tools, and let it figure out the rest. While this is great for creative tasks, it’s sometimes dangerous for business processes that require strict logic.

For example: if you’re building an automated qualification system for high-ticket sales (like luxury travel, real estate, or insurance), “hallucination” isn’t just a quirky bug; it’s lost revenue. You can’t have an AI agent promising a custom itinerary to a lead with a $500 budget, or forgetting to collect an email address because it got distracted chatting about the weather.

In these high-value environments, we need probabilistic understanding (LLMs interpreting natural language) but deterministic routing (hard-coded logic controlling the flow).

This tutorial explores how to build a robust, state-machine-driven lead qualification system using n8n, a persistent data layer (n8n data tables), and an external CRM (GoHighLevel). We will move beyond stateless webhooks to create a system that “remembers” exactly where a user is in a complex application flow.

🚀 Sign up for The Replay newsletter

The Replay is a weekly newsletter for dev and engineering leaders.

Delivered once a week, it's your curated guide to the most important conversations around frontend dev, emerging AI tools, and the state of modern software.

The architecture: Why state machines?

Most WhatsApp or SMS automation relies on stateless webhooks. When a user sends a message, your server receives a JSON payload. It doesn’t natively know if this is the user’s first “Hello” or if they are answering question #3 about their budget.

A naive approach involves sending the entire chat history to an OpenAI API and asking, “What should we do next?” This is expensive, slow, and prone to logic breaks (e.g., the user tricks the bot into skipping qualification).

A better architectural pattern is the Finite State Machine (FSM).

In this pattern, the AI is downgraded from “Decision Maker” to “Data Processor.” The application logic is handled by a router that checks a persistent database.

The data flow

Trigger: Webhook receives a message (via Evolution API).
Context Fetch: System queries the database using the sender’s phone number.
State Check: We retrieve the user’s current_state cursor.
Routing: A Switch node directs traffic based only on that state.
Execution: AI extracts data or classifies intent.
Persistence: We update the database with the new state.

Step 1: Designing the persistence layer

For this workflow, we use n8n’s internal data tables to act as our state store. If you are scaling to millions of rows, you would swap this for Supabase or Redis, but the logic remains identical.

We need a schema that tracks the user’s progress and the data collected so far.

The schema:

phone (primary key): The unique identifier
state: The cursor tracking position (e.g., waiting_for_type)
trip_type: The extracted intent (Custom, Resort, Flight)
budget: The integer value of the user’s spending power
is_qualified: Boolean flag based on budget threshold
email: The user’s contact info

When a webhook arrives, we perform an Upsert operation. If the phone number doesn’t exist, we create a new row with state: null. If it exists, we return the current row. This ensures idempotency; sending “Hello” twice doesn’t break the flow.

Step 2: The router (Switch Node)

This is the brain of the operation. Unlike an AI agent, which guesses the next step, the Router knows the next step.

In n8n, we implement this using a Switch Node connected to the output of our data table lookup. We define the following strict routes based on the state column:

Empty / Null: Route to Greeting.
waiting_for_type: Route to Trip Classification
waiting_for_budget: Route to Budget Extraction
waiting_for_email: Route to Email Validation
waiting_for_booking_link_sent: Route to Final Confirmation

This creates a linear dependency. A user cannot jump to the email step without passing through the budget check.

Step 3: Constraining AI (classification & extraction)

Now that we have routed the user to the correct step, we invoke the LLM. However, we strictly limit its scope. We don’t want a conversation; we want structured data.

State A: `waiting_for_type` (Classification)

The user has just received the greeting. They might reply, “I want to plan a honeymoon in Bali,” or “Just looking for cheap flights.”

We use an AI Text Classifier node (LangChain integration) with a strict schema. We map vague natural language to our internal Enums:

Input: “We want a fully organized tour of Japan.”
Categories:
- A: Description: “Full, custom-built itinerary”
- B: Description: “Luxury resort or cruise”
- C: Description: “Flight booking or simple hotel”

The Logic:

If the output is A (Custom), we update the data table:

trip_type: A
state: waiting_for_budget

We then send a hard-coded message: “Amazing! Custom itineraries are our specialty. What is the approximate budget per person you’ve set aside?”

State B: `waiting_for_budget` (Extraction)

The user is now in the waiting_for_budget state. They reply: “We are thinking around 10k, maybe 12.”

We use an AI Information Extractor node. Crucially, we use a system prompt to force JSON output and handle edge cases (like “k” notation).

Over 200k developers use LogRocket to create better digital experiences

Learn more →

System prompt:

You are a data extraction specialist.
Input: User text.
Task: Extract 'budget' as an integer and 'qualified' status.
Qualification Rules:
1. If budget >= 10,000, qualified = true.
2. If budget < 10,000, qualified = false.
3. If user says "10k", output 10000.

Business logic (The guardrails):

Immediately after extraction, we update the data table. We then split the flow using an If Node:

Qualified (True): Update state to waiting_for_email. Ask for contact info to send the booking link.
Unqualified (False): Update state to waiting_for_email_unqualified. We will still ask for an email, but we will send a “DIY Guide” instead of a meeting link.

This keeps the decision-making logic in our control, avoiding the risk of an AI agent “being nice” and booking a meeting for an unqualified lead.

Step 4: Input sanitization and loops

The most fragile part of any bot is identifying specific strings like emails. Users make typos (e.g., alex@gmail,com). If we rely solely on AI extraction, it might hallucinate a valid email or pass the bad string to our CRM.

In the waiting_for_email state, we implement a validation loop.

The Code node

After the AI extracts the email candidate, we pass it through a JavaScript Code node running a Regex check:

// n8n Code Node
const email = items[0].json.output.email;
// Comprehensive Regex for email validation
const tester = /^[-!#$%&'*+\/0-9=?A-Z^_a-z{|}~](\.?[-!#$%&'*+\/0-9=?A-Z^_a-z`{|}~])*@[a-zA-Z0-9](-*\.?[a-zA-Z0-9])*\.[a-zA-Z](-?[a-zA-Z0-9])+$/;

if (!email || email.length > 254) {
  return { json: { isValid: false } };
}

const valid = tester.test(email);
return { json: { isValid: valid, email: email } };

The logic loop

We follow this with an If Node checking isValid:

If True: Update the database with the email, push data to the CRM, and advance the state.
If False: Send a re-prompt message: “Oops! It looks like that isn’t a valid email format. Could you check the spelling?”

Crucially, we do NOT update the state in the False path.

This means the user remains in waiting_for_email. Their next message will trigger the webhook, hit the Router, find waiting_for_email, and run the validation logic again. This creates an infinite retry loop until valid data is provided.
<h2id=”5-crm-integration”>Step 5: CRM integration & OAuth scopes

Once the local state machine has collected all necessary data (trip_type, budget, email), we sync it to the external system—in this case, GoHighLevel (GHL).

While n8n has a built-in GHL node, advanced custom fields (like Trip Type) require careful API configuration.

The OAuth scope pitfall

When connecting n8n to GHL (or many modern CRMs), the default app permissions often only include standard contact read/write scopes. To sync the data collected by our state machine, you must explicitly request custom field scopes during the handshake.

Ensure your connected app includes:

custom_fields.readonly
custom_fields.write

Without these, your workflow will execute successfully (returning a 200 OK), but the custom fields in the CRM will remain empty, failing silently.

Step 6: The async confirmation

The final state, waiting_for_booking_link_sent, handles an asynchronous dependency.

The booking link is generated by the CRM automation, not inside n8n:

n8n pushes the contact to GHL
n8n updates the local state to waiting_for_booking_link_sent
The workflow ends

We do not wait for the response immediately. Instead, we rely on a callback.

Inside GHL, an automation triggers when the Qualified tag is added. It generates the calendar link and fires a Webhook back to n8n.

n8n receives this new webhook, uses the phone number to find the user in our data table, verifies they are in the waiting_for_booking_link_sent state, and delivers the final WhatsApp message:

“Done! The booking link is on its way to your email, but you can also find it here: [Link]”

Conclusion

By treating LLMs as functional components within a rigid architecture rather than autonomous agents, we gain:

Determinism: We know exactly what the bot will do in every state.
Resilience: If the API crashes, the state is saved in the DB. The user can pick up exactly where they left off.
Data integrity: Regex and validation loops ensure no garbage data enters the CRM.

This architecture turns a chatbot into a reliable engineering system, suitable for any high-ticket or regulated industry where precision matters more than conversation.

How to build advanced forms in Next.js using a rule engine

Learn how to build advanced Next.js forms with rule engines, client-side previews, Server Actions, and server-validated form logic.

Ikeh Akinyemi

May 21, 2026 ⋅ 18 min read

The human side of AI: A CTO’s take on fear, trust, and identity in the AI age

AI is reshaping engineering teams emotionally as well as technically. A CTO shares insights on fear, trust, burnout, identity, and leading through AI change.

Ken Pickering

May 21, 2026 ⋅ 4 min read

Context rot is slowing down your AI agent: How to fix it

Learn what context rot is, why AI agent sessions degrade over time, and how to fix it with compaction, prompt anchoring, context files, plan files, and RAG.

David Omotayo

May 18, 2026 ⋅ 11 min read

TypeScript v6 is here: A full migration guide

Learn about TypeScript v6’s breaking changes, new ES2025 features, and deprecated options. A complete migration guide from v5 to prepare for v7.

Amazing Enyichi Agu

May 14, 2026 ⋅ 7 min read

View all posts

Advisory boards aren’t only for executives. Join the LogRocket Content Advisory Board today →

How to build deterministic agentic AI with state machines in n8n

See how LogRocket's Galileo AI surfaces the most severe issues for you

No signup required

The architecture: Why state machines?

The data flow

Step 1: Designing the persistence layer

The schema:

Step 2: The router (Switch Node)

Step 3: Constraining AI (classification & extraction)

State A: `waiting_for_type` (Classification)

The Logic:

State B: `waiting_for_budget` (Extraction)

Over 200k developers use LogRocket to create better digital experiences

System prompt:

Business logic (The guardrails):

Step 4: Input sanitization and loops

The Code node

The logic loop

The OAuth scope pitfall

More great articles from LogRocket:

Step 6: The async confirmation

Conclusion

Stop guessing about your digital experience with LogRocket

Recent posts:

How to build advanced forms in Next.js using a rule engine

The human side of AI: A CTO’s take on fear, trust, and identity in the AI age

Context rot is slowing down your AI agent: How to fix it

TypeScript v6 is here: A full migration guide

Advisory boards aren’t only for executives. Join the LogRocket Content Advisory Board today →

See how LogRocket's Galileo AI surfaces the most severe issues for you

No signup required

🚀 Sign up for The Replay newsletter

The architecture: Why state machines?

The data flow

Step 1: Designing the persistence layer

The schema:

Step 2: The router (Switch Node)

Step 3: Constraining AI (classification & extraction)

State A: waiting_for_type (Classification)

The Logic:

State B: waiting_for_budget (Extraction)

Over 200k developers use LogRocket to create better digital experiences

System prompt:

Business logic (The guardrails):

Step 4: Input sanitization and loops

The Code node

The logic loop

The OAuth scope pitfall

More great articles from LogRocket:

Step 6: The async confirmation

Conclusion

Stop guessing about your digital experience with LogRocket

Recent posts:

How to build advanced forms in Next.js using a rule engine

The human side of AI: A CTO’s take on fear, trust, and identity in the AI age

Context rot is slowing down your AI agent: How to fix it

TypeScript v6 is here: A full migration guide

State A: `waiting_for_type` (Classification)

State B: `waiting_for_budget` (Extraction)