The current agentic AI hype cycle often pitches the idea of autonomous agents: give an LLM a goal, a set of tools, and let it figure out the rest. While this is great for creative tasks, it’s sometimes dangerous for business processes that require strict logic.
For example: if you’re building an automated qualification system for high-ticket sales (like luxury travel, real estate, or insurance), “hallucination” isn’t just a quirky bug; it’s lost revenue. You can’t have an AI agent promising a custom itinerary to a lead with a $500 budget, or forgetting to collect an email address because it got distracted chatting about the weather.
In these high-value environments, we need probabilistic understanding (LLMs interpreting natural language) but deterministic routing (hard-coded logic controlling the flow).
This tutorial explores how to build a robust, state-machine-driven lead qualification system using n8n, a persistent data layer (n8n data tables), and an external CRM (GoHighLevel). We will move beyond stateless webhooks to create a system that “remembers” exactly where a user is in a complex application flow.
The Replay is a weekly newsletter for dev and engineering leaders.
Delivered once a week, it's your curated guide to the most important conversations around frontend dev, emerging AI tools, and the state of modern software.
Most WhatsApp or SMS automation relies on stateless webhooks. When a user sends a message, your server receives a JSON payload. It doesn’t natively know if this is the user’s first “Hello” or if they are answering question #3 about their budget.
A naive approach involves sending the entire chat history to an OpenAI API and asking, “What should we do next?” This is expensive, slow, and prone to logic breaks (e.g., the user tricks the bot into skipping qualification).
A better architectural pattern is the Finite State Machine (FSM).
In this pattern, the AI is downgraded from “Decision Maker” to “Data Processor.” The application logic is handled by a router that checks a persistent database.
For this workflow, we use n8n’s internal data tables to act as our state store. If you are scaling to millions of rows, you would swap this for Supabase or Redis, but the logic remains identical.
We need a schema that tracks the user’s progress and the data collected so far.
phone (primary key): The unique identifierstate: The cursor tracking position (e.g., waiting_for_type)trip_type: The extracted intent (Custom, Resort, Flight)budget: The integer value of the user’s spending poweris_qualified: Boolean flag based on budget thresholdemail: The user’s contact infoWhen a webhook arrives, we perform an Upsert operation. If the phone number doesn’t exist, we create a new row with state: null. If it exists, we return the current row. This ensures idempotency; sending “Hello” twice doesn’t break the flow.
This is the brain of the operation. Unlike an AI agent, which guesses the next step, the Router knows the next step.
In n8n, we implement this using a Switch Node connected to the output of our data table lookup. We define the following strict routes based on the state column:
waiting_for_type: Route to Trip Classificationwaiting_for_budget: Route to Budget Extractionwaiting_for_email: Route to Email Validationwaiting_for_booking_link_sent: Route to Final ConfirmationThis creates a linear dependency. A user cannot jump to the email step without passing through the budget check.
Now that we have routed the user to the correct step, we invoke the LLM. However, we strictly limit its scope. We don’t want a conversation; we want structured data.
waiting_for_type (Classification)The user has just received the greeting. They might reply, “I want to plan a honeymoon in Bali,” or “Just looking for cheap flights.”
We use an AI Text Classifier node (LangChain integration) with a strict schema. We map vague natural language to our internal Enums:
A: Description: “Full, custom-built itinerary”B: Description: “Luxury resort or cruise”C: Description: “Flight booking or simple hotel”If the output is A (Custom), we update the data table:
trip_type: Astate: waiting_for_budgetWe then send a hard-coded message: “Amazing! Custom itineraries are our specialty. What is the approximate budget per person you’ve set aside?”
waiting_for_budget (Extraction)The user is now in the waiting_for_budget state. They reply: “We are thinking around 10k, maybe 12.”
We use an AI Information Extractor node. Crucially, we use a system prompt to force JSON output and handle edge cases (like “k” notation).
You are a data extraction specialist. Input: User text. Task: Extract 'budget' as an integer and 'qualified' status. Qualification Rules: 1. If budget >= 10,000, qualified = true. 2. If budget < 10,000, qualified = false. 3. If user says "10k", output 10000.
Immediately after extraction, we update the data table. We then split the flow using an If Node:
waiting_for_email. Ask for contact info to send the booking link.waiting_for_email_unqualified. We will still ask for an email, but we will send a “DIY Guide” instead of a meeting link.This keeps the decision-making logic in our control, avoiding the risk of an AI agent “being nice” and booking a meeting for an unqualified lead.
The most fragile part of any bot is identifying specific strings like emails. Users make typos (e.g., alex@gmail,com). If we rely solely on AI extraction, it might hallucinate a valid email or pass the bad string to our CRM.
In the waiting_for_email state, we implement a validation loop.
After the AI extracts the email candidate, we pass it through a JavaScript Code node running a Regex check:
// n8n Code Node
const email = items[0].json.output.email;
// Comprehensive Regex for email validation
const tester = /^[-!#$%&'*+\/0-9=?A-Z^_a-z{|}~](\.?[-!#$%&'*+\/0-9=?A-Z^_a-z`{|}~])*@[a-zA-Z0-9](-*\.?[a-zA-Z0-9])*\.[a-zA-Z](-?[a-zA-Z0-9])+$/;
if (!email || email.length > 254) {
return { json: { isValid: false } };
}
const valid = tester.test(email);
return { json: { isValid: valid, email: email } };
We follow this with an If Node checking isValid:
Crucially, we do NOT update the state in the False path.
This means the user remains in waiting_for_email. Their next message will trigger the webhook, hit the Router, find waiting_for_email, and run the validation logic again. This creates an infinite retry loop until valid data is provided.
<h2id=”5-crm-integration”>Step 5: CRM integration & OAuth scopes
Once the local state machine has collected all necessary data (trip_type, budget, email), we sync it to the external system—in this case, GoHighLevel (GHL).
While n8n has a built-in GHL node, advanced custom fields (like Trip Type) require careful API configuration.
When connecting n8n to GHL (or many modern CRMs), the default app permissions often only include standard contact read/write scopes. To sync the data collected by our state machine, you must explicitly request custom field scopes during the handshake.
Ensure your connected app includes:
custom_fields.readonlycustom_fields.writeWithout these, your workflow will execute successfully (returning a 200 OK), but the custom fields in the CRM will remain empty, failing silently.
The final state, waiting_for_booking_link_sent, handles an asynchronous dependency.
The booking link is generated by the CRM automation, not inside n8n:
waiting_for_booking_link_sentWe do not wait for the response immediately. Instead, we rely on a callback.
Inside GHL, an automation triggers when the Qualified tag is added. It generates the calendar link and fires a Webhook back to n8n.
n8n receives this new webhook, uses the phone number to find the user in our data table, verifies they are in the waiting_for_booking_link_sent state, and delivers the final WhatsApp message:
“Done! The booking link is on its way to your email, but you can also find it here: [Link]”
By treating LLMs as functional components within a rigid architecture rather than autonomous agents, we gain:
This architecture turns a chatbot into a reliable engineering system, suitable for any high-ticket or regulated industry where precision matters more than conversation.

Local AI proxy tutorial for detecting, masking, and rehydrating PII before prompts reach cloud LLMs.

Learn how Graph RAG uses connected knowledge structures to improve retrieval beyond simple text similarity.

Learn how sibling-index() enables clean, JavaScript-free stagger animations using native CSS.

useEffect breaks AI streaming responses in ReactSee why useEffect breaks AI streaming in React, and how moving stream state outside React fixes flicker and stale updates.
Would you be interested in joining LogRocket's developer community?
Join LogRocket’s Content Advisory Board. You’ll help inform the type of content we create and get access to exclusive meetups, social accreditation, and swag.
Sign up now