Turn any website into an AI support chatbot with OpenAI and Pinecone

31 views

Built by

Dinakar Selvakumar

Created on July 29, 2026

Description

Complete AI support system using website data (RAG pipeline)

This template provides a full end-to-end Retrieval-Augmented Generation (RAG) system using n8n. It includes two connected workflows:

A data ingestion pipeline that crawls a website and stores its content in a vector database.
A customer support chatbot that retrieves this knowledge and answers user queries in real time.

Together, these workflows allow you to turn any public website into an intelligent AI-powered support assistant grounded in real business data.

Use cases

AI customer support chatbot for your website
Internal company knowledge assistant
Product FAQ automation
Helpdesk or IT support bot
AI receptionist for services
Semantic search over company content

How it works

Ingestion workflow
Discover all URLs from a website sitemap.
Filter and normalize the URLs.
Fetch each page and extract readable text.
Clean HTML into plain text.
Split text into overlapping chunks.
Generate embeddings using OpenAI.
Store vectors in Pinecone with metadata.

Chatbot workflow
A user sends a message via chat webhook.
The agent queries Pinecone for relevant knowledge.
Retrieved content is passed to OpenAI.
OpenAI generates a grounded response.
Short-term memory maintains conversation context.

How to use

Step 1 – Run ingestion
Set your target website URL.
Add Firecrawl, OpenAI, and Pinecone credentials.
Create a Pinecone index.
Execute the ingestion workflow.
Wait until all pages are indexed.

Step 2 – Run chatbot
Deploy the chatbot workflow.
Set the same Pinecone index and namespace.
Copy the chat webhook URL.
Connect it to a website, chat widget, or WhatsApp bot.
Start chatting with your AI assistant.

Requirements

Firecrawl account
OpenAI API key
Pinecone account and index
Public website to crawl
Optional: frontend chat interface

Good to know

The chatbot never answers from memory for business data.
All company knowledge comes from Pinecone.
If Pinecone returns nothing, the bot fails safely.
HTML cleaning is basic and can be replaced with:
Mozilla Readability
Jina Reader
Unstructured
Chunk size and overlap affect retrieval quality.
Pinecone can be replaced with:
Qdrant
Weaviate
Supabase Vector
Chroma

Customising this workflow

You can extend this system by:
Adding PDF or document loaders
Scheduling ingestion daily or weekly
Connecting CRM or ticketing systems
Adding appointment booking tools
Switching to local or open-source models
Adding multilingual support
Storing raw content in a database
Adding feedback or logging

What this n8n template demonstrates

Real-world RAG architecture
Web crawling pipelines
Text chunking strategies
Vector database integration
AI agent orchestration
Memory-controlled conversations
Production-grade AI support systems
End-to-end AI infrastructure with n8n

Architecture overview

This template follows a modern AI system design:

Website → Ingestion → Embeddings → Pinecone → Retrieval → OpenAI → User

It separates:
Data preparation (offline)
Knowledge storage
Runtime inference

This makes the system scalable, maintainable, and safe for production use.

Need a custom setup?

If you want a similar AI system built for your business (custom data sources, CRM integration, WhatsApp bots, booking systems, dashboards, or private deployments), feel free to reach out at [email protected].

I help companies design and deploy production-ready AI workflows.

Nodes Used (8)

AI Agent

@n8n/n8n-nodes-langchain.agent

Code

n8n-nodes-base.code

Default Data Loader

@n8n/n8n-nodes-langchain.documentDefaultDataLoader

Embeddings OpenAI

@n8n/n8n-nodes-langchain.embeddingsOpenAi

HTTP Request

n8n-nodes-base.httpRequest

OpenAI Chat Model

@n8n/n8n-nodes-langchain.lmChatOpenAi

Pinecone Vector Store

@n8n/n8n-nodes-langchain.vectorStorePinecone

Simple Memory

@n8n/n8n-nodes-langchain.memoryBufferWindow

Turn any website into an AI support chatbot with OpenAI and Pinecone

Description

Nodes Used (8)

Select Nodes to Filter