Basic RAG chat

Go to Workflow
2,085 views
Built by JustinLee JustinLee
Created on June 06, 2026

Description

This workflow demonstrates a simple Retrieval-Augmented Generation (RAG) pipeline in n8n, split into two main sections:

๐Ÿ”น Part 1: Load Data into Vector Store
Reads files from disk (or Google Drive).

Splits content into manageable chunks using a recursive text splitter.

Generates embeddings using the Cohere Embedding API.

Stores the vectors into an In-Memory Vector Store (for simplicity; can be replaced with Pinecone, Qdrant, etc.).

๐Ÿ”น Part 2: Chat with the Vector Store
Takes user input from a chat UI or trigger node.

Embeds the query using the same Cohere embedding model.

Retrieves similar chunks from the vector store via similarity search.

Uses Groq-hosted LLM to generate a final answer based on the context.

๐Ÿ› ๏ธ Technologies Used:
๐Ÿ“ฆ Cohere Embedding API

โšก Groq LLM for fast inference

๐Ÿง  n8n for orchestrating and visualizing the flow

๐Ÿงฒ In-Memory Vector Store (for prototyping)

๐Ÿงช Usage:
Upload or point to your source documents.

Embed them and populate the vector store.

Ask questions through the chat trigger node.

Receive context-aware responses based on retrieved content.

Nodes Used (7)

Default Data Loader
@n8n/n8n-nodes-langchain.documentDefaultDataLoader
Embeddings Cohere
@n8n/n8n-nodes-langchain.embeddingsCohere
Groq Chat Model
@n8n/n8n-nodes-langchain.lmChatGroq
Question and Answer Chain
@n8n/n8n-nodes-langchain.chainRetrievalQa
Recursive Character Text Splitter
@n8n/n8n-nodes-langchain.textSplitterRecursiveCharacterTextSplitter
Simple Vector Store
@n8n/n8n-nodes-langchain.vectorStoreInMemory
Vector Store Retriever
@n8n/n8n-nodes-langchain.retrieverVectorStore