Parse Google Drive documents to RAG-ready Markdown with Landing.ai and Supabase cache

Go to Workflow
0 views
Built by Alok Kumar Alok Kumar
Created on June 05, 2026

Description

Make your unstructured large documents LLM ready markdown using LandingAI Document Parsing.

Automatically watches a Google Drive folder, submits new documents to Landing.ai for parsing, caches processed files in - Supabase to avoid reprocessing, and reliably polls results with retry and timeout handling.
Use Cases
Automated document ingestion for RAG pipelines
Invoice, contract, or report parsing
AI-powered document analysis workflows
Knowledge base ingestion from Google Drive
Preventing duplicate document processing in ETL pipelines
External services:
Google Drive
Landing.ai
Supabase
Credentials Required

Required
Google Drive OAuth2
Landing.ai API (HTTP Bearer Token)
Supabase API
How it works

Once the pdf land in google drive location it trigger and it convert pdf (even more then 200 pages to LLM ready markdown).
It also check in database if the parsing is already done or not, this help to avoid any unnecessary landingAI api call.

Setup Instructions

Step 1: Google Drive
Create or select a folder in Google Drive
Copy the folder ID
Update the Google Drive Trigger node with this folder ID

Step 2: Landing.ai
Create a Landing.ai account
Generate an API key
Add it in n8n as an HTTP Bearer Auth credential
Update the organization-id header if required

Step 3: Supabase
Create a Supabase project
Create a table named landing_parse_cache
Add fields such as:
file_id
document_name
mime_type
file_size_bytes
job_id
job_status
markdown
uploaded_at
workflow_run_id
Connect Supabase credentials in n8n

Expected Input
A document uploaded into the configured Google Drive folder
(PDF, DOCX, or other supported formats)

Expected Output
Parsed markdown content stored in Supabase
Metadata including:
File ID
File name
MIME type
File size
Job ID
Processing status
Early exit if the document already exists in cache

Error Handling & Edge Cases
Cache check to prevent duplicate processing
Retry-based polling for async job completion
Timeout detection for stuck jobs
Large file output URL handling
Detailed logging for debugging and audits

Customization Ideas
Push parsed output to a vector database
Trigger Slack or email notifications
Store results in cloud storage (S3, GCS)
Extend into a RAG or AI agent pipeline
Categories
Document Processing
AI & LLM
Knowledge Management
Automation

Difficulty Level
Advanced


Happy Automating - from Alok

Nodes Used (4)

Code
n8n-nodes-base.code
Google Drive
n8n-nodes-base.googleDrive
HTTP Request
n8n-nodes-base.httpRequest
Supabase
n8n-nodes-base.supabase