AI Powered Web Scraping with Jina, Google Sheets and OpenAI : the EASY way

Go to Workflow
43,837 views
Built by Derek Cheung Derek Cheung
Created on June 05, 2026

Description

Purpose of workflow:
The purpose of this workflow is to automate scraping of a website, transforming it into a structured format, and loading it directly into a Google Sheets spreadsheet.

How it works:

Web Scraping: Uses the Jina AI service to scrape website data and convert it into LLM-friendly text.
Information Extraction: Employs an AI node to extract specific book details (title, price, availability, image URL, product URL) from the scraped data.
Data Splitting: Splits the extracted information into individual book entries.
Google Sheets Integration: Automatically populates a Google Sheets spreadsheet with the structured book data.

Step by step setup:

Set up Jina AI service:
Sign up for a Jina AI account and obtain an API key.


Configure the HTTP Request node:
Enter the Jina AI URL with the target website.
Add the API key to the request headers for authentication.

Set up the Information Extractor node:
Use Claude AI to generate a JSON schema for data extraction.
Upload a screenshot of the target website to Claude AI.
Ask Claude AI to suggest a JSON schema for extracting required information.
Copy the generated schema into the Information Extractor node.

Configure the Split node:
Set it up to separate the extracted data into individual book entries.

Set up the Google Sheets node:
Create a Google Sheets spreadsheet with columns for title, price, availability, image URL, and product URL.
Configure the node to map the extracted data to the appropriate columns.

Nodes Used (4)

Google Sheets
n8n-nodes-base.googleSheets
HTTP Request
n8n-nodes-base.httpRequest
Information Extractor
@n8n/n8n-nodes-langchain.informationExtractor
OpenAI Chat Model
@n8n/n8n-nodes-langchain.lmChatOpenAi