Nano Banana/Gemini 2.5 Telegram Bot with Multi-modal Functionality

0 views

Built by

Denis

Created on July 31, 2026

Description

How it works
Multi-modal AI Image Generator powered by Google's Nano Banana (Gemini 2.5 Flash Image) - the latest state-of-the-art image generation model
Accepts text, images, voice messages, and PDFs via Telegram for maximum flexibility
Uses OpenAI GPT models for conversation and image analysis, then Nano Banana for stunning image generation
Features conversation memory for iterative image modifications ("make it darker", "change to blue")
Processes different input types: analyzes uploaded images, transcribes voice messages, extracts PDF text
All inputs are converted to optimized prompts specifically tuned for Nano Banana's capabilities

Set up steps
Create Telegram bot via @BotFather and get API token
Set up Google Gemini API key from Google AI Studio for Nano Banana image generation (~$0.04/image)
Configure OpenAI API key for GPT models (conversation, image analysis, voice transcription)
Import workflow and configure all three API credentials in n8n
Update bot tokens in HTTP request nodes for file downloads
Test with text prompts, image uploads, voice messages, and PDF documents

Nodes Used (8)

AI Agent

@n8n/n8n-nodes-langchain.agent

Code

n8n-nodes-base.code

Google Gemini Chat Model

@n8n/n8n-nodes-langchain.lmChatGoogleGemini

HTTP Request

n8n-nodes-base.httpRequest

OpenAI

@n8n/n8n-nodes-langchain.openAi

OpenAI Chat Model

@n8n/n8n-nodes-langchain.lmChatOpenAi

Simple Memory

@n8n/n8n-nodes-langchain.memoryBufferWindow

n8n-nodes-base.telegram

Nano Banana/Gemini 2.5 Telegram Bot with Multi-modal Functionality

Description

Nodes Used (8)

Select Nodes to Filter