Triage GitHub issues with OpenAI categorization and embedding-based duplicate detection
Go to WorkflowDescription
Who's it for
Open-source maintainers, product teams with public repositories, and any organization receiving a steady stream of GitHub Issues. Ideal for small teams who waste hours per week triaging duplicates and misrouted reports.
How it works
When a new Issue is opened, a GitHub webhook fires this workflow. It first filters for the "opened" action, then fetches the last 30 Issues from the repository. All Issue texts (new + past) are sent to OpenAI's embeddings API in a single batch call for efficiency. The workflow calculates cosine similarity between the new Issue and every past Issue. If the maximum similarity exceeds 0.85, the new Issue is auto-closed with a comment referencing the original. Otherwise, AI classifies it into one of four categories: bug (adds label + Slack alert to dev team), question (posts FAQ link as a comment), feature (appends to a roadmap Google Sheet), or spam (auto-close with label). AI is used only for classification — the duplicate detection uses deterministic vector math, and every action is rule-based.
Set up steps
Generate a GitHub Personal Access Token with repo scope
Create a webhook on your repository pointing to this workflow's URL, subscribing to Issues events
Create a Google Sheet named feature_roadmap with columns: date_added, issue_number, title, author, url, status
Open Set Configuration and fill in the repo owner, repo name, Sheet ID, Slack channel, and FAQ URL
Register GitHub and OpenAI Header Auth credentials and connect Google Sheets and Slack
Activate the workflow
How to customize
Adjust duplicate_threshold for stricter or looser matching, change the embeddings model, or swap Sheets for Notion or Airtable.