The AI Vendor Evaluation Prompt: Strip the Marketing from Any 'AI-Powered' Tool
Most 'AI-powered' tools are a wrapper around an OpenAI API call — this prompt helps you figure out if you're buying real AI capability or a $299/month import statement.
## Why This Prompt Exists
Most 'AI-powered' B2B tools are: 1) call OpenAI API, 2) insert your data, 3) return the response. That's fine — but you should know what you're paying for. When a $50k/year incident management tool suggests you 'check your database connections' during a real outage, you've been had.
## How to Use It
Run this in Claude or GPT-4 after you've seen a vendor demo or read their marketing page. Paste in their feature descriptions when prompted.
---
## THE PROMPT
```
I'm evaluating an AI-powered [CATEGORY: e.g. 'incident management' / 'code review' / 'customer support'] vendor. Their marketing claims are below. Help me cut through it.
VENDOR CLAIMS:
[paste their feature list or marketing copy here]
For each claimed AI feature, give me:
1. WHAT IT LIKELY ACTUALLY IS — three options ranked by likelihood:
a) Standard LLM API call with their prompt template (lowest moat)
b) Fine-tuned or RAG-augmented model on their specific data (medium moat)
c) Custom model or architecture trained on proprietary data (highest moat)
2. THE SHARP QUESTION to ask in a demo to expose which it is. The question should be something a vendor rep cannot easily deflect.
3. THE FAILURE MODE — what does this feature do wrong when it matters most? (Under load, on novel/unseen inputs, at 3am during an incident)
4. THE BUILD-VS-BUY CALCULATION — how many hours to replicate this specific feature with a direct OpenAI/Claude API call and a well-crafted system prompt? Be concrete.
After analyzing all features, tell me:
- What the actual defensible value of this vendor is (if any)
- What I would lose by building the core functionality myself in a weekend
- What I would NOT be able to replicate easily
```
## What To Do With the Output
Take the 'sharp questions' into your next vendor demo. If the sales rep can answer them cleanly with specifics — model architecture, training data provenance, accuracy benchmarks on your use case — that's a good sign. If they say 'I'll have to follow up on the technical details,' you have your answer.
## Real Example of What This Catches
An incident tool that claims 'AI root cause analysis' is almost certainly doing: take your alert payload → insert into a prompt template → call GPT-4 → return the response. You can replicate this in an afternoon with a Slack webhook and a $20/month OpenAI key. The question isn't whether it uses AI — it's whether the vendor has built something on top of that which you can't replicate.