Skip to main content
Stagehand uses Large Language Models (LLMs) to understand web pages, plan actions, and interact with complex interfaces. The choice of LLM significantly impacts your automation’s accuracy, speed, and cost.

Model Evaluation

Find more details about how to choose the right model on our Model Evaluation page.

Why LLM Choice Matters

  • Accuracy: Better models provide more reliable element detection and action planning
  • Speed: Faster models reduce automation latency
  • Cost: Different providers offer varying pricing structures
  • Reliability: Structured output support ensures consistent automation behavior
Find more details about how to choose the right model on our Model Evaluation page.
Small models on Ollama struggle with consistent structured outputs. While technically supported, we don’t recommend them for production Stagehand workflows.

Environment Variables Setup

Set up your API keys before configuring Stagehand:
# Choose one or more providers
OPENAI_API_KEY=your_openai_key_here
ANTHROPIC_API_KEY=your_anthropic_key_here
GOOGLE_API_KEY=your_google_key_here
GROQ_API_KEY=your_groq_key_here

Supported Providers

Stagehand supports major LLM providers with structured output capabilities:

Production-Ready Providers

ProviderBest ModelsStrengthsUse Case
OpenAIgpt-4.1, gpt-4.1-miniHigh accuracy, reliableProduction, complex sites
Anthropicclaude-3-7-sonnet-latestExcellent reasoningComplex automation tasks
Googlegemini-2.5-flash, gemini-2.5-proFast, cost-effectiveHigh-volume automation

Additional Providers

Basic Configuration

Model Name Format

Stagehand uses the format provider/model-name for model specification. Examples:
  • OpenAI: openai/gpt-4.1
  • Anthropic: anthropic/claude-3-7-sonnet-latest
  • Google: google/gemini-2.5-flash (Recommended)

Quick Start Examples

Custom LLM Integration

Custom LLMs are currently only supported in TypeScript.
Integrate any LLM with Stagehand using custom clients. The only requirement is structured output support for consistent automation behavior.

Vercel AI SDK

The Vercel AI SDK is a popular library for interacting with LLMs. You can use any of the providers supported by the Vercel AI SDK to create a client for your model, as long as they support structured outputs. Vercel AI SDK supports providers for OpenAI, Anthropic, and Google, along with support for Amazon Bedrock and Azure OpenAI. To get started, you’ll need to install the ai package and the provider you want to use. For example, to use Amazon Bedrock, you’ll need to install the @ai-sdk/amazon-bedrock package. You’ll also need to import the Vercel AI SDK external client which is exposed as AISdkClient to create a client for your model.
  • npm
  • pnpm
  • yarn
npm install ai @ai-sdk/amazon-bedrock
To get started, you can use the Vercel AI SDK external client which is exposed as AISdkClient to create a client for your model.
// Install/import the provider you want to use.
// For example, to use OpenAI, import `openai` from @ai-sdk/openai
import { bedrock } from "@ai-sdk/amazon-bedrock";
import { AISdkClient } from "@browserbasehq/stagehand";

const stagehand = new Stagehand({
  llmClient: new AISdkClient({
	model: bedrock("anthropic.claude-3-7-sonnet-20250219-v1:0"),
  }),
});

Troubleshooting

Common Issues

Error: Model does not support structured outputsSolution: Use models that support function calling/structured outputs. The minimum requirements are:
  • Model must support JSON/structured outputs
  • Model must have strong reasoning capabilities
  • Model must be able to handle complex instructions
For each provider, use their latest models that meet these requirements. Some examples:
  • OpenAI: GPT-4 series or newer
  • Anthropic: Claude 3 series or newer
  • Google: Gemini 2 series or newer
  • Other providers: Latest models with structured output support
Note: Avoid base language models without structured output capabilities or fine-tuning for instruction following. When in doubt, check our Model Evaluation page for up-to-date recommendations.
Error: Invalid API key or UnauthorizedSolution:
  • Verify your environment variables are set correctly
  • Check API key permissions and quotas
  • Ensure you’re using the correct API key for the provider
  • For Anthropic, make sure you have access to the Claude API
Symptoms: Actions work sometimes but fail other timesCauses & Solutions:
  • Weak models: Use more capable models - check our Model Evaluation page for current recommendations
  • High temperature: Set temperature to 0 for deterministic outputs
  • Complex pages: Switch to models with higher accuracy scores on our Model Evaluation page
  • Rate limits: Implement retry logic with exponential backoff
  • Context limits: Reduce page complexity or use models with larger context windows
  • Prompt clarity: Ensure your automation instructions are clear and specific
Issue: Automation takes too long to respondSolutions:
  • Use fast models: Choose models optimized for speed
    • Any model with < 1s response time
    • Models with “fast” or “flash” variants
  • Optimize settings:
    • Use verbose: 0 to minimize token usage
    • Set temperature to 0 for fastest processing
    • Keep max tokens as low as possible
  • Consider local deployment: Local models can provide lowest latency
  • Batch operations: Group multiple actions when possible
Issue: LLM usage costs are too highCost Optimization Strategies:
  1. Switch to cost-effective models:
    • Check our Model Evaluation page for current cost-performance benchmarks
    • Choose models with lower cost per token that still meet accuracy requirements
    • Consider models optimized for speed to reduce total runtime costs
  2. Optimize token usage:
    • Set verbose: 0 to reduce logging overhead
    • Use concise prompts and limit response length
  3. Smart model selection: Start with cheaper models, fallback to premium ones only when needed
  4. Cache responses: Implement LLM response caching for repeated automation patterns
  5. Monitor usage: Set up billing alerts and track costs per automation run
  6. Batch processing: Process multiple similar tasks together

Next Steps

I