Skip to main content

General Questions

MCP (Model Context Protocol) is an open protocol that enables AI assistants to securely connect to external tools and data sources. The Perplexity MCP Server implements this protocol to provide AI assistants with access to Perplexity’s web-grounded search, research, and reasoning capabilities.Benefits of MCP:
  • Standardized way to integrate tools with AI assistants
  • Works across multiple clients (Cursor, Claude Desktop, VS Code, etc.)
  • Secure and controlled access to external APIs
  • Easy to install and configure
Learn more about MCP at modelcontextprotocol.io
The Perplexity MCP Server is the official integration that brings Perplexity’s AI-powered search and research capabilities to your favorite AI coding assistants and development tools.What it provides:
  • Real-time web search with ranked results
  • AI-powered question answering with citations
  • Deep research with comprehensive multi-source analysis
  • Advanced reasoning for complex analytical tasks
All tools are read-only and access live web data to keep your AI assistant up-to-date with current information.

Tool Selection

Choose the right tool based on your needs:Best for: Finding URLs, checking recent news, verifying facts, discovering sourcesReturns: Ranked list of results with titles, URLs, snippets, and dates (no AI synthesis)Speed: Fast (< 2 seconds)Example use cases:
  • “Find official documentation for React hooks”
  • “Search for recent articles about TypeScript 5.0”
  • “Get URLs for Python asyncio tutorials”

perplexity_ask

Best for: Quick factual questions, summaries, explanations, general Q&AReturns: AI-generated answer with numbered citationsSpeed: Fast (< 3 seconds)Model: sonar-proExample use cases:
  • “What are the main features of Next.js 14?”
  • “Explain how OAuth 2.0 works”
  • “Summarize recent changes to the AWS SDK”

perplexity_research

Best for: Literature reviews, comprehensive overviews, investigative queries needing many sourcesReturns: Detailed multi-source analysis with numbered citationsSpeed: Slow (30+ seconds)Model: sonar-deep-researchExample use cases:
  • “Compare modern web frameworks for building SaaS applications”
  • “Research best practices for microservices architecture in 2026”
  • “Comprehensive analysis of GraphQL vs REST API design”

perplexity_reason

Best for: Math, logic, comparisons, complex arguments, chain-of-thought tasksReturns: Step-by-step reasoning with numbered citationsSpeed: Medium (3-10 seconds)Model: sonar-reasoning-proExample use cases:
  • “Analyze the trade-offs between PostgreSQL and MongoDB”
  • “Calculate the complexity of this sorting algorithm”
  • “Compare the security implications of different authentication methods”
Start with perplexity_ask for most questions. Use perplexity_research only when you need comprehensive multi-source analysis. Use perplexity_reason when the task requires logical analysis or step-by-step thinking.

Features

Most Perplexity tools return responses with numbered citations that reference the sources used to generate the answer.Citation format:
Next.js 14 introduces several improvements including Turbopack...

Citations:
[1] https://nextjs.org/blog/next-14
[2] https://vercel.com/blog/turbopack
[3] https://github.com/vercel/next.js/releases
Which tools include citations:
  • perplexity_ask - Yes
  • perplexity_research - Yes
  • perplexity_reason - Yes
  • perplexity_search - No (returns raw search results instead)
Why citations matter:
  • Verify the accuracy of information
  • Explore sources for deeper understanding
  • Assess the credibility of the response
  • Reference sources in your own documentation
Citations are numbered in the order they appear in the response. The same source may be cited multiple times.
Yes! You can configure a custom base URL using the PERPLEXITY_BASE_URL environment variable. This is useful for:
  • Using a proxy or gateway
  • Testing against a staging environment
  • Routing through a custom endpoint
Configuration:
{
  "mcpServers": {
    "perplexity": {
      "command": "npx",
      "args": ["-y", "@perplexity-ai/mcp-server"],
      "env": {
        "PERPLEXITY_API_KEY": "your_key_here",
        "PERPLEXITY_BASE_URL": "https://your-custom-url.com"
      }
    }
  }
}
Default value: https://api.perplexity.ai
The custom URL should provide API-compatible endpoints at /chat/completions and /search.
strip_thinking is an optional parameter for perplexity_reason and perplexity_research that removes internal reasoning tokens from the response.Purpose:
  • Some models include <think>...</think> tags showing their internal reasoning process
  • These tags can consume significant context tokens in your AI assistant
  • Setting strip_thinking: true removes these tags while keeping the final answer
Usage:
{
  "tool": "perplexity_reason",
  "arguments": {
    "messages": [{"role": "user", "content": "..."}],
    "strip_thinking": true
  }
}
Default value: false (thinking tokens are included)When to use it:
  • ✅ You only need the final answer, not the reasoning process
  • ✅ You’re running low on context tokens
  • ✅ The thinking tokens are cluttering your chat history
  • ❌ You want to understand how the model arrived at its conclusion
  • ❌ You’re debugging unexpected responses
Try leaving strip_thinking as false initially. Enable it if you notice context length issues or prefer cleaner responses.
The default timeout is 5 minutes (300,000ms), which works for most queries. For longer research tasks, increase it using PERPLEXITY_TIMEOUT_MS:Configuration:
{
  "mcpServers": {
    "perplexity": {
      "command": "npx",
      "args": ["-y", "@perplexity-ai/mcp-server"],
      "env": {
        "PERPLEXITY_API_KEY": "your_key_here",
        "PERPLEXITY_TIMEOUT_MS": "600000"
      }
    }
  }
}
Recommended timeouts:
  • perplexity_search: 60000ms (1 minute)
  • perplexity_ask: 180000ms (3 minutes)
  • perplexity_reason: 300000ms (5 minutes) - default
  • perplexity_research: 600000ms+ (10+ minutes)
Timeout behavior:
  • If the API doesn’t respond within the timeout period, the request is aborted
  • You’ll receive an error: “Request timeout: Perplexity API did not respond within Xms”
  • The timeout is checked on each request, so you can change it without restarting
The perplexity_research tool using the sonar-deep-research model typically takes 30+ seconds and may take several minutes for complex queries.
Rate limits are determined by your Perplexity API subscription plan, not by the MCP server.Check your limits:
  1. Visit the Perplexity API Portal
  2. View your current plan and rate limits
  3. Monitor your usage and remaining quota
If you hit rate limits:
  • You’ll receive a 429 error from the API
  • Wait before making additional requests
  • Consider upgrading your API plan for higher limits
  • Optimize your queries to use fewer API calls
Rate limit best practices:
  • Use perplexity_search instead of perplexity_ask when you only need URLs
  • Use perplexity_ask instead of perplexity_research for simple questions
  • Cache responses when appropriate
  • Implement exponential backoff for retries
The MCP server does not implement client-side rate limiting or caching. All requests are sent directly to the Perplexity API.

Advanced Features

Yes! Several tools support filtering parameters:

Recency Filtering

Available for: perplexity_ask, perplexity_reason
{
  "search_recency_filter": "day"  // "hour", "day", "week", "month", "year"
}
Use cases:
  • "hour" - Breaking news and very recent updates
  • "day" - Today’s developments
  • "week" - Recent announcements and releases
  • "month" - Current trends and recent changes
  • "year" - Exclude older information

Domain Filtering

Available for: perplexity_ask, perplexity_reason
{
  "search_domain_filter": ["wikipedia.org", "arxiv.org"]
}
Include specific domains:
["github.com", "stackoverflow.com"]  // Only these domains
Exclude domains (use - prefix):
["-reddit.com", "-twitter.com"]  // Exclude these domains

Search Context Size

Available for: perplexity_ask, perplexity_reason
{
  "search_context_size": "high"  // "low", "medium", "high"
}
  • "low" (default) - Fastest, less context
  • "medium" - Balanced speed and comprehensiveness
  • "high" - Most comprehensive, slower
Combine filters for precise results: filter by recency for current information, by domain for trusted sources, and use higher context size for comprehensive answers.
reasoning_effort controls the depth of analysis for the perplexity_research tool using the sonar-deep-research model.Available values:
  • "minimal" - Quick overview with basic analysis
  • "low" - Standard research with moderate depth
  • "medium" - Thorough analysis with multiple sources
  • "high" - Comprehensive deep dive with extensive investigation
Usage:
{
  "tool": "perplexity_research",
  "arguments": {
    "messages": [{"role": "user", "content": "..."}],
    "reasoning_effort": "high"
  }
}
Trade-offs:
  • Higher effort = More thorough analysis but slower response time
  • Lower effort = Faster results but less comprehensive coverage
When to use higher effort:
  • Complex topics requiring multiple perspectives
  • Critical decisions needing thorough research
  • Comprehensive comparisons and evaluations
When to use lower effort:
  • Simple overview or introduction to a topic
  • Time-sensitive queries
  • When you already have some context
The reasoning_effort parameter only applies to perplexity_research. Other tools use fixed reasoning strategies optimized for their use cases.

Need More Help?

Can’t find your question here? Check out these resources: