Skip to main content
Deploy the Perplexity MCP Server to cloud platforms for scalable, production-ready environments with high availability and automated management.

Overview

Cloud deployment allows you to:
  • Scale horizontally with multiple instances
  • Leverage managed infrastructure
  • Implement automated deployments
  • Benefit from cloud provider reliability
  • Integrate with cloud-native monitoring

General Requirements

All cloud deployments require:

Environment Variables

  • PERPLEXITY_API_KEY (required)
  • PORT for HTTP server
  • ALLOWED_ORIGINS for CORS

Networking

  • HTTP server listening on configured port
  • Health check endpoint at /health
  • MCP endpoint at /mcp

Platform Examples

AWS Elastic Container Service (ECS)

1

Build and push Docker image

Build your Docker image and push to Amazon ECR:
# Authenticate to ECR
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin <account-id>.dkr.ecr.us-east-1.amazonaws.com

# Build and tag
docker build -t perplexity-mcp-server .
docker tag perplexity-mcp-server:latest <account-id>.dkr.ecr.us-east-1.amazonaws.com/perplexity-mcp-server:latest

# Push to ECR
docker push <account-id>.dkr.ecr.us-east-1.amazonaws.com/perplexity-mcp-server:latest
2

Create ECS task definition

Define your container with environment variables:
task-definition.json
{
  "family": "perplexity-mcp-server",
  "containerDefinitions": [
    {
      "name": "perplexity-mcp",
      "image": "<account-id>.dkr.ecr.us-east-1.amazonaws.com/perplexity-mcp-server:latest",
      "portMappings": [
        {
          "containerPort": 8080,
          "protocol": "tcp"
        }
      ],
      "environment": [
        {"name": "PORT", "value": "8080"},
        {"name": "BIND_ADDRESS", "value": "0.0.0.0"},
        {"name": "ALLOWED_ORIGINS", "value": "https://your-app.com"}
      ],
      "secrets": [
        {
          "name": "PERPLEXITY_API_KEY",
          "valueFrom": "arn:aws:secretsmanager:us-east-1:<account-id>:secret:perplexity-api-key"
        }
      ],
      "healthCheck": {
        "command": ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"],
        "interval": 30,
        "timeout": 5,
        "retries": 3
      }
    }
  ]
}
3

Deploy to ECS

Create service with load balancer integration for high availability.

Google Cloud Run

1

Build and deploy

Deploy directly from source or Docker image:
# Deploy from source
gcloud run deploy perplexity-mcp-server \
  --source . \
  --region us-central1 \
  --set-env-vars PORT=8080 \
  --set-env-vars ALLOWED_ORIGINS=https://your-app.com \
  --set-secrets PERPLEXITY_API_KEY=perplexity-api-key:latest
2

Configure health checks

Cloud Run automatically uses the /health endpoint:
gcloud run services update perplexity-mcp-server \
  --region us-central1 \
  --no-use-http2

Azure Container Instances

az container create \
  --resource-group myResourceGroup \
  --name perplexity-mcp-server \
  --image <registry>.azurecr.io/perplexity-mcp-server:latest \
  --dns-name-label perplexity-mcp \
  --ports 8080 \
  --environment-variables \
    PORT=8080 \
    BIND_ADDRESS=0.0.0.0 \
    ALLOWED_ORIGINS=https://your-app.com \
  --secure-environment-variables \
    PERPLEXITY_API_KEY=your_key_here

Heroku

1

Create Heroku app

heroku create perplexity-mcp-server
2

Set environment variables

heroku config:set PERPLEXITY_API_KEY=your_key_here
heroku config:set ALLOWED_ORIGINS=https://your-app.com
3

Deploy

git push heroku main

Render

Create a new Web Service with:
  • Build Command: npm install && npm run build
  • Start Command: npm run start:http
  • Environment Variables:
    • PERPLEXITY_API_KEY (secret)
    • PORT (auto-set by Render)
    • ALLOWED_ORIGINS (your domain)

Railway

# Install Railway CLI
npm install -g @railway/cli

# Initialize and deploy
railway init
railway up

# Set environment variables
railway variables set PERPLEXITY_API_KEY=your_key_here
railway variables set ALLOWED_ORIGINS=https://your-app.com

Health Check Configuration

All cloud platforms should be configured to use the /health endpoint for health checks:
curl http://your-deployment-url/health
Expected response:
{
  "status": "ok",
  "service": "perplexity-mcp-server"
}

Health Check Settings

SettingRecommended Value
Path/health
Interval30 seconds
Timeout5 seconds
Healthy threshold2 consecutive successes
Unhealthy threshold3 consecutive failures
Configure health checks to restart unhealthy containers automatically. This ensures high availability and automatic recovery from transient failures.

Security Best Practices

  • Never commit API keys to version control
  • Use cloud provider secret management (AWS Secrets Manager, Google Secret Manager, etc.)
  • Rotate API keys regularly
  • Use different keys for development and production
  • Restrict ALLOWED_ORIGINS to specific domains
  • Use HTTPS/TLS for all external connections
  • Implement rate limiting at the load balancer level
  • Use VPC/private networking where possible
  • Implement authentication/authorization if needed
  • Use cloud IAM roles for service-to-service communication
  • Enable audit logging
  • Monitor API usage and set up alerts
Always use HTTPS in production. The MCP server runs HTTP by default - terminate SSL at your load balancer or reverse proxy.

Scaling Considerations

Horizontal Scaling

The Perplexity MCP Server is stateless and can be scaled horizontally:
  • Auto-scaling: Configure based on CPU/memory usage or request count
  • Load balancing: Distribute traffic across multiple instances
  • Session handling: Each request is independent (stateless)

Resource Allocation

Recommended resources per instance:
EnvironmentCPUMemoryInstances
Development0.25 vCPU512 MB1
Production0.5-1 vCPU1-2 GB2+
High traffic1-2 vCPU2-4 GB3+
Memory requirements may increase for long-running research queries. Monitor actual usage and adjust accordingly.

Timeout Configuration

For cloud deployments handling research queries, increase timeouts:
export PERPLEXITY_TIMEOUT_MS=600000  # 10 minutes
Ensure your cloud platform’s timeout settings accommodate long-running requests:
  • Load balancer timeout > PERPLEXITY_TIMEOUT_MS
  • Container platform timeout > load balancer timeout

Monitoring and Logging

Log Level Configuration

export PERPLEXITY_LOG_LEVEL=INFO  # DEBUG, INFO, WARN, ERROR

Metrics to Monitor

  • Request count and latency
  • Error rates (4xx, 5xx responses)
  • Health check success rate
  • API key usage and quotas
  • Container CPU and memory usage
Set up alerts for:
  • Health check failures
  • Error rate spikes
  • API quota approaching limits
  • Unusual traffic patterns

Example: Complete AWS Deployment

For a comprehensive HTTP server deployment guide with Node.js and process management, see the HTTP Server Deployment documentation. For Docker-based deployments, refer to the Docker Deployment guide.