Setting Up An API Subdomain - Solving CORS Issues and Production AI Reliability

My AI demos were failing in production. Not always—just enough to be frustrating. Users would get CORS errors, 504 timeouts, or watch the loading spinner run for 35 seconds before giving up. The root causes were separate but equally annoying: API Gateway configuration issues and optimistic reliance on free-tier AI APIs.

Here’s how I fixed both.

The CORS Problem: API Gateway Request Validation

The startup generator and Ask Warren tools were intermittently failing with this error:

Access to fetch at 'https://0035cpjhwl.execute-api.us-east-1.amazonaws.com/prod/generate'
from origin 'https://mays.co' has been blocked by CORS policy:
No 'Access-Control-Allow-Origin' header is present on the requested resource.

The pattern was clear: valid requests worked fine, but any validation error (short resume, malformed JSON, missing auth) returned a CORS error.

Root Cause: Request Validator Running Before Lambda

I had configured an API Gateway Request Validator that checked incoming requests against a JSON schema. This seemed smart at the time—catch bad requests early, save Lambda execution costs.

The problem: when validation failed, API Gateway returned a 400 error without invoking the Lambda. Since my Lambda was responsible for adding CORS headers to all responses, these validation failures came back headerless, triggering browser CORS errors.

Flow before the fix:

Valid request → passes validation → reaches Lambda → CORS headers added → Success
Invalid request → fails validation → API Gateway returns 400 without CORS headers → CORS error in browser

Solution: Lambda Proxy Integration

I removed the request validator entirely and switched to Lambda Proxy Integration mode. Now all requests reach the Lambda, which handles validation and always returns proper CORS headers.

infrastructure/cdk/startup-generator-stack.ts

// Before: Request validation at API Gateway level
const lambdaIntegration = new apigateway.LambdaIntegration(startupGeneratorFunction, {
  requestTemplates: { 'application/json': '{ "statusCode": "200" }' },
});

generateResource.addMethod('POST', lambdaIntegration, {
  authorizationType: apigateway.AuthorizationType.NONE,
  requestValidator: new apigateway.RequestValidator(...),  // ❌ Validation without CORS
  requestModels: { ... },
});

// After: Lambda handles everything
const lambdaIntegration = new apigateway.LambdaIntegration(startupGeneratorFunction, {
  proxy: true,  // ✅ Proxy mode
});

generateResource.addMethod('POST', lambdaIntegration, {
  authorizationType: apigateway.AuthorizationType.NONE,
  // ✅ No validator - Lambda handles validation and CORS
});

This fixed the CORS issues, but revealed a second problem: 504 Gateway Timeouts.

The Timeout Problem: Free Tier Rate Limits

With CORS fixed, I started seeing 504 errors. Checking CloudWatch logs:

Duration: 34978.73 ms (34.9 seconds) ❌ Timeout
Duration: 29607.99 ms (29.6 seconds) ✅ Success
Duration: 33207.36 ms (33.2 seconds) ❌ Timeout

API Gateway has a hard 30-second timeout. My Lambda was regularly exceeding it.

Why This Was Happening

I’d originally built both tools using Google’s Gemini Flash 2.0 on the free tier (2M tokens/day, effectively $0.00 per generation). This worked great during development, but in production I hit rate limits.

My “clever” solution had been to implement a fallback system:

// Try Gemini first
try {
  const response = await gemini.generateContent(...);
  // ...
} catch (geminiError) {
  console.log('Gemini failed, falling back to OpenAI');
  const response = await openai.chat.completions.create(...);
  // ...
}

This meant every rate-limited request made two API calls:

Gemini API → 429 Rate Limit Error (~1-2 seconds)
OpenAI API → Success (~8-10 seconds)

Add in Lambda cold starts, embedding generation, and response processing, and I was consistently hitting 30-35 seconds total.

The Fix: Drop Gemini, Go Direct to OpenAI

I removed the fallback system entirely and switched both tools to use OpenAI directly:

infrastructure/lambda/startup-generator/index.ts

// Before: Try Gemini, fall back to OpenAI
let ideas: StartupIdea[];
let usedFallback = false;

try {
  const ai = new GoogleGenAI({ apiKey: apiKeys.gemini });
  const response = await ai.models.generateContent({
    model: "gemini-2.0-flash-lite",
    contents: prompt,
    config: { temperature: 0.7, maxOutputTokens: 4000 },
  });
  // ... parse response
} catch (geminiError) {
  usedFallback = true;
  const openai = new OpenAI({ apiKey: apiKeys.openai });
  // ... OpenAI call
}

// After: OpenAI directly
console.log('Using OpenAI directly (Gemini temporarily disabled)');
const openai = new OpenAI({ apiKey: apiKeys.openai });
const completion = await openai.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [
    { role: 'system', content: 'You are a startup advisor...' },
    { role: 'user', content: prompt }
  ],
  temperature: 0.7,
  max_tokens: 3000, // Reduced from 4000 for faster response
});

Ask Warren got the same treatment:

// Embeddings: Gemini → OpenAI
const openai = new OpenAI({ apiKey: apiKeys.openai });
const embeddingResponse = await openai.embeddings.create({
  model: 'text-embedding-3-small',
  input: query,
  dimensions: 768, // Match original Gemini dimensions
});

// LLM generation: Gemini → OpenAI
const completion = await openai.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [
    { role: 'system', content: 'You are a helpful assistant...' },
    { role: 'user', content: augmentedPrompt }
  ],
  temperature: 0.7,
  max_tokens: 1000,
});

Additional Performance Optimizations

With the fallback removed, I made a few more tweaks:

1. Increased Lambda Memory (More CPU Power)

// infrastructure/cdk/startup-generator-stack.ts
memorySize: 256, // Up from 128MB
timeout: cdk.Duration.seconds(29), // Must be < API Gateway's 30s limit

Lambda allocates CPU power proportionally to memory. Doubling memory gave me noticeably faster execution.

2. Reduced Token Generation

max_tokens: 3000, // Down from 4000 in startup generator
max_tokens: 1000, // Already optimized in Ask Warren

Fewer tokens = faster generation, with no quality loss for these use cases.

3. Cleaned Up Dependencies

# Removed unused packages
npm uninstall dotenv @google/generative-ai

Slightly smaller bundle = faster cold starts.

Setting Up api.mays.co: A Better Developer Experience

With the backend stable, I tackled one last annoyance: the ugly API Gateway URL.

Frontend code before:

const API_ENDPOINT = 'https://0035cpjhwl.execute-api.us-east-1.amazonaws.com/prod/generate';

This is fine functionally, but:

Hard to remember
Looks unprofessional
Region-locked in the URL
No flexibility to move backends

I set up a custom subdomain using Route53 and ACM:

infrastructure/cdk/startup-generator-stack.ts

// Import existing certificate and hosted zone
const certificate = certificatemanager.Certificate.fromCertificateArn(
  this,
  'MaysCoCertificate',
  'arn:aws:acm:us-east-1:414424364221:certificate/...'
);

const hostedZone = route53.HostedZone.fromHostedZoneAttributes(this, 'MaysCoHostedZone', {
  hostedZoneId: 'Z0805591185SX9PE0SF0B',
  zoneName: 'mays.co',
});

// Create custom domain for API Gateway
const apiDomainName = new apigateway.DomainName(this, 'ApiDomainName', {
  domainName: 'api.mays.co',
  certificate,
  securityPolicy: apigateway.SecurityPolicy.TLS_1_2,
  endpointType: apigateway.EndpointType.EDGE,
});

// Map API to custom domain
new apigateway.BasePathMapping(this, 'ApiBasePathMapping', {
  domainName: apiDomainName,
  restApi: api,
  stage: api.deploymentStage,
});

// Create Route53 A record
new route53.ARecord(this, 'ApiAliasRecord', {
  zone: hostedZone,
  recordName: 'api',
  target: route53.RecordTarget.fromAlias(
    new route53targets.ApiGatewayDomain(apiDomainName)
  ),
});

Frontend code after:

const API_ENDPOINT = 'https://api.mays.co/generate';

Much cleaner. And if I ever need to move to a different backend (different AWS account, different provider, whatever), I can update the DNS without touching frontend code.

Why This Matters: CORS and Custom Domains

Here’s a bonus benefit I didn’t expect: custom domains make CORS simpler.

With the raw API Gateway URL, I needed to configure CORS for a cross-origin request:

Origin: https://mays.co
API: https://0035cpjhwl.execute-api.us-east-1.amazonaws.com

With the custom domain:

Origin: https://mays.co
API: https://api.mays.co

Both are subdomains of mays.co, which makes debugging easier and reduces the mental overhead of thinking about CORS policies.

Results

Before:

CORS failures on validation errors
30-35 second response times
~40% success rate (timeouts when >30s)
Fallback system masking rate limit issues
Ugly API Gateway URLs

After:

100% CORS success (all responses include headers)
20-25 second response times
~100% success rate
Clean, professional API endpoints
Simpler codebase (no fallback logic)

Cost Impact:

Before: $0.00/request (Gemini free tier)
After: ~$0.01-0.02/request (OpenAI gpt-4o-mini)

For a demo/portfolio site, this is negligible. More importantly, it’s reliable. Users don’t care that I saved $0.01 if the tool doesn’t work.

Key Takeaways

Let Lambda handle CORS. API Gateway request validators cause headerless error responses. Use Lambda Proxy Integration and handle validation in code.
Free tier AI is great for prototyping, terrible for production. Rate limits are real. Budget for paid APIs once you ship.
Fallback systems have hidden costs. My “robust” fallback doubled response time on every rate-limited request. Sometimes failing fast is better than failing slow.
Custom domains are worth it. Cleaner URLs, easier debugging, more flexibility. The CDK setup took 20 minutes.
Lambda memory = CPU power. Doubling from 128MB to 256MB meaningfully improved execution speed.
API Gateway’s 30s timeout is non-negotiable. If your Lambda might take longer, you need a different architecture (async processing, Step Functions, etc.).

The tools now work reliably. Response times are predictable. And when something does break, I can actually debug it instead of guessing whether it’s CORS, timeouts, or rate limits.

If you’re running serverless AI tools in production, skip the “clever” optimizations. Pay for reliability, use custom domains, and let your Lambda handle CORS. Your users will thank you.

Try the tools: