Cloudflare Workers AI Edge Functions Skill
Deploy AI models and serverless functions to Cloudflare's global edge network with sub-5ms cold starts and 40% edge computing market share. Access 50+ open-source AI models (Llama-2, Whisper, Stable Diffusion) with pay-per-use pricing.
Open the source and read safety notes before installing.
Prerequisites
- Cloudflare account
- Wrangler CLI 3.0+
- Node.js 18+
- @cloudflare/workers-types
- Cloudflare account with Workers AI enabled (available on Free and Paid plans)
- Wrangler CLI authentication configured (wrangler login) for deployment access
Schema details
- Install type
- package
- Reading time
- 6 min
- Difficulty score
- 100
- Troubleshooting
- Yes
- Breaking changes
- No
- Download URL
- /downloads/skills/cloudflare-workers-ai-edge.zip
- Package verified
- Yes
- SHA-256
- 8cf522b452d3699ef4bc63ebfb8e326609b053e7c7234a44aef1b2b2adeee6d8
- Skill type
- general
- Skill level
- advanced
- Verification
- draft
- Verified at
- 2025-10-16
| Platform | Support | Install path |
|---|---|---|
| claude-code | Native | .claude/skills/<skill-name>/SKILL.md |
| codex | Native | .agents/skills/<skill-name>/SKILL.md |
| windsurf | Native | .windsurf/skills/<skill-name>/SKILL.md |
| gemini | Native | .gemini/skills/<skill-name>/SKILL.md or .agents/skills/<skill-name>/SKILL.md |
| cursor | Adapter | .cursor/rules/<skill-name>.mdc |
| cli | Manual | AGENTS.md or tool-specific context file |
Full copyable content
export interface Env {
AI: any;
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
if (request.method !== 'POST') {
return new Response('Method not allowed', { status: 405 });
}
const { messages } = await request.json<{ messages: any[] }>();
const response = await env.AI.run('@cf/meta/llama-2-7b-chat-int8', {
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
...messages,
],
stream: true,
});
return new Response(response, {
headers: {
'content-type': 'text/event-stream',
'cache-control': 'no-cache',
},
});
},
};About this resource
Deploy AI models and serverless functions to Cloudflare's global edge network with sub-5ms cold starts and 40% edge computing market share. Access 50+ open-source AI models (Llama-2, Whisper, Stable Diffusion) with pay-per-use pricing. Includes 10,000 free Neurons per day, integrated D1/R2/KV storage, and deployment to 275+ cities worldwide.
Content
Cloudflare Workers AI Edge Functions Skill
What This Skill Enables
Claude can build and deploy AI-powered serverless functions on Cloudflare's global edge network, spanning 275+ cities with sub-5ms cold start times (10-80x faster than AWS Lambda@Edge). With 40% edge computing market share and 4,000% year-over-year growth in AI inference requests, Cloudflare Workers AI brings machine learning models directly to users worldwide with minimal latency.
Compatibility
Native
- Claude Code / Claude: native skill usage via
SKILL.md. - Codex/OpenAI workflows: compatible with Agent Skills-style
SKILL.mdcontent as reusable workflow instructions.
Manual Adaptation
- Gemini CLI: native skill usage via
.gemini/skills/<skill-name>/SKILL.mdor.agents/skills/<skill-name>/SKILL.mdwhere supported. - Cursor: use the generated
.cursor/rules/*.mdcadapter for project rules. - OpenClaw and similar agents: use the same skill content as a reusable prompt/workflow file when native skill import is unavailable.
Prerequisites
Required:
- Claude Pro subscription or Claude Code CLI
- Cloudflare account (free tier available)
- Wrangler CLI installed (
npm install -g wrangler) - Basic understanding of JavaScript/TypeScript
What Claude handles automatically:
- Writing Workers code with TypeScript types
- Configuring wrangler.toml for deployments
- Implementing AI model bindings (Llama-2, Whisper, Stable Diffusion)
- Setting up D1 database and R2 storage integrations
- Managing environment variables and secrets
- Deploying to Cloudflare's edge network
- Optimizing for V8 isolate performance
How to Use This Skill
Deploy a Basic Edge Function
Prompt: "Create a Cloudflare Worker that responds to HTTP requests with JSON data and deploys to the edge."
Claude will:
- Generate a Worker with proper
fetchevent handler - Create
wrangler.tomlconfiguration - Set up TypeScript types for Request/Response
- Add error handling and CORS headers
- Deploy with
wrangler publish - Provide the deployed Worker URL
AI Model Integration (Llama-2 Chat)
Prompt: "Build a Cloudflare Worker that uses Llama-2 to generate chat responses. Accept POST requests with user messages and stream the AI responses back."
Claude will:
- Configure AI binding in wrangler.toml
- Implement streaming response with ReadableStream
- Add proper prompt formatting for Llama-2
- Set up rate limiting to control costs
- Include request validation and error handling
- Deploy with Workers AI binding enabled
Image Generation with Stable Diffusion
Prompt: "Create an edge function that generates images using Stable Diffusion XL. Accept a text prompt via API and return the generated image URL stored in R2."
Claude will:
- Set up Workers AI binding for Stable Diffusion
- Configure R2 bucket for image storage
- Implement image generation with proper parameters
- Upload generated images to R2 with public URLs
- Add caching headers for CDN optimization
- Include usage analytics with D1 database
Real-Time Translation API
Prompt: "Build a translation API using Cloudflare Workers AI that detects the source language and translates to the target language. Support 50+ languages with edge caching."
Claude will:
- Use Workers AI translation models
- Implement language detection
- Set up KV namespace for translation caching
- Add rate limiting per IP address
- Configure CDN cache for common translations
- Include usage metrics and error logging
Tips for Best Results
Leverage V8 Isolates: Workers use V8 isolates that start in <5ms and use 1/10th the memory of Node.js. Design stateless functions that take advantage of this architecture.
Use Durable Objects for State: For stateful operations (WebSockets, real-time collaboration), request Durable Objects implementation instead of external databases.
Model Selection: Choose appropriate AI models based on latency requirements. Smaller models like Llama-2-7B offer faster inference than larger variants.
Edge Caching: Implement Cache API or KV storage for frequently accessed data to reduce AI inference costs.
Cost Optimization: Workers AI charges per request. Use caching, rate limiting, and request batching to optimize costs.
Geographic Routing: Workers automatically route to the nearest data center. For AI models, consider pinning specific regions for data residency compliance.
Common Workflows
Full-Stack AI Application
"Create a complete AI-powered application on Cloudflare:
1. Workers AI for text generation (Llama-2)
2. D1 database for storing conversations
3. R2 for file uploads and generated content
4. KV for session management and caching
5. Pages for frontend deployment
6. Queue for background job processing
Include TypeScript types and deployment scripts."
Content Moderation API
"Build an edge API that:
1. Accepts text content via POST request
2. Uses Workers AI to detect harmful content
3. Classifies content as safe/unsafe with confidence scores
4. Logs results to D1 database
5. Returns moderation decision in <100ms
6. Handles 10,000 requests per minute"
Smart Image CDN
"Create a Cloudflare Worker that:
1. Intercepts image requests
2. Analyzes image with Workers AI (OCR, object detection)
3. Automatically optimizes images for device/bandwidth
4. Stores optimized versions in R2
5. Serves from edge cache on subsequent requests
6. Includes usage analytics and cost tracking"
Real-Time Sentiment Analysis
"Build a WebSocket-based sentiment analysis service:
1. Accept streaming text via WebSocket
2. Process chunks with Workers AI sentiment model
3. Return real-time sentiment scores
4. Store aggregate results in D1
5. Support 1000 concurrent connections
6. Deploy across all Cloudflare edge locations"
Troubleshooting
Issue: Worker exceeds CPU time limits Solution: Workers have a 50ms CPU time limit on free tier (30s on paid). Optimize by using streaming responses, reducing synchronous processing, or upgrading to Unbound workers for longer execution.
Issue: AI model inference too slow Solution: Use smaller model variants (e.g., Llama-2-7B instead of 13B), implement request queuing with Workers Queue, or cache common responses in KV storage.
Issue: CORS errors when calling from frontend Solution: Add proper CORS headers in Worker response. Ask Claude to include OPTIONS method handler and appropriate Access-Control-* headers.
Issue: Workers AI billing concerns Solution: Implement rate limiting with Durable Objects or KV, cache responses aggressively, use smaller models for simpler tasks, and set up billing alerts in Cloudflare dashboard.
Issue: Cannot access environment variables
Solution: Ensure secrets are set with wrangler secret put and bindings are properly configured in wrangler.toml. Access via env.SECRET_NAME in Worker code.
Issue: Cold start latency for complex Workers Solution: Minimize dependencies (Workers bundle size should be <1MB), use dynamic imports for optional features, and consider splitting into multiple Workers for different routes.
Learn More
- Cloudflare Workers AI Documentation
- Workers AI Models Catalog
- Wrangler CLI Guide
- Workers Platform Architecture
- Edge Computing Best Practices
- Durable Objects Guide
Features
- Sub-5ms cold starts with V8 isolates
- 20+ AI models: Llama-2, Whisper, Stable Diffusion
- Deploy to 275+ cities globally
- Integrated with D1, R2, KV, Queues
- 50+ open-source AI models in catalog
- Pay-per-use pricing with 10,000 free Neurons/day
- Integrated with D1, R2, KV, Queues, Durable Objects
- Real-time streaming responses with Server-Sent Events (SSE) support for AI model outputs, enabling progressive response delivery and improved user experience for long-running AI operations
Use Cases
- Edge AI inference with minimal latency
- Serverless APIs with global distribution
- Real-time content moderation and analysis
- Content moderation APIs with real-time classification
- Multi-language translation services with edge caching
- AI-powered image generation and processing pipelines
- Content
- What This Skill Enables
- Compatibility
- Native
- Manual Adaptation
- Prerequisites
- How to Use This Skill
- Deploy a Basic Edge Function
- AI Model Integration (Llama-2 Chat)
- Image Generation with Stable Diffusion
- Real-Time Translation API
- Tips for Best Results
- Common Workflows
- Full-Stack AI Application
- Content Moderation API
- Smart Image CDN
Source citations
Signals
Loading live community signals…
A short, calm digest of reviewed Claude resources. Unsubscribe any time.