skillsFirst-partyReview first Safety · Privacy ·

Cloudflare Workers AI Edge Functions Skill

by JSONbored·added 2025-10-16·

Claude CodeCodexWindsurfGeminiCursorCLI

HarnessClaude CodeCodexWindsurfGeminiCursorCLI

Level:advancedType:generalVerified:draft

Review first — review before installing

Open the source and read safety notes before installing.

Prerequisites

Cloudflare account
Wrangler CLI 3.0+
Node.js 18+
@cloudflare/workers-types
Cloudflare account with Workers AI enabled (available on Free and Paid plans)
Wrangler CLI authentication configured (wrangler login) for deployment access

Schema details

Install type: package
Reading time: 6 min
Difficulty score: 100
Troubleshooting: Yes
Breaking changes: No

Package metadata

Download URL: /downloads/skills/cloudflare-workers-ai-edge.zip
Package verified: Yes
SHA-256: 8cf522b452d3699ef4bc63ebfb8e326609b053e7c7234a44aef1b2b2adeee6d8

Skill and platform metadata

Skill type: general
Skill level: advanced
Verification: draft
Verified at: 2025-10-16

Retrieval sources

https://developers.cloudflare.com/workers-ai/

Tested platforms

ClaudeCodexOpenClawCursorWindsurfGemini

Platform	Support	Install path
claude-code	Native	.claude/skills/<skill-name>/SKILL.md
codex	Native	.agents/skills/<skill-name>/SKILL.md
windsurf	Native	.windsurf/skills/<skill-name>/SKILL.md
gemini	Native	.gemini/skills/<skill-name>/SKILL.md or .agents/skills/<skill-name>/SKILL.md
cursor	Adapter	.cursor/rules/<skill-name>.mdc
cli	Manual	AGENTS.md or tool-specific context file

Full copyable content

export interface Env {
  AI: any;
}

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    if (request.method !== 'POST') {
      return new Response('Method not allowed', { status: 405 });
    }

    const { messages } = await request.json<{ messages: any[] }>();

    const response = await env.AI.run('@cf/meta/llama-2-7b-chat-int8', {
      messages: [
        { role: 'system', content: 'You are a helpful assistant.' },
        ...messages,
      ],
      stream: true,
    });

    return new Response(response, {
      headers: {
        'content-type': 'text/event-stream',
        'cache-control': 'no-cache',
      },
    });
  },
};

About this resource

Deploy AI models and serverless functions to Cloudflare's global edge network with sub-5ms cold starts and 40% edge computing market share. Access 50+ open-source AI models (Llama-2, Whisper, Stable Diffusion) with pay-per-use pricing. Includes 10,000 free Neurons per day, integrated D1/R2/KV storage, and deployment to 275+ cities worldwide.

Content

Cloudflare Workers AI Edge Functions Skill

What This Skill Enables

Claude can build and deploy AI-powered serverless functions on Cloudflare's global edge network, spanning 275+ cities with sub-5ms cold start times (10-80x faster than AWS Lambda@Edge). With 40% edge computing market share and 4,000% year-over-year growth in AI inference requests, Cloudflare Workers AI brings machine learning models directly to users worldwide with minimal latency.

Compatibility

Native

Claude Code / Claude: native skill usage via SKILL.md.
Codex/OpenAI workflows: compatible with Agent Skills-style SKILL.md content as reusable workflow instructions.

Manual Adaptation

Gemini CLI: native skill usage via .gemini/skills/<skill-name>/SKILL.md or .agents/skills/<skill-name>/SKILL.md where supported.
Cursor: use the generated .cursor/rules/*.mdc adapter for project rules.
OpenClaw and similar agents: use the same skill content as a reusable prompt/workflow file when native skill import is unavailable.

Prerequisites

Required:

Claude Pro subscription or Claude Code CLI
Cloudflare account (free tier available)
Wrangler CLI installed (npm install -g wrangler)
Basic understanding of JavaScript/TypeScript

What Claude handles automatically:

Writing Workers code with TypeScript types
Configuring wrangler.toml for deployments
Implementing AI model bindings (Llama-2, Whisper, Stable Diffusion)
Setting up D1 database and R2 storage integrations
Managing environment variables and secrets
Deploying to Cloudflare's edge network
Optimizing for V8 isolate performance

How to Use This Skill

Deploy a Basic Edge Function

Prompt: "Create a Cloudflare Worker that responds to HTTP requests with JSON data and deploys to the edge."

Claude will:

Generate a Worker with proper fetch event handler
Create wrangler.toml configuration
Set up TypeScript types for Request/Response
Add error handling and CORS headers
Deploy with wrangler publish
Provide the deployed Worker URL

AI Model Integration (Llama-2 Chat)

Prompt: "Build a Cloudflare Worker that uses Llama-2 to generate chat responses. Accept POST requests with user messages and stream the AI responses back."

Claude will:

Configure AI binding in wrangler.toml
Implement streaming response with ReadableStream
Add proper prompt formatting for Llama-2
Set up rate limiting to control costs
Include request validation and error handling
Deploy with Workers AI binding enabled

Image Generation with Stable Diffusion

Prompt: "Create an edge function that generates images using Stable Diffusion XL. Accept a text prompt via API and return the generated image URL stored in R2."

Claude will:

Set up Workers AI binding for Stable Diffusion
Configure R2 bucket for image storage
Implement image generation with proper parameters
Upload generated images to R2 with public URLs
Add caching headers for CDN optimization
Include usage analytics with D1 database

Real-Time Translation API

Prompt: "Build a translation API using Cloudflare Workers AI that detects the source language and translates to the target language. Support 50+ languages with edge caching."

Claude will:

Use Workers AI translation models
Implement language detection
Set up KV namespace for translation caching
Add rate limiting per IP address
Configure CDN cache for common translations
Include usage metrics and error logging

Tips for Best Results

Leverage V8 Isolates: Workers use V8 isolates that start in <5ms and use 1/10th the memory of Node.js. Design stateless functions that take advantage of this architecture.
Use Durable Objects for State: For stateful operations (WebSockets, real-time collaboration), request Durable Objects implementation instead of external databases.
Model Selection: Choose appropriate AI models based on latency requirements. Smaller models like Llama-2-7B offer faster inference than larger variants.
Edge Caching: Implement Cache API or KV storage for frequently accessed data to reduce AI inference costs.
Cost Optimization: Workers AI charges per request. Use caching, rate limiting, and request batching to optimize costs.
Geographic Routing: Workers automatically route to the nearest data center. For AI models, consider pinning specific regions for data residency compliance.

Common Workflows

Full-Stack AI Application

"Create a complete AI-powered application on Cloudflare:
1. Workers AI for text generation (Llama-2)
2. D1 database for storing conversations
3. R2 for file uploads and generated content
4. KV for session management and caching
5. Pages for frontend deployment
6. Queue for background job processing
Include TypeScript types and deployment scripts."

Content Moderation API

"Build an edge API that:
1. Accepts text content via POST request
2. Uses Workers AI to detect harmful content
3. Classifies content as safe/unsafe with confidence scores
4. Logs results to D1 database
5. Returns moderation decision in <100ms
6. Handles 10,000 requests per minute"

Smart Image CDN

"Create a Cloudflare Worker that:
1. Intercepts image requests
2. Analyzes image with Workers AI (OCR, object detection)
3. Automatically optimizes images for device/bandwidth
4. Stores optimized versions in R2
5. Serves from edge cache on subsequent requests
6. Includes usage analytics and cost tracking"

Real-Time Sentiment Analysis

"Build a WebSocket-based sentiment analysis service:
1. Accept streaming text via WebSocket
2. Process chunks with Workers AI sentiment model
3. Return real-time sentiment scores
4. Store aggregate results in D1
5. Support 1000 concurrent connections
6. Deploy across all Cloudflare edge locations"

Troubleshooting

Issue: Worker exceeds CPU time limits Solution: Workers have a 50ms CPU time limit on free tier (30s on paid). Optimize by using streaming responses, reducing synchronous processing, or upgrading to Unbound workers for longer execution.

Issue: AI model inference too slow Solution: Use smaller model variants (e.g., Llama-2-7B instead of 13B), implement request queuing with Workers Queue, or cache common responses in KV storage.

Issue: CORS errors when calling from frontend Solution: Add proper CORS headers in Worker response. Ask Claude to include OPTIONS method handler and appropriate Access-Control-* headers.

Issue: Workers AI billing concerns Solution: Implement rate limiting with Durable Objects or KV, cache responses aggressively, use smaller models for simpler tasks, and set up billing alerts in Cloudflare dashboard.

Issue: Cannot access environment variables Solution: Ensure secrets are set with wrangler secret put and bindings are properly configured in wrangler.toml. Access via env.SECRET_NAME in Worker code.

Issue: Cold start latency for complex Workers Solution: Minimize dependencies (Workers bundle size should be <1MB), use dynamic imports for optional features, and consider splitting into multiple Workers for different routes.

Learn More

Features

Sub-5ms cold starts with V8 isolates
20+ AI models: Llama-2, Whisper, Stable Diffusion
Deploy to 275+ cities globally
Integrated with D1, R2, KV, Queues
50+ open-source AI models in catalog
Pay-per-use pricing with 10,000 free Neurons/day
Integrated with D1, R2, KV, Queues, Durable Objects
Real-time streaming responses with Server-Sent Events (SSE) support for AI model outputs, enabling progressive response delivery and improved user experience for long-running AI operations

Use Cases

Edge AI inference with minimal latency
Serverless APIs with global distribution
Real-time content moderation and analysis
Content moderation APIs with real-time classification
Multi-language translation services with edge caching
AI-powered image generation and processing pipelines

Content outline

Content
What This Skill Enables
Compatibility
Native
Manual Adaptation
Prerequisites
How to Use This Skill
Deploy a Basic Edge Function
AI Model Integration (Llama-2 Chat)
Image Generation with Stable Diffusion
Real-Time Translation API
Tips for Best Results
Common Workflows
Full-Stack AI Application
Content Moderation API
Smart Image CDN

#cloudflare#edge-computing#ai#serverless#workers

Source citations

Signals

Loading live community signals…

Prerequisites

Schema details

About this resource

Content

Cloudflare Workers AI Edge Functions Skill

What This Skill Enables

Compatibility

Native

Manual Adaptation

Prerequisites

How to Use This Skill

Deploy a Basic Edge Function

AI Model Integration (Llama-2 Chat)

Image Generation with Stable Diffusion

Real-Time Translation API

Tips for Best Results

Common Workflows

Full-Stack AI Application

Content Moderation API

Smart Image CDN

Real-Time Sentiment Analysis

Troubleshooting

Learn More

Features

Use Cases

Source citations

Related resources

Cloudflare Workers D1 KV R2 Capability Pack Skill

OpenNext Cloudflare Capability Pack Skill

GitHub Actions AI-Powered CI/CD Automation Skill

Playwright E2E Testing Automation Skill

Signals