Leaner, Smarter AI Cloud Systems

The promise of edge computing for AI workloads has evolved from experimental optimization to production-ready enterprise architecture. What began as our exploration of WASM efficiency gains has matured into a comprehensive platform strategy that leverages Cloudflare’s full spectrum of services; from Workers and AI inference to containers, durable execution, and Zero Trust security.

A Pragmatic Approach

Our initial focus on pure WASM compilation through the Fidelity framework revealed both the tremendous potential and practical limitations of edge-first development. Cloudflare Containers are in public beta and represents a significant opportunity for sophistication of solution building with the Fidelity framework. It allows Cloudflare to support new types of workloads on their network with an experience that is deeply integrated with their Worker ecosystem. This container support fundamentally changes the architectural possibilities, allowing us to deploy native F# workloads via unikernels where they make sense, with passing mention that .NET is also a supported option for legacy workloads.

The architecture now encompasses three complementary execution models:

JavaScript via Fable: Service orchestration and platform API integration leveraging Cloudflare’s significant service offerings
WASM via Fidelity/Firefly: Core business logic compiled to WebAssembly modules via LLVM or the new WAMI pathway
Containers for Fidelity Unikernels: Complex workloads requiring specialized capabilities or a specific native library that’s unique to customer requirements

This pragmatic approach acknowledges that most systems are increasingly real-time and chatty, often holding open long-lived connections, performing tasks in parallel, requiring different compute strategies for different components.

The Complete Cloudflare Mosaic

Our comprehensive architecture leverages Cloudflare’s extensive service portfolio to build secure, scalable AI systems:

%%{init: {'theme': 'neutral'}}%% graph LR subgraph "Zero Trust Security Perimeter" ZT[Cloudflare Access
Identity Verification] CASB[Cloud Access Security
Broker] DLP[Data Loss Prevention] end subgraph "Application Layer" subgraph "Workers Platform" WORKER[Workers
WASM + JS] PAGES[Pages
Static Assets] EMAIL[Email Workers] end subgraph "Container Platform" CONTAINER[Fidelity Unikernels
Fast Cold Starts] SANDBOX[Code Interpreter
Sandboxes] end end subgraph "AI & Compute Services" AI[Workers AI
GPU Inference] VECTOR[Vectorize
Vector Database] WORKFLOW[Workflows
Durable Execution] QUEUE[Queues
Event Processing] end subgraph "Data & Storage Layer" D1[D1
SQLite Database] R2[R2
Object Storage] KV[KV Store
Session Data] DO[Durable Objects
Stateful Coordination] end subgraph "Analytics & Monitoring" ANALYTICS[Analytics Engine
Time-Series Data] LOGS[Logpush
Audit Trail] TRACE[Trace
Distributed Tracing] end %% Security flow ZT --> WORKER ZT --> CONTAINER CASB --> R2 DLP --> D1 %% Application connections WORKER --> AI WORKER --> VECTOR WORKER --> WORKFLOW CONTAINER --> AI %% Data flow WORKFLOW --> QUEUE QUEUE --> DO DO --> D1 WORKER --> KV WORKER --> R2 %% Analytics WORKER --> ANALYTICS CONTAINER --> ANALYTICS AI --> ANALYTICS style ZT fill:#e8f5e9,stroke:#2e7d32,stroke-width:3px style WORKER fill:#ff6d00,stroke:#e65100,stroke-width:2px,color:#ffffff style AI fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px

Extending the Enterprise Security Perimeter

Zero Trust security is an IT security model that requires strict identity verification for every person and device trying to access resources on a private network, regardless of whether they are sitting within or outside of the network perimeter. Cloudflare’s Zero Trust services enable enterprises to extend their security boundary into the cloud while maintaining complete control over data access.

The CloudflareFS framework, built on Fable, allows us to manage these security policies in a type-safe and composable way.

// F# CloudflareFS configuration for Zero Trust architecture
module SecurityInfrastructure =
    open Cloudflare.ZeroTrust
    open Fable.Import.Cloudflare

    let configureZeroTrust (config: EnterpriseConfig) = async {
        // Access policies for AI services
        let! aiAccessPolicy = 
            Cloudflare.ZeroTrust.Access.createPolicy(
                config.AIApplicationId,
                { Precedence = 1
                  Decision = PolicyDecision.Allow
                  Include = [
                      PolicyRule.Groups(config.DataScienceGroupId)
                  ]
                  Require = [
                      PolicyRule.DevicePosture("compliant")
                  ]
                  SessionDuration = TimeSpan.FromHours(24)
                })
        
        // DLP rules for sensitive data
        let! dlpProfile = 
            Cloudflare.ZeroTrust.DLP.createProfile(
                "sensitive-data-protection",
                { ProfileType = ProfileType.Predefined
                  Entries = [
                      Entry.Pattern(Pattern.Regex @"\d{3}-\d{2}-\d{4}") // SSN
                      Entry.Pattern(Pattern.Regex @"\d{16}") // Credit cards
                  ]
                })
        
        // Tunnel for secure connectivity
        let! secureTunnel = 
            Cloudflare.ZeroTrust.Tunnel.create(
                config.AccountId,
                { Name = "enterprise-ai-tunnel"
                  Secret = config.TunnelSecret
                  ConfigSrc = "cloudflare" })
        
        return { AccessPolicy = aiAccessPolicy
                  DlpProfile = dlpProfile
                  Tunnel = secureTunnel }
    }

This Zero Trust architecture ensures that security controls for public-facing applications match or exceed private networks, bringing enterprise-grade security to cloud AI deployments.

Container Platform: The Missing Piece

Cloudflare Containers are now available and they are a game-changer for Cloudflare deployments. You can now run new kinds of applications alongside your Workers. The true power of this with Fidelity is our plan to build Unikernels, which are single-purpose, lightweight, and fast-booting containers compiled within our framework.

When Unikernels Make Sense

// Compile a self-contained F# application with Fidelity
module EnterpriseAIProcessor =
    open FSharp.Data
    open Alloy.IO
    open Alloy.Text
    
    // This F# code is compiled to a tiny native binary
    let processText (data: string) = 
        let lines = data.Split('\n') |> Array.ofList
        let wordCount = lines |> Array.sumBy (fun line -> line.Split(' ').Length)
        let uniqueWordCount = 
            lines 
            |> Array.collect (fun line -> line.Split(' '))
            |> Set.ofArray 
            |> Set.count
        
        {| wordCount = wordCount; uniqueWordCount = uniqueWordCount |}
    
    // Dockerfile for a minimal musl-based container
    // Uses a very lightweight base image for maximum efficiency
    let dockerfile = """
FROM scratch
COPY ./app /
ENTRYPOINT ["/app"]
"""

The Fidelity compilation process produces a native binary that links against musl, a lightweight C library. The resulting container is a fraction of the size of traditional containers, enabling sub-second cold starts and minimal resource consumption. This approach is ideal for tasks like data preprocessing, custom API endpoints, or complex, stateful computations.

Durable Execution with Workflows

Cloudflare Workflows is also in open beta, and represents a serious “leveling up” of solutions Cloudflare can support. Workflows allows building reliable, repeatable, long-lived multi-step applications that can automatically retry, persist state, and scale out. This enables sophisticated pipelines:

module AIWorkflows =
    open CloudflareWorkflows
    
    type DocumentProcessor() =
        inherit Workflow<DocumentInput, ProcessingResult>()
        
        override this.run(input, step) = async {
            // Extract text from document
            let! extracted = step.do_("extract", fun () ->
                R2.fetch input.documentUrl
                |> extractText
            )
            
            // Generate embeddings with retry logic
            let! embedded = step.do_("embed", fun () ->
                WorkersAI.createEmbedding extracted
            ) |> step.withRetries 3
            
            // Store in vector database
            let! stored = step.do_("store", fun () ->
                Vectorize.upsert {
                    id = input.documentId
                    vector = embedded
                    metadata = input.metadata
                }
            )
            
            // Conditionally trigger analysis
            if input.requiresAnalysis then
                let! analysis = step.do_("analyze", fun () ->
                    // This could call a container for complex ML
                    Container.fetch "ml-analyzer" extracted
                )
                
                // Wait for human review if needed
                if analysis.requiresReview then
                    let! approved = step.waitForEvent("approval", TimeSpan.FromHours(24))
                    
                    if approved then
                        do! step.do_("finalize", fun () ->
                            D1.insert analysis.results
                        )
            
            return ProcessingResult.Success stored.id
        }

Event-Driven Architecture

You can also configure changes to content in any R2 bucket to trigger sophisticated processing pipelines:

module EventDrivenAI =
    // R2 event triggers workflow
    let configureEventPipeline() =
        R2EventNotification.create {
            bucket = "user-uploads"
            event_types = ["object-create", "object-update"]
            destination = Queue "document-processor"
        }
    
    // Queue consumer processes events
    let documentQueueConsumer = 
        QueueConsumer.create {
            queue = "document-processor"
            batch_size = 10
            max_retries = 3
            
            handler = fun messages -> async {
                for msg in messages do
                    // Trigger workflow for each document
                    let! workflowId = 
                        Workflow.create "DocumentProcessor" {
                            documentUrl = msg.body.url
                            documentId = msg.body.id
                            requiresAnalysis = msg.body.size > 1_000_000L
                            metadata = msg.body.metadata
                        }
                    
                    // Track in Analytics Engine
                    do! Analytics.writeDataPoint {
                        dataset = "document_processing"
                        point = {
                            timestamp = DateTime.UtcNow
                            workflowId = workflowId
                            documentSize = msg.body.size
                        }
                    }
            }
        }

Infrastructure as Code with `CloudflareFS`

Managing this comprehensive architecture requires sophisticated infrastructure orchestration. Our plans for a future CloudflareFS framework, built by transpiling the Cloudflare TypeScript SDKs to F# with Fable, will provide a type-safe and idiomatic F# approach for managing Cloudflare resources.

The CloudflareFS framework will offer a unified, functional API that naturally aligns with Cloudflare’s declarative service model, allowing you to manage everything from Workers to security policies.

module CloudflareInfrastructure =
    open Fable.Import.Cloudflare
    open Cloudflare.Workers
    open Cloudflare.Containers
    open Cloudflare.ZeroTrust
    open System.IO

    let deployAIPlatform() = async {
        // Configure Workers with AI bindings
        let! aiWorker = 
            Cloudflare.Workers.deployScript(
                "ai-orchestrator",
                { Content = File.ReadAllText("./dist/worker.js")
                  Module = true
                  Bindings = [
                      WorkerBinding.AI(ModelName.Llama_3_1_8B_Instruct)
                      WorkerBinding.Service(
                          name = "VECTORIZE"
                          service = "vectorize-index")
                      WorkerBinding.R2Bucket(
                          name = "STORAGE"
                          bucketName = "ai-documents")
                      WorkerBinding.D1Database(
                          name = "DATABASE"
                          databaseId = database.Id)
                  ] })
        
        // Deploy a lean, fast-starting container
        let! mlContainer = 
            Cloudflare.Containers.deployContainer(
                "ml-processor",
                { Image = "fidelity/ml-processor:musl-latest"
                  Secrets = [config.GetSecret "ML_API_KEY"]
                  EnvironmentVariables = Map.ofList [
                      "PROCESSING_MODE", "production"
                      "GPU_ENABLED", "true"
                  ]
                  ResourceRequirements = {
                      Cpu = 1
                      Memory = "512MB"
                      Gpu = 1
                  }})
        
        // Configure Zero Trust access
        let! accessApplication = 
            Cloudflare.ZeroTrust.Access.createApplication(
                "ai-platform",
                { Domain = "ai.company.internal"
                  Type = ApplicationType.SelfHosted
                  SessionDuration = "24h"
                  AllowedIdps = [IdP.AzureAD]
                  AutoRedirectToIdentity = true})
        
        // Set up monitoring
        let! logpush = 
            Cloudflare.Logpush.createJob(
                "audit-logs",
                { Dataset = "access_requests"
                  DestinationConf = "s3://audit-bucket/logs"
                  Ownership = { DestinationConf = config.GetSecret "S3_AUDIT_KEY" } })
        
        return {|
            WorkerUrl = aiWorker.Url
            ContainerEndpoint = mlContainer.Endpoint
            AccessPortal = accessApplication.Domain
        |}
    }

Speculating on Fable Bindings from TypeScript

The cloudflare-fs framework will use tools like Fable.Glutinum and Hawaii to generate F# bindings from Cloudflare’s official TypeScript SDKs and OpenAPI specifications. This process will offer type-safe APIs for Cloudflare’s entire ecosystem without needing to manually author a binding by hand. The mature Fable compiler handles all of the bundling and the cloudflare-fs tool will manage the Wrangler CLI for seamless environment management and deployment to Cloudflare’s global network.

Here’s a look at how a Cloudflare TypeScript SDK definition could be automatically transformed into a Fable binding.

TypeScript Definition (Simplified)

The Cloudflare Workers AI SDK provides a createEmbedding function on the AI class.

// From @cloudflare/workers-ai SDK
export class AI {
    constructor(bindings: any);
    
    createEmbedding(
        model: string,
        input: string | string[]
    ): Promise<{ shape: number[], data: number[][] }>;
    
    run<T>(
        model: string,
        options: { input: any }
    ): Promise<T>;
}

Transformed F# Fable Binding

Using Glutinum, the TypeScript types and functions are automatically converted into F# types and async functions, providing a fully idiomatic and type-safe F# experience. The union types (string | string[]) become F# discriminated unions, and the object literals ({ input: any }) become F# record types.

// Automatically generated F# binding via Fable.Glutinum
module Fable.Import.Cloudflare
    open Fable.Core
    open Fable.Core.JS
    
    [<JSComponent "AI", "AI">]
    type AI = 
        [<JSCtor>]
        new: bindings: obj -> AI
    
        [<JSMethod>]
        member createEmbedding: 
            model: string *             input: EmbeddingInput -> 
            Async<EmbeddingResult>
    
        [<JSMethod>]
        member run:
            model: string *
            options: RunOptions ->
            Async<'T>
    
    // TypeScript union type becomes F# discriminated union
    type EmbeddingInput =
        | [<JSEnum>] String of string
        | [<JSEnum>] StringArray of string[]
        
    // TypeScript object literal becomes F# record type
    type RunOptions = {
        input: obj // 'any' from TypeScript becomes 'obj' in F#
    }
    
    // The F# API is cleaner and easier to use:
    let embed (ai: AI) (input: EmbeddingInput) = 
        ai.createEmbedding(ModelName.Llama_3_1_8B_Instruct, input)

This approach brings the full power of Cloudflare’s services directly into the F# ecosystem, providing a consistent, type-safe, and composable experience for building modern distributed applications.

Future Evolution: The Convergence Path

Looking ahead, several Cloudflare developments will further enhance this architecture:

WebGPU and Browser-Based Inference

Progressive enhancement to client-side inference for ultimate privacy and zero latency.

Constellation: Distributed Data Platform

Analytics Engine is Cloudflare’s time-series and metrics database that allows you to write unlimited-cardinality analytics at scale, evolving toward a complete analytical platform.

Quantum-Safe Cryptography

Post-quantum security preparations ensuring long-term data protection.

Edge Databases Global Consistency

D1’s evolution toward globally distributed SQL with strong consistency guarantees.

Conclusion: Beyond Optimization to Transformation

What began as an exploration of WASM efficiency has evolved into a comprehensive platform strategy that fundamentally reimagines how enterprise AI systems should be built. By embracing Cloudflare’s full service portfolio; from Workers to containers, from Zero Trust to durable execution; we’ve moved beyond mere optimization to architectural transformation.

In choosing the right compute paradigm for each component, orchestrating services through functional composition, and leveraging platform-native capabilities, we can build safe, intelligent systems that are simultaneously more secure, more efficient, and more maintainable than traditional approaches.

The Fidelity framework remains central to this vision as a compilation strategy that can target multiple elements in this ecosystem. Whether generating WASM modules through MLIR or deploying native workloads via containers, the Fidelity framework and F# provide the conceptual coherence that makes this multi-paradigm architecture manageable.

For enterprises looking to deploy AI at scale, this architecture offers an efficient, performant path. Companies can start with Zero Trust bounded services, gradually adopting additional services for operational efficiency. Eventually containers with mission-critical logic can support business where complexity demands it, and wrap everything in enterprise-grade protection. The result is an AI-ready platform that’s built for production today while remaining flexible enough for tomorrow’s innovations.

Return to Blog