The Advent of Neuromorphic AI

The “AI industrial complex” in its current form is not sustainable.

While transformers have delivered remarkable capabilities, their energy consumption and computational demands reveal a fundamental inefficiency: we’re fighting against nature’s design principles. The human brain operates on roughly 20 watts, processing massive volumes of information through sparse, event-driven spikes. (at least, as we currently understand it today) Current AI systems consume thousands of watts to support narrow inference capabilities, forcing dense matrix operations through every computation. This disparity isn’t just inefficient; it suggests we’re missing something fundamental about intelligence itself.

Spiking Neural Networks (SNNs) offer a radically different path, one that neuromorphic processors have begun to realize in silicon. Yet despite decades of research and impressive hardware developments, SNNs remain frustratingly difficult to train and deploy. As with many algorithmic methods, efficient and accurate gradient calculation has been a constant challenge. For those that have been working in the field for decades, the core question in addressing SNNs surround how to compute gradients through discrete, non-differentiable spike events.

This document explores an unexpected convergence of ideas that may finally reveal the astounding potential of neuromorphic computing. It’s a story that connects several unique, yet proven ideas: ternary number systems, a breakthrough in gradient computation, and new designs in “spiking” neural processing. Coupled with SpeakEZ’s unique Fidelity framework design, revolutionary solutions will bring heterogeneous architectures into a manageable and coherent platform. The implications reach beyond any single hardware innovation to suggest a fundamental shift in how we can build intelligent systems.

The Ternary Revelation: Beyond Binary Thinking

Modern spiking neural network algorithms, as described in the Multi-Plasticity Synergy Learning (MPSL) framework¹, are shown to operate on a binary principle despite running on far more capable hardware. The Leaky Integrate-and-Fire equation from the paper defines spike generation as:

\[ S^{t,l} = \Theta(U^{t,l} - V_{th}) = \begin{cases} 1, & U^{t,l} \geq V_{th} \\ 0, & U^{t,l} < V_{th} \end{cases} \]

This binary representation; spike (1) or silent (0); has been the algorithmic convention in neuromorphic computing, not because of hardware limitations, but due to historical precedent and the mathematical challenges of training. Some neuromorphic processors like Intel’s Loihi 2 actually support graded spikes with up to 32-bit payloads, programmable neuron models, and thousands of states per neuron. This is a radical shift that completely outstrips the current theoretical conventions. Even the sophisticated MPSL framework, which innovatively combines multiple learning mechanisms (Spatio-Temporal Backpropagation or STBP for gradient-based learning, Hebbian plasticity for correlation-based local learning, and Self-Backpropagation or SBP for local feedback without explicit gradients), constrains itself to binary representations despite the hardware’s richer capabilities.

Lessons from Biology

Neuroscience tells us that biological neurons exhibit far richer dynamics, which is a subtle but significant contributor to the model. Between rest and firing, neurons spend significant time in distinct computational regimes, actively processing information without generating spikes. This isn’t merely noise or inefficiency; it’s a computational feature that binary SNNs completely miss.

Consider what happens in the binary model: a neuron accumulating toward threshold carries critical temporal information about recent inputs, yet this information vanishes the moment we sample its state. If the membrane potential is at 0.9 × threshold, the binary representation sees only “0”; identical to a neuron at rest. This discretization throws away precisely the information that makes temporal processing powerful.

The Computational Regime Model

The notion is to distinguish between the continuous membrane potential dynamics and the discrete computational regimes that neurons occupy. Biological neurons don’t just have voltage levels; they have distinct operational modes based on their membrane potential:

Silent/Resting: Near the resting potential (typically -70mV), the neuron is minimally responsive, with leak currents dominating
Active/Integrating: Depolarized but below firing threshold (between -55mV and -40mV), actively accumulating and processing inputs
Spiking/Firing: Above threshold, generating output spikes

This biological reality maps naturally to a ternary encoding that captures computational regime, not just voltage:

\[ \text{TernaryState} = \begin{cases} 0 & \text{Silent (near resting potential)} \\ -1 & \text{Active (integrating, between thresholds)} \\ +1 & \text{Spiking (above firing threshold)} \end{cases} \]

The continuous membrane potential \(U\) maps to discrete states via two thresholds:

\(\theta_{active}\): Transition from silent to active integration
\(\theta_{fire}\): Spike generation threshold

This preserves critical information about neurons actively integrating inputs (\(U \in [\theta_{active}, \theta_{fire})\)) that binary representations discard.

Leveraging New Hardware

This observation leads to our key innovation: expanding the algorithmic state space to match hardware capabilities. While the MPSL paper advances the field through multiple learning mechanisms, it follows the algorithmic convention of binary spike representation, leaving hardware capabilities untapped. Our ternary encoding leverages the multi-level states these processors already support.

This isn’t a hardware modification; it’s simply using the hardware as it was designed. Intel’s Loihi 2 can represent 4096 states per neuron, SambaNova’s RDU can reconfigure for arbitrary word-level operations, and we’re finally going to use these capabilities. The active state (-1) captures neurons that are actively integrating inputs but haven’t yet reached firing threshold, preserving temporal context that binary algorithms discard.

// Clear separation of continuous dynamics and discrete states
type TernarySpikingNeuron = {
    Potential: Posit<16, 1>      // Continuous membrane potential
    RestingPotential: float32    // Baseline (e.g., -70mV)
    ActiveThreshold: float32      // Activation begins (e.g., -55mV)  
    FiringThreshold: float32      // Spike generation (e.g., -40mV)
    State: TernaryState          // Discrete computational regime
}

// State mapping based on potential regions
let computeState (potential: Posit<16,1>) (neuron: TernarySpikingNeuron) =
    match Posit.toFloat32 potential with
    | p when p >= neuron.FiringThreshold -> Spiking   // +1
    | p when p >= neuron.ActiveThreshold -> Active    // -1 
    | _ -> Silent                                      // 0

// Accumulation happens in continuous domain
let updateNeuron (neuron: TernarySpikingNeuron) (input: float32) =
    // Continuous dynamics (Leaky Integrate-and-Fire)
    let leak = (neuron.Potential - neuron.RestingPotential) * leakRate
    let newPotential = neuron.Potential - leak + Posit.fromFloat32 input
    
    // Discrete state for communication
    let newState = computeState newPotential neuron
    
    // Reset only after spike
    match newState with
    | Spiking -> 
        { neuron with 
            State = Spiking
            Potential = Posit.fromFloat32 neuron.RestingPotential }
    | otherState -> 
        { neuron with 
            State = otherState
            Potential = newPotential }

%%{init: {'theme': 'neutral'}}%% stateDiagram-v2 Silent --> Active: Potential > θ_active Active --> Spiking: Potential > θ_fire Spiking --> Silent: Reset to resting Active --> Silent: Leak below θ_active Silent --> Silent: Remain near rest Active --> Active: Integrate inputs

Breaking the Backpropagation Dependency

The Surrogate Gradient Problem

The MPSL paper, like virtually all modern SNN training approaches, relies on surrogate gradients to handle the non-differentiable spike function. As the paper states in Equation 6:

\[ \frac{\partial S^{t,l}}{\partial U^{t,l}} \approx u'(U^{t,l}, V_{th}) \]

This approximation; replacing the undefined gradient with a smooth surrogate function; is a mathematical fiction that introduces instability and limits learning efficiency. Every major SNN training method (STBP, BPTT, even the innovative MPSL approach) depends on this workaround. We pretend the spike function is smooth when it fundamentally isn’t.

The Forward Gradient Revolution

Recent breakthrough work by Baydin, Pearlmutter, Siskind and Syme² demonstrates a revolutionary alternative that eliminates this fiction entirely. Their forward gradient method computes unbiased gradient estimates using only forward-mode automatic differentiation:

\[ g(\theta) = (\nabla f(\theta) \cdot v) v \]

Where \(v\) is a random perturbation vector. This formula has profound implications for SNNs:

No surrogate needed: The directional derivative \(\nabla f(\theta) \cdot v\) can be computed exactly even for discrete spike functions
Single forward pass: Eliminates the entire backward propagation phase
Proven unbiased: Mathematically guaranteed to converge to the true gradient in expectation
2x speedup: The paper demonstrates training neural networks up to twice as fast as backpropagation

Why This Changes Everything for SNNs

The forward gradient approach solves the exact problem that has plagued SNN training. Where the MPSL framework must resort to rectangular surrogate functions (Equation 7 in their paper), forward gradients handle discrete transitions naturally:

// Forward gradient captures state transition sensitivities
let computeStateGradient (potential: Posit<16,1>) (thresholds: Thresholds) =
    // Directional derivative exists at transition boundaries
    let perturbation = samplePerturbation()
    let perturbedPotential = potential + perturbation
    
    // State change detection (no surrogate needed!)
    let originalState = computeState potential thresholds
    let perturbedState = computeState perturbedPotential thresholds
    
    // Exact gradient through discrete transition
    if originalState <> perturbedState then
        perturbation  // Sensitivity at boundary
    else
        Posit.zero   // No transition

// Training with exact gradients through discrete states
let trainTernarySNN (network: SpikingNetwork) =
    // Sample random perturbation with posit precision
    let v = samplePerturbation<Posit<16,1>>()
    
    // Single forward pass computes output AND directional derivative
    // Even though spike function is discrete!
    let output, directional = 
        Furnace.ForwardMode.evaluateWithDerivative network v
    
    // Unbiased gradient estimate
    let forwardGradient = directional * v
    
    // Update using local plasticity rules
    updateSynapticWeights forwardGradient

The takeaway: discreteness doesn’t break directional derivatives.

When a perturbation causes a state transition (Silent → Active, Active → Spiking), the derivative captures that sensitivity exactly. When it doesn’t, the derivative is zero. The expectation over random perturbations recovers complete gradient information without any approximation.

Biological Plausibility Through Global Signals

The forward gradient paper notes something remarkable: this approach can be interpreted as “feedback of a single global scalar quantity that is identical for all computation nodes”². This maps naturally to biological neuromodulatory systems:

Dopamine for reward signaling
Serotonin for mood regulation
Acetylcholine for attention modulation

Combined with the MPSL framework’s multiple plasticity mechanisms¹, this creates a biologically plausible learning system that doesn’t force the algorithm into a “backpropagation corner”, which is by its nature implausible in biological neural networks.

Hebbian Plasticity Through State Transitions

The forward gradient approach naturally combines with local Hebbian rules based on our ternary state transitions:

\[ \Delta w_{ij} = \eta \cdot (\nabla f \cdot v) \cdot P(\text{State}_j | \text{State}_i) \]

Where weight updates depend on state transition probabilities:

Silent → Active: Potentiation (strengthen connection)
Active → Spiking: Hebbian reinforcement
Spiking → Silent: Refractory adjustment

This directly enhances the MPSL framework’s multi-plasticity approach, where they already combine STBP, Hebbian, and SBP mechanisms. Our ternary states provide richer transition information for these learning rules to exploit:

// Forward gradient weight update - mathematically principled
let updateSynapticWeights (network: SpikingNetwork) (weight: Posit<16,1>) =
    // Sample perturbation vector
    let v = sampleGaussian<Posit<16,1>>()
    
    // Compute directional derivative (exact, not surrogate!)
    let directional = computeDirectionalDerivative network v
    
    // Forward gradient is unbiased estimate of true gradient
    let gradient = directional * v
    
    // Combine with state transition probabilities
    match (preState, postState) with
    | (Silent, Active) -> 
        weight + learningRate * gradient * potentiationFactor
    | (Active, Spiking) -> 
        weight + learningRate * gradient * hebbianFactor
    | (Spiking, Silent) -> 
        weight - learningRate * gradient * depressionFactor
    | _ -> weight

Posits: The Natural Language of Membrane Dynamics

The Leaky Integrate-and-Fire equation from the MPSL paper reveals why posit arithmetic is ideal for SNNs:

\[ U^{t,l} = \rho_m(U^{t-1,l} - S^{t-1,l}V_{th}) + I^{t,l} \]

This equation involves:

Exponential decay (\(\rho_m\))
Threshold comparisons
Accumulation of many small inputs

Posit arithmetic’s variable precision naturally matches these requirements:

High precision near threshold: Where spike/no-spike decisions are critical
Lower precision for strongly polarized states: Where exact values matter less
Exponential representation: Natural for the \(\rho_m\) decay factor
Exact accumulation via quire: No rounding errors during integration

// Membrane potential dynamics with posit arithmetic
let computeMembranePotential (current: Posit<32,2>) (input: Posit<32,2>) =
    use quire = Quire<32, 512>.Zero  // Exact accumulation
    
    // Decay current potential
    quire.AddProduct(current, decayRate)
    
    // Accumulate weighted inputs (no rounding errors!)
    for synapse in activeSynapses do
        quire.AddProduct(synapse.Weight, synapse.Input)
    
    quire.ToPosit()  // Single rounding at the end

Integration with Furnace Auto-Differentiation

The Furnace library, originally developed as ‘DiffSharp’ by the same team behind the forward gradient breakthrough (Syme, Baydin, Pearlmutter and Siskind), provides the perfect foundation for implementing forward-mode SNNs:

module Furnace.Neuromorphic =
    // Leverage existing forward-mode AD infrastructure
    let trainSpikingNetwork (network: TernarySpikingNetwork) (data: SensorData) =
        // Forward gradient computation in a single pass
        let forwardGradient = furnace {
            // Random perturbation for unbiased gradient estimation
            let! v = sampleStandardNormal network.ParameterShape
            
            // Forward pass computes both output and directional derivative
            // No backward pass needed!
            let! output, directional = 
                ForwardMode.evaluateWithDirectional network data v
            
            // Forward gradient theorem: E[g] = ∇f
            return directional * v
        }
        
        // Update weights using local plasticity rules
        network.UpdateWeights forwardGradient

This approach achieves what the forward gradient paper demonstrated: training neural networks “without backpropagation” while being “computationally competitive”², often achieving 2x speedup over traditional methods.

The Untapped Silicon Potential

Modern neuromorphic processors and Coarse-Grained Reconfigurable Architectures (CGRAs) already possess the capabilities needed for our approach; they’re just waiting for the right software to unlock their potential.

Intel’s Loihi 2, far from being limited to binary spikes, actually supports:

Graded spikes with up to 32-bit integer payloads
Programmable neuron models via microcode that can implement arbitrary dynamics
Up to 4096 states per neuron, not just spike/no-spike
Ternary weight matrices already demonstrated in recent implementations

IBM’s TrueNorth, BrainChip’s Akida, and other neuromorphic processors similarly offer programmable models and multi-bit communications. The limitation has never been the silicon; it’s been our algorithms.

CGRAs: The Perfect Platform for Adaptive Intelligence

Coarse-Grained Reconfigurable Architectures from companies like NextSilicon and SambaNova offer even more flexibility:

Platform	Architecture	Key Advantage for SNNs
NextSilicon Maverick	Runtime reconfigurable dataflow	Automatically tunes to code patterns, no manual optimization needed
SambaNova RDU	Reconfigurable at each clock cycle	Can morph between neural and conventional processing dynamically
General CGRAs	Word-level reconfigurable arrays	Natural fit for ternary representations and posit arithmetic

As SambaNova describes it, their RDU is “an array of compute and memory on chip” that can be reconfigured to match the exact computational pattern needed. This makes CGRAs ideal for:

Ternary state machines that can be efficiently mapped to word-level operations
Posit arithmetic implementations using the flexible compute units
Dynamic network topologies that adapt during runtime
Mixed conventional/neuromorphic workloads in the same chip

Hardware Capability Summary

Feature	Neuromorphic (Loihi 2)	CGRAs (SambaNova/NextSilicon)	Fidelity Framework
State representation	Up to 4096 states/neuron	Arbitrary via reconfiguration	Ternary mapping
Arithmetic precision	8-32 bit configurable	Word-level operations	Posit arithmetic
Learning capability	Programmable plasticity	Runtime adaptable	Forward gradients
Computation model	Event-driven spikes	Dataflow reconfigurable	Both paradigms
Programming model	Microcode/assembly	High-level dataflow	F# unified abstraction

The Reality of Hybrid Compute

In practice, CGRA and neuromorphic processors rarely operate as the sole component to a solution. They’re deployed in heterogeneous systems as accelerators:

On-die integration: Accelerators alongside conventional CPU/GPU cores
CXL coherent memory: Shared memory spaces between neuromorphic and traditional processors
PCIe accelerators: Accelerator cards working within host systems
Edge hybrids: Low-power neuromorphic/CGRA units paired with DSPs or microcontrollers

The Fidelity framework’s design, particularly the Firefly Hypergraph as a “control flow to data flow” tranformer would make it uniquely suited for these heterogeneous deployments:

// Platform-agnostic neuromorphic compilation
[<CompileToNeuromorphic>]
let neuromorphicCore (neurons: TernarySpikingNeuron array) =
    neuromorphic {
        // Configure for available neuromorphic target
        let! target = detectNeuromorphicPlatform()
        
        match target with
        | Intel_Loihi2 config ->
            configureLoihi config
        | IBM_TrueNorth config ->
            configureTrueNorth config
        | BrainChip_Akida config ->
            configureAkida config
        | Infineon_Neuromorphic config ->
            configureInfineon config
        | FPGA_Emulation config ->
            configureFPGAEmulation config
        | CPU_Simulation fallback ->
            // Graceful degradation to CPU simulation
            configureCPUSimulation fallback
        
        // Common neuromorphic operations
        return! compileToDataFlow neurons
    }

Platform-Specific Implementation Strategies

The beauty of our approach is how naturally this future design will map to numerous hardware architectures:

On Neuromorphic Processors (Loihi 2):

// Direct mapping to Loihi 2's programmable neurons
[<CompileToLoihi>]
let ternaryNeuronLoihi (state: int32) (input: int32) =
    // Loihi 2 supports up to 4096 states - we use just 3
    // Maps to microcode on neuromorphic cores
    match state with
    | -1 -> processActive input         // State 0-1365
    | 0  -> processSilent input        // State 1366-2730
    | 1  -> processSpike input         // State 2731-4095

On CGRAs (SambaNova RDU, NextSilicon Maverick):

// CGRA implementation leverages word-level reconfiguration
[<CompileToCGRA>]
let ternaryNeuronCGRA (neurons: TernaryNeuron array) =
    cgra {
        // Configure processing elements for ternary operations
        let! pe_array = allocatePEs (neurons.Length)
        
        // Runtime reconfiguration based on state distribution
        for pe in pe_array do
            pe.ConfigureForTernary()  // Word-level ternary ops
            pe.SetPositPrecision(16, 1)  // Native posit support
        
        // Dataflow automatically optimized by platform
        return dataflowProcess neurons
    }

CGRAs are particularly powerful here because they can:

Reconfigure arithmetic units for posit operations dynamically
Adapt dataflow patterns based on spike density
Seamlessly transition between neural and conventional processing
Implement the forward gradient computation in parallel across PEs

Each learning mechanism operates on appropriate hardware:

STBP (Spatio-Temporal Backpropagation): Gradient-based learning that propagates errors through both space (layers) and time (timesteps)
Hebbian Plasticity: Local learning based on the principle “neurons that fire together, wire together”
SBP (Self-Backpropagation): Local feedback mechanism that approximates gradients without explicit error propagation

The ternary states provide richer information than binary for all three mechanisms. The MPSL paper’s choice to use binary states was algorithmic convention, not hardware necessity.

Each learning mechanism operates independently on parallel cores, then combines via learnable coefficients as described in the MPSL paper:

\[ W^l = \sum_{i=1}^{3} \lambda_i W_i^l \]

Where \(\lambda_i\) are adaptively learned mixing coefficients, optimized through local feedback using forward gradients, not global backpropagation.

Revolutionary Performance Projections

The convergence of ternary representations, forward gradient training, and advanced acceleration hardware promises unprecedented efficiency gains over both conventional approaches and existing binary SNNs:

Metric	GPU (A100)	Binary SNN (Multi-Plasticity)¹	Ternary + Forward Gradient	Improvement vs GPU
Power (Inference)	400W	50W	1-5W	80-400x
Power (Training)	400W	100W	2-10W	40-200x
Latency (per spike)	10μs	1μs	10-100ns	100-10000x
Training passes	2 (fwd+bwd)	2 (fwd+bwd)	1 (fwd only)	2x
Gradient accuracy	N/A	Surrogate	Exact	Mathematically honest
Information preserved	N/A	Binary states	Ternary states	50% more
Biological correspondence	None	Medium	High	Paradigm shift

Note: Performance varies by neuromorphic processor and deployment configuration.

The forward gradient approach demonstrated 2x speedup over backpropagation in conventional networks². For SNNs, the advantage is even greater since we eliminate the surrogate gradient approximation entirely.

Roadmap: From Vision to Silicon

Phase 1: Foundation

Implement ternary SNN models in Fidelity framework
Integrate forward gradient training via Furnace
Develop neuromorphic backend for Firefly compiler
Demonstrate MNIST/CIFAR-10 benchmarks

Phase 2: Hardware Integration

Intel Loihi 2 support with ternary neuron models
BAREWire integration for event streaming
Posit arithmetic emulation on fixed-point units
Heterogeneous CPU-neuromorphic demonstrations

Phase 3: Platform Expansion

Support for IBM TrueNorth, BrainChip Akida
FPGA-based neuromorphic emulation
Cloud deployment with neuromorphic simulation
Edge deployment on heterogeneous SoCs

Phase 4: Applications

Real-time sensor fusion for robotics
Ultra-low-power edge AI
High-throughput inference systems
Continuous learning systems

The Strategic Opportunity

The convergence of these technologies reveals an extraordinary opportunity: the hardware is already here, waiting for the right algorithms to unlock its potential. Current neuromorphic software treats advanced processors as if they were simple binary spike generators, using only a fraction of their capabilities. Similarly, CGRAs from NextSilicon and SambaNova are often programmed with conventional approaches that don’t leverage their reconfigurable nature. Our framework would change this by:

Utilizing existing hardware features: Ternary states map naturally to the multi-bit spikes and programmable neurons already in silicon
Eliminating algorithmic bottlenecks: Forward gradients remove the surrogate gradient fiction that has limited SNN training
Providing unified abstractions: F# code that compiles efficiently to both neuromorphic and CGRA targets

Platform-Specific Advantages

For Neuromorphic Processors (Intel, IBM, BrainChip):

Finally use the full state space (4096 states, not just 2)
Leverage graded spikes for richer information encoding
Implement true online learning without backpropagation

For CGRAs (NextSilicon, SambaNova):

Natural word-level operations for ternary representations
Runtime reconfiguration for adaptive neural topologies
Seamless integration of neural and conventional processing

For Heterogeneous Systems:

Neuromorphic cores for spiking dynamics
CGRA/GPU for dense operations when needed
CPU for orchestration and control flow
All unified through BAREWire’s zero-copy communication

Why This Convergence Matters Now

The hardware ecosystem has reached a critical point where multiple platforms; neuromorphic processors, CGRAs, and heterogeneous systems; all have the capabilities needed for advanced SNNs. What’s been missing is the software layer that can:

Train these networks without mathematical compromises
Deploy across diverse hardware without rewriting
Utilize the full capabilities of modern silicon

The Fidelity framework with forward gradient training provides exactly this missing piece.

Unlocking Today’s & Tomorrow’s Silicon

The intelligent chip revolution isn’t waiting for new hardware; it’s waiting for software that can unleash the capabilities already available. Neuromorphic chips have multiple states per neuron, but we’ve been using just two. CGRA solutions can reconfigure every clock cycle, but we’ve been treating them like fixed architectures. The hardware industry has delivered remarkable capabilities; now it’s time for algorithms to catch up.

Our novel approach to ternary spiking neural networks, converging with forward gradient training, and existing classical hardware integration represents more than incremental progress; it’s about finally using what we’ve built. By embracing nature’s organizing principles and matching them to silicon’s actual capabilities, we can achieve the efficiency gains that neuromorphic computing has long promised.

The mathematical foundations are now clear:

Ternary modeling leverages the multi-state capabilities already in neuromorphic processors and CGRAs, capturing distinct computational regimes of biological neurons
Forward gradients provide exact training without the surrogate approximations that have limited the field
Posit arithmetic maps naturally to the word-level operations of CGRAs and programmable precision of neuromorphic chips
Existing hardware from Intel, IBM, BrainChip, NextSilicon, and SambaNova is ready today

The Fidelity framework bridges the gap between hardware capability and algorithmic reality. Our control-flow to data-flow compilation, forward gradient training powered by Furnace, and platform-agnostic approach create the software foundation that can finally free the potential of neuromorphic and reconfigurable hardware.

The future of neuromorphic computing isn’t just about building new silicon; it’s also about finding ways to fully employ this remarkable silicon to its fullest, world-changing potential.

Let’s unlock true intelligence with The Fidelity Framework.

Liu, Y., Deng, X., & Yu, Q. (2024). Multi-Plasticity Synergy with Adaptive Mechanism Assignment for Training Spiking Neural Networks. arXiv preprint arXiv:2508.13673v1 ↩︎ ↩︎ ↩︎
Baydin, A. G., Pearlmutter, B. A., Syme, D., Wood, F., & Torr, P. (2022). Gradients without Backpropagation. arXiv preprint arXiv:2202.08587. ↩︎ ↩︎ ↩︎ ↩︎

Return to Blog