No Servers, Actually: What the Cloud Hides and What Quantum Reveals

A Naming Problem Worth Taking Seriously

Classical serverless computing did not eliminate servers. It made them someone else's operational problem. AWS Lambda, Google Cloud Functions, and Azure Functions hide provisioning, scaling, and billing complexity behind a function interface. You upload code, define a trigger, and the platform handles execution. The name that attached to this model was serverless. It stuck, and became one of the defining architectural terms of the decade.

The name is technically inaccurate, and the industry knows it. When pressed, practitioners say serverless means "servers you do not have to manage yourself." That is a practical and useful abstraction. But it is worth separating the abstraction from the claim, because the confusion between the two becomes interesting once you consider a different kind of computation entirely.

The quantum gate model is different. It is not serverless because the server is hidden. It is serverless because the formal model contains no server at all: only states, gates, unitary evolution, and measurement. That distinction may sound semantic, but it becomes important as quantum computing moves through cloud APIs and into production hybrid workflows.

The practical consequence is architectural. Quantum circuits should not be treated like cloud functions running on exotic hardware. They should be treated as formal transformations whose surrounding state, scheduling, retries, and integration all belong to the classical layer. The rest of this post builds up to that claim.

What Serverless Actually Promises

The 2019 Berkeley view on serverless computing by Jonas et al. defines the model around three properties: automatic scaling, no explicit resource provisioning, and billing based on actual execution rather than reserved capacity.^[1] By that definition, serverless is a resource management and billing model. It says nothing about whether servers exist. It says only that their management is not your problem.

For event-driven workloads with variable traffic this is real value, clearly delivered. But the abstraction has a physical layer underneath it that reveals exactly what classical serverless can and cannot promise.

Fig. 1 — Left: AWS Lambda exposes one interface to the developer but runs on two hidden layers: a language runtime and a Firecracker microVM on an EC2 host. Right: the quantum gate model contains only qubit state, unitary gates, and measurement. No server concept appears in the formalism at any layer.

The Cold Start Problem as Evidence

The most direct evidence that classical serverless is a management abstraction over conventional compute is the cold start problem. When a Lambda function has not been invoked recently, the first invocation incurs a latency penalty that can range from tens of milliseconds to several seconds depending on the runtime, the package size, and the memory configuration. This is the cold start.

Cold starts exist because Lambda functions execute inside Firecracker microVMs. Firecracker is a virtual machine monitor developed by AWS and described in detail in a 2020 NSDI paper by Agache et al.^[2] A Firecracker microVM needs to be created and booted, a Linux kernel needs to initialize, the language runtime needs to start, and the function package needs to be loaded before the first invocation can execute. That entire sequence is a server boot sequence. The fact that it happens in milliseconds rather than minutes does not change what it is.

A cold start is one of the clearest reminders that serverless still depends on real infrastructure: runtimes, isolation boundaries, scheduling, and physical machines.

Firecracker is a remarkable piece of engineering. It achieves a per-microVM memory overhead of approximately 5 MB and a boot time of around 125 milliseconds, enabling AWS to run millions of function instances across a shared fleet with strong isolation between customers.^[2] The engineering that makes Lambda fast and safe is infrastructure engineering built on top of hardware that is, without ambiguity, a server.

None of this is a criticism of the serverless model. It is a clarification of what the model is: an abstraction over server management, not an absence of servers. That distinction matters because it places a ceiling on certain properties the model can offer.

The Quantum Gate Model Has No Server in Its Formalism

Quantum computation in the gate model is defined as follows. A quantum system begins in a known initial state, usually all qubits in the ground state. A sequence of quantum gates, each a unitary matrix, is applied to the system. A measurement is taken at the end, collapsing the quantum state to a classical bitstring with probabilities determined by the final state vector. The computation is the circuit. This is the foundational description in Nielsen and Chuang, the standard reference for quantum computation.^[3]

Nothing in that description contains a process, a runtime, a file system, an operating system, a boot sequence, or a memory allocator. These concepts do not appear because they are not part of the model. The qubit is not allocated from a pool of available memory. The gate is not a function call dispatched by a scheduler. The measurement is not an I/O operation to a storage device. The entire framework sits outside the classical computing paradigm at the level of its formalism.

Classical Serverless (Lambda)

Function executes inside a Firecracker microVM
Linux kernel initializes on cold start
Language runtime boots before first invocation
State is discarded between invocations by design
Billed per 1 ms of execution time and memory
Cold start latency is observable and measurable

Quantum Gate Model
Circuit is a sequence of unitary operators on qubit state
No runtime, no OS, no boot sequence in the formalism
Each shot is a fresh prepare-evolve-measure cycle in the circuit abstraction
State collapses to classical bits on measurement
Billed per shot on cloud QPU access today
No cold start concept exists in the model

This is not a claim that quantum computers are easier to use or more practical than classical serverless platforms. Quantum hardware is fragile, error-prone, and subject to decoherence. Access to real QPU time is limited and expensive. Classical Lambda handles a trillion invocations per year at production scale. The comparison is not about maturity or utility. It is about what the underlying computational model is, at the level of its definition.

The Meta-Irony: Quantum Circuits Run on Classical Serverless Today

Here is where the comparison becomes genuinely interesting. When you submit a quantum circuit to IBM Quantum, Amazon Braket, or Google Quantum AI today, the submission goes through a classical API endpoint. That endpoint is almost certainly served by infrastructure that includes classical serverless components. The circuit definition is serialized, queued, transmitted to a QPU controller, executed, and the measurement results are returned over an HTTP response.

The computationally server-free model is accessed exclusively through the very abstraction layer that is itself the managed-server model. The original serverless runs on the simulated serverless. This is not a contradiction so much as a consequence of the fact that QPU hardware is not yet accessible as a standalone computing substrate. All quantum access today is mediated by classical infrastructure.

This is worth noting not as irony for its own sake but because it shapes how hybrid quantum-classical architectures actually need to be designed. The quantum circuit is the stateless computation. The classical cloud infrastructure manages state, schedules QPU access, handles errors, queues retries, and returns results. The roles map naturally: quantum for the computation that benefits from superposition and entanglement, classical serverless for orchestration, state management, and integration with the rest of the system.

The Boundary Is the Architecture

In a hybrid quantum-cloud system, the clean boundary is not serverless versus quantum. The boundary is classical orchestration versus quantum transformation. Once that boundary is clear, the architecture follows from it.

Classical infrastructure handles authentication, job queuing, retries, error handling, state management, logging, and billing. The quantum circuit handles state evolution and measurement. Confusing those layers leads to bad architecture: circuits that try to carry state across shots, orchestration logic embedded inside gate sequences, or classical post-processing treated as part of the quantum computation when it is not.

In the standard gate-model circuit abstraction, each shot is treated as a fresh preparation, evolution, and measurement cycle. Persistent application state belongs outside the quantum circuit, in the classical control layer. Cloud QPU workflows may maintain classical session state across jobs, but that state lives in the classical infrastructure wrapping the circuit, not inside it. Keeping the boundary explicit in your architecture makes that distinction operationally visible.

Why the Distinction Matters for Hybrid Systems

Once you recognize that the quantum gate model is genuinely stateless at the formalism level and not stateless by convention, it becomes easier to reason about where the boundary between classical and quantum components should sit in a hybrid system.

State belongs in the classical layer. A quantum circuit cannot maintain state across shots. Measurement collapses the quantum state, and there is no mechanism in the gate model for a qubit to retain information between invocations. This is not a limitation that better engineering will remove. It is a consequence of quantum mechanics. Designing a hybrid system that tries to persist state in the quantum layer is working against the physics.

Orchestration belongs in the classical layer. Deciding which circuit to run, interpreting results, updating parameters in a variational algorithm, and routing between classical and quantum workloads are all classical problems. The quantum component answers one kind of question well: given this input state and this circuit, what is the probability distribution over measurement outcomes? Everything around that question is classical infrastructure work.

The insight that follows from taking the serverless analogy seriously is that quantum and classical serverless are complementary rather than competing. Lambda handles the orchestration, state management, and integration work. The QPU handles the circuits. The per-shot billing model on Braket maps naturally onto the per-invocation billing model of Lambda. The architectural fit is cleaner than it might appear at first glance.

Quantum handles the circuit. Classical handles everything that surrounds it. The distinction in their computational models is not a gap to be bridged but a division of labor to be designed around.

What This Does Not Mean

This argument does not mean quantum computing will replace classical serverless, or that it is inherently superior, or that current quantum hardware is ready for production workloads. The argument is narrower: that the serverless label, when applied to Lambda and similar platforms, describes a management and billing abstraction over conventional compute, and that this is distinct from the quantum gate model, which has no server concept in its foundational formalism.

Recognizing that distinction helps clarify what we are actually building when we design hybrid systems. Quantum computation is not a faster way to run functions. It is a different computational model, useful for a different class of problems, accessed today through classical infrastructure that works in the way that term actually implies. Treating it as exotic cloud compute leads to architectural choices that fight the model. Treating it as a formal transformation surrounded by classical orchestration leads to designs that work with it.

References

[1] Jonas, E., Schleier-Smith, J., Sreekanti, V., Tsai, C., Khandelwal, A., Pu, Q., Shankar, V., Carreira, J., Kraska, T., Alizadeh, M., Gonzalez, J., Hellerstein, J., Stoica, I., & Patterson, D. (2019). Cloud Programming Simplified: A Berkeley View on Serverless Computing. UC Berkeley Technical Report UCB/EECS-2019-3. Available at: arxiv.org/abs/1902.03383
[2] Agache, A., Brooker, M., Iordache, A., Liguori, A., Neugebauer, R., Piwonka, P., & Popa, D. (2020). Firecracker: Lightweight Virtualization for Serverless Applications. 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 2020). Available at: usenix.org/conference/nsdi20/presentation/agache
[3] Nielsen, M. A., & Chuang, I. L. (2010). Quantum Computation and Quantum Information (10th Anniversary Edition). Cambridge University Press. ISBN 978-1-107-00217-3. Standard reference for the quantum gate model and circuit formalism. Available via university libraries. For a free introduction to the same gate model concepts, see IBM Quantum Learning: learning.quantum.ibm.com

Written as a perspective on computational models and cloud infrastructure. Jaya Preethi Mohan, University of North Dakota.