Förderjahr 2024 / Stipendium Call #19 / ProjektID: 7413 / Projekt: Optimizing Hybrid Workflows for Cloud-Based Quantum Computation
In the previous blog post, we announced our next steps, moving from two initial case studies to systematic profiling of quantum circuit compilation across different circuits, sizes, backends, and optimization settings. Our expanded study covers 42,240 compilation runs, varying 11 benchmark circuits, qubit counts from 10 to 120, four backend models, four optimization levels and 20 compiler seeds to ensure that our results are reproducible. The circuits include standardized benchmarks from the MQT Bench Tool, and the backend models comprise restricted-connectivity IBM-like architectures as well as generic backends with all-to-all connectivity. This setup allows us to compare the impact of realistic hardware constraints on compilation with and without restricted connectivity limitations.
Qubit count alone does not explain compilation time
Our clearest finding is that qubit count alone cannot explain compilation time. On restricted-connectivity backends, increasing the number of qubits increases compilation time, but not always smoothly and uniformly across circuits. For some circuit families, such as GHZ, VQE RealAmplitudes, VQE SU2, and W-state, we observe sharp increases as the problem size reaches the limit of the target device.
This picture changes on the generic all-to-all-connected backend, where compilation time may even decrease for larger circuit sizes. This counterintuitive scaling behavior of compilation time is consistent with a topology- and size-dependent graph-embedding effect. On a fully connected backend, mapping a small logical circuit onto a much larger physical target can leave many valid embeddings to be evaluated, whereas with a circuit that nearly fills the target device, the embedding problem becomes more constrained, which can substantially reduce the search space. Thus, larger circuits can be cheaper to compile on fully connected backends.
This demonstrates that compilation cost is not only a function of problem size but rather a result of the interaction between circuit structural properties and backend constraints. This is an important practical insight. If a cloud or HPC system wants to estimate compilation cost before execution, simple size-based rules are not sufficient. A useful estimate must consider structural circuit features and backend properties.
A few passes often consume most of the time
Another central result is that compilation time is frequently concentrated in a small number of passes. To quantify this effect, we use the top-k concentration measure, defined as the fraction of total compilation time spent on the k most time-consuming passes. In many configurations, the three most expensive passes account for a large share of the overall compilation time. On backends with restricted connectivity at optimization level 3, the top-3 concentration depends strongly on the circuit type. While some circuits show comparatively constant and stable concentrations, others reach very high concentration values, exceeding 80%. On the all-to-all connected backend, the concentration differs again. There, the top-3 passes account for a substantial and nearly constant share of compilation time across all analyzed circuits. This result is encouraging from an engineering perspective, as our data indicates that targeting a few dominant passes may be a realistic path toward reducing classical preprocessing time.
The dominant passes are workload-dependent
On fully connected backends, layout-related passes often drive compilation time across several circuits. On restricted-connectivity backends, however, we see a more differentiated picture. On fully connected backends, layout-related passes often drive compilation time across several circuits. For strongly entangled circuits, layout and routing-related passes become dominant.
From profiling to understanding
The next question is whether these "bottlenecks" can be anticipated before the full compilation is performed. To investigate these issues, we also analyzed structural circuit features, including gate counts, circuit depth, interaction-graph properties, gate-dependency information, and benchmark features. The results exhibit correlations between circuit structure and pass-level runtime behavior. For example, properties related to two-qubit interactions and graph structure correlate with the runtime and activation of specific compiler passes. This provides us a first empirical basis for moving from profiling to modeling. Instead of treating the quantum compiler as a black box, we can begin to ask which circuit features make certain passes expensive and under which conditions a costly pass is likely to dominate.
These results form the foundation for the next stage of our work: developing resource-aware compilation strategies that reduce classical overhead without compromising the quality of the compiled quantum circuit. In the next blog post, we will discuss how these findings motivate concrete follow-up work on prediction models, selective compiler optimization, and reuse mechanisms for cloud-based quantum computing workflows.