CUDA-Q (formerly nvq++) provides a programming model that integrates quantum kernels with classical C++ or Python code, enabling hybrid applications. Its compilation flow is built around LLVM, MLIR, and QIR (Quantum Intermediate Representation).

1. Source Code

You write code using CUDA-Q's syntax:

C++ with quantum kernels, e.g.:

__qpu__ void bell() {
    qubit q0, q1;
    h(q0);
    cx(q0, q1);
}

Python front end is also supported but wraps around C++/LLVM.

2. Front-End Parsing & AST Generation

C++ source is parsed using Clang, extended with CUDA-Q attributes (qpu, etc.).
The compiler identifies quantum kernels and separates them from classical code.

3. MLIR Transformation

Quantum kernels are lowered into MLIR (Multi-Level Intermediate Representation).
Classical code remains in LLVM IR or C++.
CUDA-Q uses its own dialects within MLIR (e.g., quake, cc, etc.).

4. Quantum IR Generation (QIR)

The MLIR representation is converted to QIR — a standard LLVM-based intermediate representation for quantum programs.
QIR enables target-agnostic quantum code that can be interpreted or compiled for multiple backends.

5. Target-Specific Lowering

Depending on the selected backend (e.g., emulator, NVIDIA cuStateVec, IBM Qiskit, etc.), the QIR is further lowered into:
- Quantum assembly (for simulators or real quantum hardware).
- CUDA kernels (if GPU-based simulation is used).
- QASM or vendor-specific IRs.

6. Classical Compilation (LLVM)

The classical host code is compiled using Clang + LLVM.
It links with the runtime libraries that handle quantum kernel dispatching and memory management.

7. Linking & Execution

The final binary includes:
- Host (classical) code.
- Quantum kernels embedded as callable functions or LLVM modules.
- Runtime dependencies (e.g., CUDA-Q runtime, simulators, drivers).
At runtime, the host code launches quantum kernels on the selected backend, collects results, and continues classical processing.

Compiling using the MQSS CUDA-Q Adapter

To target MQSS via MQP (REST API)

nvq++ --target mqssMQP source.cpp
To target MQSS via HPC

nvq++ --target mqssHPC source.cpp

Configuration File

A configuration file is a requirement to send additional information to be used at runtime by the MQSS.

The environment variable CUDAQ_MQSS_CONFIGURATION has to be defined in order to include the configuration file during compilation

export CUDAQ_MQSS_CONFIGURATION=$HOME/.mqss_config

The configuration file might contain the following information:

n_qbits is the number of qubits required by the submitted task.
n_shots is the number of shots required to execute the quantum task. This is annotated by MQSS CUDA-Q Adapter and it is not required to be defined in the configuration file.
transpiler_flag is a flag indicating if the MQSS has to perform transpilation. true indicates the flag is active, false otherwise.
circuit_files is a list with the circuit files.
submit_time is the submission time of the task. This is an notated by MQSS CUDA-Q Adapter on each submitted task and it is not required to be defined in the configuration file.
circuit_file_type is the circuit type submitted to the MQSS. The supported file types at the moment are: QASM, Quake, and QIR.
preferred_qpu specifies the selected preferred QPU where the submitted jobs will be executed.
restricted_resource_names specifies the restricted QPUs when submitting a job the MQSS.
priority integer value that specified the priority level of the submitted.
user_identity specifies the identity of the user.
optimisation_level specifies the optimization level. Supported levels: 0, 1, 2, and 3.
via_hpc is a flag specifying if the job is submitted via HPC or MQP. true indicates the job is submitted via HPC, false via MQP.

In the following an example of a configuration file:

n_qbits: 5
n_shots: 1000
transpiler_flag: true
circuit_file_type: quake
preferred_qpu: iqm
restricted_resource_names: aqt
priority: 1
user_identity: YOUR IDENTITY
optimisation_level: 3
via_hpc: false