Concise guide to the local SPSA estimator and sampler gradient implementations for parameterized quantum circuits.
Resources
1Install
npx skillscat add unitarylab/quantum-skills/spsa Install via the SkillsCat registry.
SPSA Gradient Estimation
Purpose
Use this skill for the SPSA gradient implementations in this folder:
scripts/spsa_estimator_gradient.pyscripts/spsa_sampler_gradient.py
They estimate gradients of:
- expectation values via
SPSAEstimatorGradient - sampled output distributions via
SPSASamplerGradient
SPSA is useful when circuits have many parameters, because each batch needs only 2 * batch_size circuit evaluations, independent of parameter count.
One-Step Run Command
There is no standalone script entry point in this folder.
Use the classes as library components inside the full package.
Overview
For each random perturbation vector delta in {+1, -1}^d, the implementation evaluates:
f(theta + epsilon * delta)f(theta - epsilon * delta)
It then forms a central-difference estimate and averages across batch_size perturbations.
Prerequisites
- NumPy
- Qiskit parameterized circuits
- Estimator or Sampler primitives
- Basic finite-difference gradient intuition
Using the Provided Implementation
The real entry classes are:
SPSAEstimatorGradientSPSASamplerGradient
Public usage goes through inherited run(...); the algorithm itself is implemented in _run(...).
Minimal estimator example:
from qiskit.circuit import QuantumCircuit, ParameterVector
from qiskit.primitives import StatevectorEstimator
from qiskit.quantum_info import SparsePauliOp
from qiskit_algorithms.gradients import SPSAEstimatorGradient
theta = ParameterVector("theta", 2)
qc = QuantumCircuit(2)
qc.ry(theta[0], 0)
qc.ry(theta[1], 1)
qc.cx(0, 1)
estimator = StatevectorEstimator()
obs = SparsePauliOp("ZZ")
grad = SPSAEstimatorGradient(estimator=estimator, epsilon=0.01, batch_size=4, seed=123)
result = grad.run(
circuits=[qc],
observables=[obs],
parameter_values=[[0.3, -0.2]],
parameters=[[theta[0], theta[1]]],
precision=None,
).result()
print(result.gradients)Minimal sampler example:
from qiskit.circuit import QuantumCircuit, ParameterVector
from qiskit.primitives import StatevectorSampler
from qiskit_algorithms.gradients import SPSASamplerGradient
theta = ParameterVector("theta", 2)
qc = QuantumCircuit(2)
qc.ry(theta[0], 0)
qc.ry(theta[1], 1)
qc.cx(0, 1)
qc.measure_all()
sampler = StatevectorSampler()
grad = SPSASamplerGradient(sampler=sampler, epsilon=0.01, batch_size=4, seed=123)
result = grad.run(
circuits=[qc],
parameter_values=[[0.3, -0.2]],
parameters=[[theta[0], theta[1]]],
shots=None,
).result()
print(result.gradients)Core Parameters Explained
SPSAEstimatorGradient
estimator: expectation-value primitiveepsilon: perturbation size, must be positivebatch_size=1: number of SPSA samples to averageseed=None: RNG seedprecision=None: estimator precision overridetranspiler=None,transpiler_options=None: optional transpilation
Run inputs:
circuitsobservablesparameter_valuesparametersprecision
Return fields:
gradientsmetadataprecision
SPSASamplerGradient
sampler: sampling primitiveepsilon: perturbation size, must be positivebatch_size=1: number of SPSA samples to averageseed=None: RNG seedshots=None: sampler shot overridetranspiler=None,transpiler_options=None: optional transpilation
Run inputs:
circuitsparameter_valuesparametersshots
Return fields:
gradientsmetadatashots
Implementation Architecture
Estimator path
SPSAEstimatorGradient._run(...) does the following:
- Normalize
precision. - Optionally transpile circuits and remap observables with circuit layout.
- For each circuit, sample
batch_sizerandom sign vectors. - Build
theta + epsilon * deltaandtheta - epsilon * deltaparameter sets. - Submit one batched estimator job.
- Compute
$$
\frac{f_+ - f_-}{2\epsilon}
$$
and divide by the perturbation signs to recover per-parameter gradients. - Average across the batch and keep only requested parameters.
Simplified reconstruction:
for each circuit:
deltas = random_pm_one_vectors(batch_size)
plus = [theta + epsilon * d for d in deltas]
minus = [theta - epsilon * d for d in deltas]
pubs.append((circuit, observable, plus + minus, precision))
results = estimator.run(pubs).result()
gradient = mean(((f_plus - f_minus) / (2 * epsilon)) / delta over batch)Sampler path
SPSASamplerGradient._run(...) is similar, but works on probability distributions:
- Normalize
shots. - Optionally transpile circuits.
- Build plus/minus parameter sets from random sign vectors.
- Submit one batched sampler job.
- Convert counts to probabilities.
- Compute per-bitstring distribution differences.
- Reconstruct each requested parameter gradient and average over the batch.
Simplified reconstruction:
for each circuit:
deltas = random_pm_one_vectors(batch_size)
plus = [theta + epsilon * d for d in deltas]
minus = [theta - epsilon * d for d in deltas]
pubs.append((circuit, plus + minus, shots))
results = sampler.run(pubs).result()
for each target parameter j:
gradient_j = average_k(diff_distribution_k * deltas[k][j])Understanding the Key Quantum Components
- Parameterized circuit
U(theta) - Estimator objective: expectation value
f(theta) - Sampler objective: output distribution
p_theta(z) - Random perturbation vector
delta in {+1, -1}^d - Quantum evaluations at
theta +/- epsilon * delta - Classical post-processing for gradient recovery
Theory-to-Code Mapping
theta:parameter_valuesdelta: random+1/-1vectors built with NumPy RNGepsilon: constructor argumentepsilonf(theta + epsilon * delta),f(theta - epsilon * delta):plusandminusevaluations- requested parameters:
parameters - batch average: NumPy mean or dict-value averaging
Mathematical Deep Dive
For scalar objective f(theta):
$$
\hat g_i(\theta) = \frac{f(\theta + \epsilon \delta) - f(\theta - \epsilon \delta)}{2\epsilon,\delta_i}
$$
For batch_size = B:
$$
\hat g(\theta) = \frac{1}{B}\sum_{k=1}^B \frac{f(\theta + \epsilon \delta^{(k)}) - f(\theta - \epsilon \delta^{(k)})}{2\epsilon} \odot \delta^{(k)}
$$
Since delta_i in {+1, -1}, dividing by delta_i is equivalent to multiplying by delta_i.
For sampler gradients, the same idea is applied per output bitstring probability.
Hands-On Example
If you request only a parameter subset, the output keeps that order:
result = grad.run(
circuits=[qc],
observables=[obs],
parameter_values=[[0.1, -0.4, 0.7]],
parameters=[[theta[0], theta[2]]],
precision=None,
).result()
print(result.gradients[0])
print(result.metadata[0]["parameters"])Expected behavior:
- output length matches requested parameter count
- output order matches
parameters=[[...]]
Implementing Your Own Version
import numpy as np
def spsa_gradient(eval_fn, theta, epsilon, batch_size, rng):
grads = []
for _ in range(batch_size):
delta = (-1) ** rng.integers(0, 2, len(theta))
f_plus = eval_fn(theta + epsilon * delta)
f_minus = eval_fn(theta - epsilon * delta)
grads.append(((f_plus - f_minus) / (2 * epsilon)) / delta)
return np.mean(np.asarray(grads), axis=0)This matches the core logic of the estimator implementation; the sampler version replaces scalar outputs with sparse probability dictionaries.
Debugging Tips
epsilon <= 0raisesValueError.- In estimator mode, the input lists must have matching lengths.
- Requested parameters must exist in the circuit.
- Too-small
epsiloncan make noise dominate. - Larger
batch_sizereduces variance but increases runtime. - In estimator mode, transpilation changes layout, so observables are remapped.
- In sampler mode, gradients are built from normalized counts, so shot count matters.