Understanding Measurement

Apr 15, 2023   #Measurement  #Observable  #Observer 

I’ve never quite felt that I understood what “measurement” means in quantum mechanics and my recent engagement with quantum computing surfaced this rather starkly. I’m not talking at the handwaving level of “wavefunction collapse” and such esoteric things. I just want an ordinary close hand-lens view understanding of what measurement is, even if incomplete at some level. This post is an attempt at that. If any reader thinks I should read up X/Y/Z to understand this, I’d much appreciate any links/references.

I’m taking a computational approach to this, treating the computer as a means to conducting virtual “experiments”. So this is essentially a Julia notebook in readable form, also available as a GitHub gist.

using Random, LinearAlgebra
rng = MersenneTwister(1234)


I was looking for what would be very simple measurement scenario and chose to examine what happens to a single qubit upon measurement.

The idea is to see how a qubit in a random superposition of its two possibilities evolves when “measured” by a random observer that can distinguish between the two qubit states. Towards this, we need the ability to construct a random state, a random observable and a random observer. In this exploration, we’re not initially concerned with the nature of the distribution when creating random states/observables and observers. We’ll get to that later on … if necessary.

Random states

We can make any N-dimensional random vector with complex components such that \(\sum_{i}|c_i|^2 = 1\) where \(c_i\) are the amplitudes of the basis states that make up the random state vector. Note that the constraint implies \(|c_i|^2 \leq 1\) for all \(i\).

function random_state(rng, N)
    @assert N >= 2
    z = zeros(Complex, N)
    for i in 1:N
        a = 2.0 * (rand(rng, Float64) - 0.5)
        b = 2.0 * (rand(rng, Float64) - 0.5)
        z[i] = a+b*im
    return z / sqrt(sum(abs2, z))

> random_state(rng, 4)
4-element Vector{ComplexF64}:
 0.13894291414503196 + 0.4080544353672953im
 0.10130724069115052 - 0.06104772848254118im
 0.44969950485897653 + 0.5416517697082908im
 -0.4579406055551145 - 0.30801068114907093im

Random observers

What is a random observer? Let’s focus on observers for single qubits. A single qubit can be either in the \(\ket{0}\) state or in the \(\ket{1}\) state or in general in a superposition of the form \(\ket{q} = q_0\ket{0}+q_1\ket{1}\) where \(|q_0|^2 + |q_1|^2 = 1\). Let’s consider an “ideal observer” represented by a unitary operator \(\Omega\) and with \(n\) dimensional internal state \(\ket{\omega}\) which behaves like this -

$$ \Omega\ket{\omega}\ket{0} = \ket{\omega_0}\ket{0} $$ and $$ \Omega\ket{\omega}\ket{1} = \ket{\omega_1}\ket{1} $$

Therefore such an \(\Omega\) will transform an arbitrary \(\ket{q}\) according to –

$$ \Omega\ket{\omega}\ket{q} = q_0\ket{\omega_0}\ket{0} + q_1\ket{\omega_1}\ket{1} $$

Such an \(\Omega\) can be thought of as a combination of \(\Omega_0\) and \(\Omega_1\) where –

$$ \Omega_0\ket{\omega}\ket{q} = q_0\ket{\omega_0}\ket{0} + q_1\ket{\omega}\ket{1} $$ and $$ \Omega_1\ket{\omega}\ket{q} = q_0\ket{\omega}\ket{0} + q_1\ket{\omega_1}\ket{1} $$

and we can write -

$$ \Omega = \Omega_1\Omega_0 = \Omega_0\Omega_1 $$

We can therefore pick \(\Omega_0\) and \(\Omega_1\) to be unitary transformations due to random commuting observables \(H_0\) and \(H_1\).

The combined observable (a self adjoint matrix) can therefore be written as –

$$ H = \begin{bmatrix} H_0 & \mathbb{0} \\ \mathbb{0} & H_1 \end{bmatrix} $$

function random_observable(rng, N; minmax=(-1.0,1.0))
    m = 2.0 * (rand(rng, Float64, (N,N)) + im * rand(rng, Float64, (N,N)) .- (0.5 + im*0.5))
    m = 0.5*(m+m')
    e = eigen(m)
    emin = minimum(real.(e.values))
    emax = maximum(real.(e.values))
    ev = minmax[1] .+ (e.values .- emin) * ((minmax[2] - minmax[1]) / (emax - emin))
    e.vectors * Diagonal(ev) * inv(e.vectors)

> eigen(random_observable(rng, 4))
Eigen{ComplexF64, ComplexF64, Matrix{ComplexF64}, Vector{ComplexF64}}
4-element Vector{ComplexF64}:
 -0.9999999999999992 + 2.865976966174497e-17im
 -0.7017467680014327 + 2.2852080636888333e-17im
 0.03676648290639895 + 3.66067959963041e-17im
  1.0000000000000004 + 3.2140735975966996e-17im
4×4 Matrix{ComplexF64}:
 0.645644+0.0im        -0.140488+0.620761im  …   -0.144119-0.257587im
 0.134682+0.435929im    0.666776+0.0im           -0.133027-0.275052im
 0.288964+0.0572762im   0.313974-0.137567im       0.830699+0.0im
 0.535022+0.0440478im  -0.144123-0.109784im     -0.0465203+0.356796im

function random_observer(rng, n; minmax=(-1.0,1.0))
    H0 = random_observable(rng, n; minmax)
    H1 = random_observable(rng, n; minmax)
    Z = zeros(Complex, (n,n))
    H = vcat(hcat(H0, Z), hcat(Z, H1))

> random_observer(rng, 2)
4×4 Matrix{Complex}:
  0.623319+0.0im       -0.239081-0.744522im    …          0+0im
 -0.239081+0.744522im  -0.623319+6.1382e-18im             0+0im
         0+0im                 0+0im              -0.479245+0.812471im
         0+0im                 0+0im              -0.331987-6.7288e-17im


function bra(state)

function braket(state1, state2)
    bra(state1) * state2

Evolution of state due to a random observable

We take a random observable \(H\) and evolve a random state \(\ket{r}\) over time using the unitary operation \(\ket{r(t)} = e^{i2\pi Ht}\ket{r}\). Then we compare the result state to the original random state as a function of time using the dot product of the two states \(\braket{r(t)|r}\). We use \(2\pi\) as a convenience so the \(H\) makes sense as a spectrum.

function random_evolution(rng, n, T, dt)
    H = random_observable(rng, n)
    r = random_state(rng, n)
    r0 = r
    evolve = exp(im * 2 * pi * dt * H)
    result = []
    for t in 0:dt:T
        push!(result, braket(r, r0))
        r = evolve * r
    return real.(result), imag.(result)

using Plots
plot(random_evolution(rng, 4, 100, 0.01)[1])

State evolution

That is kind of what you’d expect - a superposition of oscillations at a number of different frequencies. We’d get different frequencies depending on the outcome of the random number generation.

Using an observer

Now what happens to the states if we evolve a random state similarly using an observer. We’ll try to look at the observation process as a series of small unitary transformations and keep track of how the two states of the original measured qubit evolve when considered along with the n states of the observer.

function evolve_observer(rng, n, T, dt)
    H = random_observer(rng, n)
    w = random_state(rng, n)
    q = random_state(rng, 2)
    r = kron(q,w)
    r0 = r
    result = []
    evolve = exp(im * 2 * pi * H * dt)
    for t in 0:dt:T
        costate_0 = r[1:n]
        costate_1 = r[n+1:end]
        costate_0 /= sqrt(sum(abs2, costate_0))
        costate_1 /= sqrt(sum(abs2, costate_1))
        push!(result, braket(costate_0, costate_1))
        r = evolve * r
    return abs.(result)

evols = evolve_observer.(rng, 2 .^ (4:8), 10, 0.1)

Evolve observer

That looks interesting to me. While initially there is some correlation between the two parts of the state vector \(r_{1\ldots n}\) and \(r_{(n+1)\ldots 2n}\), it rapidly declines to near 0 and lingers around 0 forever, the larger \(n\) gets. For small \(n\), we essentially have a small coherent superposition, but the superpositions get complex for observers with larger \(n\). Let’s split the plots into a series and see it more clearly.

tsteps = 0:0.1:100
layout = (length(evols),1)
evols = evolve_observer.(rng, 2 .^ (4:10), tsteps[end], tsteps[2]-tsteps[1])
plot(tsteps, [evols...], layout=layout, size=(500,1000))

Evolve observer stacked

Let’s now zoom in to the first few “seconds” to see what happens there.

plot(tsteps[zoom], [e[zoom] for e in evols], layout=layout, size=(500,1000))

Decoherence process


The dot products of the two co-states goes to zero after about a time of \(0.5\) for larger \(n\).

The dot product of the co-states represents the internal states taken on by the observer for each of the two states of the qubit - i.e. the two possibilities starting with \(\ket{\omega}\ket{0}\) and \(\ket{\omega}\ket{1}\).

I interpret the dot product of the co-states going to 0 as \(n\) becomes larger, as the states evolving in their own “worlds” and not crossing each other. If they did cross, it would mean that at some point in the future of the observer the qubit and the observer would become separable.


Given the observer \(H = \begin{bmatrix} H_0 & 0 \\ 0 & H_1 \end{bmatrix}\), and \(H_0\) and \(H_1\) are both self adjoint (i.e. \(H_0 = H_0^\dagger\) and \(H_1 = H_1^\dagger\)), we have –

$$ e^{iHt} = \begin{bmatrix} e^{iH_0t} & \mathbb{0} \\ \mathbb{0} & e^{iH_1t} \end{bmatrix} $$

If we notate the combined state vector of the observer+qubit system as \(\ket{\Psi} = \ket{q} \otimes \ket{psi} = \begin{bmatrix}q_0\ket{\psi} \\
q_1\ket{\psi}\end{bmatrix}\) where \(\ket{q} = q_0\ket{0} + q_1\ket{1}\) (implying \(|q_0|^2 + |q_1|^2 = 1\)), The evolution of the joint state can be written –

$$ e^{iHt}\ket{\Psi} = \begin{bmatrix}q_0e^{iH_0t}\ket{\psi} \\
q_1e^{iH_1t}\ket{\psi}\end{bmatrix} = \begin{bmatrix} q_0\ket{\psi_0(t)} \\
q_1\ket{\psi_1(t)}\end{bmatrix} $$

So if we’re examining the inner product of the two co-state vectors, we get -

$$ \alpha = \braket{\psi_1(t)|\psi_0(t)} = \braket{\psi\left|e^{-iH_1t}e^{iH_0t}\right|\psi} = \braket{\psi\left|e^{i(H_0-H_1)t}\right|\psi} = \braket{\psi\left|e^{i\Delta Ht}\right|\psi} $$

If \(H_0\) and \(H_1\) are very close to each other, then \(\Delta H\) ends up being small – i.e. its eigenvalues end up small relative to the eigenvalues of \(H_0\) and \(H_1\) and the inner product becomes close to \(1.0\). Therefore for \(H\) to actually represent an observer, it must be able to differentiate between the two qubit states and therefore \(\Delta H\) itself must have random eigen values comparable in magnitude to both \(H_0\) and \(H_1\).

If we then consider \(\ket{\psi}\) in an eigen basis \(\ket{\phi_k}\) of \(\Delta H\) with \(\braket{\phi_i|\phi_j} = \delta_{ij}\) and \(\Delta H \ket{\phi_k} = 2\pi f_k\ket{\phi_k}\), then taking \(\ket{\psi} = \sum_k{\beta_k\ket{\phi_k}}\), we see that -

$$ \alpha = \sum_k{|\beta_k|^2e^{i(2\pi f_kt)}} $$

So if a) \(\ket{\psi}\) is mixed enough in that basis and b) the “frequencies” \(f_k\) are different enough, then for \(t \gtrapprox 0\), we see that \(\alpha \approx 1\) and as \(t\) increases, all sorts of frequencies will get mixed up and \(\alpha \to 0\) with larger dimensions \(n \gg 1\). So the behaviour of the observer relative to the qubit states must also not have degenerate eigenvalues, or at least have sufficient number of different eigenvalues (“frequencies”). If we start out in a pure state of the observer system in that eigen basis though, we’ll get \(\alpha = 1\). The interval during which \(\alpha \sim 1\) (say \(\alpha > 0.5\)) can then be called the “coherence interval” of the observer, beyond which you get “decoherence”.

So this is what we’re seeing in the trend plot in the previous section using a random observer starting out at a random state.

All these points correlate well with the mental model of “observer” that I started with, so I’m wondering whether all this should’ve been “obvious” from the get go :) I’m still irked by not being able to see why the Born rule has to hold based on this model of an observer, but that problem remains unsolved for now, and I also get the feeling that I’ll be adding to this post over time.