Two transducers and two pins are all that you need to measure ultrasonic range profiles on the cheap!

I recently wanted to revisit an old project idea - an ultrasonic windspeed monitor. The original project idea started about 10 years ago. However, I was stuck with slow AVR microcontrollers and didn’t have the knowledge to work around that. Thus, the project never went anywhere.

Now - 10 years later and armed with better hardware and a better understanding of signal processing - I decided to revisit the project.

If you want to skip straight to the HOW, click here. The code is available here.

State of the Art

In the maker community, everybody is familiar with the ubiquitous HC-SR04 ultrasonic rangefinder. Triggered by an external signal, the HC-SR04 emits a short pulse of ultrasound - 8 cycles at 40 kHz to be exact. The return of the echo is signalled on another pin.

Looking at the signal levels, 8 cycles with a +-3.3V drive level at the transmitter only yield +-25 mV at the receiver, with the ultrasonic pulse bouncing off a wall at a distance of 62 cm. We could certainly amplify both the transmit and receive signal, but let’s see how far we can get without extra hardware.

There are two main reasons why the received signal is so weak. First, the ultrasonic transducers are only specified for a very narrow band (39–41 kHz). Because of this, they behave a bit like a resonant circuit: they need time to “ring up” to full amplitude. Eight cycles at 40 kHz last only about 200 µs, which is much shorter than the ~500 µs it takes for the transducers to reach their peak output. Second, with only eight cycles we simply aren’t putting much acoustic energy into the environment, so the echo that comes back is correspondingly faint.

Both problems can be fixed by turning on the transmitter for longer, but then we’d risk masking the return signal with the transmit pulse.

In many ways, ultrasonic ranging is very similar to RADAR - a signal is transmitted and the signal’s reflections are received. The time delay between sending and receiving is proportional to the distance of the reflector, as the signal travels with a finite speed c (299792458 m/s for RADAR, 340 m/s for ultrasound). We can measure this time delay and figure out the distance of a target. With this in mind, I decided to test how much we can improve on the SR-HC04 using RADAR signal processing.

FMCW

Frequency-Modulated Continuous Wave (FMCW) is a technique widely used in modern radar systems. Instead of sending a short pulse and waiting, FMCW changes the frequency of the transmitted signal over time—typically in a sawtooth-shaped sweep (a “chirp”). Because the frequency is always changing, any echo that returns is slightly shifted in frequency depending on how long it took to come back. This means we can transmit and receive at the same time without the echo being drowned out by the outgoing signal. By comparing the transmit and receive frequencies, we can calculate the delay, and therefore the distance to the target.

Above, you can see the block diagram of a basic FMCW radar. A chirp is generated by a VCO or synthesizer and transmitted. In the receive path, the incoming signal is mixed with the transmit signal. In a mixer, two signals are multiplied with one another, usually using some sort of nonlinear element. As described by some trigonometric identities, this has the effect of generating two signals with frequencies equal to the sum and difference of the two incoming signals. The difference signal - also called IF (intermediate frequency) - is then filtered out and processed. The frequency of this IF signal depends on the ramp rate (Hz/s) of the chirp and the delay introduced by signal propagation. With a simple FFT, the frequency contents of the IF signal are separated. Each frequency bin of the FFT then corresponds to a specific distance.

Building a mixer for 40 kHz is fairly easy. A simple analog switch, chopping up the signal, could serve this purpose. We can filter out any resulting harmonic components. However, this would require external circuitry. Inspired by Charles’ LoLRa project, I wanted to see if I could get away with aliasing instead.

Aliasing occurs when a signal is sampled at a rate that’s not high enough to capture its actual frequency content. Frequencies above half the sampling rate get “folded” back into lower frequencies. This effect is useful for us: the folding behaves similarly to what an analog mixer does - it creates a difference frequency between the transmitted and received signals. Instead of building a physical mixer, we can let the sampling process create that difference frequency for us. The trade-off is that the sampling frequency now changes with the chirp, but because our chirp bandwidth is small (about 5%) we can treat it as approximately constant during processing.

At only 40 kHz, a modern microcontroller could also simply sample the whole signal to memory. Even the RP2040’s ADC has a maximum sample rate of 500 kHz, well above the 80 kHz required for this project. However, the amount of data created is much smaller if we only have to sample the intermediate frequency instead of the carrier frequency.

Implementation

Signal Generation

The chirp is generated by one of the RP2040’s PIOs (I can’t state how much I love the PIOs! I’ve used them in both personal and professional projects and they never cease to amaze me).

The chirp waveform is produced by a PIO state machine. A Python script precomputes an array of time intervals that describe how long each output level should last as the frequency ramps upward. The DMA streams this list into the PIO so the state machine can toggle the output pins without CPU involvement. Each interval is repeated 16 times to reduce the size of the lookup table while still giving the chirp a smooth shape.

The PIO also triggers an interrupt at every transition and halfway between transitions. This gives us four evenly spaced ADC samples per cycle - enough to reconstruct both I and Q components and cancel out DC offsets later in processing.

Receiver

Unfortunately, the RP2040 does not include any direct peripheral connections like, for example, many STM32s. This means we have to involve the CPU to start the ADC conversions in the PIO interrupt routine. Involving CPUs in time-sensitive tasks always risks introducing jitter, but at 40 kHz and with the second core fully dedicated to the ISR I’m not too worried about that. If jitter were an issue, it should also be possible to use a DMA channel between the PIO’s RX FIFO and the ADC’s CS register, but I didn’t test that.

static int16_t adc_buffer_i[CHIRP_LENGTH];
static int16_t adc_buffer_q[CHIRP_LENGTH];

static volatile uint32_t pio_count = 0;

static void pio_irq_handler() {
    pio_interrupt_clear(pio_chirp, 0);

    int16_t value = adc_read();

    // sequence: +I +Q -I -Q
    bool is_q = (pio_count & 1) != 0;
    int16_t *buffer = is_q ? adc_buffer_q : adc_buffer_i;

    bool is_negative = (pio_count & 2) != 0;
    value = is_negative ? -value : value;

    // 8 cycles of oversampling and 4 samples each
    buffer[pio_count / 32] += value;

    pio_count += 1;
}

The interrupt routine reads a single ADC sample every time the PIO requests one. The samples arrive in a fixed pattern: +I, +Q, -I, -Q, repeating. By flipping the sign of the negative samples, we effectively double the signal while removing any constant offset.

It is bad practice to use blocking functions (adc_read()) in an interrupt service routine, but since I can dedicate an entire core to this task, I can get away with it. Before this solution, I tried only starting the ADC (hw_set_bits(&adc_hw->cs, ADC_CS_START_ONCE_BITS);) and reading the samples with the DMA, but that turned out to be more unstable and missed occasional samples.

Since each chirp cycle gives us two I and two Q samples and we oversample by eight complete cycles, every entry in the I and Q buffers accumulates 16 samples. While we do end up sampling the signal fast enough to fully reconstruct it, the concept would also work with many fewer samples by skipping cycles entirely. Using aliasing to fold down our signal of interest close to DC, we can simply average multiple ADC samples together instead of performing resource-intensive FFTs on-chip.

Finally, the raw samples are sent out using the Pico’s stdio_usb.

In my experiments, the transmitter transducer was wired between two PIO output pins so it could be driven differentially, which gives a bit more output power. However, the transmitter can be connected between one pin and ground as well.

The receiver transducer can be biased at mid-supply using a simple resistor divider so that the ADC could read both positive and negative swing around VCC/2. However, since no significant improvement in performance is observed, I’ll stick with connecting one lead to ground and truly have no external components besides the transducers.

Processing

A small Python script receives the raw I/Q samples over USB and converts them into complex values. Looking at the raw I and Q data, we can see a fairly clean signal reflected from the wall 25 cm away. The transducers’ limited bandwidth already takes care of windowing, so we can feed the waveform straight through an FFT - neat! Taking the complex Fourier transform and adjusting the axes accordingly reveals a very clear peak at 0.25 m, but also some frequency contents not visible to the eye.

For a more complex target, I move the setup back and place some 3D printed acoustic corner reflectors in the scene. The raw IQ data looks fairly complex, but in the FFT we can clearly see three distinct targets.

As a final example, I’m aiming the setup from the desk into my room. The raw signal amplitudes are very small, but in the FFT we can still clearly see some targets. First, there is a small DC component from crosstalk between TX and RX - either electric or acoustic in nature. The next target at 1.5 m is my bed with a bit of clutter. Finally, the target at 3.5 m is the wall on the opposite side of the room.

Final thoughts

Comparing this solution with the SR-HC04, we’re using far fewer parts and getting a full range plot instead of a single echo.

I’m pretty impressed that even at amplitudes of less than 10 mV, we’re still seeing targets clearly. With some additional amplification, I’m sure this concept can be extended to work well beyond 4 m.

One more aspect to try in the future is chirp sequences. By sending many chirps in succession and performing some additional processing (basically just throwing another FFT at the problem), we can get full range-doppler plots, resolving targets in both distance and relative speed.

However, this is as far as I’ll take the project for now. I’ll instead concentrate on the original goal - measuring wind speed with ultrasound.