Introduction
I initially learned about shaped tone burst testing from Sigfried Linkwitz’s website 20+ years ago: https://www.linkwitzlab.com/mid_dist.htm
After studying up on burst testing I took it one step further, stepped sweep burst testing. Time analysis of speaker tone burst signals will reveal stored energy released after the input signal has stopped. This stored energy can be caused by the driver, cabinet, or room.
Here is a program I created to test tone bursts and stepped sweep tone burts. You can read about some of its functions below, it will be expanded soon. If you decide to check it out, please remember that this program is not digitally signed, so your browser will throw a warning when you try to download it. I don't want to pay to get it signed, the program is free to use.
1. PRN Sync Signal
The sweep file uses a PRN (Pseudorandom Noise) signal based on a Maximum Length Sequence (MLS) for robust synchronization, replacing the older two-pulse approach.
Sweep Generation (-sweepwav)
- 100 ms MLS — a deterministic PRN from a 31-bit LFSR with taps at bits 31 and 28
- PRN windowed with Hanning fade in/out (10 ms each)
- -3dB amplitude to leave headroom for tone burst peaks
- Broadband energy sweep that works with any speaker type
DUT Analysis (-dutsweepwav)
- Regenerates the identical PRN MLS reference sequence
- Uses cross-correlation to find the sync position
- Two-pass search: coarse (every 10 samples) then fine refinement
- Detects sync error if correlation exceeds 10% of expected value
- Works even with band-limited playback (tweeters, etc.)
Broadband energy spread across all frequencies — works with tweeters, woofers, or full-range drivers. Robust to filtering because cross-correlation still works after EQ. Low crest factor prevents clipping, and the deterministic sequence is identical every run.
ETC Output Layout
2. Tone Burst Bandwidth
The bandwidth of a tone burst is determined by its duration, not its frequency. For a 4-cycle burst the duration is T = 4 / f and the approximate −3 dB bandwidth is BW ≈ f / 4 — roughly 25% of the center frequency (±12.5%).
The Blackman window reduces spectral leakage (sidelobes ≈ −58 dB down) but slightly widens the main lobe, so the actual −3 dB bandwidth is closer to f / 3.5 or about 29% of center frequency.
| Frequency | Duration | Bandwidth (−3 dB) |
|---|---|---|
| 100 Hz | 40 ms | ~25 Hz |
| 1 kHz | 4 ms | ~250 Hz |
| 10 kHz | 0.4 ms | ~2.5 kHz |
| 20 kHz | 0.2 ms | ~5 kHz |
This is why tone bursts are useful for transient / time-domain analysis — they are short enough to measure time-domain behavior but have enough bandwidth to excite the system meaningfully. The trade-off is that you cannot isolate a single frequency the way you would with a long steady-state sine wave.
For speaker measurements, this bandwidth is actually desirable since it shows how the speaker responds to transient signals across a range of frequencies simultaneously, which is more representative of real audio content (music, speech) than pure tones.
4-Cycle Blackman-Windowed Reference Burst
3. Optimal Octave Spacing
For adjacent burst spectra to "just touch" at their −3 dB points, we need the upper −3 dB edge of one burst to meet the lower −3 dB edge of the next.
Bandwidth ≈ 29% of center frequency (±14.5%). Upper −3 dB edge: f × 1.145. Lower −3 dB edge: f × 0.855.
Derivation
f₂ / f₁ = 1.145 / 0.855 = 1.339
21/n = 1.339 → 1/n = log₂(1.339) = 0.421 → n ≈ 2.37
| Setting | Step Ratio | Coverage |
|---|---|---|
-sweepoct 2 (½ octave) | 1.414 | Slight gaps between bursts |
| Optimal ≈ 1/2.4 octave | 1.339 | Spectra just touch |
-sweepoct 3 (⅓ octave) | 1.260 | Slight overlap (~6%) |
Since -sweepoct takes integers, use -sweepoct 3 or higher for continuous frequency coverage without gaps.
4. Parabolic Interpolation for Sub-Sample PRN Detection
Test Signal Structure
The Problem: Integer Sample Resolution
Cross-correlation finds where the PRN best matches, but only at integer sample positions. The true peak of the correlation function almost always falls between samples. A 1-sample error in a 770,400-sample span produces a 1.3 PPM apparent drift.
The Solution: Fit a Parabola
A parabola closely approximates the correlation peak. We use the three samples surrounding the integer maximum — y0, y1 (the peak), and y2 — to calculate where the true peak lies.
The Math
For a parabola passing through points at x = −1, 0, +1 the vertex (peak) occurs at:
Real Example (End PRN Detection)
y0 (lag−1) : 527,410,999
y1 (lag peak): 529,429,464 ← highest correlation
y2 (lag+1) : 527,361,859
y0 − y2 = 49,140
y0 − 2·y1 + y2 = −4,086,070
δ = 49,140 / (2 × −4,086,070) = −0.006013
Integer position : 866,400
Sub-sample offset : −0.006013
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Precise position : 866,399.993987
Precision Improvement
Start PRN: 96000
End PRN: 866400
Span: 770,400 samples
Error: ~1 sample
Drift: ~1.3 PPM
Start PRN: 96000.00513
End PRN: 866399.993987
Span: 770,399.988857
Error: 0.011 samples
Drift: ~0.014 PPM
| Detection Method | Resolution | Min Detectable Drift |
|---|---|---|
| Integer only | 1 sample | ~1.3 PPM |
| Parabolic interpolation | ~0.01 sample | ~0.013 PPM |
Typical consumer audio drift is 20–100+ PPM; professional gear is 1–20 PPM. With parabolic interpolation we can accurately measure even professional-grade clock accuracy.
5. |Diff| % Metric
The |Diff| % metric compares the Energy Time Curve (ETC) of the Device Under Test (DUT) to an ideal 4-cycle Blackman-windowed reference burst. It measures how much the burst "spreads" or "rings" compared to the ideal.
Step 1 — Record the Signals
The reference is the ideal 4-cycle Blackman burst. The DUT signal is recorded through the speaker and room, and may show additional ringing after the main burst.
Step 2 — Compute ETC (Envelope via Hilbert Transform)
The Hilbert Transform extracts the amplitude envelope, which is then converted to dB scale (0 dB = peak).
Step 3 — Calculate Area Above −40 dB Threshold
We integrate the linear envelope (not dB) above the −40 dB threshold. The shaded regions below represent the areas being compared.
Step 4 — Calculate |Diff| %
= |19.5 − 17.0| / 17.0 × 100% = 2.5 / 17.0 × 100% = 14.7%
Visual Summary
Interpretation Guide
| |Diff| % | Rating | Interpretation |
|---|---|---|
| 0–5% | Excellent | Minimal ringing or resonance |
| 5–15% | Good | Some coloration |
| 15–30% | Fair | Noticeable resonance |
| 30–50% | Poor | Significant ringing |
| >50% | Bad | Severe resonance issues |
Speaker resonance (cone continues moving after signal stops), room reflections (sound bouncing back to microphone), cabinet vibrations (enclosure ringing), and port turbulence (in ported speakers).
6. Noise Floor Measurement
The noise floor is measured from a dedicated 500 ms silence window located just before the End PRN marker.
Measurement Window Location
Step 1 — Locate the Window
noiseFloorSilence = SampleRate × 0.500 # 48,000 samples at 96 kHz
noiseFloorEndSample = endPrnPosition
noiseFloorStartSample = endPrnPosition − noiseFloorSilence
Step 2 — Calculate RMS
RMS (Root Mean Square) measures the average power of the signal in the noise window.
Step 3 — Convert to dBFS
Convert the RMS value to decibels relative to full scale (maximum possible digital value).
For 24-bit audio: Full Scale Max = 8,388,607 (2²³ − 1)
Example: RMS = 3,340
dBFS = 20 × log₁₀( 3,340 / 8,388,607 )
= 20 × log₁₀( 0.000398 )
= 20 × (−3.4)
= −68 dBFS
Step 4 — Per-Burst Relative Noise Floor
For each burst's mini ETC graph, the noise floor is calculated relative to that burst's peak amplitude, giving the signal-to-noise ratio for that specific frequency.
Example: 20 × log₁₀( 3,340 / 6,700,000 ) = 20 × log₁₀( 0.000499 ) = −66 dB
Two Different Noise Floor Values
"Noise Floor: −58.1 dBFS (measured from silence window)"
Measured relative to full scale (0 dBFS). Same value for the entire file. Tells you the actual noise level.
Orange dashed line at −XX dB on each mini ETC graph.
Measured relative to each burst's peak amplitude. Different per burst (lower freq = higher amplitude). Shows SNR for that burst.
Captures system noise without signal interference. Located after all bursts so no burst energy leaks in. 500 ms provides enough samples for a stable RMS measurement. Positioned before the End PRN so the PRN doesn't corrupt the measurement.
Noise sources captured: ADC quantization noise, preamp/mic noise, environmental noise (room, HVAC, etc.), and electromagnetic interference.