This is a very infuential video. I often see it referenced when digital audio gets explained.
It is also very insiduously misleading in a way that is hard to fault it for.
That "band limited signal" that uniquely satisfies Niquist theorem? That is an infinite, periodic signal.
No finite (e. g. a song), aperiodic signal can be band limited. That includes any signal with transients.
Well, how big is the difference? How much overhead/error/lookahead is needed to approach the Niquist result? It is never mentioned by people referring to this theorem when talking about audio signal sampling!
And I wish it was mentioned and explained.
steve1977 19 hours ago [-]
I think the constraint of using a band limited signal is the big misunderstanding many people have in regards to digital audio.
Yes, you can perfectly reproduce a band limited signal as long as the highest frequency is below fs/2.
But to get a band limited signal from a “real life” signal without any artifacts can be trickier than one might think. Especially when the Nyquist frequency is near the limit of human hearing.
And this is the one big argument in favor of Hi Res audio - moving those filter frequencies high above the hearing threshold.
Kirby64 16 hours ago [-]
The only thing really in favor of hi-res audio is that is allows you to have rather lazy circuit design. You can have a super lazy 20 kHz -> 48 kHz anti-aliasing filter that is cheap... or you could just properly make a nice 20-22k sharp filter, and stop wasting all that bandwidth on worthless doubled sample rate audio.
In reality, there is absolutely 0 use for digital audio sampled above 16-bit, 48 kHz for listening. (44.1 kHz is fine too, I guess, but the sample rate is annoying for compatibility with a lot of modern systems) It has uses for music production, but that's about it. Final mix should be 16-bit/48k.
That's unrelated. This difference is the inter-aural or inter-channel difference and 16/44.1k can capture that to much greater precision than microseconds.
Some math [1]
44.1k file containing pulses with sub-sample delays [2]
Something similar, but square wave, and nicely showing that timing precision actually depends on bit-depth and not the sampling rate [3]
Some practical experiments with capturing the playback of such files and verifying that the delay is preserved: pulse [4] and square [5]
>And this is the one big argument in favor of Hi Res audio
It's really not. For redbook, fs/2 is 22 kHz and while human hearing maxes at 20 kHz, it's not a hard cutoff: combine our low sensibility to high frequencies (cf ISO 226), the average listener's hearing not going much further than 18 in reality and frequency masking by other simultaneous notes and the small aliasing/imaging issues near fs/2 aren't a real problem.
But the real important factor rarely mentioned is material: the amount of music with meaningful content at a meaningful volume at those frequencies is statistically negligible.
Hi Res audio is snake oil designed to sell the same thing multiple times, period.
PrismCrystal 12 hours ago [-]
"Hi Res audio is snake oil designed to sell the same thing multiple times, period."
The "period" is unwarranted, because there are too important caveats. Firstly, it has been extremely common for albums to be sold with very compressed dynamic range, assuming the average consumer will be listening in noisy environments etc. However, the mastering supplied to Hi Res shops sometimes lacks that compression, so that is where you can hear the album with room to breathe.
Secondly, the SACD format allowed adding a layer for 5.1 surround sound. In classical music, this is especially important for works where the performers are spread out around the hall, not just all on stage in front of the listeners.
So, with Hi Res the higher frequencies and 24 bit depth are snake oil, but the ancillary benefits are audible to anyone with a good listening environment.
kalleboo 6 hours ago [-]
> Firstly, it has been extremely common for albums to be sold with very compressed dynamic range, assuming the average consumer will be listening in noisy environments etc. However, the mastering supplied to Hi Res shops sometimes lacks that compression, so that is where you can hear the album with room to breathe
I had a friend who was extolling the virtues of Hi Res for the pop music he was buying so I asked him to send me a track, and it had the same brick compression as the standard iTunes version and sounded just as flat (I was hoping that even if it was compressed the same, the extra resolution meant that you could recover the detail, but there wasn't an audible improvement).
If that's what they want to sell, they need to create an actual term for that, like the audio version of "Director's Cut", not just sneak it into some random Hi Res releases and hope you find "the good ones" while the rest are snake oil.
BoingBoomTschak 6 hours ago [-]
> However, the mastering supplied to Hi Res shops sometimes lacks that compression, so that is where you can hear the album with room to breathe.
I've almost never heard of Hi Res with a totally new master that wouldn't have been previously available as CD, to be honest. This isn't common, right?
> the SACD format allowed adding a layer for 5.1 surround sound
Well, yeah. Too bad I don't have a PS3 to rip the SACD layer =)
kalleboo 6 hours ago [-]
> Hi Res audio is snake oil designed to sell the same thing multiple times, period
There's a karaoke bar I go to with "Hi Res" logos on the speakers. These are basically MIDI files, in a loud bar atmosphere, who is going to hear the difference, haha.
shmerl 15 hours ago [-]
And for higher price too.
amavect 13 hours ago [-]
I'd like to answer your questions.
While no finite audio clip qualifies as bandlimited, the Nyquist theorem cheats by assuming that the audio clip repeats indefinitely. Doing so results in sharp frequency lines, separated by gaps of zero. Each frequency line lies on an integer multiple of the audio clip's length, the fundamental frequency.
Equivalently, every finite audio clip has a time-discrete Fourier transform.
Mathematically, an audio clip of length T seconds at a sample rate S Hz must have DFT coefficients separated by 1/T Hz, with a maximum frequency less than S/2 Hz. For example, a 1 second clip at 48000 Hz has DFT coefficients between [0,24000) at every 1 Hz. By increasing the length of the audio clip, the frequency resolution increases.
In asking for the error, you ask for the values between the discrete Fourier coefficients. What happens outside of the audio clip determines what happens between coefficients. If the signal repeats, interpolate zeros between coefficients. If the signal goes to zero (not exactly bandlimited), interpolate a sinc function summation between coefficients (this has to do with summation of the rectangular/boxcar function).
How much overhead/error/lookahead is needed to approach the Nyquist result? Theoretically, none. But in practice, perfect filters don't exist.
In practice, how big is the difference? In order to properly record a real waveform, the signal must go through a physical low-pass filter, or else risk unbounded aliasing. The answer depends on the filter specification. I pulled up a Realtek ALC892 datasheet. When sampling at 44100 Hz, a -1 dB passband at 20158 Hz, and a -80 dB ADC stopband at 24916 Hz. Yep, that allows aliasing to pass through, yet it remains somewhat passable. No surprises from a cheap chip. Hence the importance of oversampling during recording or reconstruction in better chips. The audio files themselves don't need it because the error comes from imperfect filters.
In practice it doesn't really matter. Most good converters have pretty sharp cutoffs on their anti-aliasing filters, so any aliasing is severely reduced when mixed back in to the converter. Sure an aperiodic signal needs infinite bandwidth to 'accurately' represent it, but you're quickly below the noise floor.
112233 16 hours ago [-]
Sure. And having that 2+ kHz headroom surely helps. Still, it is a bit jarring to almost never see reconstruction filter mentioned (except for sigma delta, where you simply cannot ignore it), and always see appeals to Nyquist theorem as a mathematical proof, even though it does not apply.
Taking the opportunity, how is the math called that actually does apply to this case?
Kirby64 16 hours ago [-]
Every ADC/DAC used in audio in existence has an anti-aliasing filter, though. The Xiph video even talks about band-limiting signals with the 21 kHz cutoff filter he shows as the example.
It's not talked about, largely, because modern circuit designs already have great anti-aliasing filters that have quite sharp rejection curves.
shmerl 1 days ago [-]
A classic. I wish Xiph developed Daala into a fully functional video codec.
Rendered at 12:42:02 GMT+0000 (UTC) with Wasmer Edge.
It is also very insiduously misleading in a way that is hard to fault it for.
That "band limited signal" that uniquely satisfies Niquist theorem? That is an infinite, periodic signal.
No finite (e. g. a song), aperiodic signal can be band limited. That includes any signal with transients.
Well, how big is the difference? How much overhead/error/lookahead is needed to approach the Niquist result? It is never mentioned by people referring to this theorem when talking about audio signal sampling!
And I wish it was mentioned and explained.
Yes, you can perfectly reproduce a band limited signal as long as the highest frequency is below fs/2.
But to get a band limited signal from a “real life” signal without any artifacts can be trickier than one might think. Especially when the Nyquist frequency is near the limit of human hearing.
And this is the one big argument in favor of Hi Res audio - moving those filter frequencies high above the hearing threshold.
In reality, there is absolutely 0 use for digital audio sampled above 16-bit, 48 kHz for listening. (44.1 kHz is fine too, I guess, but the sample rate is annoying for compatibility with a lot of modern systems) It has uses for music production, but that's about it. Final mix should be 16-bit/48k.
Some math [1]
44.1k file containing pulses with sub-sample delays [2]
Something similar, but square wave, and nicely showing that timing precision actually depends on bit-depth and not the sampling rate [3]
Some practical experiments with capturing the playback of such files and verifying that the delay is preserved: pulse [4] and square [5]
[1] https://troll-audio.com/articles/time-resolution-of-digital-...
[2] https://www.audiosciencereview.com/forum/index.php?threads/t...
[3] https://audiophilestyle.com/forums/topic/58511-time-resoluti...
[4] https://www.head-fi.org/threads/can-you-hear-upscaling.97295...
[5] https://www.head-fi.org/threads/can-you-hear-upscaling.97295...
It's really not. For redbook, fs/2 is 22 kHz and while human hearing maxes at 20 kHz, it's not a hard cutoff: combine our low sensibility to high frequencies (cf ISO 226), the average listener's hearing not going much further than 18 in reality and frequency masking by other simultaneous notes and the small aliasing/imaging issues near fs/2 aren't a real problem.
But the real important factor rarely mentioned is material: the amount of music with meaningful content at a meaningful volume at those frequencies is statistically negligible.
Hi Res audio is snake oil designed to sell the same thing multiple times, period.
The "period" is unwarranted, because there are too important caveats. Firstly, it has been extremely common for albums to be sold with very compressed dynamic range, assuming the average consumer will be listening in noisy environments etc. However, the mastering supplied to Hi Res shops sometimes lacks that compression, so that is where you can hear the album with room to breathe.
Secondly, the SACD format allowed adding a layer for 5.1 surround sound. In classical music, this is especially important for works where the performers are spread out around the hall, not just all on stage in front of the listeners.
So, with Hi Res the higher frequencies and 24 bit depth are snake oil, but the ancillary benefits are audible to anyone with a good listening environment.
I had a friend who was extolling the virtues of Hi Res for the pop music he was buying so I asked him to send me a track, and it had the same brick compression as the standard iTunes version and sounded just as flat (I was hoping that even if it was compressed the same, the extra resolution meant that you could recover the detail, but there wasn't an audible improvement).
If that's what they want to sell, they need to create an actual term for that, like the audio version of "Director's Cut", not just sneak it into some random Hi Res releases and hope you find "the good ones" while the rest are snake oil.
I've almost never heard of Hi Res with a totally new master that wouldn't have been previously available as CD, to be honest. This isn't common, right?
> the SACD format allowed adding a layer for 5.1 surround sound
Well, yeah. Too bad I don't have a PS3 to rip the SACD layer =)
There's a karaoke bar I go to with "Hi Res" logos on the speakers. These are basically MIDI files, in a loud bar atmosphere, who is going to hear the difference, haha.
While no finite audio clip qualifies as bandlimited, the Nyquist theorem cheats by assuming that the audio clip repeats indefinitely. Doing so results in sharp frequency lines, separated by gaps of zero. Each frequency line lies on an integer multiple of the audio clip's length, the fundamental frequency. Equivalently, every finite audio clip has a time-discrete Fourier transform. Mathematically, an audio clip of length T seconds at a sample rate S Hz must have DFT coefficients separated by 1/T Hz, with a maximum frequency less than S/2 Hz. For example, a 1 second clip at 48000 Hz has DFT coefficients between [0,24000) at every 1 Hz. By increasing the length of the audio clip, the frequency resolution increases. In asking for the error, you ask for the values between the discrete Fourier coefficients. What happens outside of the audio clip determines what happens between coefficients. If the signal repeats, interpolate zeros between coefficients. If the signal goes to zero (not exactly bandlimited), interpolate a sinc function summation between coefficients (this has to do with summation of the rectangular/boxcar function). How much overhead/error/lookahead is needed to approach the Nyquist result? Theoretically, none. But in practice, perfect filters don't exist.
In practice, how big is the difference? In order to properly record a real waveform, the signal must go through a physical low-pass filter, or else risk unbounded aliasing. The answer depends on the filter specification. I pulled up a Realtek ALC892 datasheet. When sampling at 44100 Hz, a -1 dB passband at 20158 Hz, and a -80 dB ADC stopband at 24916 Hz. Yep, that allows aliasing to pass through, yet it remains somewhat passable. No surprises from a cheap chip. Hence the importance of oversampling during recording or reconstruction in better chips. The audio files themselves don't need it because the error comes from imperfect filters.
https://www.alldatasheet.com/html-pdf/1137676/REALTEK/ALC892...
Hope this helps.
Taking the opportunity, how is the math called that actually does apply to this case?
It's not talked about, largely, because modern circuit designs already have great anti-aliasing filters that have quite sharp rejection curves.