🎉 Celebrating 25 Years of GameDev.net! 🎉

Not many can claim 25 years on the Internet! Join us in celebrating this milestone. Learn more about our history, and thank you for being a part of our community!

Back to Math and Physics

DSP Stuff | Pre-Emphasis | Pinking

Math and Physics Programming

Started by L. Spiro December 21, 2021 06:08 PM

5 comments, last by hplus0603 2 years, 6 months ago

L. Spiro

25,818

Author

December 21, 2021 06:08 PM

I restore Nintendo 64 music into HD and in my attempt to take it to the next level I wrote an offline synthesizer. Being offline it can take as much time to render the final result as necessary and I am fully favoring quality over performance.

I am using this paper as a guide for sample interpolation: http://yehar.com/blog/wp-content/uploads/2009/08/deip-original.pdf

It mentions pinking and pre-emphasis a lot.

I am probably just missing a small part of the puzzle.

#1: For pre-emphasis, it provides a table at the bottom: Table: Polynomial approximations of interpolator passband responses.

Let’s use the very bottom one as an example.

The paper says this table gives me an approximation of the passband response. Let’s say I am using “6po 5o 32: 1 + x^2*-0.00026762900966793.”

It says: Design of the oversampling and pre-emphasis filter(s) has been left for the reader.

To make this job easier, minmax-error polynomial approximations of the passband

magnitude frequency responses of the interpolators are given in the following table, with a maximum error of ±0.001dB. The x variable is the frequency in radians. x = 0 corresponds to 0Hz and x = π to the passband edge frequency, i.e. the

Nyquist frequency of the original sampledata before upsampling by N.

In my example, my original samples are 22050.0 Hz, so Nyquist will be 11025.

Okay…what am I supposed to do with 1 + x^2*-0.00026762900966793? How do I choose x? x will be a normalized frequency (frequency converted to radians), but what frequency? The samples will be 22050Hz, but my final result will be (in this example) 48000Hz. What am I supposed to do to apply pre-emphasis?

I thought I needed to apply a pre-emphasis filter to the 22050 samples before using the “6po 5o 32” interpolator across them for synthesis.

I looked up pre-emphasis: https://dsp.stackexchange.com/questions/7771/how-should-one-select-this-filter-coefficient-for-a-pre-emphasis-filter

This says that pre-emphasis is just a 1-order FIR.

So I looked up a pre-emphasis filter and found this 1-order code: https://librosa.org/doc/main/generated/librosa.effects.preemphasis.html

But since I have no idea how I am supposed to use 1 + x^2*-0.00026762900966793 I have no idea what coefficient I am supposed to plug into that. The things I tried are audibly wrong.

What am I supposed to do to handle pre-emphasis?

#2: For pinking, what to do?

The paper says:

Pink means that the spectrum decreases 3dB per an octave increase

in frequency. To take this into account in interpolator quality evaluation, we filter

the spectral images with a pinking filter, whose magnitude is proportional to √1w,

where w is the angular frequency of the passband frequency that creates the

image. We shall call this process pinking.

…

Pinking emphasizes the importance of stopband attenuation near frequencies a

multiple of 2π. This has proven to be important as some interpolators may have

OK-looking frequency responses, but sound really bad when there are typical

amounts of low frequencies in the input, compared to testing with white noise. Because pinking would be infinitely strong near 0Hz, we choose to keep increasing the pinking gain only down to the frequency corresponding to 5Hz in a 44100Hz

sampling frequency (before oversampling) input signal, and keep the pinking gain at the same level from that point to 0Hz.

I’ve implemented the bi-quad filter here: https://webaudio.github.io/Audio-EQ-Cookbook/audio-eq-cookbook.html

I don’t see a good way using that to create this pinking filter. It would be kind of like an LPF but I need to control the slope, and with these all taking a Q factor…

Is there a filter type that would be best-suited for pinking? Is it appropriate to Jerry-rig an LPF into doing this (by refactoring out Q and using S and dBGain instead) or-?

L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

Aressera

3,159

December 21, 2021 11:38 PM

I would stay away from those polynomial interpolators and use a windowed polyphase sinc function FIR filter instead. This will give the theoretical best quality if your window is long. The implementation is pretty simple and easy to speed up with SIMD because it's just a bunch of dot products.

Precompute polyphase filters:
1. allocate a 2D array of size N*M (N filters at different 1/N sub-sample offsets) (filters are M samples long). Longer filters produce better results but use more CPU. Bigger N uses more memory but gives better results. I never found the need to go beyond N=64 for even irrational resampling ratios and got >120dB SNR.
2. compute N Hann-windowed sinc filters at different sub-sample offsets (i.e. at 0, 1/N, 2/N, 3/N …, (N-1)/N), and store the results in the array.

Something like this for each output sample:
1. for each output sample, determine the sub-sample offset in the wave table, in the range [0,1].
2. Use this offset to look up the nearest 2 filter phases from the precomputed filters.
3. Compute dot product of each filter phase with the M wave samples surrounding the interpolation position. You need to reverse either the filter coefficients or wave samples in advance to get the FIR to work correctly.
4. Use linear interpolation to interpolate between the output of these dot products, according to the sub-sample position. This is the output sample value. Care must be taken to do wrapping interpolation (i.e. between phase (N-1)/N and phase 0).

If this is implemented correctly, you should be able to sweep a sine wave from 20Hz to beyond Nyquist without any visible aliasing in the spectrogram, and produce results as good as the best on this page.

For speed, I'd recommend using a pre-filtered wave table. This works similar to texture mip maps. Compute base wave at a high resolution (i.e. at least 2048 samples). Then generate a mip pyramid with different low-passed versions of the base wave (I used at least 4 mips per octave). Filter the mips using an FFT to make sure it wraps properly. I got the best quality by keeping all mips the same size as the base wave, instead of trying to downsample them. By using this mip map approach, you can use a shorter filter during the interpolation (linear is almost good enough). During interpolation, you interpolate from the two nearest mips according to the current frequency using sinc FIR interpolation, and then interpolate between the mips linearly.

hplus0603

11,917

December 22, 2021 04:36 AM

Aressera said:
I would stay away from those polynomial interpolators and use a windowed polyphase sinc function FIR filter instead.

I was going to say the same thing!

There really are only four levels of interpolation that makes sense in the computation/quality spectrum:

nearest neighbor – 1 point
linear interpolation – 2 points
cubic (Hermite) interpolation – 4 points
windowed sinc convolution – as many points as you can muster; ideally 4096 or more

Last I checked it, cubic Hermite's worst case artifacts were at -60 dB of full scale, so to do meaningfully better, you need to get down to -100 dB, and you need the fancy sinc to do that.

Note that there are many ways of implementing the windowed sinc; polyphase is great for certain approaches; a simple FFT based convolver is good for other cases. Also, modern CPUs can do hundreds of those 4096-point FFT filters per core in real time, assuming you use a good FFT / IFFT library. (djb-fft is still alright if you're on x86, but there are better ones these days.)

enum Bool { True, False, FileNotFound };

L. Spiro

25,818

Author

December 23, 2021 02:43 AM

I am sure I will have enough knowledge on the topic to decipher your replies soon, but for now the Learning Train is still stuck on the questions I raised in my first post!
I’m going to be upsampling/filtering either way, but learning all of it only works if I learn all of it. I will get to polyphase and windows when I am done exploring this area.

I still need to know about pre-emphasis. And the best way to LPF. I don’t want a Q factor. I want a sudden fall-off.
Is Butterworth the best for over-sampling filtering? I see every site explaining that you want to insert 0’s and then LPF it, but nothing ever gives an LPF slope. -6dB-per-octave? -3? -12?

L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

L. Spiro

25,818

Author

December 30, 2021 11:03 AM

Took me a while but I think I have finally caught up enough to understand your advice and why you suggested it!
Thank you!

L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

hplus0603

11,917

December 30, 2021 05:38 PM

L. Spiro said:
And the best way to LPF

The best way to LPF with a steep cut-off and phase correct response is to convolve with a windowed sinc function, typically using a raised-cosine window. There really is no simpler way to express this! And, once you work out the actual implementation, it's likely you'll think “oh, so that's all that is – not so bad after all!” Every concept needs a name, and these are the names these concepts have…

Anyway, sounds like you're making good progress, so, yay!

enum Bool { True, False, FileNotFound };