What do you need to know to understand this topic?


What is Aliasing?

Aliasing is an interesting (and usually unwanted) phenomenon that happens when a signal is sampled at less than the double of the highest frequency contained in the signal. Related to this concept are the Nyquist-Shannon sampling theorem, Nyquist frequency, frequency folding, anti-aliasing filter and others. These concepts are so interconnected that is difficult to choose which one to introduce first. So, let us start with an example, a signal with a single frequency $f$ that is sampled at a frequency $f_s$. The following plot shows the signal in the time domain and the sampling points. You can control the sampling frequency with the slider and see how the sampled signal looks like.

The slider below controls the sampling frequency as a function of the signal frequency. When $f_s \lt 2f$, the sampled signal appears to have a different frequency than the original.

$f_s$ = 3$f$

As you can see, if you lower the sampling frequency too much, the sampled signal appears to have a different frequency. This effect is called aliasing and it also happens when you watch wheels spinning so fast that it appears that they are moving slowly in the other direction (the Wagon-wheel effect). For example, sampling frequencies $0.9f$ and $1.1f$ lead to the same sampled signal frequency $0.1f$. In general, a signal with frequency $f_i$ after sampling could have been obtained by the sampling of signals with frequency $f$ $$f_i = |f - N \cdot f_s|$$ where $N$ is any integer. Then, with $N = 1$, the alias or image with frequency $0.1 f$ comes from sampling at frequencies $0.9f$ ($f_i = |f - 0.9 \cdot f| = |0.1 \cdot f|$) or $1.1f$ ($f_i = |f - 1.1 \cdot f| = |0.1 \cdot f|$). Likewise, the alias with zero frequency comes from sampling cases:

Keep in mind that a signal is normally composed of many frequencies, and this effect can be analyzed individually for each of them. The last experiment shows that the sampling process, when suffering from aliasing, is changing the spectrum of the signal. To understand what is really happening, we should know how sampling affects the spectrum of the original signal.

How does sampling affects the spectrum of the signal?

Although it is easy to understand what happens to a signal in the time domain after sampling, it is not so clear what happens in the frequency domain. Sampling can be seen as multiplying a signal $x(t)$ by a train of dirac impulses equally spaced by $t_s=1/f_s$ (a Dirac comb): $$x_s(t) = x(t)\sum_{n=-\infty}^{\infty} \delta(t-n\cdot t_s)$$

The Fourier transform of $x(t)$ is $X(f)$ and the transform of the train of impulses is: $$\mathcal{F}\left\{\sum_{n=-\infty}^{\infty} \delta(t-n\cdot t_s) \right\} = \frac{1}{t_s} \sum_{n=-\infty}^{\infty} \delta(f-n \cdot f_s)$$ i.e., it is also a train of impulses. Recalling that multiplication in the time domain is the same as convolution in the frequency domain, we have: $$X_s(f) = X(f)\star \left[\frac{1}{t_s} \sum_{n=-\infty}^{\infty} \delta(f-n \cdot f_s)\right] = \int_{-\infty}^{\infty}X(s) \frac{1}{t_s}\sum_{n=-\infty}^{\infty} \delta(f-s-n \cdot f_s)\textrm{d}s$$ $$=\frac{1}{t_s} \sum_{n=-\infty}^{\infty} \int_{-\infty}^{\infty}X(s) \delta(f-s-n \cdot f_s)\textrm{d}s$$ $$=\frac{1}{t_s} \sum_{n=-\infty}^{\infty} X(f-n\cdot f_s)$$ where the first equation comes from the definition of convolution and the last comes from the convolution of a function with a dirac impulse. This expression tells us that sampling of a signal with spectrum $X(f)$ replicates the same spectrum at distances multiples of $f_s$.

The next experiment lets you add a number of copies at multiples of the sampling frequency to the spectrum of a cosine function and you can see that the resulting signal starts to approach the sampled signal.

The slider below controls the number of copies that are added to the original signal. In this case, the sampling frequency is 3x the signal frequency. To see the original signal, set the number of copies to zero. As more copies are added, the waveform tends to the sampled signal (compare to the first experiment). The sampling process would produce an infinite number of these copies, resulting in a waveform with dirac impulses at the sampling points.

Number of copies = 0

From here, it will be very simple to figure out the minimum sampling frequency that avoids aliasing. This is stated in the Nyquist-Shannon sampling theorem.

Nyquist-Shannon sampling theorem

From what you've learned so far, it is clear that aliasing is a destructive effect, since you can never recover the original signal after sampling. The Nyquist-Shannon sampling theorem states what you should do to avoid aliasing or, in other words, at what frequency you should sample a signal in order to reconstruct it perfectly. In the words of Shannon:

If a function x(t) contains no frequencies higher than B hertz, it is completely determined by giving its ordinates at a series of points spaced 1/(2B) seconds apart.
Therefore, if you sample a signal with bandwidth $B$ at a frequency higher than $2B$, you do not have aliasing. This condition is known as the Nyquist criterion. You have two ways of looking at it: From the previous section, we know that sampling replicates the spectrum of the original signal at distances multiples of the sampling frequency. The criterion stated in Nyquist-Shannon sampling theorem is sufficient to avoid overlap of these copies. This is necessary if we hope to ever recover the original signal.

Signal reconstruction

You might be wondering how come the original signal can be reconstructed by the sampled signal. The sampling process is making an infinite copies of the spectrum of the signal around multiples of the sampling frequency. If the sampling frequency respects the Nyquist criterion, none of these copies will overlap. Hence, the reconstructed signal can be low-pass filtered to remove all frequencies in the baseband and the original signal is obtained. On the other hand, if the Nyquist criterion is not respected, the copies will overlap and a certain band of frequencies in the baseband will be contaminated with components of adjacent copies, thereby ruining the possibility of exact recovery.

Frequency folding

Now that you know more about the sampling process, let's revisit the aliasing effect in a different view. In pratice, it makes sense to look at a signal as a group of sinusoidal components, instead of just one. Let's fix the sampling frequency and look at what happens at each frequency component of the original signal, upon sampling. Frequencies between 0 and $f_s/2$ do not suffer from aliasing. Frequencies between $f_s/2$ and $f_s$ enter the baseband from the negative frequencies of the first replicas and frequencies between $f_s$ and $3f_s/2$ enter the baseband from the positive frequencies of the first replicas too. And so on and so forth... With increasing frequencies, their image goes back and forth between 0 and $f_s/2$, which looks like folding.

The slider below controls the frequency component $f$ of a signal, given a fixed sampling rate $f_s$. As the frequency of the component increases, the alias frequency $f_i$ goes back and forth between zero and the Nyquist frequency $f_s/2$, due to the copies produced by sampling.

$f$ = 0$f_s$

Since all frequencies above the Nyquist frequency "fold" to the Nyquist frequency band, we name this effect as frequency folding. Due to this folding effect, which is another pretty name for the aliasing effect, the Nyquist frequency is also named the folding frequency.

Anti-aliasing filter

Usually, our signal of interest has a limited bandwidth, but it is contaminated with components of higher frequencies (such as noise, interferences or other signals that we want to ignore). If the sampling was done at twice the bandwidth of the signal of interest, the unwanted components would get folded (or aliased) to the baseband and contaminate the sampled signal. To avoid that, the signal is filtered before sampling with a cut-off frequency matching the bandwidth of the signal of interest. Higher frequencies are eliminated or greatly reduced and their aliasing becomes negligible.

The paradox of the Nyquist rate

In the beginning we started with an example of a cosine. If we started with the sine function instead, a sampling rate of $2f$ leads to no signal. How come we can reconstruct a sampled cosine, but not a sampled sine? I will leave to you to solve this paradox. Hint: note that the only difference between sine and cosine is the time sampling occurs: the sine is sampled in the zero-crossings, but the cosine is sampled at its peaks.