To FFT or not to FFT

Our ear has thousands of hair cells of various sizes in an organ called the Cochlea. Each hair responds to specific frequencies as determined by its size and is connected to a nerve that signals to the brain whenever the hair vibrates. Thus the brain can tell which frequencies are present in the sound it is hearing by looking at which hair cells vibrate. This is different from how most microphones work. Microphones have a single sheet (the diaphragm) that vibrates in response to sound. The microphone’s wires carry how far the diaphragm is from its normal position*. If the computer wants to know which frequencies are present in the sound recorded, it can perform a mathematical operation known as the ‘Fourier transform’. Similarly if the brain needs to know how far the eardrum was displaced, it can do the inverse operation#.

The techniques employed by the ear and microphone are almost equally good, so a question that naturally arises is: Why did evolution choose one method over the other? Is the decision arbitrary, or is there a rationale behind it?

The answer is rather simple (though it eluded me for a long time). Nerves are too slow for one nerve to be able to carry all that information. A neuron can only fire once every few milliseconds[1]. Hence the Nyquist theorem states that we cannot detect frequencies above 0.5kHz if a single neuron were to be used. Humans are sensitive to sounds upto 20kHz. Hence we need employ the alternative system where the Fourier transform is done physically and not electrically.

Even our 0.5kHz bound is a gross overestimate. Neurons typically encode information in trains of rapid firings. The rate of firing indicates the amplitude. Hence each firing encodes less than 1 bit of information. While there is some evidence that the precise timing of neurons can encode information, the temporal resolution is still insufficient.

[*] Actually they carry how fast the sheet is moving, but that is just a matter of phase difference.
[#] It probably can’t since most phase information is lost. My experience with synthesizing sound corroborates this, since phase didn’t seem to change human perception.

2 thoughts on “To FFT or not to FFT

  1. What if there are a number of neurons which sample the displacement of the diaphragm at different points in time (uniformly spaced). The brain could place these signals one by one in order and then compute Fourier transform to get frequencies. Why didn’t evolution choose this approach?

    • That is a good point! I did not give this possibility a lot of thought. It is not clear to me how such a system can be implemented though. In the straightforward method, one would have to synchronize the neurons’ sampling to very fine timescales. Ie. they have to sample at a very constant frequency with a constant phase difference with each other. I do not know of something similar happening anywhere else in our body. One would presumably use an oscillator of some sort. The fastest pacemakers in our heart do about 100-200 Hz (in the sino-atrial node). I am not sure if higher frequencies (10-100x more) are possible.

      Once we have a signal, finding the frequencies should be straightforward. It is achieving high temporal precision with neurons that is the challenge.

      An alternate method would be for neurons to sample without synchronizing with each other. I will have to think about whether this will preserve enough information. This is an entertaining train of thought. Thanks for pointing it out.

Leave a reply to 108anup Cancel reply