How the cochlea computes (2024)

izhak | 439 points

If you want to get really deep into this, Richard Lyon has spent decades developing the CARFAC model of human hearing: Cascade of Asymmetric Resonators with Fast-Acting Compression. As far as I know it's the most accurate digital model of human hearing.

He has a PDF of his book about human hearing on his website: https://dicklyon.com/hmh/Lyon_Hearing_book_01jan2018_smaller...

antognini | 16 hours ago

The thesis about human speech occupying less crowded spectrum is well aligned with a book called "The Great Animal Orchestra" (https://www.amazon.com/Great-Animal-Orchestra-Finding-Origin...).

That author details how the "dawn chorus" is composed of a vast number of species making noise, but who are able to pick out mating calls and other signals due to evolving their vocalizations into unique sonic niches.

It's quite interesting but also a bit depressing as he documents the decline in intensity of this phenomenon with habitat destruction etc.

shermantanktop | 18 hours ago

> A Fourier transform has no explicit temporal precision, and resembles something closer to the waveforms on the right; this is not what the filters in the cochlea look like.

Perhaps the ear does someting more vaguely analogous to a discrete Fourier transforms on samples of data, which is what we do in a lot of signal processing.

In signal processing, we take windowed samples, and do discrete transforms on these. These do give us some temporal precision.

There is a trade off there between frequency and temporal precision, analgous to the Pauli exclusion principle in quantum mechanics. The better we know a frequency, the less precisely we know the timing. Only an infinite, periodic signal has a single precise frequency (or precise set of harmonics) which are infinitely narrow blips in the frequency domain.

The continuous Fourier transform deals with periodic signals only. We transform an entire function like sin(x) over the entire domain. If that domain is interpreted as time, we are including all of eternity, so to speak from negative infinite time to positive.

kazinator | 18 hours ago

Nit: It’s an unfortunate confusion of naming conventions, but Fourier Transform in the strictest sense implies an infinite “sampling” period, while the finite “sample” period counterpart would correspond to Fourier Series even though we colloquially refer to them interchangeably.

(I had put “sampling” in quotes as they’re actually “integration period” in this context of continuous time integration, though it would be less immediately evocative of the concept people are colloquially familiar with. If we actually further impose a constraint of finite temporal resolution so that it is honest-to-god “sampling” then it becomes Discrete Fourier Transform, of which the Fast Fourier Transform is one implementation of.)

It is this strict definition that the article title is rebuking, but it’s not quite what the colloquial usage loosely evokes in most people’s minds when we usually say Fourier Transform as an analysis tool.

So this article should have been comparing to Fourier Series analysis rather than Fourier Transform in the pedantic sense, albeit that’ll be a bit less provocative.

Regardless, it doesn’t at all take away from the salient points of this excellent article which are really interesting reframing of the concepts: what the ear does mechanistically is applying a temporal “weigting function” (filter) so it’s somewhere between Fourier series and Fourier transform. This article hits the nail on the head on presenting the sliding scale of conjugate domain trade offs (think: Heisenberg)

xeonmc | 19 hours ago

To summarize: the ear does not do a Fourier transform, but it does do a time-localized frequency-domain transform akin to wavelets (specifically, intermediate between wavelet and Gabor transforms). It does this because the sounds processed by the ear are often localized in time.

The article also describes a theory that human speech evolved to occupy an unoccupied space in frequency vs. envelope duration space. It makes no explicit connection between that fact and the type of transform the ear does—but one would suspect that the specific characteristics of the human cochlea might be tuned to human speech while still being able to process environmental and animal sounds sufficiently well.

A more complicated hypothesis off the top of my head: the location of human speech in frequency/envelope is a tradeoff between (1) occupying an unfilled niche in sound space; (2) optimal information density taking brain processing speed into account; and (3) evolutionary constraints on physiology of sound production and hearing.

edbaskerville | 18 hours ago

Wow, this discussion about how our ears work is mind-blowing! It's amazing how complex sound processing is, and the comparison to signal processing concepts is really illuminating.

rattan12138 | 4 hours ago

Nice to see a video for the tip links and ion channels.

I spent a while reading up on that stuff because I was trying to figure what causes my tinnitus. My best guess is if the hairs over bend, that stuff can break and an ion channel get stuck open causing the cell to fire continually.

Another fun ear fact is they incorporate active amplification. You can hook an electrical signal to the loudspeaker type cell to make it vibrate around https://youtu.be/pij8a8aNpWQ

tim333 | 14 hours ago

Just a warning that the video ends with a loud, high pitched tone that will make you want to rip your headphones off.

Ironic for a video about hearing.

Cadwhisker | 13 hours ago

This subject has bothered me for a long time. My question to guys into acoustics was always: If the cochlea performs some kind of Fourier transform, what are the chances, that it uses sinus waves as a base for the vector-space? - if it did anything like that it could just as good use any slightly different wave-forms as a base for transformation. Stiffness and non-linearity will for sure take care that any ideal rubber model in physics will in reality be different from the perfect sinus.

adornKey | 18 hours ago

I've always thought the basilar membrane was a fascinating piece of biological engineering. Whether or not the difference between its behavior vs FT really matters depends on the context. Audio processing on a computer, FFT is often great. Trying to understand / model human sound perception, particularly in relation to time, FFT has weaknesses.

shannifin | 8 hours ago

As the auditory associative cortex in parietal lobe discriminates frequencies, there must be some time-frequency transform between the ear and the brain. This must be discrete (as neurons fire in bursts and there is a finite frequency resolution capacity) and finite time.

The poor man's conversion of finite to equivalent infinite time is if you assume an infinite signal where the initial finite one is repeated infinately to the past and the future.

tsoukase | 12 hours ago

Somewhere here must lie the cure to tinnitus.

hbarka | 3 hours ago

This is fascinating.

I know of vocoders in the military hardware that encode voices to resemble something more simple for compression (a low-tone male voice), smaller packets that take less bandwidth. This evolution of the ear to must also have evolved with our vocal chords and mouth to occupy available frequencies for transmission and reception for optimal communication.

The parallels with waveforms don't end there. Waveforms are also optimized for different terrains (urban, jungle).

Are languages organic waveforms optimized to ethnicity and terrain?

Cool article indeed.

javier_e06 | 16 hours ago

man I need to finally learn what a Fourier transform is

tryauuum | 19 hours ago

"It appears that human speech occupies a distinct time-frequency space. Some speculate that speech evolved to fill a time-frequency space that wasn’t yet occupied by other existing sounds."

I found this quite interesting, as I have noticed that I can detect voices in high-noise environments. E.g. HF Radio where noise is almost a constant if you don't use a digital mode.

fennec-posix | 15 hours ago

supplemental:

Neuroanatomy, Auditory Pathway

https://www.ncbi.nlm.nih.gov/books/NBK532311/

Cochlear nerve and central auditory pathways

https://www.britannica.com/science/ear/Cochlear-nerve-and-ce...

Molecular Aspects of the Development and Function of Auditory Neurons

https://pmc.ncbi.nlm.nih.gov/articles/PMC7796308/

rolph | 16 hours ago

FT is frequency domain representation.

neural signaling by action potential, is also a representation of intensity by frequency.

the cochlea is where you can begin to talk about bio-FT phenomenon.

however the format "changes" along the signal path, whenever a synapse occurs.

rolph | 19 hours ago

Tbh I used to think that it does. For example, when playing higher notes, it's harder to hear the out-of-tune frequencies than on the lower notes.

p0w3n3d | 19 hours ago
[deleted]
| 15 hours ago

What does the continuous tingling of a hair cell sound like to the subject?

amelius | 15 hours ago

Many versions of this article could be written:

The computer does not do a Fourier transform (FFT computes the discrete Fourier transform)

Spectroscope dont do a Fourier transform (it's actually the short time FT)

The only thing that actually does Fourier transform is a mathematician, with a pen and some paper.

xmcqdpt2 | 15 hours ago

Why is there no box diagram for cochlea "between wavelet and Gabor" ?

gowld | 18 hours ago

Fourear transform

debo_ | 16 hours ago

Spoiler: yes it does, but the author isn't familiar with how the term Fourier Transform is used in signal processing.

dboreham | 11 hours ago

Man, I've been spreading disinformation for years.

bloppe | 19 hours ago
[deleted]
| 14 hours ago

OT: Does anyone here believe in Intelligent Design?

brcmthrowaway | 17 hours ago

The title seems a little click-baity and basically wrong. Gabor transforms, wavelet transforms, etc are all generalizations of the fourier transform, which give you a spectrum analysis at each point in time

The content is generally good but I'd argue that the ear is indeed doing very Fourier-y things.

superb-owl | 17 hours ago

[dead]

hamonrye | 6 hours ago

[flagged]

lala_ | 18 hours ago