The Linux audio stack demystified

ruffyx64 | 134 points

Wrote this blog article as I needed to get a better understanding of the audio stack on Linux (esp. PipeWire, PulseAudio, ALSA, etc. ...). The article turned out to be a lenghty in-depth explanation of how audio works, how digital audio works, and what sound servers on linux actually do. Tried to write it in a way so it is accessible and understandable for beginners but also enlightening for experienced users. Hope it's helpful to HN

ruffyx64 | 2 months ago

I can explain it much more simple

"At first Linus created /dev/dsp, and the user did smile upon him, and the user did see that it was good, and the user did see that it was simple, and people did use their sound, and people did pipe in and out sound as they did please, and Ken Thompson Shined upon them for following the way"

"Then the fiends got in on it and ruined it all, with needless complexities and configurations and situationships, with servers and daemons, and server and daemon wrappers to wrap the servers and daemons, and wrappers for those server wrappers, and then came security permissions for the server wrapper wrapper wrappers, why doesn't my sound work anymore, and then the server wrapper server wrapper wrapper server did need to be managed for massive added complexity, so initd was replaced by systemd, which solves the server wrapper wrapper server server wrapper through a highly complicated system of servers and services and wrappers"

RIP /dev/dsp you will be missed

- Kernighan 3:16

amy-petrik-214 | a month ago

Thanks for the nice writing. But do you have any insight on why is bluetooth audio so clunky on Linux? I'm using a pair of Sony XM4 and I have never had any problems on my 4 Windows machines. But on Ubuntu (both 22.04 and 24.04), I have had to jump through many hoops, from editing a bunch of config files, changing kernel flags, disable and enable a bunch of things I don't understand (mostly from reading Arch Wiki), just to get it working some of the times. Some days it will just outright refuse to connect, sometime it connects but not playing anything (switching audio device to it generates some undecipherable error logs), and (probably worst) sometime it connects very quickly but stay locked in low fidelity mode instead of a2dp sink. I'm so fed up that I just switched to wired headphones every time I use my Ubuntu.

anvuong | a month ago

I miss the simplicity of OSS :\

epx | a month ago

An informative article for the Linux parts, I skipped the basics/intro.

I’d like to see some more detail on the rating chart, particularly on the axes where pipewire doesn’t surpass JACK/pulseaudio.

As an embedded software engineer who deals with processing at hundreds of kilohertz, it is funny hearing anything running Linux called “real time”.

If it’s not carefully coded on bare metal for well understood hardware, it’s not real time, it’s just low latency. No true Scotsman though(looking over my shoulder for the FPGA programmers).

Zamiel_Snawley | a month ago
| a month ago

So far the audio section is a great intro to audio and digitization, and applies to any a-to-d process at some level. Looking forward tomplowing through the rest.

The problem with audio is it's realtime (isochronous), which means good audio processing requires a guarantee of sorts. To get that guarantee requires a path through the system that's clear, which can be difficult to construct.

mannyv | a month ago

"Professional audio will typicall utilize 24-bit. Everything higher than that is usually bogus. Bogus where only audiophiles will hear a difference." Does he mean internal DAW bit rates like 64/32bit float are bogus, I am probably reading it wrong ?

ladzoppelin | a month ago

Very nice article, I love posts that go right from the basics and build up to answer the question. And I certainly have a better understanding of DACs as a bonus!

Voklen | a month ago

Dupe from three days ago by the same author

g15jv2dp | a month ago

No mention of AoIP. I make heavy use of Netjack2 in my production / streaming studio. Great way to move 25/30 channels of audio between 5 PCs in real-time.

Beats the pants off DANTE.

Venn1 | a month ago

Well, the most confusing part of linux is definitely the audio stack. Thanks for the writeup.

lofaszvanitt | a month ago
| a month ago