I forgot the most important part in the original, the kernel!
(I hope the ASCII art doesn't get mangled)
I have been working a while trying to get a big picture of how Linux handles sound processing and after much work I have put together this little representation of what I have learned.
Please send me any additional comments or components that I may have missed.
I hope it helps somebody though who may be struggling in getting sound to work and just needs some idea of how all the parts fit together.
There really needs to be some concise, up-to-date Wiki on Linux sound processing and how the different applications work and work together (or not).
Linux Sound Architecture XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX X Linux Sound Applications X X X X XXXXXXXXXXXXXXXXXXXXXXXXX X X Sound Servers X X X ESD/Arts/NASD/Pulse X X X X X XXXXXXXXXXXXXXXXXXXXX X X X Third Party APIs X X X X GStreamer X X X X X X XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX X X OSS Compatibility API X X ALSA Sound API XXXXXXXXXXXXXXXXXXXXXXXXXXXXX X X XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX X Linux Kernel X XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX X Sound Hardware X XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Ross S. W. Walker Information Systems Manager Medallion Financial, Corp. 437 Madison Avenue 38th Floor New York, NY 10022 Tel: (212) 328-2165 Fax: (212) 328-2125 WWW: http://www.medallion.com http://www.medallion.com/
______________________________________________________________________ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof.
On Tue, 2008-03-11 at 11:59 -0400, Ross S. W. Walker wrote:
I have been working a while trying to get a big picture of how Linux handles sound processing and after much work I have put together this little representation of what I have learned.
Please send me any additional comments or components that I may have missed.
Some corrections (PulseAudio contains an ALSA module that can redirect audio back into PA):
Linux Sound Architecture
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX X Linux Sound Applications X X X
X XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX X X Third-Party APIs X X X X GStreamer/Phonon/ X Sound X X X xine-lib X Servers X X X XXXXXXXXXXXX X X X X ^ X X X X esd/aRts/NAS/JACK | X X X X | X X XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX | X X X aoss X X OSS Compatibility API X X | X X XXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXX X PA X X X X | X X X XXX | X X X alsa-lib API X >-/ X X X XXXXXXXXX X X X
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
X Linux Kernel (ALSA driver) X
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX X Sound Hardware X XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Yes, audio on Linux is a mess.
Ignacio Vazquez-Abrams wrote:
On Tue, 2008-03-11 at 11:59 -0400, Ross S. W. Walker wrote:
I have been working a while trying to get a big picture of how Linux handles sound processing and after much work I have put together this little representation of what I have learned.
Please send me any additional comments or components that I may have missed.
Some corrections (PulseAudio contains an ALSA module that can redirect audio back into PA):
ALSA provides an ALSA driver in it's plugins to send audio to a PulseAudio server, so that part is pure ALSA. I mean sure it uses PulseAudio's protocol to send over the network, but as far as ALSA is concerned it's just another ALSA kernel driver for communicating with sound hardare. The PulseAudio server by itself is of course a pure sound server.
Having said that, I don't believe that the ALSA driver for PulseAudio counts as yet another interface.
AOSS is merely a shim for the builtin OSS Compatibility API to force older OSS apps to use the API properly, because of that I count AOSS as part of the OSS Compatibility API.
Also sound servers can and do use third party API products such as GStreamer. Often GStreamer provides those plugins on behalf of the sound server (cause no one else wants to), but the plugin is still part of the sound server and as far as the sound server is concerned it is just sending audio directly to the hardware API. GStreamer/Phonon also have plugins for communicating with sound servers as well as HW APIs such as ALSA or OSS. When diagramming these third party APIs things can ugly pretty darn fast.
Thanks for the additional examples, but I still stand by my original diagram. Maybe someone can take each part of the diagram, zoom in on it and show which apps/apis/modules from which project interface between each other and in which direction.
Linux Sound Architecture
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX X Linux Sound Applications X X X X XXXXXXXXXXXXXXXXXXXXXX X X Sound Servers X X X ESD/aRts/NAS/JACK X X X X X XXXXXXXXXXXXXXXXXXXXX X X X Third-Party APIs X X X X GStreamer/Phonon/ X X X X xine-lib X X X X X X XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX X X OSS Compat API X X ALSA API XXXXXXXXXXXXXXXXXXXXXX X X XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX X Linux Kernel (ALSA driver) X XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX X Sound Hardware X XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Yes, audio on Linux is a mess.
This I do agree with though!
-Ross
______________________________________________________________________ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof.
Ross S. W. Walker wrote:
Ignacio Vazquez-Abrams wrote:
On Tue, 2008-03-11 at 11:59 -0400, Ross S. W. Walker wrote:
I have been working a while trying to get a big picture of how Linux handles sound processing and after much work I have put together this little representation of what I have learned.
Please send me any additional comments or components that I may have missed.
Some corrections (PulseAudio contains an ALSA module that can redirect audio back into PA):
ALSA provides an ALSA driver in it's plugins to send audio to a PulseAudio server, so that part is pure ALSA. I mean sure it uses PulseAudio's protocol to send over the network, but as far as ALSA is concerned it's just another ALSA kernel driver for communicating with sound hardare. The PulseAudio server by itself is of course a pure sound server.
Having said that, I don't believe that the ALSA driver for PulseAudio counts as yet another interface.
AOSS is merely a shim for the builtin OSS Compatibility API to force older OSS apps to use the API properly, because of that I count AOSS as part of the OSS Compatibility API.
Also sound servers can and do use third party API products such as GStreamer. Often GStreamer provides those plugins on behalf of the sound server (cause no one else wants to), but the plugin is still part of the sound server and as far as the sound server is concerned it is just sending audio directly to the hardware API. GStreamer/Phonon also have plugins for communicating with sound servers as well as HW APIs such as ALSA or OSS. When diagramming these third party APIs things can ugly pretty darn fast.
Thanks for the additional examples, but I still stand by my original diagram. Maybe someone can take each part of the diagram, zoom in on it and show which apps/apis/modules from which project interface between each other and in which direction.
Linux Sound Architecture
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX X Linux Sound Applications X X X X XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX X X Third-Party APIs X X X X GStreamer/Phonon/ X X X X xine-lib X X X X X X X X XXXXXXXXXX X X X X Sound Servers X X X X ESD/aRts/NAS/JACK X X X X X X X X X XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX X X OSS Compat API X X ALSA API XXXXXXXXXXXXXXXXXXXXXX X X XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX X Linux Kernel (ALSA driver) X XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX X Sound Hardware X XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Yes, audio on Linux is a mess.
This I do agree with though!
One thing I didn't think of initially that Ignacio reminded me of in his diagram is the Ying/Yang relationship that the third party sound APIs and the sound servers have. What I mean is that sound servers have plugins for communicating with hardware APIs, third party APIs and other sound servers. While third party APIs have plugins for communicating with hardware APIs, sound servers and other third party APIs.
I modified the image to see if it can be graphically portrayed. It's better, but if a picture is worth a thousand words, then one needs ten thousand words to properly explain this one.
-Ross
______________________________________________________________________ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof.
On Tue, 2008-03-11 at 16:14 -0400, Ross S. W. Walker wrote:
Ignacio Vazquez-Abrams wrote:
On Tue, 2008-03-11 at 11:59 -0400, Ross S. W. Walker wrote:
I have been working a while trying to get a big picture of how Linux handles sound processing and after much work I have put together this little representation of what I have learned.
Please send me any additional comments or components that I may have missed.
Some corrections (PulseAudio contains an ALSA module that can redirect audio back into PA):
ALSA provides an ALSA driver in it's plugins to send audio to a PulseAudio server, so that part is pure ALSA. I mean sure it uses PulseAudio's protocol to send over the network, but as far as ALSA is concerned it's just another ALSA kernel driver for communicating with sound hardare. The PulseAudio server by itself is of course a pure sound server.
Having said that, I don't believe that the ALSA driver for PulseAudio counts as yet another interface.
No, but it does place part of ALSA above (in front of?) part of PA. (JACK and OSS also have similar ALSA plugins, although I don't see the point of the OSS module)
Also sound servers can and do use third party API products such as GStreamer.
"Can" far, far more than "do". The only concrete evidence of that I was able to find were the PA GStreamer and JACK plugins. I was unable to find any evidence that ESD or aRts use third-party APIs, only ALSA and OSS. (Incidentally, I found that NAS can only use OSS. Yet another reason for it to die.)
Often GStreamer provides those plugins on behalf of the sound server (cause no one else wants to), but the plugin is still part of the sound server and as far as the sound server is concerned it is just sending audio directly to the hardware API.
It's just sending audio, period. It's not at all concerned with where it ends up, just that it moves to the next stage.
GStreamer/Phonon also have plugins for communicating with sound servers as well as HW APIs such as ALSA or OSS. When diagramming these third party APIs things can ugly pretty darn fast.
Indeed. And I also think that an ASCII diagram is no longer sufficient for showing the details.