The Windows Phone SDK includes a Windows.Phone.Media.Devices.AudioRoutingManager API which I had occasion to use.
The API allows apps that have communication audio streams (e.g., Voice over IP calls) to control whether the audio goes out over the earpiece, over the speakerphone, or over the Bluetooth headset. This might be done automatically, or might be used to power "Speakerphone" and "Bluetooth" buttons in the app UI.
The starting point is a GetDefault() method which gives you the singleton AudioRoutingManager object.
There are three ways to get information out of this object:
- A read-only AvailableAudioEndpoints property tells you the list of currently available audio outputs.
- A GetAudioEndpoint method tells you what the current audio output is.
- An AudioEndpointChanged callback tells you when either of the previous two things change.
You can also tell the object to change something:
- SetAudioEndpoint(…) lets you tell Windows Phone where audio should come out, subject to some restrictions.
There are two enumerated types used by these methods:
- AvailableAudioRoutingEndpoints, which is the type of the AvailableAudioEndpoints property. This is a "flags"-style (multi-valued) enum with the following values:
- None
- Earpiece
- Speakerphone
- Bluetooth
- AudioRoutingEndpoint, which is returned by GetAudioEndpoint and is the sole argument for SetAudioEndpoint. This is a single-valued enum with the following values:
- Default
- Earpiece
- Speakerphone
- Bluetooth
- WiredHeadset
- WiredHeadsetSpeakerOnly
- BluetoothWithNoiseAndEchoCancellation
At first I found this very confusing. SetAudioEndpoint takes an AudioRoutingEndpoint type, but what do I pass to it? And why does GetAudioEndpoint always tell me "Speakerphone?"
After experimenting and chatting with the folks who own the API I was able to construct an internal mental model which made more sense to me.
- While communications audio is playing, the Phone has an audio routing policy. Imagine an AudioRoutingPolicy write-only property with the following values:
- Normal: play to the first available of { Bluetooth device, wired headset, earpiece }
- No Bluetooth: play to the first available of { Wired headset, earpiece }
- Speakerphone: play to the built-in speaker
- If you want to change this policy, the app needs to have either the ID_CAP_VOIP or ID_CAP_VOICEMAIL capability. The documentation refers to an ID_CAP_AUDIOROUTING capability, but this does not exist. Do
var audioRoutingManager = Windows.Phone.Media.Devices.AudioRoutingManager.GetDefault();
and then:- audioRoutingManager.SetAudioEndpoint(AudioRoutingEndpoint.Bluetooth); sets the policy to Normal
- audioRoutingManager.SetAudioEndpoint(AudioRoutingEndpoint.Earpiece); sets the policy to No Bluetooth
- audioRoutingManager.SetAudioEndpoint(AudioRoutingEndpoint.Speakerphone); sets the policy to Speakerphone
- There is no direct way to tell what the current value of the AudioRoutingPolicy is. You can sometimes guess, though, based on the value of GetAudioEndpoint and/or AvailableAudioEndpoints.
- If GetAudioEndpoint is "Speakerphone", then the current policy is definitely "Speakerphone."
- If GetAudioEndpoint is "Earpiece" or "WiredHeadset" or "WiredHeadsetSpeakerOnly", and AvailableAudioEndpoints has "Bluetooth", then the current policy must be "No Bluetooth".
- If GetAudioEndpoint is "Bluetooth" or "BluetoothWithNoiseAndEchoCancelation", then AvailableAudioEndpoints must have "Bluetooth", and the current policy must be "Normal".
- When there are no audio communications streams, the policy is undefined.
- When the number of audio communications streams goes from zero to one, the policy is defaulted to "Normal". This means you shouldn't bother setting a policy until after your phone call starts playing audio.
- When a Bluetooth device is connected, the policy is reset to "Normal". This is usually what you want. But if you want to enforce a "No Bluetooth" or "Speakerphone" policy even if a Bluetooth device connects during the call, you need to listen for AudioEndpointChanged callbacks and make sure you enforce the policy you want at that time.
Here's a chart I put together on how the different states and policies interact:
If a Bluetooth Hands-Free HF device is: | Connected AvailableAudioEndpoints = Speakerphone | Earpiece | Bluetooth | Not connected AvailableAudioEndpoints = Speakerphone | Earpiece |
Normal audio routing policy is: This policy is automatically invoked when: a call starts, or Bluetooth connects You can manually invoke it with: SetAudioEndpoint(Bluetooth) | Bluetooth or BluetoothWith... Depending on the capabilities of the device | WiredHeadset or WiredHeadsetSpeakerOnly or Earpiece Depending on what is plugged in |
No Bluetooth audio routing policy is: You can manually invoke this policy with: SetAudioEndpoint(Earpiece) or SetAudioEndpoint(Default) | WiredHeadset or WiredHeadsetSpeakerOnly or Earpiece Depending on what is plugged in | |
Speakerphone audio routing policy is: You can manually invoke this policy with: SetAudioEndpoint(Speakerphone) | Speakerphone | |
Invalid audio routing policies: The following calls are all errors: SetAudioEndpoint(WiredHeadset) SetAudioEndpoint(WiredHeadsetSpeakerOnly) SetAudioEndpoint(BluetoothWith...) | N/A SetAudioEndpoint throws an exception |
Note that if a wired headset is plugged in, the app has no way to make audio come out of the earpiece. This is true regardless of whether Bluetooth is connected.
It seems like much of my confusion resulted from a single enumerated type (AudioRoutingEndpoint) serving three purposes:
- Tell the app where audio is coming out (WiredHeadset vs. Earpiece)
- Tell the app what the capabilities of the current output are (Bluetooth vs. BluetoothWithNoiseAndEchoCancellation)
- Allow the app to control the audio routing policy (Default vs. Speakerphone)
I think it would have been clearer to make the audio routing policy a different enumerated type from the current audio output or the available audio outputs. But with the "audio routing policy" mental model, it's not too bad.