Currently with APL, as soon as you interact with the screen via touch or swipe, the voice interface stops. One argument is that this is not multi-modal. It is single-modal - one interface at a time. To go back to voice, you have to use your voice with the wake word, which users do not understand. Touching the screen again stops voice again.
There are multiple uservoice enhancement requests on this topic (a few below), but it has been a long time with no action. The current 'multimodal' experience is so poor with this limitation in place that I don't see 3rd parties investing in creating voice experiences with screens. Am I missing something? Would love to hear other's opinions.