Process speaker identification immediately for short audio samples
First off, this is an awesome API that I would love to use in my app. The big problem I have, though, is that it's not really usable for real-time, low latency identification from short samples because:
1. The asynchronous callback method requires me to make constant polls to the operation result endpoint, which takes (from my measurement) about 1200ms in the ideal case, whereas I would really prefer results within 400-500 ms.
- Each poll on the operation status costs me QPS, which triggers throttling if I poll to often
I would propose the following change to the speaker identification API:
For long inputs (audio over 10+ seconds long), return HTTP 202 with an operation result URL, as it does today
Otherwise, process the audio immediately and return HTTP 200 with the identification results. This should both reduce latency and the polling required to retrieve the results.
