Microsoft

Speaker Recognition API

Welcome to the Speaker Recognition API Forum

Categories

API – Any ideas or feedback pertaining to features or enhancements to Speaker Recognition API.

Documentation – Any ideas or suggestions for the API Reference or Documentation.

Language Support – Submit a request to have a particular language supported.

Samples & SDK Request – Let us know if you would like to see a Code sample or SDK provided.

  • Hot ideas
  • Top ideas
  • New ideas
  • My feedback
  1. Support Spanish language on Speaker Recognition API

    Can you please specify when does Spanish language support is expected to be released for the Verification profile feature?

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Language Support  ·  Flag idea as inappropriate…  ·  Admin →
  2. Return Confidence Score

    Currently with the GET Operations Status API call, confidence is returned as "High", Normal", or "Low".

    Having the actual Confidence Score (0-1 i real numbers) returned would be much more useful than an arbitrary value.

    GitHub Issue connected with this: https://github.com/MicrosoftDocs/azure-docs/issues/30221

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  API  ·  Flag idea as inappropriate…  ·  Admin →
  3. speaker verification demo

    the demo for speaker verification, https://azure.microsoft.com/en-us/services/cognitive-services/speaker-recognition/, is great! Love the web-based aspect. The demo has a link that says 'want to build this', and that link takes you to the SDK docs, with no real info about how to build your sample app. I want to see the code for your demo! So a web based client communicating audio collected from a browser and sent to a server that calls the speaker veridication SDK. All your samples on github seem to be WPF based. I need a web based client talking to a (C#) server that calls you C#…

    3 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Samples & SDK Requests  ·  Flag idea as inappropriate…  ·  Admin →
  4. Wrong recognition

    Hi, do you recommend a way of speaking or recording the audio, since we tried testing the speaker verification api, it is accepting another's person voice(acceptance rate:normal) which not supposed to be..and its very hard to get an acceptance rate of high..are there plans to make the confidence level into percentage? thank you.

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Speaker Verification  ·  Flag idea as inappropriate…  ·  Admin →
  5. Provide iPhone application to use Speaker Recognition API

    It will be helpful if you provide the sample application for iPhone also.

    0 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Under Review  ·  0 comments  ·  Samples & SDK Requests  ·  Flag idea as inappropriate…  ·  Admin →
  6. 1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  API  ·  Flag idea as inappropriate…  ·  Admin →
  7. Korean language support for Speaker recognition

    Please support Korean speaker recognition. I would like to support Korean.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Language Support  ·  Flag idea as inappropriate…  ·  Admin →
  8. Add support for Speaker Diarization for untrained speakers.

    Distinguish between multiple speakers in a conversation without training the system first. IBM Watson currently supports this: https://www.ibm.com/blogs/bluemix/2017/05/whos-speaking-speaker-diarization-watson-speech-text-api/

    Given an audio recording of a conversation the minimuim I'm looking for is:
    Speaker 1 (0:01-0:03): Hi Ted, how are you today?
    Speaker 2 (0:04-0:05): I'm doing well, how about you?
    Speaker 1 (0:05-0:10): Good thanks. So the reason I called you today was to discuss your recent sales performance.

    Ideally each word would be timestamped so we could highlight the spoken word when displaying the transcription next to the playing audio. Also it would be nice if each word had a…

    12 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    2 comments  ·  Speaker Identification  ·  Flag idea as inappropriate…  ·  Admin →
  9. Process speaker identification immediately for short audio samples

    First off, this is an awesome API that I would love to use in my app. The big problem I have, though, is that it's not really usable for real-time, low latency identification from short samples because:
    1. The asynchronous callback method requires me to make constant polls to the operation result endpoint, which takes (from my measurement) about 1200ms in the ideal case, whereas I would really prefer results within 400-500 ms.

    2. Each poll on the operation status costs me QPS, which triggers throttling if I poll to often

    I would propose the following change to the speaker…

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Speaker Identification  ·  Flag idea as inappropriate…  ·  Admin →
  10. How to reduce response time for identification requests?

    When testing in python, each identification request would take eight to nine seconds to get a response. Is this due to the Internet or the identification model processing itself would take that long? And is there any way to get a response faster? Thank you.

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Speaker Identification  ·  Flag idea as inappropriate…  ·  Admin →
  11. API support to know who spoke what

    I am trying to build a system that has should be able to recognize all the speakers and the speeches each speaker has spoken.
    I was trying to build the solution using “Speaker Recognition API”. I am passing the voice/audio stream to identification API and able to know who are the speakers there; but didn’t find a way to know who spoke what.
    Is there any way to know who spoke what using “Speaker Recognition API” as it is required for my solution?
    Reference to any other APIs Microsoft is building will be really helpful

    5 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  API  ·  Flag idea as inappropriate…  ·  Admin →
  12. Speaker Identication Apis

    Operation status api always return status failed and message Speaker Invalid, please give the solution to this problem. audio are recorded exactly same as specifies the document.

    {"status":"failed","createdDateTime":"2018-05-25T09:07:19.4685571Z","lastActionDateTime":"2018-05-25T09:07:20.3782489Z","message":"SpeakerInvalid"}

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Under Review  ·  1 comment  ·  Samples & SDK Requests  ·  Flag idea as inappropriate…  ·  Admin →
  13. 0 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  API  ·  Flag idea as inappropriate…  ·  Admin →
  14. Solution to many of your APIs

    Rather than offer the APIs at Microsoft, then send the user to GitHub, then hope the user can follow the various installation processes/steps, simply allow the user to download directly from Microsoft and include the newly generated key in the downloadable source code. This way, you control the entire process and don't have to worry about unzipping, npm installs, key issues etc.

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Under Review  ·  0 comments  ·  Samples & SDK Requests  ·  Flag idea as inappropriate…  ·  Admin →
  15. Add support for italian on speaker recognition api

    Please add the italian language. Thanks for your support.

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Language Support  ·  Flag idea as inappropriate…  ·  Admin →
  16. Please add support for Danish language

    I know Denmark is a small country, but there are still a need for support in Danish

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  API  ·  Flag idea as inappropriate…  ·  Admin →
  17. Please add "How To" in the documentation

    The "How To" aspect of Speaker Recognition API documentation is missing. The "How To" documentation for Face API is well documented in a step by step fashion. However its missing here for the Speaker Recognition API.
    Second part of my question is - can I send the *.wav file directly to the endpoint URL when using the API or should it be converted to multipart-form data or application octet-stream data?

    7 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Speaker Verification  ·  Flag idea as inappropriate…  ·  Admin →
  18. More than 10 speakers to identify

    Is there any plans to later support more than 10 profiles in the requests to identify speaker voices?

    I am curious as to the technical limitations of the identification service. Is there plans in the future to make this a reliable service to make voice signature profiles out of?

    7 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    2 comments  ·  API  ·  Flag idea as inappropriate…  ·  Admin →
  19. Is there any option for verify Audio without verificationProfileId?

    I wanna create a login functionality with speaker recognition for multiple user in single device. It's mean user can login every-where using web Portal like as face recognition(findSimilar(persistent id)).

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Speaker Verification  ·  Flag idea as inappropriate…  ·  Admin →
  20. When will speaker recognition be geo-available?

    Takes too long to recognize the speaker due to Azure service being on different coast. Is this still an issue? Or do we now have servers in more locations?

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  API  ·  Flag idea as inappropriate…  ·  Admin →
    Planned  ·  Luke Bayler responded

    Hello,

    This is still an issue, but we are working on making the service geo-available.

    Thanks,
    Luke

← Previous 1
  • Don't see your idea?

Feedback and Knowledge Base