Microsoft

How can we improve Speech service?

confidence number value per word or per speech fragment

I am doing a POC with speech recognition for long speeches.
https://docs.microsoft.com/de-de/azure/cognitive-services/speech/concepts#recognition-modes

The recognition mode "conversation" with format "detailed" delivers message responses of type "SpeechPhrase" including confidence value.

The recognition mode "dictation" with format "detailed" delivers message responses of type "SpeechFragment" and "SpeechPhrase" (including confidence value). But the fragments do not contain any information about confidence value.
With the C# service library and the recognition mode "dictation" you'll get partial results with a confidence value (enum). But this is not our desired solution, because the confidence value seems to belong to the whole phrase (Confidence: Indicates the level of confidence of a recognized phrase., https://cdn.rawgit.com/Microsoft/Cognitive-Speech-STT-ServiceLibrary/master/docs/html/9d706b3a-8d1f-ba71-d628-fff00928c72d.htm)

The recognition mode "interactive" is not optimized for long speeches.

A confidence number value per word or per speech fragment would be very interesting for us. Because with this confidence value it would be possible to get self-assessments of the Microsoft speech service, if it recognizes the word or the fragment correctly or not. Unfortunately I didn't found a such possibility.

Thanky you for a short answer, if there are any solution supported by Microsoft speech service.

1 vote
Sign in
Check!
(thinking…)
Reset
or sign in with
  • facebook
  • google
    Password icon
    Signed in as (Sign out)

    We’ll send you updates on this idea

    AdminLuke Bayler (Community Manager, Microsoft Cognitive Services) shared this idea  ·   ·  Flag idea as inappropriate…  ·  Admin →

    0 comments

    Sign in
    Check!
    (thinking…)
    Reset
    or sign in with
    • facebook
    • google
      Password icon
      Signed in as (Sign out)
      Submitting...

      Feedback and Knowledge Base