Microsoft

Speech Service

  1. Mark Labels for TTS Speech API

    Azure Speech API should offer json mark labels for Text to Speech audio. This allows developers to use the audio file and the json mark labels to create audio tracking text in the app. The competitor has a similar solution. I found Azure TTS to be superior but am forced to use the competitor's solution due to lack of json mark labels. Speech mark labels should be in json format and available for all languages. It should provide information such as the begin and end timestamp of each sound to the text, phrase and sentences.

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Under Review  ·  0 comments  ·  Text to Speech  ·  Flag idea as inappropriate…  ·  Admin →
  2. confidence number value per word or per speech fragment

    I am doing a POC with speech recognition for long speeches.
    https://docs.microsoft.com/de-de/azure/cognitive-services/speech/concepts#recognition-modes

    The recognition mode "conversation" with format "detailed" delivers message responses of type "SpeechPhrase" including confidence value.

    The recognition mode "dictation" with format "detailed" delivers message responses of type "SpeechFragment" and "SpeechPhrase" (including confidence value). But the fragments do not contain any information about confidence value.
    With the C# service library and the recognition mode "dictation" you'll get partial results with a confidence value (enum). But this is not our desired solution, because the confidence value seems to belong to the whole phrase (Confidence: Indicates the level of confidence…

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Under Review  ·  1 comment  ·  Speech to Text  ·  Flag idea as inappropriate…  ·  Admin →
  • Don't see your idea?

Feedback and Knowledge Base