Microsoft

Speech Service

  1. LUIS Reference Grammar ID fails West Europe when included

    For West Europe region, the service is returning "Specified grammar type is not supported!" when we pass in LUIS reference grammar ID (a.k.a. IntentRecognizer).

    This causes the speech service to fail with a "WebSocket is already in CLOSING or CLOSED state." error when the LUIS reference grammar ID is passed in. If it is not included, the service works correctly.

    May be related to this issue in the Cognitive Services Speech SDK repo: https://github.com/Azure-Samples/cognitive-services-speech-sdk/issues/127

    3 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Speech to Text  ·  Flag idea as inappropriate…  ·  Admin →
  2. Automatic determination of English locales

    At present, we have to determine the locale of the input language in detail, such as "Speech to Text" feature. For example, en-US, en-AU, and so on.
    Users may not know which one to choose, so it will be easier to use if they automatically recognize it from their voice.

    Please let me know if you have any prospects for the future.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Speech to Text  ·  Flag idea as inappropriate…  ·  Admin →
  3. Norwegian language needs improvement in grammar

    Norwegian needs a grammatical rethink with regards to compound words.

    Currently in Norwegian we distinguish individual words by putting spaces between them, so for instance when I mention to the speech-to-text-service the console window ("konsollvindu" in Norwegian), it outputs it as "konsoll vindu" with a space between the words.

    This.

    Must.

    Absolutely.

    Be.

    Fixed.

    I am a linguist by degree. Please hire me to fix this if you need help, because you really do.

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    5 comments  ·  Speech to Text  ·  Flag idea as inappropriate…  ·  Admin →
  4. Confidence score on word level

    The lack of a confidence score on word level feature is a show stopper for my company's project. It would be extremely useful for us to have the confidence score included within "Words" list , which consist of words and their timestamps.

    According to this answer: https://social.msdn.microsoft.com/Forums/en-US/4979ca92-aa0f-4d09-b010-fc2eeb1bde80/speech-results-confidence-score-on-word-level?forum=AzureCognitiveService#8ae67445-4e23-49ea-b694-a8d877dc2dd0
    the feature is not public and we suspect that it could be provided quickly.

    I'd be grateful for each vote for this idea!

    15 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    5 comments  ·  Speech to Text  ·  Flag idea as inappropriate…  ·  Admin →
  5. Please support spelling words

    Let's say I'm building an application where I want to know a user's name or address. Whereas I can find the address online, the name might be something unique.

    In this case, I'd like to let the user spell his name to the speech service. However, the results are not very good currently.

    I'd love if there was an option to tell the cognitive services that I'm spelling something or that I only send letters to it.

    Adding a LUIS model or custom intents to the recognizer didn't improve the results either. Very clear names always lead to some characters…

    4 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Speech to Text  ·  Flag idea as inappropriate…  ·  Admin →
  6. audio+transcription not available

    For training a model, I only have access to use "related text" and not audio+transcription.

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Speech to Text  ·  Flag idea as inappropriate…  ·  Admin →
  7. Add support for KWS on IOS

    Add support for wake word detection on IOS platform

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Speech to Text  ·  Flag idea as inappropriate…  ·  Admin →
  8. Add support for other audio formats and bitrates

    Add support for other audio formats and bitrates

    4 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    2 comments  ·  Speech to Text  ·  Flag idea as inappropriate…  ·  Admin →
  9. Gaeilge (Irish)

    Please add Gaeilge the Irish language.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Speech to Text  ·  Flag idea as inappropriate…  ·  Admin →
  10. Is there a way to stream audio via WebSocket and get Speech to Text results AND get a copy of the recording on Azure Storage?

    We are currently using Bing Speech with LUIS, but looking to convert to Speech service.

    Right now we have multiple recorders that operate in the browser, Flash, WebRTC, HTML5.

    Each of these has to connect to Bing Speech to Text to get realtime translation and LUIS results to drive actions in the application. Additionally we are currently streaming the audio to Amazon S3. Ideally we would like to stream the audio only once, and have it picked up by Microsoft from Speech to Text AND be able to retrieve a URL for later use.

    Having to maintain two streams has…

    3 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    3 comments  ·  Speech to Text  ·  Flag idea as inappropriate…  ·  Admin →
  11. Please continue to support the Korean language supported by Bing speech api

    I would like to use the new Speech SDK. I am currently using Bing speech API, but if Korean is added, I will change to Speech SDK.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Speech to Text  ·  Flag idea as inappropriate…  ·  Admin →
  12. Speech Service very high response time

    we are using https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/quickstart-js-browser
    for filing user data on screen based on Voice input (recognizeOnceAsync)..but i am facing multiple problems in that
    1. There is no way around to stop recognizer immediately once i speak some words ..it take 4-5 seconds to stop mic processing ..
    2. assume i want to enter user first name based on voice ..but it takes 5-6 to complete the process ,which is not recommendable..need to your inputs to improve it .
    3. Any results returned from Speech service is appended with extra (.) in last of result . not sure why its happening .

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Speech to Text  ·  Flag idea as inappropriate…  ·  Admin →
  13. Xamarin Android & Xamarin IOS SDK

    There are no Xamarin Android & Xamarin IOS SDK for the latest custom speech used by Microsoft.

    We hope we can use the SDK ASAP, since we think Xamarin is one of the Microsoft core product and widely used.

    5 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    2 comments  ·  Speech to Text  ·  Flag idea as inappropriate…  ·  Admin →
  14. need amr to be supported in Speech to Text services

    I saw that currently only wav audio format is supported.
    https://docs.microsoft.com/en-us/azure/cognitive-services/speech/getstarted/getstartedrest?tabs=Powershell

    However, my customer is using Cordova App and the default format in Android Cordova is AMR audio.
    We hope Azure Speech to Text can accept AMR as well.

    This cloud services accept various format of audio including AMR, so I think technically it is feasible for Azure.
    http://www.folio3.com/speech-to-text-services/

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Speech to Text  ·  Flag idea as inappropriate…  ·  Admin →
  15. Samples

    I haven't been able to find samples of the TTS anywhere.

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    3 comments  ·  Speech to Text  ·  Flag idea as inappropriate…  ·  Admin →
  16. Connector for PowerApps

    Please provide a connector which can be used in PowerApps.

    I have tried to use the service from the demo in an azure function, but this fails currently by an unknown error :(

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Speech to Text  ·  Flag idea as inappropriate…  ·  Admin →
  17. improve the recognition of numbers

    The recognition of words like "tenthousend" is "10 1.000" but should be "10000" this is bad if you try to allow an Input of numbers bigger then 1000

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Speech to Text  ·  Flag idea as inappropriate…  ·  Admin →
  18. captialization

    This service is not doing a good job of capitalizing the first word in sentences. I say "I ate lunch period it was good" and I get

    I ate lunch period it was good. This later becomes

    I ate lunch. it was good

    For some reason the recognition of a period in punctuation doesn't produce the capitalization of the next sentence.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Speech to Text  ·  Flag idea as inappropriate…  ·  Admin →
  19. confidence number value per word or per speech fragment

    I am doing a POC with speech recognition for long speeches.
    https://docs.microsoft.com/de-de/azure/cognitive-services/speech/concepts#recognition-modes

    The recognition mode "conversation" with format "detailed" delivers message responses of type "SpeechPhrase" including confidence value.

    The recognition mode "dictation" with format "detailed" delivers message responses of type "SpeechFragment" and "SpeechPhrase" (including confidence value). But the fragments do not contain any information about confidence value.
    With the C# service library and the recognition mode "dictation" you'll get partial results with a confidence value (enum). But this is not our desired solution, because the confidence value seems to belong to the whole phrase (Confidence: Indicates the level of confidence…

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Under Review  ·  1 comment  ·  Speech to Text  ·  Flag idea as inappropriate…  ·  Admin →
  • Don't see your idea?

Feedback and Knowledge Base