Microsoft

Bing Speech

Welcome to the Bing Speech Forum

The Cognitive Service's Speech Service is replacing Bing Speech. Please refer to their forum for speech product feedback


Categories

Documentation – Any ideas or suggestions for the API Reference or Documentation.

Language Support – Submit a request to have a particular language supported.

Samples & SDK Requests – Let us know if you would like to see a tutorial or sample provided.

Speech to Text – API & SDK – Ideas and feature requests to Speech Recognition and Speech to Text (STT).

Text to Speech – Ideas and feature requests for Text to Speech (TTS) – API only


                               Attention!




We have moved our Customer Feedback & Ideas for Azure Cognitive Services portal to the Azure Feedback Forum.





Please go to the link below to access our new Feedback and Ideas Page.


  • Hot ideas
  • Top ideas
  • New ideas
  • My feedback
  1. request for Lao-TTS, in lao we are looking for Lao-TTS in the microsoft tts and we can cooperate on linguistic support any feedback from dev

    i'm director for the DSC and university in Lao we can provide on Lao--linguistic support if microsoft planning to add Lao-language we are small country but please help us to have more chance to access to the world by using assistive technology

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Text to Speech - API Only  ·  Flag idea as inappropriate…  ·  Admin →
  2. Is it correct to pronounce on air radio??

    Develop a system for aeronautical use of radio text-to-speech.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Text to Speech - API Only  ·  Flag idea as inappropriate…  ·  Admin →
  3. The es-MX Voice Raul, Apollo has a bug reading some tildes.

    The text "Paciente P2, Posición 6" spoken by teh Raul, Apollo voice is ignoring the tilde. If I write the word "Posición" alone, the voice read it correctly.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Text to Speech - API Only  ·  Flag idea as inappropriate…  ·  Admin →
  4. Text to Speech API - Bug found for Chinese text

    Hi Guy

    Could you please check text "给我来点咖啡,谢谢。我们久仰贵公司的大名,那么它具体是何时成立的呢?". There is a distortion with the last the character 谢

    Thanks
    Davis

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Text to Speech - API Only  ·  Flag idea as inappropriate…  ·  Admin →
  5. I need to use Text To Speech API in my own voice. Is it possible to register a person's voice and use it for Text To Speech API

    I need to use Text To Speech API in my own voice. Is it possible to register a person's voice and use it for Text To Speech API.
    I have to register my voice.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Under Review  ·  0 comments  ·  Text to Speech - API Only  ·  Flag idea as inappropriate…  ·  Admin →
  6. Confused about 3 parameters in request header

    I'm new to Azure, and trying to use Text to Speech programmatically.

    According to https://docs.microsoft.com/en-us/azure/cognitive-services/Speech/api-reference-rest/bingvoiceoutput#VoiceSynthesisRequest, I want to know how I get values of X-Search-AppId, X-Search-ClientID and User-Agent? And what does application mean in description? Should I get those values from Azure Portal, or just generate random ones?

    Thanks in advance!

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Under Review  ·  1 comment  ·  Text to Speech - API Only  ·  Flag idea as inappropriate…  ·  Admin →
  7. Better pronunciation

    We have a client that is upset because they received a call about one of their children whose name was pronounced wrong. The child's name was "Nicarri", but Bing pronounced it "Nicker", and the parent who received the call thought it was a racial slur.
    I attached a recording of a test call (not the actual call).

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Completed  ·  2 comments  ·  Text to Speech - API Only  ·  Flag idea as inappropriate…  ·  Admin →
  8. How can we implement this speech API in a J2ee web app. I am using servlet, jsp, hibernate, eclipse ee for development.

    How can we implement this speech API in a J2ee web app. I am using servlet, jsp, hibernate, eclipse ee for development.
    On the front end i am using html, css, js, jquery.

    I want to fill the form by using text input . as well as the actions should take place like navigating to a particular page.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Under Review  ·  0 comments  ·  Text to Speech - API Only  ·  Flag idea as inappropriate…  ·  Admin →
  9. Improved English GB Male voice

    At https://www.microsoft.com/cognitive-services/en-us/Speech-api/documentation/API-Reference-REST/BingVoiceOutput#SupLocales there are some new voices suffixed by RUS.
    I cannot find what RUS stands for, but they are significantly better quality. There are now two British Female voices, one acceptable, one excellent, while the British Male voice remains very low quality.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Under Review  ·  0 comments  ·  Text to Speech - API Only  ·  Flag idea as inappropriate…  ·  Admin →
  10. Dutch language support in Bing text-to-speech API

    The Bing text-to-speech API supports 10 languages, but of course, there are many more. Dutch is not yet supported.

    https://www.microsoft.com/cognitive-services/en-us/speech-api

    I have a Cognitive Services account for Bing Speech and I have software working to provide TTS in the supported languages. But the main language that is of interest to me is Dutch.

    I would very much appreciate if Dutch can be added.

    5 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Under Review  ·  4 comments  ·  Text to Speech - API Only  ·  Flag idea as inappropriate…  ·  Admin →
  11. Maximum request length

    There's no clear documentation on the maximum request length that the Text-to-Speech API can support. My plan was to chunk my text according to this maximum request length. Since my application is aggressive/greedy (try to max out every call), I often get a 413 Error "RequestEntityTooLarge" most of the time.

    I found this on Microsoft's web site: "The maximum amount of audio returned for a given request must not exceed 15 seconds." Which is quite useless because the length of the audio cannot be known from the client side at the time the request is generated. And I found that…

    7 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Under Review  ·  0 comments  ·  Text to Speech - API Only  ·  Flag idea as inappropriate…  ·  Admin →
  12. Start and Duration for RecognizedPhrase or RecognitionResult

    The built in Windows Speech Recognition APIs allow us to tie recorded text to the corresponding portion of the audio. Could such ability be introduced to Bing Speech?

    3 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Under Review  ·  0 comments  ·  Text to Speech - API Only  ·  Flag idea as inappropriate…  ·  Admin →
  13. Equivalent propositions generator

    Especially using text-to-speech I feel the lack of an engine (and the related APIs) to produce equivalent phrases. Given a sentence, and some parameters, we should be able to receive a collection of equivalent propositions and in the same language of the source sentence. Of course the service could evolve from one language to another.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Text to Speech - API Only  ·  Flag idea as inappropriate…  ·  Admin →
  14. Text to Phoneme

    Provide text to phoneme capability to the API, on top of only the speech output.

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Under Review  ·  0 comments  ·  Text to Speech - API Only  ·  Flag idea as inappropriate…  ·  Admin →
  15. Control/reduce the amount of silence at end of text-to-speech clip

    Based on my testing, the current text-to-speech outputs appear to have anywhere between 659ms and 672ms of silence at the end. The start of the audio have between 62ms and 86ms of silence.

    When using multiple generated phrases in sequence, the long gap at the end makes the flow sound unnatural. At present, I am have to post process the generated auto using NAudio to clip the amount of silence at the end. At present, I have found about 100ms at the end allows the clips to played in sequence sounding natural.

    It would be good if the amount of…

    10 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Under Review  ·  0 comments  ·  Text to Speech - API Only  ·  Flag idea as inappropriate…  ·  Admin →
  16. Support SSML for customizing pronunciation

    The text-to-speech HTML interface accepts the input as an SSML document. I was trying to use the following features of SSML without any luck:

    <say-as interpret-as="ordinal">3</say-as> - should say "third"
    <phoneme> - to render speech by its phonetic pronunciation. For example I get the wrong pronunciation of 'record'. I want to force it to use the correct pronunciation. (ie record player vs record an song).
    Use SSML to Control Synthesized Speech

    Most of the things I tired were from the old Microsoft Speech SDK documentation. Is there any guidance on what is supported, what is not supported and any plans…

    5 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Under Review  ·  0 comments  ·  Text to Speech - API Only  ·  Flag idea as inappropriate…  ·  Admin →

Feedback and Knowledge Base