Microsoft

Bing Speech API

Welcome to the Bing Speech API Forum

Categories

Documentation – Any ideas or suggestions for the API Reference or Documentation.

Language Support – Submit a request to have a particular language supported.

Samples & SDK Requests – Let us know if you would like to see a tutorial or sample provided.

Speech to Text – API & SDK – Ideas and feature requests to Speech Recognition and Speech to Text (STT).

Text to Speech – Ideas and feature requests for Text to Speech (TTS) – API only

  1. Support low-latency Opus audio for speech recognition

    I am super happy to hear that Opus audio can be used for uploading speech to the speech-to-text API. However, I have a concern: because of the way Ogg pages + framing works, Opus packets are buffered for several seconds before being sent on the stream. This makes OggOpus useless for real-time speech transcription (though for the REST API it is fine).
    For real-time transcription over websocket I would appreciate an Opus protocol that works around the ogg buffering issue, for example by using RTP headers or a custom size prefix scheme that frames the raw Opus packets. I have…

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Speech to Text - API & SDK  ·  Flag idea as inappropriate…  ·  Admin →
  2. Add word timings, IPA syllables, confusion network to speech reco response

    I know you have this information available in the speech decoder; can you please expose it via the public API?
    - The list of phrase elements (words) and their timestamps within the audio stream
    - IPA phonemes for each phrase element
    - Confusion network output from the lattice

    Right now I am forced to reconstruct / approximate this data after the fact and it would be 1000x easier if the API could just give it to me.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Speech to Text - API & SDK  ·  Flag idea as inappropriate…  ·  Admin →
  3. Exclude words in Bing Spell Scheck

    We have certain words we would like the spell checker to ignore. For instance, 'ltos' is a meaningful abbreviation to our customers, but gets corrected to 'lots.'

    4 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Language Support  ·  Flag idea as inappropriate…  ·  Admin →
  4. 1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Speech to Text - API & SDK  ·  Flag idea as inappropriate…  ·  Admin →
  5. 1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Under Review  ·  0 comments  ·  Documentation  ·  Flag idea as inappropriate…  ·  Admin →
  6. request for Lao-TTS, in lao we are looking for Lao-TTS in the microsoft tts and we can cooperate on linguistic support any feedback from dev

    i'm director for the DSC and university in Lao we can provide on Lao--linguistic support if microsoft planning to add Lao-language we are small country but please help us to have more chance to access to the world by using assistive technology

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Text to Speech - API Only  ·  Flag idea as inappropriate…  ·  Admin →
  7. speech recognition

    Speech Recognition should transcribe spelling...

    Suppose I say the spelling of a word, I would like the response to be the letters corresponding to that word. For example, if I say 'double you A tee ee or', it should return 'w a t e r'

    Is this already available in Bing Speech?

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Under Review  ·  0 comments  ·  Speech to Text - API & SDK  ·  Flag idea as inappropriate…  ·  Admin →
  8. websocket refuse handshake

    websocket refuse handshake close code = -1.
    Can you help me solve this question?
    Thanks.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Speech to Text - API & SDK  ·  Flag idea as inappropriate…  ·  Admin →
  9. Is there a way to stream audio via WebSocket and get Speech to Text results AND get a copy of the recording on Azure Storage?

    Right now we have multiple recorders that operate in the browser, Flash, WebRTC, HTML5.

    Each of these has to connect to Bing Speech to Text to get realtime translation and LUIS results to drive actions in the application. Additionally we are currently streaming the audio to Amazon S3. Ideally we would like to stream the audio only once, and have it picked up by Microsoft from Speech to Text AND be able to retrieve a URL for later use.

    Having to maintain two streams has led to a number of problems where one isn't a consistent recording, and the transcription…

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Speech to Text - API & SDK  ·  Flag idea as inappropriate…  ·  Admin →
  10. Text to Speech - Request to improve or replace Korean voices

    Text to Speech: Korean - KR / HeamiRUS

    Currently TTS is supported by HeamiRUS.
    But Korean voices are too mechanical.
    I tried the Read aloud feature in Microsoft Edge, but the Korean voice is too awkward.
    English is quite natural.
    Please improve this or replace it with a new voice.

    And Korean voices are only female voices.
    Please add a male voice later.

    I used Google Translator

    25 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    22 comments  ·  Language Support  ·  Flag idea as inappropriate…  ·  Admin →
  11. Internal Server Error occurs.

    I returned the results when I ran a few days ago, but today I get Internal Server Error and do not return any results. Please confirm.

    Bing Speech REST api

    internal class BingSpeechHelper

    {
    
    private const string INTERACTIVE = "interactive";
    private const string CONVERSATION = "conversation";
    private const string DICTATION = "dictation";

    private const string LANGUAGE = "en-US";
    private readonly string _requestUri;

    public BingSpeechHelper()
    {
    //&format=detailed
    _requestUri =
    $@&quot;<a rel="nofollow noreferrer" href="https://speech.platform.bing.com/speech/recognition/">https://speech.platform.bing.com/speech/recognition/</a>{
    INTERACTIVE}/cognitiveservices/v1?language={
    LANGUAGE}&quot;;

    }

    public async Task&lt;string&gt; GetTextFromAudioAsync(string recordedFilename)
    {
    var file = await ApplicationData.Current.LocalFolder.GetFileAsync(recordedFilename);

    using (var fileStream = new FileStream(file.Path, FileMode.Open, FileAccess.Read))
    {
    using (var client = new…

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Speech to Text - API & SDK  ·  Flag idea as inappropriate…  ·  Admin →
  12. Is it correct to pronounce on air radio??

    Develop a system for aeronautical use of radio text-to-speech.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Text to Speech - API Only  ·  Flag idea as inappropriate…  ·  Admin →
  13. The es-MX Voice Raul, Apollo has a bug reading some tildes.

    The text "Paciente P2, Posición 6" spoken by teh Raul, Apollo voice is ignoring the tilde. If I write the word "Posición" alone, the voice read it correctly.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Text to Speech - API Only  ·  Flag idea as inappropriate…  ·  Admin →
  14. Incomincio a capire un bel poco in questa fase primaria ,ma se avete paienza iu in avanti potrò fare domande e forse inplementare concetti n

    La combinazione di frequenze multiple campionate in altezza con un marker A master confrontate con un marker B master
    In un dato istante variano lo spazio vibratorio con nuova forma sommatoria che acquisisce una tonalita differente che è, il timbro quantizzato , probabile valore da usare per generare risposta di forma reale.....ok!? Il marker B rappresenta l'ampiezza ed e il tempo di evento definito effettivo ,il momento distintivo codificabile...ok!?

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Language Support  ·  Flag idea as inappropriate…  ·  Admin →
  15. Incomincio a capire un bel poco in questa fase primaria ,ma se avete paienza iu in avanti potrò fare domande e forse inplementare concetti n

    La combinazione di frequenze multiple campionate in altezza con un marker A master confrontate con un marker B master
    In un dato istante variano lo spazio vibratorio con nuova forma sommatoria che acquisisce una tonalita differente che è, il timbro quantizzato , probabile valore da usare per generare risposta di forma reale.....ok!?

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Language Support  ·  Flag idea as inappropriate…  ·  Admin →
  16. Make a microsoft flow / powerapps connector

    Allow support for all of the requests. (Speech to text, text to speech, etc). Integrate with flow!

    3 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Speech to Text - API & SDK  ·  Flag idea as inappropriate…  ·  Admin →
  17. Bond.IO DLL error on latest version of Bing.Speech assembly

    Version 2.0.2 of this installs Bond assemblies version 7.0.1, however when using "RecognizeAsync" it looks for version 1.0.0.0 of the Bond.IO.Dll which obviously doesn't exist. This is easily reproducible by taking the SpeechClientSample and updating the Microsoft.Speech.Bing nuget package to the latest 'Stable' build.

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Speech to Text - API & SDK  ·  Flag idea as inappropriate…  ·  Admin →
  18. How can I disabled the punctuation in recognized result when I use Bing speech recognise?

    How can I disabled the punctuation in recognized result when I use Bing speech recognise? Can I turn off the function?

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Under Review  ·  1 comment  ·  Speech to Text - API & SDK  ·  Flag idea as inappropriate…  ·  Admin →
  19. Support for different PCM sample rates (8kHz) in CreateSpeechRecognizerWithStream

    If you create a recognizer with CreateSpeechRecognizerWithStream, you need to provide an AudioInputStreamFormat. This class only supports 16000 for SamplesPerSec. I'd like to have support for 8000 to be able to hook directly to a VoIP call. I was able to get it working with nAudio, but it needs an inline transcode, which can be expensive.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Speech to Text - API & SDK  ·  Flag idea as inappropriate…  ·  Admin →
  20. Support for Xamarin

    Hi,
    I wanted to know if you could add a Speech Client Library for Xamarin with features such as intermediate results during recognition.
    Thanks!

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    2 comments  ·  Speech to Text - API & SDK  ·  Flag idea as inappropriate…  ·  Admin →
← Previous 1 3 4 5
  • Don't see your idea?

Feedback and Knowledge Base