Speech Service
Attention!
We have moved our Customer Feedback & Ideas for Azure Cognitive Services portal to the Azure Feedback Forum.
-
About the display of the character acquired by SpeechToText (SpeechSDK)
The results obtained by the SpeechRecognizer's Recognized event are not broken by punctuation marks, and sentences are connected even if the speaker changes.
Therefore, it is not possible to know the timing of the change of the speaker.
I want you to improve it so that an event occurs for each punctuation mark.38 votes -
Confidence score on word level
The lack of a confidence score on word level feature is a show stopper for my company's project. It would be extremely useful for us to have the confidence score included within "Words" list , which consist of words and their timestamps.
According to this answer: https://social.msdn.microsoft.com/Forums/en-US/4979ca92-aa0f-4d09-b010-fc2eeb1bde80/speech-results-confidence-score-on-word-level?forum=AzureCognitiveService#8ae67445-4e23-49ea-b694-a8d877dc2dd0
the feature is not public and we suspect that it could be provided quickly.I'd be grateful for each vote for this idea!
19 votes -
Neural voices in Northern Europe
Make neural voices available in the Northern Europe region.
We want to use the Norwegian Neural voice in our product (based in Norway, naturally) but are for some reason locked out of this feature.6 votes -
Possibility to talk English, but in the different foreign accents.
Like a German/French/Spanish/Italian person speaking English, all have their own accent. Perfect for applications like Air Traffic Control etc.
6 votes -
Add support for other audio formats and bitrates
Add support for other audio formats and bitrates
6 votes -
I need Tamil nlp
Tamil nlp
6 votes -
Neural TTS in french
Hi,
The new neural text 2 speech feature looks amazing, but one language is missing : French :)
I don't know if it something on the list, or if it is coming soon, but I'm waiting for this feature to switch from Google Wavenet.
I'm pretty sure that the french voice generated by this new neural TTS by MS Cognitive service will be a game changer.The french language is complex, the emphasis, the punctation etc... but if MS can provide the same awesome quality as they have done in english... we will be able to build something incredible...
5 votes -
Please support spelling words
Let's say I'm building an application where I want to know a user's name or address. Whereas I can find the address online, the name might be something unique.
In this case, I'd like to let the user spell his name to the speech service. However, the results are not very good currently.
I'd love if there was an option to tell the cognitive services that I'm spelling something or that I only send letters to it.
Adding a LUIS model or custom intents to the recognizer didn't improve the results either. Very clear names always lead to some characters…
5 votes -
Xamarin Android & Xamarin IOS SDK
There are no Xamarin Android & Xamarin IOS SDK for the latest custom speech used by Microsoft.
We hope we can use the SDK ASAP, since we think Xamarin is one of the Microsoft core product and widely used.
5 votes -
REST API support for custom phrase lists
REST API support for custom phrase lists
4 votes -
Mark Labels for TTS Speech API
Azure Speech API should offer json mark labels for Text to Speech audio. This allows developers to use the audio file and the json mark labels to create audio tracking text in the app. The competitor has a similar solution. I found Azure TTS to be superior but am forced to use the competitor's solution due to lack of json mark labels. Speech mark labels should be in json format and available for all languages. It should provide information such as the begin and end timestamp of each sound to the text, phrase and sentences.
4 votes -
Move Irish text to phonetic ipa conversion upstream.
Irish (Gaeilge) tts works if correct IPA syntax is used when sending the request for tts with the IPA option included.
Example...
Ba chuid mhór den togra seo teacht ar ábhar a bheadh chomh maith nó níos fearr ná an fhuinseog agus an t-ábhar a bheith níos inmharthana.
translated to IPA..
"bˠɑː xɪdʲ woːˈr dʲəɴʲ tʲɔɡˈrə ʃoː tʃæxt eˈr ɑːwəˈr ɑː vʲəh xɔv mˠɑːh ɴˠoː ɴʲiːsˠ fʲæˈrr ɴˠɑː en ɪɴʲʃoːɡ əɡʊsˠ en tʲɑːwəˈr ɑː vʲeɪh ɴʲiːsˠ ɪɴʲwəˈrhɑːɴˠɑː"
The IPA (irish) text can be read and spoken accurately by Neural EN-GB or EN-US voice.
I have attached a file that converts…
3 votes -
Have scripts ready for voice input when creating a custom voice.
To create a new voice, have scripts ready to be read by the user. As the computer recognizes the user's voice, the sound is recorded and synthesized into a custom voice for the user.
3 votes -
If it is possible to identify sounds into characters, then it is good for other language developers to map it to the corresponding words.
If it is possible to identify sounds into characters, then it is good for other language developers to map it to the corresponding words.
3 votes -
Retrain a previously trained model on custom speech portal
I had previously trained a custom speech model and I trying to retrain that model but I am not seeing an option to retrain it, It only gives me an option to train the baseline model.
3 votes -
Language Support for Greek
Is Greek on the roadmap? Please let me know when it is planned. If not, please add it.
3 votes -
LUIS Reference Grammar ID fails West Europe when included
For West Europe region, the service is returning "Specified grammar type is not supported!" when we pass in LUIS reference grammar ID (a.k.a. IntentRecognizer).
This causes the speech service to fail with a "WebSocket is already in CLOSING or CLOSED state." error when the LUIS reference grammar ID is passed in. If it is not included, the service works correctly.
May be related to this issue in the Cognitive Services Speech SDK repo: https://github.com/Azure-Samples/cognitive-services-speech-sdk/issues/127
3 votes -
Stop TTS synthesizer
Once the synthesizer starts synthesizing audio, it can't be stopped. It would be nice to have some method that would stop/interrupt currently running audio synthesis.
3 votes -
Support more languages/derivatives for Custom Voice Fonts
Support more languages/derivatives for Custom Voice Fonts. In particular, I'm looking to create a custom voice for en-GB, and building it using en-US doesn't quite work. I would also be looking to create custom voices for other flavours of English, en-AU being a priority.
3 votes -
Is there a way to stream audio via WebSocket and get Speech to Text results AND get a copy of the recording on Azure Storage?
We are currently using Bing Speech with LUIS, but looking to convert to Speech service.
Right now we have multiple recorders that operate in the browser, Flash, WebRTC, HTML5.
Each of these has to connect to Bing Speech to Text to get realtime translation and LUIS results to drive actions in the application. Additionally we are currently streaming the audio to Amazon S3. Ideally we would like to stream the audio only once, and have it picked up by Microsoft from Speech to Text AND be able to retrieve a URL for later use.
Having to maintain two streams has…
3 votes
- Don't see your idea?