Custom Speech Service
Welcome to the Custom Speech Service API Forum
Categories
API – Any ideas or feedback pertaining to features or enhancements to Custom Speech Service API.
Documentation – Any ideas or suggestions for the API Reference or Documentation.
Language Support – Submit a request to have a particular language supported.
Samples & SDK Request – Let us know if you would like to see a Code sample or SDK provided.
Attention!
We have moved our Customer Feedback & Ideas for Azure Cognitive Services portal to the Azure Feedback Forum.
-
Language Portuguese - PT-BR
We have some customer in Brazil that want to use CRIS with PT-BR
34 votes -
India English Language Support - en-in Base models
Microsoft Conversational Base Model with en-in language, mainly for the Indian English, any plans/tentative dates?
With en-us, USA English, not getting the best results, and the accuracy percentage is too lower, to make use it!
9 votes -
Allow CRIS APIs to run locally without Internet connections
If I want to build an APP using CRIS API, but the Internet connection cannot be guaranteed when using the APP, can this be achieved for the time being?
9 votes -
Language Japanese - ja-jp
We have some customer in Japan that want to use Japanese Voice data.
8 votes -
Facilitating Audio Labeling in Speech Studio
- If you want to test the model performance, you need to label audio data. Ideally, the speech studio would facilitate this.
• Select uploaded audio data to label.
• Select model to help the labeler.
• Allow playing of the data, show a suggested model transcription, and the ability to edit it and produce a gold transcription dataset - Fixing test data transcriptions in the test screen
• When reviewing a test for a given model, you frequently sort by type of error and listen / review the highest error rate items.
• While listening error prone messages it is frequent…
6 votes - If you want to test the model performance, you need to label audio data. Ideally, the speech studio would facilitate this.
-
Train existing model with additional data
After having created and trained a model with a dataset, whenever i get additional data for training, i would like to -
- create a new dataset with just the additional data - possible currently.
- re-train an existing model with this new additional dataset.
So all models created by me need to added as baseline models so I can specify my own model as the baseline model in the API and provide the id of the new dataset.
5 votes -
Confidence / probability value per recognized word or fragment
A confidence number value per word or per speech fragment would be very interesting. Because with this confidence value it would be possible to get self-assessments of the Microsoft speech service, if it recognizes the word or the fragment correctly or not. Unfortunately I didn't found a such possibility.
5 votes -
Custom model training with multiple datasets
I'm trying to train a custom speech model using about 5 GB of data that I have. I realize the upload limit is 2GB archives, so I've split my data into multiple bundles to upload them separately. However, in the UI I don't see how it is possible to train a model using multiple archives. In particular, the "select training data" flow uses radio buttons to select data so I am only able to select a single model. Is it possible to train a model with multiple archives of data?
3 votes -
Xamarin supported sdk
I want to use the Cognitive speech service (real-time continuous speech to text and interim results) in the Xamarin app. Is there any SDKs or plugin available? Since REST API has some limitations (no interim results), i am unable to go with it.
3 votes -
Usage of custom language and accoustic models
Get more precsion with my own language & accoustic models
3 votes -
configuring custom speech to recognize selected English phonemes.
Can the custom speech service be configured to recognize selected English phonemes? Our university lab needs this capability to enhance our English training program for slum dwelling disadvantaged youth
3 votes -
Train Custom Speech Model with data from Tenant Model
The new Tenant Model, which is currently in preview allows using organizational data to create a speech model, but I would like to add this data to an existing Custom Speech Model, thereby allowing me to deploy an endpoint that has the ability to detect not only organization specific terms, but also terms from my custom speech data.
2 votes -
Add WordBoundary to all the Speech SDK's
The WordBoundary is an easy way to get the timestamps of the words. It should be included in all the Speech SDK's. Instead at the moment it is not included in the javascript version, but can be found in C# version.
2 votes -
Create Model API doesn't seem to be working
I tried to create a new model using an existing dataset through the REST api. I get a 202 response but the UI doesn't reflect the new model. Am i doing something wrong?
2 votes -
Language model training stuck on running
I have been trying to train a new language model with custom adaptation data and the status is stuck perpetually on "Running".
2 votes -
Acoustic model support for languages "de-DE" and "fr-FR"
The standard speech service isn't enough specific. Hence, we want to train the Microsoft Custom Speech Service (CRIS) for dynamic conditions in a non-english environmnent - at first for locale "de-DE" later for "fr-FR". But there is no support for the two languages currently.
Otherwise for creating language model the locales "de-DE" and "fr-FR" are available.2 votes -
Add the ability to version the same endpoint
Add the ability to deploy new versions of the same endpoint have updateable stage/prod endpoints (like LUIS) so that we can easily swap models under an endpoint without needing to adapt the application that uses it.
2 votes -
Does‘not it support Chinese and English mixing?
I need to create a custom language model, which is based on Chinese. But there may be some English words in the sentence.
E.g. "如何使用Veeva?"2 votes -
Add the ability to provide a custom dictionary
Right now, our only option when we simply want to provide a custom set of words that are not commonly used in the English language (E.G. medical terms), our only option is to upload the dictionary as if it was a set of transcriptions, which creates a bias as it does not contain full sentences. Creating full sentences from these terms can be a long and painful process (around 100k words in the dictionary)
2 votes -
Create a C# SDK for the WebSocket with the Speech Protocol
With the C# SDKs and endpoints now deprecated, there is not supported SDK for C#. It would be great to have the support for the Speech Protocol in C#.
2 votes
- Don't see your idea?