Polyglotism: via API/Portal, please consider supporting multiple languages within 1 source video or audio input? IE: Within 1 video file, recognizable text and speech are in English and Spanish. In another, French and Italian. Thanks. Today, I run the video twice and semi-automatically merge the results, discarding the mistakes.2 votes
we started a private preview program for this, please reach out to visupport[at]microsoft[dot]com if interested
Add technical video metadata to breakdown, e.g. framerate, format, codecs, per track information, exact duration, resolution, aspect ratio, bitrate, start timecode ...3 votes
A way to train the face recognition with "known data" (e.g. images of people, like in FACE API).
This could be something like a corporate image database that can be referenced or the images uploaded.
Also a way to export the face training data would be nice.16 votes
done! see https://azure.microsoft.com/en-us/blog/people-recognition-enhancements-video-indexer/ for details
Hi please add a parameter callbackurl like in post method api to create linguistic model api, sothat reindex breakdown can be easier . since waituntilready is true/false state is waiting. so if a callback is provided we can easily reindex breakdowns.7 votes
Hi please add a parameter callbackurl like in post method api to create linguistic model api, sothat reindex breakdown can be easier . since waituntilready is true/false state is waiting. so if a callback is provided we can easily reindex breakdowns.1 vote
Speech is starting in the middle of the video. Transcript is missing. Is it a known issue?1 vote
No – please contact visupport[at]microsoft[dot]com for support
In Insights, you're able to add names to people that have been identified in videos.
You should also be able to associate people with speakers within the transcript instead of displaying "Speaker #1" or "Speaker 2".
This should also work for audio only content.4 votes
Hi, Does VideoIndexer Result includes Face Emotion recognition details..such as Happy/Surprise etc? I do not see any such attributes in the JSON Result, but documentation in general says Emotion details are analyzed. Please clarify1 vote
Ability to export the transcript to a Word doc or plain text format without the timestamps1 vote
Export to plain text and CSV formats are now supported
When I alter the transcript does this improve the learning ability on future uploads? At the moment the projects I'm working on are complex and the transcript needs a lot of alteration. Or do I need to add a lot of detail into the content model?0 votes
The ability to learn from edits was recently added, see https://azure.microsoft.com/en-us/blog/azure-media-services-the-latest-video-indexer-updates-from-nab-show-2019/ for details
When there more than one user accessing the account, at the moment there doesn't seem to be the ability for the second user to save the edited transcript even when working on another video source. (save button doesn't exist only reload source).0 votes
Currently, we get breakdown from Video Indexer with transcript blocks that contain whole sentences or phrase which are 5-10 seconds long. We need more granular speech to text with times when each word is spoken. Media Analytics in Media Services already does that, but it would be useful to have it in Video Indexer API.1 vote
Hi, I have been trying to upload videos onto the indexer but it keeps getting hanged at 95% and/or restarting at 0%, not any one of the videos out of my list of at least 9 videos could be uploaded.
How should I go about it?
*The videos are about 2hrs long1 vote
The ability to train the API to detect scenes and objects(faces, logos) according to my needs5 votes
I'd like to be able to link to a timecode position in a video,
where 123 is the timecode.1 vote
It is supported, exactly as in your idea
If you experiance an issue with it, please contact our support: visupport#at#microsoft#dot#com
Add the ability to get the XY co-ordinates of tracked people1 vote
bounding box for faces are provided as an artifact
Provide the ability to index a particular location. For example, will the search index respect geotagging?3 votes
I uploaded a flac file and downloaded the transcript. The timecode is wrong after every 5 minutes. It jumped from
00:04:56.200 --> 00:04:59.640
00:11:51.484 --> 00:11:59.818
Then it jumped from
00:16:49.214 --> 00:16:50.964
00:23:42.688 --> 00:23:47.4941 vote
I uploaded my private audio-only media (mp3). The generated transcript is completely different from the original ones.
I understand this service is a preview, but I'd like to improve the quality if there's anything to do on my side.
Is there any guideline to improve the transcript quality?
also, it's good to generate error/warning message when the AI recognizes transcript quality is bad.1 vote
After posting video via the API, I get the results but I don't see the videos, nor can I plan them through the portal1 vote
You must use the same email address and oAuth provider for the website and API to point at the same account. Per your description it seems like you are using a different combination and hance have 2 separated accounts.
- Don't see your idea?