Computer Vision
Welcome to the Computer Vision API Forum
Categories
API – Any ideas or feedback pertaining to features or enhancements to Computer Vision API.
Documentation – Any ideas or suggestions for the API Reference or Documentation.
Language Support – Submit a request to have a particular language supported.
Samples & SDK Request – Let us know if you would like to see a Code sample or SDK provided.
Custom/Sample Images – Have an image you’ve tested and not getting the results you are seeking? Upload the image and describe the information or tags you would like to be included.
Attention!
We have moved our Customer Feedback & Ideas for Azure Cognitive Services portal to the Azure Feedback Forum.
-
Set a language field as a read result
I need a detected language field in read result version 3.0.
On your documents, about its field has been written, but the field isn't in response."readResults":[{"page":1,"angle":0,"width":282,"height":216,"unit":"pixel","lines":[{"boundingBox"...
1 vote -
Can the OCR API support identifying paragraphs from PDF's
Currently the tool is able to identify the texts line by line, however it cannot recognize the paragraph of the text searched.
1 vote -
El tutorial
Los tutoriales de c# de OCR y de reconocimiento de texto ya no funcionan, están des-actualizados y marca error en el Uri:
Se proporcionó un URI de solicitud no válido. El URI de solicitud debe ser un URI absoluto o debe establecerse BaseAddress.1 vote -
Disable Celebrity Recognition in Vision Analysis Captions
My application does not wish to know if a face in an image is recognised as a celebrity (correctly or false-positive), but I see no option to remove/disable celebrity names from captions.
e.g. I just want to know that the image is "A person talking on a cell phone"
"captions": [
{
"text": "Leonardo DiCaprio talking on a cell phone",
"confidence": 0.53837191468154666
}
]3 votes -
Is there a way to get page height and width in ms vision ocr api?
As available in the read api, is there a way to extract page width and height infiormation in response of ms vision ocr api
1 voteWe have decide to not add this feature to the OCR API.
-
It is found that the subscriptionKey is placed in the request header during API use, which is unsafe
After looking at the documentation, you'll notice that a lot of API requests put subscriptionKey in the request header. That way, you can grab the request and get the user's key. This allows illegal use of the service.
1 vote -
api使用中发现subscriptionKey放到请求头中,不安全
After looking at the documentation, you'll notice that a lot of API requests put subscriptionKey in the request header. That way, you can grab the request and get the user's key. This allows illegal use of the service.
1 vote -
api使用中发现subscriptionKey放到请求头中,不安全
After looking at the documentation, you'll notice that a lot of API requests put subscriptionKey in the request header. That way, you can grab the request and get the user's key. This allows illegal use of the service.
1 vote -
Can handwriting(Japanese) be supported?
Can handwriting(Japanese) be supported?
2 votes -
Get bounding box and confidence level for every individual character of a word
Greetings,
partially adding to https://cognitive.uservoice.com/forums/430309-computer-vision-api/suggestions/38584192-add-ocr-confidence-to-v1-0-ocr-api and https://cognitive.uservoice.com/forums/430309-computer-vision-api/suggestions/15634182-single-character-recognition :
I am using the Cognitive Services Computer Vision API v2.0 (through the Python SDK) (with the Read Batch File / Get Read Operation Result combination) and it would be valuable to me if I could extract the bounding box (and optionally maybe even the confidence level) for every single character of a word. I already tried cropping out the words out of the images and submitting the cropped words to the API individually, but that still returns only the whole word.
Thank you for your time.
Best regards
Christian Römer1 vote -
The Analyze Image result is not as expected
I called the computer vision api with this image url:
https://lh3.googleusercontent.com/p/AF1QipMPhnsunVu5SWvwWTSUfelS8zRvMnznCiNlUXd8=s1600-w1600.There is no "people" in categories and there is no "faces" in result.
Result detail:
{
"categories": [{"name": "outdoor_",
"score": 0.046875,
"detail": {
"landmarks": []
}}],
"adult": {"isAdultContent": false,
"isRacyContent": false,
"adultScore": 0.12861922383308411,
"racyScore": 0.14124451577663422},
"color": {"dominantColorForeground": "Grey",
"dominantColorBackground": "Grey",
"dominantColors": ["Grey"],
"accentColor": "844D47",
"isBwImg": false,
"isBWImg": false},
"imageType": {"clipArtType": 0,
"lineDrawingType": 0},
"tags": [{"name": "outdoor",
"confidence": 0.99072730541229248}, {
"name": "seafood",
"confidence": 0.98059272766113281}, {
"name": "fishing",
"confidence": 0.9629974365234375}, {
"name": "person",
"confidence": 0.94020164012908936}, {
…"name":
1 vote -
Add OCR confidence to v1.0/ocr API
Other OCR platforms provide OCR confidence sometimes per character and sometimes per word.
The confidence meaning how likely the result is to match the input image, for very poorly scanned documents where noise is a problem this can cause the current API to return incorrect text frequently with no programmatic way to detect if the result should be trusted or sent to a user for verification.
Having this as an optional query parameter on the API would be helpful, perhaps confidencePerWord=true and confidencePerCharacter=true
16 votesThe confidence per word is now available in the Read API call. After calling BatchReadFileAsync(). You would then call GetReadOperation() and from its result you’d get Lines—>Words—>Confidence.
https://docs.microsoft.com/en-us/dotnet/api/microsoft.azure.cognitiveservices.vision.computervision.models.word?view=azure-dotnet -
Add Read API docker container
Only Recognize Text is currently containerised. As this API is deprecated, could we please get a container version of the newer Read API?
1 vote -
Using opencv with Kinect for azure
k4aimaget depthimage = k4acapturegetdepth_image(capture);
colorframe=cv::Mat(k4aimagegetheightpixels(colorimage), k4aimagegetwidthpixels(colorimage), CV8UC3, k4aimagegetbuffer(colorimage));
cv::imshow("color", color_frame);
The imshow here creates a 'segmentation fault' even though there are proper values in the matrix. I check this is not an opencv fault. Could someone please help to visualize color and depth data from custom application.
1 vote -
Bug: color JSON from REST API contains both "isBwImg" and "isBWImg" (Typo)
There is a little bug in the JSON response from the REST API:
In the JSON response, the color object contains two most-likely identical features with different names - "isBWImg" and "isBwImg" (lower-case "w" vs. capital "W").
Ex:
"color": {
"dominantColorForeground": "White",
"isBwImg": false,
"isBWImg": false,
"accentColor": "228AAA",
"dominantColorBackground": "White",
"dominantColors": ["White"]}
API version: 2.0
endpoint: https://westeurope.api.cognitive.microsoft.com/vision/v2.0/analyze
Region: West Europe1 vote -
Capturing PO number Issue
Hello Azure Team,
Using azure online tool , the attached image generates the correct json response(all image text), but when we use it via computervision, cognitiveserive API in web or mobile Android it returns less/incorrect json response which is invalid and we cannot pick number from it.
1 vote -
12 votes
Hello,
Please see our product roadmap here: https://azure.microsoft.com/en-us/updates/.
Thanks,
Luke -
Computer vision algorithms applied to a video or stream
Could it be possible to use this algorithm with a video stream instead of single images
3 votes -
Support data URIs in image APIs (vision/emotion/face)
For JavaScript clients, there are many circumstances when you want the image as represented in a Canvas to be uploaded to one of the MCS APIs that accept images. The problem right now is that Canvas returns its content in the form of a data URI (RFC 2397), and callers need to convert this to a blob before uploading. It would be convenient if the APIs simply accepted the data URI directly, although it would mean the network payload is larger than need be (because of the base64 encoding.)
8 votes
- Don't see your idea?