Computer Vision
Welcome to the Computer Vision API Forum
Categories
API – Any ideas or feedback pertaining to features or enhancements to Computer Vision API.
Documentation – Any ideas or suggestions for the API Reference or Documentation.
Language Support – Submit a request to have a particular language supported.
Samples & SDK Request – Let us know if you would like to see a Code sample or SDK provided.
Custom/Sample Images – Have an image you’ve tested and not getting the results you are seeking? Upload the image and describe the information or tags you would like to be included.
Attention!
We have moved our Customer Feedback & Ideas for Azure Cognitive Services portal to the Azure Feedback Forum.
-
Improve Documentation for Responses, esp Rate Limits
We've encountered failure responses on both the Read and Results endpoints that we can find no documentation for...
- OK: "Running" What do we do with this? Is it costing us?
- Forbidden with text "Quota exceeded" Includes a recommended try again time in units of milliseconds?
- TooManyRequests : Includes a recommended try again time in units of seconds?0 votesWe recently updated the Read documentation to better explain the API operations and the questions raised here. We will continue to make our documentation better so thanks for the feedabck! Please keep it coming.
-
OCR processing of numerical text should not be impaired when detected language is Turkish
When testing with Turkish text that also includes numerical strings, I have isolated the following anomaly:
Consider two versions of an otherwise identical image (for informative purposes, the image is a graphic depicting the total number of deaths in Turkey from COVID-19, on a particular date):
Turkish version:
TÜRKİYE’DE ÖLÜMLER 1.368English version:
DEATHS IN TURKEY 1.368When the image is OCR-processed with the text in English (i.e. with the words: “DEATHS IN TURKEY”), the numeric string is returned correctly as “1.368”.
However, when the image is OCR-processed with the text in Turkish (i.e. with the words: “TÜRKİYE’DE ÖLÜMLER”), the…
1 voteThe latest Read 3.0 API of Computer Vision correctly handles mixed languages, mixed content, and print and handwritten (english only) content. Please see the attached sample outputs from the test images in the post.
https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/concept-recognizing-text
-
OCR return wrong text with very low confidence score
OCR misread letter "l" to "i" and return 0.355 as confidence score. Is this a bug?
The email part OCR service return as "Email: vi@bimco.org" with confidence score 0.355 while the rest is correct parsing with confidence score always higher than 0.9. I used the same file with different OCR providers and only your service return wrong value.
I tested the service with different pdf file and sometimes, it parse wrong "f" and "r" as well.
1 vote -
Spanish and French language support
Read API supports only English and Spanish is in Preview. Any ideas when Spanish will be fully supported and when some other languages like French or German will be supported as well?
1 voteThe new Read 3.0 API is now GA and supports Spanish and French languages among others.
-
1 vote
Raed API now supports spanish among other languages and is in GA. https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/concept-recognizing-text#printed-text
-
spanish
Spanish support for batch processing will be appreciated
1 vote -
Add OCR confidence to v1.0/ocr API
Other OCR platforms provide OCR confidence sometimes per character and sometimes per word.
The confidence meaning how likely the result is to match the input image, for very poorly scanned documents where noise is a problem this can cause the current API to return incorrect text frequently with no programmatic way to detect if the result should be trusted or sent to a user for verification.
Having this as an optional query parameter on the API would be helpful, perhaps confidencePerWord=true and confidencePerCharacter=true
16 votesThe confidence per word is now available in the Read API call. After calling BatchReadFileAsync(). You would then call GetReadOperation() and from its result you’d get Lines—>Words—>Confidence.
https://docs.microsoft.com/en-us/dotnet/api/microsoft.azure.cognitiveservices.vision.computervision.models.word?view=azure-dotnet -
Add Read API docker container
Only Recognize Text is currently containerised. As this API is deprecated, could we please get a container version of the newer Read API?
1 vote -
RecognizeText(Printed) is not recognizing the pound symbol (£)
I have many cases of pictures of texts where one can find a pound sign (£) but the sign is NEVER correctly recognized by Azure Cognitive Services RecognizeText API, as far as I tested. Other symbols, like the dollar sign ($) for example, are identified without problems.
I made tests with print screens of texts containing £, since these should be easy for the OCR tool to convert, and again the pound sign is not correctly identified (it becomes an f, a 2, a 1, a $ etc).
I am suspecting that the pound sign is not included in the…
18 votes -
Extract table data from image with table structure
The main purpose of this idea is to take the table structure out of the image when the user selects a part of the image.
16 votesSee the Form Recognizer product from the OCR team – learn more at https://azure.microsoft.com/en-us/services/cognitive-services/form-recognizer/
More specifically, check out the Layout API quickstart at https://docs.microsoft.com/en-us/azure/cognitive-services/form-recognizer/quickstarts/python-layout?tabs=v2-0
Layout API reference at: https://westcentralus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-1-preview-1/operations/AnalyzeLayoutAsync
-
Computer Vision API doesn't recognize clear text surrounded by @
Computer Vision API doesn't recognize very clear and obvious text surrounded by @ character
1 votePlease use the latest Read 3.0 API at https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/concept-recognizing-text
The results were correct for your sample image.
-
Make detecting hollow text better
Trying to find text on an image with hollow text doesn't return the right text results from the image and sometimes returns none at all.
1 votePlease see the latest Read 3.0 API at https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/concept-recognizing-text
-
Add ability detect checkboxes or radio buttons to OCR and Handwritten text
I've successfully been able to use vision to extract handwritten text out of fixed forms by knowing the coordinates of each form field. However many of my forms have checkboxes and/or radio buttons that the users will be filling in with pen. It doesn't seem that vision has a way to detect this type of content.
15 votesTables and checkboxes and radio buttongs are available in Form Recognizer product from the OCR team – learn more at https://azure.microsoft.com/en-us/services/cognitive-services/form-recognizer/
More specifically, check out the Layout API quickstart at https://docs.microsoft.com/en-us/azure/cognitive-services/form-recognizer/quickstarts/python-layout?tabs=v2-0
Layout API reference at: https://westcentralus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-1-preview-1/operations/AnalyzeLayoutAsync
-
Support for Offline OCR like Firebase ML Kit
Firebase started offering offline on-device image recognition.
https://firebase.google.com/products/ml-kit/
Are there any plans to allow use to do something like OCR offline like Google does?
6 votesComputer Vision now features a Read 2.0 (preview) container for on-premise deployment with Read 3.0 containers coming soon. for more on the new Read 3.0 API, refer to https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/concept-recognizing-text
-
Symbols or individual characters not recognized.
MS Handwriting recognition is best-in-class in reading poor handwriting, but seems algorithm chooses not to recognize individual symbols or characters by themselves. All text is not written in full words.
The following examples never process the same symbol correctly on successive lines. Symbols, or even "0" or "o" are ignored when left by themselves.
5 voteswhile we are constantly improving the support for symbols, the latest Read API extracts the symbols in the original images posted here. Learn more at https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/concept-recognizing-text
-
School Students Handwriting not able to Recoginze .
Hi, I am from Bangalore. I tried few students answer sheet using Computer Vision API. Most of the Answer sheet handwriting not able to recognise by API. Can you improve the algorithm or let us know how to train the data. Thank you.
1 votePlease try the new Read 3.0 api – handwritten (english only) support. Your sampel document result in the text output. https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/concept-recognizing-text#handwritten-text
-
Support PDF input in OCR function
As stated in the title, I'd like to see the OCR function support PDF files. The only supported formats right now are images like JPG, PNG, etc.
44 votesThe Read operation, which features our newest, best OCR models, allows for PDF input.
https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/concept-recognizing-text
-
Handwriting almost never recognized
I tried it with several handwritten pieces, none of them worked.
1 voteWe have just implemented this capability into Computer Vision. Please continue to share your experience and feedback with us.
Check out:
• The new handwriting OCR demo: https://www.microsoft.com/cognitive-services/en-us/computer-vision-api (scroll down to handwriting)
• Documentation: https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/home#RecognizeText
• SDKs: Python, Windows, Android
• API reference:
o https://westus.dev.cognitive.microsoft.com/docs/services/56f91f2d778daf23d8ec6739/operations/587f2c6a154055056008f200
o https://westus.dev.cognitive.microsoft.com/docs/services/56f91f2d778daf23d8ec6739/operations/587f2cf1154055056008f201 -
Increase width/height limits on OCR operation to handle Letter/A4
The Image dimension limitation is a problem.
A4 is 210 x 297 mm. At 300 dpi that’s 2480 x 3508 Pixels and 8.7 megapixels. In grayscale as a JPEG of text the page is around 1-1.5MB.
Letter is Image 8.5 x 11 in. At 300 dpi that’s 2550 x 3300 Pixels and 8.42 megapixels. As a grayscale JPEG the size is slightly smaller than A4 (obviously).
The image dimension limit is a bit…silly? Unless there is some overarching technical reason you should increase the maximum dimensions to 3510 x 3510 – so scanned 300 DPI grayscale A4 pages can be…
6 votesThe new Read 3.0+ API features large PDF documents (upt o2000 pages long) and lareg size images and page sizes.
-
2 votes
Please see the pre-built receipts feature of the OCR team’s other product – FOrm Recognizer at https://docs.microsoft.com/en-us/azure/cognitive-services/form-recognizer/concept-receipts
- Don't see your idea?