Microsoft

Computer Vision

Welcome to the Computer Vision API Forum

Categories

API – Any ideas or feedback pertaining to features or enhancements to Computer Vision API.

Documentation – Any ideas or suggestions for the API Reference or Documentation.

Language Support – Submit a request to have a particular language supported.

Samples & SDK Request – Let us know if you would like to see a Code sample or SDK provided.

Custom/Sample Images – Have an image you’ve tested and not getting the results you are seeking? Upload the image and describe the information or tags you would like to be included.


                               Attention!





We have moved our Customer Feedback & Ideas for Azure Cognitive Services portal to the Azure Feedback Forum.





Please go to the link below to access our new Feedback and Ideas Page.



  • Hot ideas
  • Top ideas
  • New ideas
  • My feedback
  1. Add "regions" to Read Results API

    Please can you add "regions" (array of "lines") to the data returned by the Read API, similar to the OCR API data (eg. regions > lines > words).

    It would be great to also see the following addition to the Read API results:
    - a "text" property in the "region" object

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Text Recognition  ·  Flag idea as inappropriate…  ·  Admin →
  2. OCR does not recognize isolated numbers

    Whenever I send an image like the attached one, I only get the words REV back but not the number 1 that's under it. More unique number shapes like 2, 3, 4, etc. are usually recognized but the number 1 is rarely recognized.

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Started  ·  1 comment  ·  Text Recognition  ·  Flag idea as inappropriate…  ·  Admin →
  3. Japanese language support for the READ API

    The READ API does not support Japanese. The OCR API supports Japanese, but does not support PDF files. Japanese customers are in great demand for reading Japanese PDF files. Any ideas when Japanese will be fully supported as well?

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Language Support  ·  Flag idea as inappropriate…  ·  Admin →
  4. show OCR alternatives (second guess)

    Capital I, lowercase L, and the one digit all look the same in some fonts, and OCR may return any combination depending on context. In my application, I often know what to expect. My code corrects for that case, but not if a less-obvious error occurs, like an "A" is mistaken for a "4".
    If OCR returned second and third (and …?) possible interpretations, my code could perform the final analysis step and my clients would find the software usable.
    This is make or break for my application.
    (This alternatives option would let the service-consuming programmer deal with regular expressions.)

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Text Recognition  ·  Flag idea as inappropriate…  ·  Admin →
  5. Please create Computer Vision SDK for iOS

    This is based on Github issue:
    https://github.com/Azure-Samples/cognitive-services-quickstart-code/issues/103

    I want use OCR in iOS by computer vision sdk,but the document dosen't contains OC/Swift for iOS.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Samples & SDK Request  ·  Flag idea as inappropriate…  ·  Admin →
  6. Allow multiple images/documents to be analyzed with one call

    I use the SDK for processing images through OCR. It would be nice to be able to supply the ReadAPI with multiple images for a single request instead of looping over each image and calling one at a time. My current project is one original image that separated into multiple and resized to increase accuracy. It would be nice to push all the these partial images into one call.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Image Analysis  ·  Flag idea as inappropriate…  ·  Admin →
  7. Set a language field as a read result

    I need a detected language field in read result version 3.0.
    On your documents, about its field has been written, but the field isn't in response.

    "readResults":[{"page":1,"angle":0,"width":282,"height":216,"unit":"pixel","lines":[{"boundingBox"...

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  API  ·  Flag idea as inappropriate…  ·  Admin →
  8. Analysis of Video, Face Detection, Sample Python Code

    Hi,

    I'd like to analyze a video file by Face Detection API. What I want to get is Emotion labels.

    I searched the documentation about analysis of video. I found the following article:

    https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/vision-api-how-to-topics/howtoanalyzevideo_vision

    I would like to ask if there is any Python sample code for video analysis. Especially, I would like to control fps (it's important because it effects the cost).

    Also, I didn't understand well if the video file stored in local PC or in the cloud. Could you please guide me?

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Face Detection  ·  Flag idea as inappropriate…  ·  Admin →
  9. Audio content to listing application

    Having an option for audio to hearing application

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Language Support  ·  Flag idea as inappropriate…  ·  Admin →
  10. Disable Celebrity Recognition in Vision Analysis Captions

    My application does not wish to know if a face in an image is recognised as a celebrity (correctly or false-positive), but I see no option to remove/disable celebrity names from captions.

    e.g. I just want to know that the image is "A person talking on a cell phone"

        "captions": [
    
    {
    "text": "Leonardo DiCaprio talking on a cell phone",
    "confidence": 0.53837191468154666
    }
    ]

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  API  ·  Flag idea as inappropriate…  ·  Admin →
  11. 2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Text Recognition  ·  Flag idea as inappropriate…  ·  Admin →
  12. Can the OCR API support identifying paragraphs from PDF's

    Currently the tool is able to identify the texts line by line, however it cannot recognize the paragraph of the text searched.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Planned  ·  0 comments  ·  API  ·  Flag idea as inappropriate…  ·  Admin →
  13. Limit per file is too small

    I would really like to use Computer Vision API to tag images uploaded to sharepoint. The problem is that limit 4mb per file is too small and causes the Flow to fail in 50% of cases. This holds me back and If there is any way how to increase that limit tell me asap.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Image Analysis  ·  Flag idea as inappropriate…  ·  Admin →
  14. El tutorial

    Los tutoriales de c# de OCR y de reconocimiento de texto ya no funcionan, están des-actualizados y marca error en el Uri:
    Se proporcionó un URI de solicitud no válido. El URI de solicitud debe ser un URI absoluto o debe establecerse BaseAddress.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  API  ·  Flag idea as inappropriate…  ·  Admin →
  15. OCR processing of numerical text should not be impaired when detected language is Turkish

    When testing with Turkish text that also includes numerical strings, I have isolated the following anomaly:

    Consider two versions of an otherwise identical image (for informative purposes, the image is a graphic depicting the total number of deaths in Turkey from COVID-19, on a particular date):

    Turkish version:
    TÜRKİYE’DE ÖLÜMLER 1.368

    English version:
    DEATHS IN TURKEY 1.368

    When the image is OCR-processed with the text in English (i.e. with the words: “DEATHS IN TURKEY”), the numeric string is returned correctly as “1.368”.

    However, when the image is OCR-processed with the text in Turkish (i.e. with the words: “TÜRKİYE’DE ÖLÜMLER”), the…

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Text Recognition  ·  Flag idea as inappropriate…  ·  Admin →
  16. Improve single-character recognition

    Single characters are having a very tough time being recognized, even in the most optimal conditions for OCR (shown in the attached PDF). Also attached is a text view of how the content gets OCR-ed.

    I saw that this is a recognized problem in a few ideas from a couple of years ago, but I was wondering if any progress was made in fixing this known problem.

    Functionally, this provides issues with OCR-ing tables that have a "Quantity" or any other amount less than 10.

    5 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Started  ·  4 comments  ·  Custom/Sample Images  ·  Flag idea as inappropriate…  ·  Admin →
  17. Receiving exception in preview container

    We're running the container preview in AKS and our containers are filled with the following exception:

    fail: Microsoft.CloudAI.Containers.Diagnostics.ExceptionMiddleware[0]

      Could not load type 'System.Net.Http.HttpStatusCodeExtensions' from assembly 'Microsoft.CloudAI.Containers.Common, Version=1.1.1201.1, Culture=neutral, PublicKeyToken=31bf3856ad364e35'. SubscriptionId='' RequestId='d1845cb5-99d0-46e3-b3e0-5ebf81ac51d9' Timestamp=''
    

    System.TypeLoadException: Could not load type 'System.Net.Http.HttpStatusCodeExtensions' from assembly 'Microsoft.CloudAI.Containers.Common, Version=1.1.1201.1, Culture=neutral, PublicKeyToken=31bf3856ad364e35'.
    at Microsoft.CloudAI.Containers.Controllers.VisionControllerBase.CreateErrorResponse(HttpStatusCode statusCode, String code, String message)
    at Microsoft.CloudAI.Containers.OneOcr.OcrControllerBase.GetTextOperationResultWorker(Guid id) in /source/src/OneOcr.Common/Controllers/OcrControllerBase.cs:line 92
    at Microsoft.CloudAI.Containers.OneOcr.OcrController.GetTextOperationResult(Guid id) in /source/src/OneOcr.2.0/Controllers/OcrController.cs:line 108
    at Microsoft.AspNetCore.Mvc.Internal.ActionMethodExecutor.TaskOfIActionResultExecutor.Execute(IActionResultTypeMapper mapper, ObjectMethodExecutor executor, Object controller, Object[] arguments)
    at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.InvokeActionMethodAsync()
    at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.InvokeNextActionFilterAsync()
    at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.Rethrow(ActionExecutedContext context)
    at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)
    at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.InvokeInnerFilterAsync()
    at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.InvokeNextResourceFilter()
    at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.Rethrow(ResourceExecutedContext context)
    at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.Next(State& next, Scope& scope,…

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  On-Premises Solution  ·  Flag idea as inappropriate…  ·  Admin →
  18. Ensure that all pages are returned, even ones that Computer Vision is unable to extract text from.

    If a document consists of pages that contain font of a 'good' size and then some documents (think Terms and Conditions) that contains a lot of small text that can't be interpreted by Computer Vision then it gets excluded from the response.

    The page numbers are returned - so this is useful. But in the case where a 'complicated' page is the last page, this would not be returned, so the number of pages you believe you have would be one less than the number of pages you actually have. Including (at the very least) a JSON segment with:
    {…

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Text Recognition  ·  Flag idea as inappropriate…  ·  Admin →
  19. Add OCR confidence to v1.0/ocr API

    Other OCR platforms provide OCR confidence sometimes per character and sometimes per word.

    The confidence meaning how likely the result is to match the input image, for very poorly scanned documents where noise is a problem this can cause the current API to return incorrect text frequently with no programmatic way to detect if the result should be trusted or sent to a user for verification.

    Having this as an optional query parameter on the API would be helpful, perhaps confidencePerWord=true and confidencePerCharacter=true

    16 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  API  ·  Flag idea as inappropriate…  ·  Admin →
  20. Is there a way to get page height and width in ms vision ocr api?

    As available in the read api, is there a way to extract page width and height infiormation in response of ms vision ocr api

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  API  ·  Flag idea as inappropriate…  ·  Admin →
← Previous 1 3 4 5 8 9
  • Don't see your idea?

Feedback and Knowledge Base