Microsoft

Form Recognizer

  1. Auto recognise wich custom model to use.

    Hi
    Woult it be posible to auto detect wich custom model to use? In a normal invoicing flow you would have one big pile of different invoices and send Them to be recognizer.
    Then it would sort Them in 2 piles one wher a model was found and one without.
    Do you understand what i Mean by that?

    7 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  2. analyze results - missing labels in the result (lavel, value - pairs)

    I have trained the model using labels.
    In the result, I dont get something like

    label 1, value
    label 2, value

    Is there any way to get this label and value pairs?

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  3. Data Type for labels

    Ability to specify a data type for a label, and have the recognizer look for that data type's formatting when extracting text for that field.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  4. Recognizing form regions

    At the moment we don't have features like recognizing checkboxes, radio buttons, signatures, etc.

    Can form recognizer recognize a region of the form, so we can cut that part out and put for further processing?

    For example, I would like to know where on the document are the radio button questions, so that I know in which area I need to do custom processing.

    6 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  5. Separate hand written text from printed text

    Occasionally, OCR puts together printed and handwritten text as a single tag. I would like them to be separate.

    For example:

    Compay Handwritten company name

    Printed text is usually the label while the handwritten text is what we are looking for.

    4 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Planned  ·  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  6. Ignore dotted lines

    When analyzing our user feedback form, I realized that the dotted line is occasionally recognized either as "........" or "**********".

    Can we train it to ignore those?

    Here is an example of an empty form: https://www.ssw.com.au/ssw/standards/forms/SSWEvaluationSurvey.pdf

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  7. Reinforcement learning

    whenever something is wrong I want the model to learn

    6 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  8. Predefined invoice Models

    at the moment I have no working protoype
    I see "training a model" in docu
    But why should I train a model for invoices, If language and location is dedected, invoices follows specific rules- Tax, tax id and so one. My expectiation is: seems to be a German Invoice, take fitting model to extract data

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  9. Test Form in Azure Portal

    The Azure Portal should have some testing UI, with uploading a pdf and start the request.

    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  10. Allow different formats to be trained in the same model

    Allow different formats to be trained in the same model. Currently, for each format, a new model is created.

    3 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  11. Analyzing forms with Fiddler

    This may be obvious to many with more experience but it took me half a day to figure it out. I'm using Fiddler instead of cURL and when I was trying to call the analyze API with a file in the Request Body I was getting a 415 Unsupported Media Type. The API does not support multipart/form-data content type which is how Fiddler defaults binary files in the Request Body. The way to work around this is after selecting an "Upload file..." in the Request Body is to remove the multipart tags from the request body and change the content-type…

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  12. Model Id is not maintained.

    Train the data using the sample label tool.
    List all custom model using .Net SDK method (GetCustomModelsAsync())
    Trained Model id doesn't available in the result

    3 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  13. 2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  14. Structure of table data

    First off, I love this tool, the key value pairs is very powerful and really solves a business need.

    Having the ability to label the table elements would take this offering to the next level. If you were able to tag columns, eg. Column 1, column 2 and then the tool provides a row from the table as object...
    {"fields":

                 [{   "Address": {
    
    ...
    },
    "Name": {
    ...
    },
    "Size": {
    ...
    }},
    { "Address": {
    ...
    },
    "Name": {
    ...
    },
    "Size": {
    ...
    }},
    { "Address": {
    ...
    },
    "Name": {
    ...
    },
    "Size": {
    ...
    }}]
    }
    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  15. Release date for V2 including local container support?

    For planning purposes we need to know when V2 is planned to be released. We also need to use the local container version so we need to know if the release of that will be different than V2 GA. Ideally we would like to test the local V2 container prior to GA.

    5 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  16. Label tool and form processing

    We did some testing with the examples. Particularly we want to start with the recognition of invoices so we need only the payment information, amount, … We started with the .NET SDK example and the custom form recognizer model. But we are receiving to much unnecessary information. Instead we are using the sample labeling tool which helps us to tag the necessary fields. Is there a way to upload the data of the training of the labeling tool in the same way as the customer form recognizer does?

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  17. Cognitive Service Container with V2.0 Support

    The containers provided are with V1.0 API and all limitations ( no Layout API, 4Mb Dataset for training, ...).
    When containers will be updated for V2.0 API ?

    Thx

    5 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  18. Roadmap

    Are you able to disclose what the roadmap is for the Form Recognizer product? I'm particularly interested in using the Custom Model and Labelling Tool.

    6 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  19. Fix .Net SDK error for Analyze method

    I'm getting the following error when using the analyze method of the SDK:

    AnalyzeResult result = await _formRecognizerClient.AnalyzeWithCustomModelAsync(new Guid(modelId), stream, contentType: "application/pdf");

    In case of HTML form data, the multipart request must contain a document with a media type of - 'application/pdf', 'image/jpeg' or 'image/png'.

    Error code: UnsupportedMediaType
    Status code: 415

    The data in the stream is correct, it contains a small pdf file.

    More code leading up to the error:

    var file = HttpContext.Current.Request.Files.Count > 0 ? HttpContext.Current.Request.Files[0] : null;
    if (file != null && file.ContentLength > 0)
    {

    using (var stream = file.InputStream){
    
    AnalyzeResult result = await _formRecognizerClient.AnalyzeWithCustomModelAsync(new
    2 votes
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  20. Make it possible to use a download of the pageResult property of the GetAnalyzeFormResult API as an input for the TrainCustomModel API

    Make it possible to use the same structure of the pageResult property of the GetAnalyzeFormResult API, as an input for the TrainCustomModel API label files data structure. This would make it easier to improve the model key-value associations based on the results of the model.
    I would like to update the content of the pageResult structure and re-use it in a next TrainCustomModel API call.
    In the current version these are 2 different structures with similar data.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  • Don't see your idea?

Form Recognizer

Categories

Feedback and Knowledge Base