Microsoft

How can we improve Computer Vision API?

The PDF coordinate system needs some work

The Read api (https://westus.dev.cognitive.microsoft.com/docs/services/5adf991815e1060e6355ad44/operations/2afb498089f74080d7ef85eb) does not handle PDF word coordinates correctly.

Did not see a way to report the bug on the API page, so I am reporting it here.

I am sending it a 8.5x11 page and getting coordinates in inches that are outside the box.

The width is marked as 10.9833 and height as 8.4567, however it is getting words outside that box:

12.3419
0.4349
13.2572
0.4294
13.2715
0.501
12.3562
0.5065

When I print the coordinates to the page some documents are spot on the the x coordinate, but roughly half as far down the page as they should be on the y coordinate.

For now I am going to revert to converting everything to images, as that works for me, but I just wanted to log an issue because being able to upload a multi-page PDF would be more efficient for us.

4 votes
Sign in
(thinking…)
Sign in with: Facebook Google
Signed in as (Sign out)

We’ll send you updates on this idea

Adrian Moore shared this idea  ·   ·  Flag idea as inappropriate…  ·  Admin →

0 comments

Sign in
(thinking…)
Sign in with: Facebook Google
Signed in as (Sign out)
Submitting...

Feedback and Knowledge Base