In addition to bounding box, include area, height, and width with Analyze Layout result
The Analyze Layout API call currently provides the bounding box as a result for each line of text. While this information is useful, it always requires the consumer to perform additional processing to make it useful.
Consider directly providing the key calculations as part of the service result to reduce redundancy on the consuming side.
Provide the following values:
- Area of the bounding box
- Height of the bounding box
- Width of the bounding box
At a minimum, consider providing the height and the width as the area can be easily calculated from that.
The height of the bounding box can be used to infer the size of the font which can be used to determine the importance of the text in the document (generally, we would assume taller text (larger font) to be more significant than shorter text).
Charles Chen commented
Sean, I think the downside to saving a few bytes is that every consumer has to implement this logic to make use of the bounding box coordinates. While this is not difficult, I think there's a design question of whether it's worth saving a few bytes vs making an easier to use API layer that reduces redundant code for the consumer.
As an example, it would be even more efficient to include a tokens dictionary as there are sure to be many repetitive tokens ("the", "I", "he", "she", "a", etc.) and instead of referring to the full string, refer to an index in the token table. However, the savings in byte count with this approach results in a less usable API because every consumer will now need to perform a token lookup to make use of the results.
Sean Flynn commented
It's just a coordinate system so all you have to do is calculate the deltas for the width/height and multiply the result for the area. I would not want to increase the size of the response for such a simple calculation.