You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
The bounding boxes returned by the HI_RES strategy are wrong for PDFs.
To Reproduce
filename = "example.pdf"
with open(filename, "rb") as f:
data = f.read()
req = operations.PartitionRequest(
partition_parameters=shared.PartitionParameters(
files=shared.Files(
content=data,
file_name=filename,
),
strategy=shared.Strategy.HI_RES,
coordinates = True,
languages=['de'],
),
)
try:
res = client.general.partition(request=req)
print(res.elements[0])
except Exception as e:
print(e)
Expected behavior
I would expect the bounding boxes to be correctly placed around each of the elements returned by the unstructured API.
Screenshots of Actual (Wrong) Behavior
Additional context
This issue was already discussed in a previous issue (#3100 ) Back then, the default strategy would still return bounding boxes. This does not seem to the the case anymore - all strategies except for hi_res return no coordinates (and hence no bounding boyes anymore). Hence there currently is no way to retrieve proper bounding boxes for PDFs?
Does anyone know a way to retrieve correct bounding boxes?
The text was updated successfully, but these errors were encountered:
Describe the bug
The bounding boxes returned by the HI_RES strategy are wrong for PDFs.
To Reproduce
Expected behavior
I would expect the bounding boxes to be correctly placed around each of the elements returned by the unstructured API.
Screenshots of Actual (Wrong) Behavior
Additional context
This issue was already discussed in a previous issue (#3100 ) Back then, the default strategy would still return bounding boxes. This does not seem to the the case anymore - all strategies except for hi_res return no coordinates (and hence no bounding boyes anymore). Hence there currently is no way to retrieve proper bounding boxes for PDFs?
Does anyone know a way to retrieve correct bounding boxes?
The text was updated successfully, but these errors were encountered: