You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When downloading and pointing to models locally, we commonly use pipeline_options.artifacts_path. For example: pipeline_options.artifacts_path = "models"
The layout and tableformer model are then imported from the following paths:
I find this very counterintuitive and not very well structured.
EDIT: This is even more annoying when using both the picture classifier and the CodeFormula models and importing from local since both accept the same artifacts_path.
Proposal
Two options that come to mind:
a. When initalizing the DocumentFigureClassifierPredictor, simply point to models as artifacts_path and allow same folder structure as on huggingface. For example:
DocumentFigureClassifierPredictor(
artifacts_path="models", # model files are in models/ds4sd/DocumentFigureClassifierdevice="cpu",
num_threads=4,
)
b. On huggingface, for the DocumentFigureClassifier repo use the model_artifacts/<model_type> repo structure that is similar to the main docling repository and allow import by simply stating the root folder. For example:
DocumentFigureClassifierPredictor(
artifacts_path="models", # model files are in models/model_artifacts/DocumentFigureClassifier/...device="cpu",
num_threads=4,
)
Docling version
2.16.0
The text was updated successfully, but these errors were encountered:
Bug
When downloading and pointing to models locally, we commonly use
pipeline_options.artifacts_path
. For example:pipeline_options.artifacts_path = "models"
The layout and tableformer model are then imported from the following paths:
models/model_artifacts/layout/...
models/model_artifacts/tableformer/...
However, when importing the new picture classifier from a local folder, it requires that all model files are in the root folder, i.e.,
models
:models/config.json
,models/model.safetensors
I find this very counterintuitive and not very well structured.
EDIT: This is even more annoying when using both the picture classifier and the CodeFormula models and importing from local since both accept the same
artifacts_path
.Proposal
Two options that come to mind:
a. When initalizing the
DocumentFigureClassifierPredictor
, simply point tomodels
as artifacts_path and allow same folder structure as on huggingface. For example:b. On huggingface, for the DocumentFigureClassifier repo use the
model_artifacts/<model_type>
repo structure that is similar to the main docling repository and allow import by simply stating the root folder. For example:Docling version
2.16.0
The text was updated successfully, but these errors were encountered: