You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In a recent PR, a question + response comment touches on this question: #105 (comment).
Two interlated questions emerge for me:
should item identifiers in DSC ever contain filename extensions (e.g. ".pdf")?
if they do include them, do these directly become identifiers anywhere in DSpace?
I understand that stakeholders, at least in the past, have included filename extensions in some kind of Item Identifier column in CSVs used for uploads. If the answer to #2 above is "No", then it probably doesn't matter; it sounds like it's a 100% internal identifier that is gone after DSC and DSS ingest the item.
Another question though, how does this scale for items that may have multiple files? Suppose it's a SimpleCSV style workflow, and there are 3 PDFs? Is this when we get into juggling _01 suffixes in the filenames? and is that somehow stripped from the item identifier?
The text was updated successfully, but these errors were encountered:
Ahhh, I get your concern now. # 2 is a no, it is purely for DSC and DSS and it is never written to the metadata.
create_dspace_metadata even has this line to ensure it never goes in:
for field_name, field_mapping in self.metadata_mapping.items():
if field_name not in ["item_identifier"]:
field_value = item_metadata.get(field_mapping["source_field_name"])
And the item identifier in the metadata CSV can be D123.pdf but it is just as often D123 when multiple files are expected (e.g. D123_01.pdf, D123_verso.pdf). Checking that the item identifier is "in" the file name just happens to work in both cases
In a recent PR, a question + response comment touches on this question: #105 (comment).
Two interlated questions emerge for me:
".pdf"
)?I understand that stakeholders, at least in the past, have included filename extensions in some kind of
Item Identifier
column in CSVs used for uploads. If the answer to #2 above is "No", then it probably doesn't matter; it sounds like it's a 100% internal identifier that is gone after DSC and DSS ingest the item.Another question though, how does this scale for items that may have multiple files? Suppose it's a
SimpleCSV
style workflow, and there are 3 PDFs? Is this when we get into juggling_01
suffixes in the filenames? and is that somehow stripped from the item identifier?The text was updated successfully, but these errors were encountered: