Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encode database label file names using URL encoding (optionally) #130

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

likelion
Copy link
Contributor

@likelion likelion commented Mar 28, 2023

Often database label (the name of a graph) needs to be a URI. However, this conflicts with the file system when directory store is used. The proposed solution is to encode the label file name using URL encoding, it is also backwards compatible.

@likelion likelion changed the title encode database labels using URL encoding encode database label file names using URL encoding Mar 28, 2023
@matko
Copy link
Member

matko commented Mar 28, 2023

I don't think this is actually backwards compatible though, as various label names that are possible now would be transformed by this change. In particular, any label with a % in it would stop working, as url-encoding them will turn that into %25. The same goes for the pipe character (|) which turns into %7C.

Both are pretty relevant for us because in TerminusDB we actually use url encoding and pipes (as separator between organization and data product) for our label names. So we currently have a whole bunch of label files with % and | characters in them that would stop working with this change.

I wonder if there's not some other strategy we could take here. For example, you could define another label store which acts as a name-transforming proxy. Or we could provide an option in the constructor. Or maybe a feature flag. But as-is, I don't think this is a good idea.

@likelion
Copy link
Contributor Author

Fair enough. I will look into making this more configurable. Would be of course nice to get both backwards compatibility and proper handling of names with other special characters, but maybe it's too late now.

@likelion
Copy link
Contributor Author

@matko Please, have a look at the modified solution. I made label file name encoding pluggable and provided two default implementations - identity and URL encoder. Do you think this may be something worth merging?

@likelion likelion changed the title encode database label file names using URL encoding Encode database label file names using URL encoding (optionally) Jun 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants