-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create transformers to support subextractors #35
Comments
I wanted to comment on this, but I can only say: I agree :) We have to make sure the extractor gives the right feedback when it is unable to read the file (in case of dirty values in the field) |
ok after giving it some thought, there is more. A problem arises in the mapping, because we need to link from resources created in the parent job to the child job. So this relation needs to be embedded into the mapping file. Since each subfile is linked with a row in the original file, I suggest a execution strategy like this: Then extend the mapping language to refer to child fields e.g: <#Event> a :Resource; Implementation wise, you will need to pass the URI of the object (what comes out of <#Event>) and the field of the child (<#OpeningHours>) to the mapping of the child job. This info is then added to the parsed mapping file and the triples are generated automatically. |
Started creating transformers in branch https://github.com/tdt/input/tree/transformers A problem occurs when the file inside this thing is huge. Conclusion of this problem is that we need to stream these chunks as wel. Thus, for every new row inside this subjob, we need a new vertere execution. The only question now is how we will tell the mapping file that this chunk should be mapped according to subjob of clause X? |
Can this be closed? @pietercolpaert |
No. But it's no priority atm as @andimou is putting effort in RML |
This is within the scope of and natively solved by RML. Dontfix? |
Maybe someone else out there wants to fix it and pull request the fix? |
It's not really a fix now is it, more of a very large enhancement/feature? |
Problem
Sometimes we get already linked data in tdt/input. A URI towards another resource or a URL to a file giving more explanation can be handy to include in our harvester as well. Some resources in these files are structured according to another extractor: for instance: a CSV file referencing an ICAL format.
Solution
We should add our first transformer to our ETML workflow. Now the configuration might look like this:
The ICAL will then be put inside the hierarchy and can be mapped as well.
The text was updated successfully, but these errors were encountered: