Feature/csckan-311 - add custom flags for full and label imports #316

D-GopalKrishna · 2024-10-10T16:34:48Z

Jira link - SCKAN-311

Task description:

Add flags for full and label imports
update the neurodm script for ingest_statement with log_error() to log the error in terminal.
a. also - the logic uses - default configurations if flags are not provided manually.
add a missing migration
update the doc for the same

Confirmed - that the tests are not affected from the new changes here - test_neurondm_processing.py file.

@ddelpiano - do we need to update the test?

D-GopalKrishna · 2024-10-10T16:36:52Z

Upon inspection, I found the following behavior. Please let me know @ddelpiano if I should look into this. thanks!

2024-10-10.20-38-31.mp4

afonsobspinto · 2024-10-11T16:27:51Z

backend/composer/management/commands/ingest_statements.py


    def handle(self, *args, **options):
        update_upstream = options['update_upstream']
        update_anatomical_entities = options['update_anatomical_entities']
+        should_overwrite = options['overwrite']


issue: I assume this is a left over of other task. We probably should remove it from this task since it's making the command fail

thanks @afonsobspinto !

ddelpiano · 2024-10-11T17:00:22Z

README.md

+```
+here - `apinat-partial-orders` and `apinat-pops-more sparc-nlp` are the full imports and `apinatomy-neuron-populations` and `../../npo` are the label imports.
+
+NOTE: full imports are done from `https://raw.githubusercontent.com/SciCrunch/NIF-Ontology/neurons/ttl/generated/neurons/*.ttl` and label imports are done from `https://raw.githubusercontent.com/SciCrunch/NIF-Ontology/neurons/ttl/generated/neurons/*.ttl`, and if you want a different path, you can use it relatively like - `../../npo`


@D-GopalKrishna please remove this, it is not our responsibility how the content gets passed and this will change soon once neurondm will pull data from Interlex, I understand we digged into how the tool works but we are not mainting that and this can be misleading in the future. Thanks.

I don't fully agree with the above. We are not maintaining the data but we are the ones that define where we are fetching the data from. If in the future this changes, we will need to change our script.

Now it's true that the part I highlighted was originally provided by them, but at this moment it's code we deliver and maintain

I agree with @afonsobspinto , the intent behind having this in README was also to have a form of documentation for ourselves for the future.

@afonsobspinto are we defining where we are fetching the data? or we are defining which labels we have to use for the imports? imho these are 2 different things, the fact neurondm download the ttls internally it's a neurondm behaviour which I don't understand why we need to be aware of, if this code will be refactored later on and data are imported and filtered by label in a different way, as long our code works that is not our problem.

Manually download files and upload them in the same folders (one of @D-GopalKrishna previous proposals if I understood correctly) is hijacking the internals of a library which we do not own and we should not worry what it does.

Hey @ddelpiano, I’ve linked the code that shows how we define the data source for neurondm to use:
we are the ones that define where we are fetching the data from

Could you take a look and let me know if that clarifies things?

afonsobspinto · 2024-10-11T17:02:57Z

backend/composer/services/cs_ingestion/cs_ingestion_services.py

@@ -14,8 +14,8 @@
 logger_service = LoggerService()


-def ingest_statements(update_upstream=False, update_anatomical_entities=False):
-    statements_list = get_statements_from_neurondm(logger_service_param=logger_service)
+def ingest_statements(update_upstream=False, update_anatomical_entities=False, should_overwrite=True, full_imports=[], label_imports=[]):


issue: We should not use mutable objects as default arguments.

It's better to do:

def ingest_statements(update_upstream=False, update_anatomical_entities=False, should_overwrite=True, full_imports=None, label_imports=None): if full_imports is None: full_imports = [] if label_imports is None: label_imports = []

Here's why it's a problem

def append_to_list(value, my_list=[]): my_list.append(value) return my_list # First call - works as expected print(append_to_list(1)) # Output: [1] # Second call - unexpected behavior print(append_to_list(2)) # Output: [1, 2] # Third call - still using the same list! print(append_to_list(3)) # Output: [1, 2, 3]

afonsobspinto · 2024-10-11T17:19:40Z

Upon inspection, I found the following behavior. Please let me know @ddelpiano if I should look into this. thanks!

2024-10-10.20-38-31.mp4

I don't fully understand the very first problem. Is it only happening with a specific .ttf file?
For the exception raised it sounds correct to me, if the original content is not complete (doesn't have a partial order) we can't create our statements. It shouldn't crash the ingestion; it should log the problem and continue without creating that specific statement

D-GopalKrishna · 2024-10-14T09:31:42Z

Upon inspection, I found the following behavior. Please let me know @ddelpiano if I should look into this. thanks!
2024-10-10.20-38-31.mp4

I don't fully understand the very first problem. Is it only happening with a specific .ttf file? For the exception raised it sounds correct to me, if the original content is not complete (doesn't have a partial order) we can't create our statements. It shouldn't crash the ingestion; it should log the problem and continue without creating that specific statement.

I am not fully aware of the concept of partial order. But it does do - It shouldn't crash the ingestion; it should log the problem and continue without creating that specific statement for invalid inputs or when things don't work.

As in the video - it doesn't ingest (or doesn't recognizes) the statements for when I use - "apinat-partial-orders" alone without any other full-imports. But when I use it with others like - "apinat-simple-sheet", it does import (114 statements).

afonsobspinto · 2024-10-14T21:54:43Z

I am not fully aware of the concept of partial order. But it does do - It shouldn't crash the ingestion; it should log the problem and continue without creating that specific statement for invalid inputs or when things don't work.

Partial order is a property that we expect/require the neurondm neuronal populations to have so that we can understand the path of the connectivity statements (what connects to what). If a neuronal population doesn't have partial order, we skip it. That's the expected behaviour.

As in the video - it doesn't ingest (or doesn't recognizes) the statements for when I use - "apinat-partial-orders" alone without any other full-imports. But when I use it with others like - "apinat-simple-sheet", it does import (114 statements).

Sounds to me like a problem with their ttf files, it might be worth letting them know but I don't think it's a blocker for this PR (aka the problem is likely on their side).

ddelpiano · 2024-10-15T07:34:32Z

@ddelpiano - do we need to update the test?

@D-GopalKrishna yes please, thanks

D-GopalKrishna requested review from afonsobspinto and ddelpiano October 10, 2024 16:34

D-GopalKrishna self-assigned this Oct 10, 2024

afonsobspinto requested changes Oct 11, 2024

View reviewed changes

ddelpiano reviewed Oct 11, 2024

View reviewed changes

afonsobspinto requested changes Oct 11, 2024

View reviewed changes

SCKAN-311 remove "should overwrite" from this branch

7c47f52

D-GopalKrishna requested a review from afonsobspinto October 14, 2024 09:33

afonsobspinto approved these changes Oct 14, 2024

View reviewed changes

ddelpiano changed the base branch from develop to feature/SCKAN-312 October 23, 2024 09:52

ddelpiano closed this Oct 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/csckan-311 - add custom flags for full and label imports #316

Feature/csckan-311 - add custom flags for full and label imports #316

D-GopalKrishna commented Oct 10, 2024 •

edited by jira bot

Loading

D-GopalKrishna commented Oct 10, 2024

afonsobspinto Oct 11, 2024

D-GopalKrishna Oct 14, 2024

ddelpiano Oct 11, 2024

afonsobspinto Oct 11, 2024 •

edited

Loading

D-GopalKrishna Oct 14, 2024

ddelpiano Oct 15, 2024

afonsobspinto Oct 15, 2024 •

edited

Loading

afonsobspinto Oct 11, 2024

afonsobspinto commented Oct 11, 2024

D-GopalKrishna commented Oct 14, 2024 •

edited

Loading

afonsobspinto commented Oct 14, 2024

ddelpiano commented Oct 15, 2024

Feature/csckan-311 - add custom flags for full and label imports #316

Feature/csckan-311 - add custom flags for full and label imports #316

Conversation

D-GopalKrishna commented Oct 10, 2024 • edited by jira bot Loading

D-GopalKrishna commented Oct 10, 2024

afonsobspinto Oct 11, 2024

Choose a reason for hiding this comment

D-GopalKrishna Oct 14, 2024

Choose a reason for hiding this comment

ddelpiano Oct 11, 2024

Choose a reason for hiding this comment

afonsobspinto Oct 11, 2024 • edited Loading

Choose a reason for hiding this comment

D-GopalKrishna Oct 14, 2024

Choose a reason for hiding this comment

ddelpiano Oct 15, 2024

Choose a reason for hiding this comment

afonsobspinto Oct 15, 2024 • edited Loading

Choose a reason for hiding this comment

afonsobspinto Oct 11, 2024

Choose a reason for hiding this comment

afonsobspinto commented Oct 11, 2024

D-GopalKrishna commented Oct 14, 2024 • edited Loading

afonsobspinto commented Oct 14, 2024

ddelpiano commented Oct 15, 2024

D-GopalKrishna commented Oct 10, 2024 •

edited by jira bot

Loading

afonsobspinto Oct 11, 2024 •

edited

Loading

afonsobspinto Oct 15, 2024 •

edited

Loading

D-GopalKrishna commented Oct 14, 2024 •

edited

Loading