missing taxonomy python notebook to script #245

park2454 · 2023-02-06T19:35:23Z

missing_taxonomy.py creates a csv file for objects with missing labels based on Brian(@bfhealy)'s jupyter notebook. A classification feature parquet and a dataset mapper json file are required as input and the output are stored in "golden_missing_label.csv".

bfhealy · 2023-02-06T20:24:10Z

Thanks @park2454! It looks like there are some lint errors related to unused imports in the code. Did you run pre-commit on your changes? If not, try:

pre-commit install
pre-commit run --files tools/missing_taxonomy.py

Then fix the errors, and commit/push again.

park2454 · 2023-02-06T22:47:15Z

Hi Brian, Thank you for the suggestions! I will try that and recommit once they are fixed. Thank you, Sungmin

…

On Mon, Feb 6, 2023 at 2:24 PM Brian Healy ***@***.***> wrote: Thanks @park2454 <https://github.com/park2454>! It looks like there are some lint errors related to unused imports in the code. Did you run pre-commit on your changes? If not, try: pre-commit install pre-commit run --files tools/missing_taxonomy.py Then fix the errors, and commit/push again. — Reply to this email directly, view it on GitHub <#245 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AFILAKA7WXYQSTEUMF7X22DWWFMXLANCNFSM6AAAAAAUTCPYPI> . You are receiving this because you were mentioned.Message ID: ***@***.***>

bfhealy

Thanks for helping to make this notebook become part of the codebase! Please see the comments below for recommended changes.

bfhealy · 2023-02-06T20:01:57Z

tools/missing_taxonomy.py

+    parser.add_argument(
+        "-merge_features",
+        type=bool,
+        nargs='?',
+        const=True,
+        default=False,
+        help="merge downloaded results with features from Kowalski",
+    )
+    parser.add_argument(
+        "-features_catalog",
+        type=str,
+        default='ZTF_source_features_DR5',
+        help="catalog of features on Kowalski",
+    )


These arguments are not used by the code above, so they can be removed.

bfhealy · 2023-02-06T20:15:31Z

tools/missing_taxonomy.py

+    # Read in golden dataset (downloaded from Fritz), mapper
+    parquet_path = os.path.join(os.path.dirname(__file__), parquet)
+    mapper_path = os.path.join(os.path.dirname(__file__), mapper)
+    output_path = os.path.join(os.path.dirname(__file__), "golden_missing_labels.csv")


It would be best to allow the user to customize the name of this output file using another argument.

bfhealy · 2023-02-06T20:16:49Z

tools/missing_taxonomy.py

+    gold_map = gold_map.reset_index(drop=False).set_index('fritz_label')
+    gold_dict = gold_map.transpose().to_dict()
+
+    labels_gold = gold_new.set_index('obj_id')[gold_new.columns[1:54]]


The columns corresponding to the labels may not always be 1:54. Perhaps we could use the mapper's keys or the config file to generate a list of classifications?

bfhealy · 2023-02-06T20:19:32Z

tools/missing_taxonomy.py

+import matplotlib.pyplot as plt
+from collections import Counter


We can remove these imports - while the notebook used them, this code does not.

bfhealy · 2023-02-06T23:43:58Z

tools/missing_taxonomy.py

@@ -0,0 +1,113 @@
+import argparse


Above the first import, add this commented line: #!/usr/bin/env python

This allows users to run the script using ./missing_taxonomy.py in addition to python missing_taxonomy.py.

tools/missing_taxonomy.py

missing taxonomy python notebook to script

22ccd3b

mcoughlin requested a review from bfhealy February 6, 2023 19:50

bfhealy requested changes Feb 7, 2023

View reviewed changes

removed unnecessary packages

09c3a21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

missing taxonomy python notebook to script #245

missing taxonomy python notebook to script #245

park2454 commented Feb 6, 2023

bfhealy commented Feb 6, 2023

park2454 commented Feb 6, 2023 via email

bfhealy left a comment •

edited

Loading

bfhealy Feb 6, 2023

bfhealy Feb 6, 2023

bfhealy Feb 6, 2023

bfhealy Feb 6, 2023

bfhealy Feb 6, 2023

		import matplotlib.pyplot as plt
		from collections import Counter

missing taxonomy python notebook to script #245

Are you sure you want to change the base?

missing taxonomy python notebook to script #245

Conversation

park2454 commented Feb 6, 2023

bfhealy commented Feb 6, 2023

park2454 commented Feb 6, 2023 via email

bfhealy left a comment • edited Loading

Choose a reason for hiding this comment

bfhealy Feb 6, 2023

Choose a reason for hiding this comment

bfhealy Feb 6, 2023

Choose a reason for hiding this comment

bfhealy Feb 6, 2023

Choose a reason for hiding this comment

bfhealy Feb 6, 2023

Choose a reason for hiding this comment

bfhealy Feb 6, 2023

Choose a reason for hiding this comment

bfhealy left a comment •

edited

Loading