Skip to content

Postprocessing Troubleshooting

Noel Merket edited this page Mar 27, 2024 · 11 revisions

Postprocessing Troubleshooting

Uploading results to AWS manually from Kestrel

Sometimes an upload fails or you didn't originally specify that you wanted to have your results uploaded. As an alternative to running with the --uploadonly flag, it is possible to manually upload your results to s3 manually.

Upload files through Globus

Follow the instructions about how to upload to S3 using Globus.

A few notes about Globus

You will probably want the nrel#kglobus_projects endpoint instead of eglobus as the tutorial says.

When it asks for the AWS credentials, ssh into Kestrel and run the following command:

cat /kfs2/shared-projects/buildstock/aws_credentials.sh

Copy those credentials into the correct location.

If you get an error about the keys not being valid (usually because we have to cycle those keys every few months) you can update the keys by doing this

image

image

image

Enter the new credentials, and hit "Continue".

Create and Run the AWS Glue Crawler

Once the results are fully uploaded to AWS, you still need to create the Glue/Athena tables manually.

  1. Log into the AWS console.
  2. Make sure you're in the region you want to be (probably Oregon).
  3. Go to "Services" > "AWS Glue"
  4. "Add Tables" > "add tables using a crawler"
  5. Name the crawler something
  6. Crawler source type: "Data stores"
  7. Add the data store:
    • Data store: s3
    • Crawl data in: Specified path in my account
    • Include path: The s3 path to the parquet data you just uploaded. You can navigate to it using the little folder icon.
  8. Don't add another data store
  9. Choose IAM role:
    • Choose existing IAM role: AWSGlueServiceRole-default
  10. Schedule: Run on demand
  11. Crawler output:
    • Database: Select the database you want
    • Prefix: Choose your prefix i.e. myanalysis_
    • Leave other options alone.
  12. Review and Finish.
  13. At the top of the list of crawlers there will be a banner asking if you want to run the crawler you just created. Select "yes".
  14. Wait for the crawler to complete.
  15. Delete the crawler.

Common Errors

If you are getting an error in postprocessing that is like TypeError: not a path-like object, do a pull from buildstockbatch/master and recreate your environment. That should clear up the issue. If you’re using the buildstock-0.18 or greater environment on Eagle you should be okay.