NO2 vs Temperature #44

Pratichhya · 2024-01-22T13:19:39Z

This is a follow-up notebook of the "NO2 emission and COVID Lockdown effects" notebook, where we explored the NO2 concentration during and post COVID lockdowns. During the finding, we noticed that there was a distinct pattern of rise and fall in the concentration with the change in temperature.

Hence, in this notebook, we want to explore the correlation between the concentration of NO2 caused by the change in temperature. Therefore, in this use case, we are using the NO2 layer of Sentinel-5P and the Land Surface Temperature(LST) layer of Sentinel-3 SLSTR.

jdries · 2024-01-22T13:29:48Z

@Pratichhya this collection SENTINEL3_SLSTR_L2_LST is not yet live, can you let us know where you found it?

Pratichhya · 2024-01-22T13:34:26Z

@jdries , ahh, I didn't realise that; I followed your comments in the teams. I usually start with the staging endpoint, and once everything works, I then migrate to the main. I tested it this morning in production, and it was fine, too, so I went forward with this.

Should I pause my use cases for those, including LST?

jdries · 2024-01-22T13:38:10Z

we indeed won't be able to release this one before we make the collection official, can still take a few weeks!

Pratichhya · 2024-01-22T13:39:36Z

Ahh, sure, in that case I can keep this and another usecase using LST on hold. 😄

soxofaan · 2024-01-22T15:33:31Z

not sure if related, but there was a user on openeo platform forum also asking about SENTINEL3_SLSTR_L2_LST: https://discuss.eodc.eu/t/mosaic-artifacts-for-sentinel-3-composites/676

Pratichhya · 2024-01-22T15:37:15Z

Oh haha no it's not related. And now I get why Jeroen asked for where I got it from.

Pratichhya · 2024-03-22T14:56:25Z

Sorry, I lost track of this PR, I think this is also ready for review

soxofaan · 2024-03-28T11:30:13Z

some notes

1 Prepare a NO2 datacube
- you define aoi as a GeoJSON-style Polygon, but you only use it as spatial_extent in load_collection. I would just use a classic bounding box dict, which is less verbose and a lot easier to adapt and play with
2 Prepare temperature datacube
- # CREATE MASK DATACUBE these kind of comments don't really add anything I think. In general as a rule of thumb, code comments in notebooks are often an anti-pattern (not always of course): if you want to explain something, use a markdown cell, which are a richer and more user-friendly medium.
- mask = mask >= 16384 I think that 16384 need some explanation (what does it mean, where does it come from, ..)
- Typically, data is stored in a 16-bit format to prevent it from occupying large space by 32-bit floating points. Therefore, we convert our LST value saved as kelvin to celsius. This is a very confusing statement: what has Kelvin-to-Celsius to do with data bit depth? And what is the goal of the operation here: avoid 32b floating point or force it? I guess the point is that by subtracting a float 273.15 you convert from 16bit int to (32b?) float. But that is far from clear to the inexperienced reader I'm afraid.
- about the resampling: I think it is not necessary to explicitly resample if you do a merge_cubes anyway (which will implicitly resamlpe). Unless the bilinear method is important
4 Download the merged cube
- it's noteworthy that the Incurred Costs amount to only seven openEO Credits is it necessary to mention a concrete cost here? I think it heavily depends on the spatiotemporal extent, and costs might change when we optimize things in the longer term. I would not mention a concrete cost, but just explain how to find it if that is necessary.
6 Let's do the plotting and further analysis
- The plot above suggests a potential... and The curve further suggests a potential relationsh.. why are these paragraphs in another font size and bold?
- ax2.set_ylabel("LST(K)") I think the plotted temperatures are in Celcius, not Kelvin
- dataset.NO2.mean(dim="x", skipna=True).mean(dim="y", skipna=True) mathematically speaking, taking the mean of means is not the same as the mean over x and y combined because of skipna=True, I think you can take the mean over multiple dimensions at the same time in xarray
- plt.subplots(2, 2, dpi=200, figsize=(8, 6)) try to stick around dpi=90 for notebooks that are just meant to be consumed on computer screens, otherwise font sizes (and file sizes) increase unnecessarily
- about using plot titles "DJF", "MAM", ... why not use self-describing titles like "Dec-Jan-Feb (DJF)", "Mar-Apr-May (MAM"), ...?

jdries · 2024-03-28T12:42:35Z

@soxofaan I asked to start including costs in samples, because this is a question that many people have. People want to develop a sense of cost, without having to run multiple workflows themselves first.
It's indeed depending on the area, so the statement can perhaps be revised to better reflect that.
Same for variability, can be mentioned, but it doesn't matter that much in the sense that even a cost of 14 credits would still be pretty low. It's about giving a sense of magnitude rather than exact numbers.

soxofaan · 2024-03-28T13:25:50Z

Thanks for that background.
I think it's better to do that programmatically in the notebook, instead of hardcoding something like that in markdown text.

For example:

You can inspect the incurred costs of this job within the web editor ... or like this:
job.describe()["costs"]

Note that just rendering the job object in a notebook (so a code cell with just job in it) also shows things like incurred costs, without need to point to the web editor. But I think this rendering is only visible when actively running the notebook yourself, it is lost in a readonly viewers like on github. Using job.describe()["costs"] is probably the most robust

soxofaan · 2024-03-28T13:32:28Z

if this is an important aspect for demo's and sample notebooks: it should not be too hard to provide a method like job.cost_overview() to get a short listing of costs (automatically with correct currency) and other usage stats like memory/cpu

Pratichhya · 2024-04-02T06:53:29Z

1 Prepare a NO2 datacube

you define aoi as a GeoJSON-style Polygon, but you only use it as spatial_extent in load_collection. I would just use a classic bounding box dict, which is less verbose and a lot easier to adapt and play with

updated as bbox

mask = mask >= 16384 I think that 16384 need some explanation (what does it mean, where does it come from, ..)

At this moment, I know that it is a kind of cloud code or flag code. But not sure if it has an in-depth explanation for this. I will cross-check with Caroline for more information once she is back.

Typically, data is stored in a 16-bit format to prevent it from occupying large space by 32-bit floating points. Therefore, we convert our LST value saved as kelvin to celsius. This is a very confusing statement: what has Kelvin-to-Celsius to do with data bit depth? And what is the goal of the operation here: avoid 32b floating point or force it? I guess the point is that by subtracting a float 273.15 you convert from 16bit int to (32b?) float. But that is far from clear to the inexperienced reader I'm afraid.

Updated

about the resampling: I think it is not necessary to explicitly resample if you do a merge_cubes anyway (which will implicitly resamlpe). Unless the bilinear method is important

No, bilinear is not important in this case. Thank you for suggestion, it seem much easier.

4 Download the merged cube

it's noteworthy that the Incurred Costs amount to only seven openEO Credits is it necessary to mention a concrete cost here? I think it heavily depends on the spatiotemporal extent, and costs might change when we optimize things in the longer term. I would not mention a concrete cost, but just explain how to find it if that is necessary.

Updated the sentence and also included the code to get the cost as a better option.

ax2.set_ylabel("LST(K)") I think the plotted temperatures are in Celcius, not Kelvin

updated

dataset.NO2.mean(dim="x", skipna=True).mean(dim="y", skipna=True) mathematically speaking, taking the mean of means is not the same as the mean over x and y combined because of skipna=True, I think you can take the mean over multiple dimensions at the same time in xarray

plt.subplots(2, 2, dpi=200, figsize=(8, 6)) try to stick around dpi=90 for notebooks that are just meant to be consumed on computer screens, otherwise font sizes (and file sizes) increase unnecessarily

about using plot titles "DJF", "MAM", ... why not use self-describing titles like "Dec-Jan-Feb (DJF)", "Mar-Apr-May (MAM"), ...?

updated

new usecase

6f89b8b

Pratichhya requested a review from soxofaan January 22, 2024 13:19

Pratichhya marked this pull request as draft January 22, 2024 13:39

Pratichhya added 2 commits March 22, 2024 15:38

updated the LST layer

630119b

python linting

deaeda1

Pratichhya marked this pull request as ready for review March 22, 2024 14:55

addressed first version commits

930ff53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NO2 vs Temperature #44

NO2 vs Temperature #44

Pratichhya commented Jan 22, 2024

jdries commented Jan 22, 2024

Pratichhya commented Jan 22, 2024 •

edited

Loading

jdries commented Jan 22, 2024

Pratichhya commented Jan 22, 2024

soxofaan commented Jan 22, 2024

Pratichhya commented Jan 22, 2024

Pratichhya commented Mar 22, 2024

soxofaan commented Mar 28, 2024

jdries commented Mar 28, 2024

soxofaan commented Mar 28, 2024

soxofaan commented Mar 28, 2024

Pratichhya commented Apr 2, 2024

NO2 vs Temperature #44

Are you sure you want to change the base?

NO2 vs Temperature #44

Conversation

Pratichhya commented Jan 22, 2024

jdries commented Jan 22, 2024

Pratichhya commented Jan 22, 2024 • edited Loading

jdries commented Jan 22, 2024

Pratichhya commented Jan 22, 2024

soxofaan commented Jan 22, 2024

Pratichhya commented Jan 22, 2024

Pratichhya commented Mar 22, 2024

soxofaan commented Mar 28, 2024

jdries commented Mar 28, 2024

soxofaan commented Mar 28, 2024

soxofaan commented Mar 28, 2024

Pratichhya commented Apr 2, 2024

Pratichhya commented Jan 22, 2024 •

edited

Loading