-
-
Notifications
You must be signed in to change notification settings - Fork 192
Post workshop notes
These are notes taken by @ljwolf during and after the workshop on things that can be improved.
Some should be filed as issues, some might need more general assessment.
What, exactly, do we need from conda forge & how can we avoid it? Maybe in the future, we aim for only conda & pip until the frequently-encountered solving environment...
hang goes away.
Finding a consistent way to install rasterio across all platforms proved difficult, with many users having distinct errors across Windows & OSX
assume that users may have more than one kernel, so that our setup works by default on someone with no kernels (i.e. ours is the only installed kernel) and so that ours shows in the kernel selector. Especially when we're giving a workshop with other workshops.
because it is not unprojected WGS84.
after the next major release, we need to use pysal proper in this teaching material.
change all instances of language about "neighborhoods" to "residential districts" or "districts" for short
We tend to use the "neighborhood of an observation" language, and this is confusing when an observation is itself a "neighborhood." This change was made in the regression notebook, but not everwhere, and the shape data is still berlin-neighborhoods.geojson
. I thought I changed this, but I need to double check git history.
some polite ribbing in the lightning talk suggested we stop importing using aliases.
PySAL devs also might want to revisit the libpysal.api
stuff, if this is a prevailing sentiment.
Right now, the maps use 'Rd'
, which is a single color ramp. But, in some cases (such as the box-and-whisker map), we need to use a divergent scheme with a zero point at the median. This might require changes to geopandas, like @slumnitz's centered choropleth stuff. Not sure.
I think this could just be done with y[np.random.permutation(W.n)]
or something, but we'll have to see.
Also, maybe, a map with the same I/join count but different pattern would help.
eventually, replace this with the gds book spatial regression chapter, which is excellent on this front
again, made redundant by the forthcoming book chapter on the topic.
this will show that cluster "certainty" in spatially-constrained clustering often reveals some latent within-zone heterogeneity that the unconstrained clustering picks up. It helps to build intuition, I think.
for a clustering (maybe all or more than one) plot the distributions of silhouette scores to show how they work within groups, and their relationship to the map-average.
this is just pure sklearn, but we can't assume people have too much sklearn knowledge.