Preserving w order in local Geary values #195

jeffcsauer · 2021-11-24T14:17:04Z

Addressing: #192

Explanation of issue: As noted by @Michel-Anton, the current use of .groupby in local_geary.py and local_geary_MV.py can lead to a loss of order in the weight IDs, specifically when the weight IDs use a custom order. The consequence of this loss of order is that the resulting G and p-values will be out of order, creating possibilities for mapping or interpreting values incorrectly.

Explanation of fix: as suggested by Michel-Anton and @ljwolf in the issue ticket, the order of the temporary object where values are stored should be rearranged according to the values of the weight IDs. No dynamic code should be needed as weights operating on 0-n will be unchanged when reordered, and

To fix the issue I have added a couple small lines. This is not the most elegant solution (adj_list_gs = adj_list_gs.iloc[w.id_order] should have worked as a one-liner but for some reason this would only work outside of a function) but it appears to work.

Old code:

adj_list_gs.columns = ["gs", "ID"]
adj_list_gs = adj_list_gs.groupby(by="ID").sum()
localG = adj_list_gs.gs.values
return localG

New code:

adj_list_gs.columns = ["gs", "ID"]
adj_list_gs = adj_list_gs.groupby(by="ID").sum()
# Rearrange data based on w id order
adj_list_gs['w_order'] = w.id_order
adj_list_gs.sort_values(by='w_order', inplace=True)
localG = adj_list_gs.gs.values
return localG

Demonstration of new code handling weights ordered 0-n and shuffled:

Uses the guerry data, which are the 85 departments across France. First analysis is to use standard queen contiguity weights (ordered 0-n).

import libpysal as lp
import geopandas as gpd
# load the data - you will need to change the filepath
guerry_ds = gpd.read_file(("C:/Users/jeffe/Dropbox/LOSH_Letter_2020/Data/Guerry/Guerry.shp"))
# generate weights
w = lp.weights.Queen.from_dataframe(guerry_ds)
print("w weight ordering:", w.id_order[0:10])
# isolate y
y = guerry_ds['Donatns']
# run statistic
lG = Geary_Local(connectivity=w).fit(y)
print("local G values:", lG.localG[0:5])
print("local p values:", lG.p_sim[0:5])

w weight ordering: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
local G values: [0.18208704 0.56001403 0.97529461 0.21590694 0.61737256]
local p values: [0.193 0.053 0.066 0.169 0.448]

Now we randomly reshuffle the weight IDs and ensure that the new G and p values are different:

import random
neworder = list(np.array(range(0,85)))
random.Random(12345).shuffle(neworder)
w_remap = lp.weights.remap_ids(w, neworder)
print("new weight order:", w_remap.id_order[0:10])

new weight order: [39, 19, 30, 59, 43, 50, 28, 9, 56, 70]

This should impact our local G calculations...

lG = Geary_Local(connectivity=w_remap).fit(y)
print("local G values:",lG.localG[0:5])
print("local p values:",lG.p_sim[0:5])

local G values: [0.99314034 1.28169257 0.96518863 1.19722984 0.3317755 ]
local p values: [0.374 0.228 0.055 0.493 0.175]

If needed, next steps would be a quick validation by hand and/or comparison to the ongoing discussion at r-spatial/spdep#68

codecov · 2021-11-24T14:19:42Z

Codecov Report

Merging #195 (5a7e4ff) into master (9dac8bd) will increase coverage by 0.20%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #195      +/-   ##
==========================================
+ Coverage   77.32%   77.52%   +0.20%     
==========================================
  Files          45       45              
  Lines        4927     4931       +4     
==========================================
+ Hits         3810     3823      +13     
+ Misses       1117     1108       -9

Impacted Files	Coverage Δ
esda/geary_local.py	`88.13% <100.00%> (+0.41%)`	⬆️
esda/geary_local_mv.py	`100.00% <100.00%> (ø)`
esda/_version.py	`40.65% <0.00%> (+2.67%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9dac8bd...5a7e4ff. Read the comment docs.

Preserving w order in local Geary values

5a7e4ff

jeffcsauer requested a review from ljwolf November 24, 2021 14:17

jeffcsauer added the bug label Nov 24, 2021

ljwolf merged commit cd15d8c into pysal:master Dec 10, 2021

ljwolf mentioned this pull request Dec 10, 2021

Local Geary loses order #192

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preserving w order in local Geary values #195

Preserving w order in local Geary values #195

jeffcsauer commented Nov 24, 2021

codecov bot commented Nov 24, 2021 •

edited

Loading

Preserving w order in local Geary values #195

Preserving w order in local Geary values #195

Conversation

jeffcsauer commented Nov 24, 2021

codecov bot commented Nov 24, 2021 • edited Loading

Codecov Report

codecov bot commented Nov 24, 2021 •

edited

Loading