-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make sure NetworkX can properly read/write GEXF 1.3 #16
Comments
@mbastian looks like networkx can't do lists of floats as a attributes, which were added in 1.3. |
@mbastian to be specific about my use case... in reading the code, it looks like Gephi's format assumes that lists of properties are time series. My use case is that I want to store a 384-dimension embedding from a paraphrase embedding of a citation graph's node properties on the nodes and do analysis in NetworkX and then also use this GEXF file in Deep Graph Library (DGL) and PyG aka PyTorch Geometric. Dataset: https://snap.stanford.edu/data/cit-HepTh.html Example code below JSONizes the embedding list of floats to make things go, but I'd like to be able to store it. @mbastian Can you make GEXF support embeddings moving forward in the next version? # Embed the abstracts for GNN features. Embedding is a generic approach for retrieval as well.
# Note: NetworkX can't save lists in GEXF format, so we'll JSONize the list & save the embeddings separately.
embedded_abstracts: np.ndarray = None
if os.path.exists("data/embedded_abstracts.npy"):
embedded_abstracts = np.load("data/embedded_abstracts.npy")
else:
embedded_abstracts = embed_paper_info(all_abstracts, convert_to_tensor=False)
np.save("data/embedded_abstracts.npy", embedded_abstracts)
for paper_id, emb in zip(file_paper_ids, embedded_abstracts):
assert emb.shape == (384,)
# Gephi assumes a list of floats is a time series, so we need to convert to a string
G.nodes[file_to_networkx_ids[paper_id]]["Embedding-JSON"] = json.dumps(emb.tolist()) Example document:
Its embedding: [-0.5083363652229309, -0.35725411772727966, 0.1389939785003662, -0.1347253918647766, -0.1535784900188446, 0.43154388666152954, 0.15374013781547546, -0.008106844499707222, -0.1662866771221161, -0.15766437351703644, 0.35521116852760315, 0.15607962012290955, 0.6218618750572205, 0.07288412749767303, -0.08790934085845947, -0.145784392952919, 0.14549043774604797, -0.03458674997091293, -0.741215705871582, 0.019919676706194878, -0.2773298919200897, -0.16332964599132538, -0.42131808400154114, 0.06080969050526619, 0.55726158618927, 0.18690286576747894, -0.19952552020549774, 0.23189248144626617, 0.39608946442604065, 0.031538791954517365, 0.4129146337509155, 0.37623560428619385, 0.16398969292640686, 0.09904278814792633, 0.5887687802314758, 0.19061870872974396, -0.020812658593058586, 0.6324356198310852, 0.005971217527985573, 0.2787822186946869, 0.20738601684570312, -1.136680006980896, 0.4140499532222748, 0.7376874685287476, 0.26450657844543457, 0.08141785860061646, -0.529627799987793, -0.07897279411554337, 0.302225261926651, 0.26963791251182556, -0.5572066307067871, 0.022079501301050186, -0.41076093912124634, -0.16617120802402496, -0.014963116496801376, 0.2403220683336258, 0.03146751970052719, -0.514580488204956, 0.02357768639922142, -0.19823256134986877, -0.1633021980524063, 0.14651842415332794, -0.5526030659675598, 0.5041884183883667, 0.20464496314525604, 0.16364993155002594, -0.0379401370882988, -0.16234970092773438, 0.273735910654068, 0.4701267182826996, 0.38202783465385437, 0.6249184608459473, -0.6957732439041138, -0.4264785051345825, 0.06444322317838669, 0.6805640459060669, -0.3116794228553772, 0.009198327548801899, -0.18131123483181, -0.4511978328227997, 0.2052099108695984, -0.7076764106750488, -0.2577372193336487, -0.11397387087345123, 0.004945039749145508, 0.29662612080574036, 0.48335978388786316, 0.16308338940143585, 0.02071310393512249, -0.06133018806576729, 0.3547375500202179, -0.015222515910863876, -0.3296150863170624, 0.27946799993515015, 0.10797177255153656, 0.5158742070198059, 0.3182218670845032, -0.1535983383655548, 0.6189644932746887, 0.16411934792995453, -0.20841538906097412, -0.09344162046909332, -0.5550981760025024, -0.0629420131444931, -0.5624946355819702, -0.6402942538261414, -0.201442688703537, 0.18017089366912842, 0.27435120940208435, 0.18869590759277344, 0.04372529685497284, -0.3697742521762848, -0.06247770041227341, 0.14726705849170685, -0.5059475302696228, 0.17057615518569946, 0.49116864800453186, 0.303863525390625, 0.7109688520431519, -0.08683305978775024, 0.4489392042160034, 0.8849781155586243, 0.2691556513309479, 0.054163508117198944, 0.20481964945793152, -0.047171857208013535, 0.49669820070266724, 0.3995380997657776, -0.2686813771724701, -0.1840616762638092, -0.03536504507064819, -0.6438066959381104, 0.0884658545255661, -0.049895793199539185, 0.1340586543083191, 0.008303023874759674, 0.12762904167175293, 0.19640912115573883, 0.09768808633089066, -0.17605964839458466, 0.03801923617720604, 0.22554127871990204, -0.0682666227221489, -0.21554642915725708, 0.34073975682258606, -0.1460971236228943, -0.6941462755203247, 0.20569857954978943, 0.5059947967529297, -0.3478425145149231, -0.13772228360176086, -0.06816817820072174, -0.5381731390953064, 0.05074828490614891, 0.06547494232654572, -0.29076358675956726, -0.15378691256046295, 0.2487240433692932, 0.3956683874130249, 0.28119516372680664, -0.36075934767723083, -0.13970033824443817, 0.3972870111465454, 0.24897192418575287, 0.39377814531326294, 0.28017812967300415, 0.5327494740486145, -0.4372592270374298, -0.33479222655296326, 0.06613282114267349, 0.4145204424858093, -0.09375417977571487, 0.006537675857543945, 0.44525378942489624, 0.03501797467470169, -0.2608524560928345, -0.006014466285705566, -0.036333389580249786, -0.537621796131134, 0.18642160296440125, 0.07950431853532791, -0.2662293016910553, -0.24478109180927277, -0.5388363003730774, 0.0674142986536026, 0.006562564522027969, 0.13258269429206848, 0.43928781151771545, 0.14479145407676697, -0.6222834587097168, -0.33258986473083496, -0.6179389357566833, -0.2406272441148758, 0.014090614393353462, -0.3714263439178467, -0.412462443113327, 0.27592408657073975, 0.0349738746881485, -0.2271711528301239, 0.5821718573570251, -0.36073049902915955, -0.2708200216293335, 0.20686064660549164, -0.23197627067565918, 0.042743708938360214, 0.14470048248767853, -0.024556558579206467, -0.6748477816581726, -0.16571849584579468, 0.20108835399150848, -0.07298190146684647, -0.5514233112335205, -0.06006268784403801, -0.04524163901805878, 0.012701082974672318, 0.41854313015937805, -0.23032033443450928, -0.7118092179298401, -0.3731357455253601, -0.038922086358070374, 0.11315789818763733, -0.19573336839675903, 0.5248740911483765, -0.8068038821220398, -0.3490540087223053, 0.6316984295845032, -0.24007821083068848, 0.19816532731056213, 0.02993026375770569, -0.09062369167804718, 0.32186055183410645, 0.41794851422309875, 0.504360556602478, 0.1191108375787735, 0.3482481837272644, 0.15071724355220795, 0.05511059984564781, -0.14041967689990997, 0.18092676997184753, 0.02112441509962082, 0.1610906720161438, 0.03389054536819458, -0.15241602063179016, -0.1575293093919754, -0.12149085104465485, 0.5990638136863708, -0.7717245817184448, -0.04483901336789131, 0.19884341955184937, 0.10792878270149231, 0.10256698727607727, -0.5565033555030823, 0.029021425172686577, 0.16152621805667877, 0.3552182912826538, -0.19814762473106384, 0.19467827677726746, -0.1417803019285202, -0.4221956431865692, 0.29962822794914246, 0.6577330827713013, 0.17069461941719055, 0.28435853123664856, 0.21476049721240997, 0.8059138059616089, -0.048171523958444595, -0.16125980019569397, -0.07039059698581696, -0.09816092252731323, -0.1514281928539276, 0.24609962105751038, -0.0849226862192154, 0.09835521876811981, 0.32943952083587646, -0.25816798210144043, -0.06863641738891602, 0.049438249319791794, 0.025209199637174606, 0.08355040848255157, 0.21580441296100616, -0.41988956928253174, 0.07675647735595703, -0.14934852719306946, -0.4311261475086212, -0.3233030140399933, -0.19432544708251953, 0.09847439080476761, -0.24860693514347076, 0.1917468160390854, -0.04119320958852768, 0.036722056567668915, -0.21387654542922974, -0.0030690915882587433, -0.13641610741615295, 0.012929495424032211, 0.3078806400299072, -0.34233883023262024, 0.045709915459156036, 0.11729196459054947, 0.13548825681209564, -0.3334689736366272, 0.29789718985557556, 0.12125445902347565, 0.13667646050453186, -0.6150417327880859, 0.0011353977024555206, -0.012479695491492748, 0.2989681363105774, 0.3227967321872711, -0.052288718521595, 0.3666779100894928, -0.2939664423465729, 0.12823599576950073, -0.10072129964828491, -0.176337331533432, 0.2739074230194092, -0.26633912324905396, 0.43988385796546936, -0.09746330976486206, -0.2637675702571869, 0.02734220400452614, -0.20562905073165894, -0.6480699777603149, 0.1781962364912033, 0.17634740471839905, -0.07000317424535751, 0.3828813135623932, -0.6547756195068359, 0.15146368741989136, 0.03579747676849365, -0.007166197523474693, 0.15733617544174194, 0.046128399670124054, -0.7098756432533264, 0.22380834817886353, 0.3733425438404083, -0.7145859003067017, 0.18655464053153992, -0.4990553557872772, -0.2336399257183075, -0.3922877907752991, -0.12291472405195236, 0.3854149878025055, -0.3202831447124481, -0.0007252912037074566, 0.34592050313949585, -0.07235311716794968, 0.5941299796104431, -0.04594670981168747, -0.10191763192415237, 0.15881231427192688, 0.38152000308036804, 0.4613525867462158, 0.07394368201494217, -0.031655725091695786, -0.1491849571466446, -0.4769206941127777, 0.11919506639242172, 0.52707439661026, 0.12066393345594406, -0.3855656683444977, 0.0897144302725792, -0.015513844788074493, 0.8330134153366089, 0.44915086030960083, 0.07939314842224121, -0.387637197971344, 0.21580561995506287, 0.18721160292625427, -0.3700406849384308, -0.1043381541967392, 0.19310817122459412, 0.116238072514534, -0.40746667981147766, 0.7291035056114197, -0.43795716762542725, 0.22398078441619873, -0.24590949714183807, -0.06679191440343857, -0.5940830111503601, -0.018695345148444176, -0.33444738388061523, -0.09381847828626633, 0.18644794821739197] |
Oh uh, supporting embeddings in Gephi is going to be essential to keeping it relevant as graph AI and visualization merge and computing becomes more GPU-centric. |
Thanks @rjurney, I would be happy to chat about what would make it easier to handle embeddings in Gephi. The GEXF format supports float lists so if it's not properly imported in Gephi it must be a bug. Compatibility with NetworkX is surely also important. Let me investigate, I bet that we don't have much unit tests around lists import as it hasn't been super popular in the past. |
Cool, I will share my notebook with you so you can see. It isn't open source at this ppl t but I trust you ;) Another issue is that integer node IDs become strings. I have had to cast them back to integers. A lot of Python tools around |
What was the result of this discussion? |
also does pickle file format have same issue or not? |
NetworkX is the goto Python library for graph manipulation and already has a GEXF import/export.
Definition of done
The text was updated successfully, but these errors were encountered: