You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When the same step permits any of multiple transformations that rely on different parameter sets, problems of intuition and implementation arise. Maybe the latter could be creatively resolved, but the former indicate that this should not be required. Therefore, different steps step_phom_*() should be implemented for PH transformations that rely on different parameter sets.
The steps could correspond to packages/engines, to filtrations, to data format, or to something else. Here are some relevant considerations:
packages/engines, e.g., step_phom_ripserr(): For most users, the engine is the last consideration. Moreover, parameter sets for the same mathematical transformation differ only syntactically (and maybe by availability) between packages/engines. This would not be appropriate.
filtrations, e.g. step_phom_vietoris_rips(): This is the most mathematically descriptive and prescriptive option. However, it fails to take advantage of the fact that most or all parameters are shared between different filtrations on the same data, e.g. Vietoris–Rips, Čech, and alpha.
data format, e.g. step_phom_dist(): This would discourage the bad practice of having, say, a list-column of point clouds in different formats (distance matrix, coordinate matrix, data frame). However, like the package/engine, this is not most users' first consideration. Indeed, the data format might change within or between workflows depending on up- or down-stream factors and it would be a hassle to have to replace the step in response to such changes. It would also introduce confusion in the event that different mathematical objects that require different filtrations may be encoded in the same format, e.g. is a numeric matrix the coordinate matrix of a point cloud or the intensity matrix of a 2D image?
I plan to separate the steps by the mathematical object they encode, e.g. step_phom_point_cloud(), step_phom_image(), and step_phom_function(). This is, i expect, most users' first consideration and also corresponds closely to distinct parameter sets. Indeed, engine deployment for the simplicial complex layers in {ggtda} is exclusively for point clouds, and parameters like diameter_max are suitable that would not be for, say, cubical complexes. (Though deployment for the computation of persistent homology on list-columns of data sets should be extended to other data types, as it implicitly is for time series.)
Feedback is welcome.
EDIT: Here is a preliminary scope checklist:
point_cloud (dist, coordinate matrix or data.frame, time series); {ripserr}, {TDA}
lattice (array, image matrix; could be extended to raster classes); {ripserr}
function (named or anonymous function, formula; use rlang::as_function())
reeb_graph or just reeb (igraph with numeric height attribute, dendrogram)
Numeric arrays encode more than images via light intensity; for example, the volcano data set encodes topography via elevation. I don't know of a better general term for this data type than "grid".
EDIT: As of 6a2f786, this scope has been renamed to lattice.
When the same step permits any of multiple transformations that rely on different parameter sets, problems of intuition and implementation arise. Maybe the latter could be creatively resolved, but the former indicate that this should not be required. Therefore, different steps
step_phom_*()
should be implemented for PH transformations that rely on different parameter sets.The steps could correspond to packages/engines, to filtrations, to data format, or to something else. Here are some relevant considerations:
step_phom_ripserr()
: For most users, the engine is the last consideration. Moreover, parameter sets for the same mathematical transformation differ only syntactically (and maybe by availability) between packages/engines. This would not be appropriate.step_phom_vietoris_rips()
: This is the most mathematically descriptive and prescriptive option. However, it fails to take advantage of the fact that most or all parameters are shared between different filtrations on the same data, e.g. Vietoris–Rips, Čech, and alpha.step_phom_dist()
: This would discourage the bad practice of having, say, a list-column of point clouds in different formats (distance matrix, coordinate matrix, data frame). However, like the package/engine, this is not most users' first consideration. Indeed, the data format might change within or between workflows depending on up- or down-stream factors and it would be a hassle to have to replace the step in response to such changes. It would also introduce confusion in the event that different mathematical objects that require different filtrations may be encoded in the same format, e.g. is a numeric matrix the coordinate matrix of a point cloud or the intensity matrix of a 2D image?I plan to separate the steps by the mathematical object they encode, e.g.
step_phom_point_cloud()
,step_phom_image()
, andstep_phom_function()
. This is, i expect, most users' first consideration and also corresponds closely to distinct parameter sets. Indeed, engine deployment for the simplicial complex layers in {ggtda} is exclusively for point clouds, and parameters likediameter_max
are suitable that would not be for, say, cubical complexes. (Though deployment for the computation of persistent homology on list-columns of data sets should be extended to other data types, as it implicitly is for time series.)Feedback is welcome.
EDIT: Here is a preliminary scope checklist:
point_cloud
(dist
, coordinatematrix
ordata.frame
, time series);{ripserr}, {TDA}lattice
(array
, imagematrix
; could be extended to raster classes);{ripserr}function
(named or anonymousfunction
,formula
; userlang::as_function()
)reeb_graph
or justreeb
(igraph
with numeric height attribute,dendrogram
)filtration
(Rcpp_SimplexTree
); {simplextree}, {TDA}The text was updated successfully, but these errors were encountered: