-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to arc.write to specify the OBJECTID/unique identifier field. #58
Comments
@jacpete thank you for the detailed feature request. As per the need for this enhancement, are there any workflows that require you to have specific IDs in the OBJECTID field? Our current mode of operation is creating these from scratch to make sure that display, navigation, and Geoprocessing functionalities work on the data seamlessly. Allowing a hard-coded Object ID exposes the output feature class to corruption and may make the ArcGIS Pro analysis on this output feature class impossible unless you recreate Object IDs. Here are our specs for ObjectIDs and their functions. |
I read through the linked documentation and I think the functionality I would like to implement is the highlighted bullet in the screenshot below: To do this we would add a fourth check that I forgot in my original post to ensure that the field has no NULL/NA values. Natively R's integer fields are already 32-bit so coercion to the integer class would handle that. My main issue is that it creates a new field when its not needed and changes the original unique identifier field to a new name so if a user like me loaded back in the data and wanted to do a join on OBJECTID they would get incorrect values because they should be joining on OBJECTID_1. I know the arcpy/arcgis packages in python handle this better and won't create a new OBJECTID field in the same circumstance (Example python code below). The python code is handling the designated OID field correctly the entire time. ExampleI am including a reproducible example using both Python and R. Running the code will save the data to a new GeoDatabase called testTrees.gdb at the root of your C drive. Feel free to change the path if needed. Pythonimport os
import re
import arcpy
import arcpy.management
import arcgis.features
def getFeatureSet(url, where = '1=1', fields = "*", objectIDs = None):
fl = arcgis.features.FeatureLayer(url)
if type(objectIDs) is list:
objectIDs = ','.join([str(ID) for ID in objectIDs])
fs = fl.query(where=where, out_fields=fields, return_geometry=True, object_ids= objectIDs)
return fs
def checkGdbExits(output):
output = os.path.normpath(output)
if re.search('.gdb', output) is not None:
pathParts = os.path.normpath(output).split(os.sep)
gdbID = list(map(lambda x: re.search('.gdb', x) is not None, pathParts)).index(True)
gdbPath = os.sep.join(pathParts[:gdbID+1])
if not os.path.exists(gdbPath):
arcpy.management.CreateFileGDB(out_folder_path=os.path.split(gdbPath)[0], out_name=os.path.split(gdbPath)[1])
return output
def saveFeatureSet(fs, output):
#Make sure output .gdb exists if output path has .gdb
output = checkGdbExits(output)
#Save feature set
fs.save(save_location=os.path.split(output)[0], out_name=os.path.split(output)[1])
def scrapeESRIServiceLayer(url, output, where = "1=1", fields = "*", objectIDs = None):
#Retrieve feature set
fs = getFeatureSet(url = url, where = where, fields = fields, objectIDs=objectIDs)
#Save feature set
saveFeatureSet(fs, output)
scrapeESRIServiceLayer(
url = "https://services.arcgis.com/V6ZHFr6zdgNZuVG0/arcgis/rest/services/Landscape_Trees/FeatureServer/0",
output = "C:\\testTrees.gdb\\trees_arcpy"
) RAnd then run the R code which also shows the difference between the files library(dplyr)
library(arcgisbinding)
arcgisbinding::arc.check_product()
## product: ArcGIS Pro (12.8.0.29751)
## license: Advanced
## version: 1.0.1.243
trees <- arcgisbinding::arc.select(arcgisbinding::arc.open("https://services.arcgis.com/V6ZHFr6zdgNZuVG0/arcgis/rest/services/Landscape_Trees/FeatureServer/0"))
arcgisbinding::arc.write("C:/testTrees.gdb/trees_arcgisbindings", data = trees, validate = TRUE, overwrite = TRUE)
#Load data in to look at the difference
trees_arcgisbindings <- arc.data2sf(arc.select(arc.open("C:/testTrees.gdb/trees_arcgisbindings")))
trees_arcpy <- arc.data2sf(arc.select(arc.open("C:/testTrees.gdb/trees_arcpy")))
#What columns are different
names(trees_arcgisbindings)[!names(trees_arcgisbindings) %in% names(trees_arcpy)]
## [1] "OBJECTID"
names(trees_arcpy)[!names(trees_arcpy) %in% names(trees_arcgisbindings)]
## character(0)
#Print the first few columns
dplyr::select(trees_arcgisbindings, 1:5)
## Simple feature collection with 1148 features and 5 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -9177809 ymin: 4247005 xmax: -9176814 ymax: 4247759
## Projected CRS: WGS 84 / Pseudo-Mercator
## First 10 features:
## OBJECTID FID Tree_ID Collected Crew geom
## 1 1 1 102 2012-10-04 19:00:00 Linden+ Forrest+ Johnny POINT (-9177312 4247151)
## 2 2 2 103 2012-10-04 19:00:00 Linden+ Forrest+ Johnny POINT (-9177303 4247155)
## 3 3 3 104 2012-10-04 19:00:00 Linden+ Forrest+ Johnny POINT (-9177382 4247204)
## 4 4 4 105 2012-10-04 19:00:00 Linden+ Forrest+ Johnny POINT (-9177390 4247219)
## 5 5 5 107 2012-10-07 19:00:00 Linden+ Adele+ Ed POINT (-9177392 4247235)
## 6 6 6 108 2012-10-07 19:00:00 Linden+ Adele+ Ed POINT (-9177406 4247253)
## 7 7 7 109 2012-10-07 19:00:00 Linden+ Adele+ Ed POINT (-9177411 4247257)
## 8 8 8 110 2012-10-07 19:00:00 Linden+ Adele+ Ed POINT (-9177415 4247255)
## 9 9 9 111 2012-10-07 19:00:00 Linden+ Adele+ Ed POINT (-9177401 4247271)
## 10 10 10 112 2012-10-09 19:00:00 Linden+ Joe+ Casey POINT (-9177416 4247277)
dplyr::select(trees_arcpy, 1:5)
## Simple feature collection with 1148 features and 5 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -9177809 ymin: 4247005 xmax: -9176814 ymax: 4247759
## Projected CRS: WGS 84 / Pseudo-Mercator
## First 10 features:
## FID Tree_ID Collected Crew Status geom
## 1 1 102 2012-10-04 19:00:00 Linden+ Forrest+ Johnny P POINT (-9177312 4247151)
## 2 2 103 2012-10-04 19:00:00 Linden+ Forrest+ Johnny P POINT (-9177303 4247155)
## 3 3 104 2012-10-04 19:00:00 Linden+ Forrest+ Johnny P POINT (-9177382 4247204)
## 4 4 105 2012-10-04 19:00:00 Linden+ Forrest+ Johnny P POINT (-9177390 4247219)
## 5 5 107 2012-10-07 19:00:00 Linden+ Adele+ Ed P POINT (-9177392 4247235)
## 6 6 108 2012-10-07 19:00:00 Linden+ Adele+ Ed U POINT (-9177406 4247253)
## 7 7 109 2012-10-07 19:00:00 Linden+ Adele+ Ed U POINT (-9177411 4247257)
## 8 8 110 2012-10-07 19:00:00 Linden+ Adele+ Ed P POINT (-9177415 4247255)
## 9 9 111 2012-10-07 19:00:00 Linden+ Adele+ Ed I POINT (-9177401 4247271)
## 10 10 112 2012-10-09 19:00:00 Linden+ Joe+ Casey P POINT (-9177416 4247277) The Python version will correctly identify the FID column as an ObjectID field and not create a new field called ObjectID while the R version will. While my suggestion for functionality wouldn't automatically prevent the creation of the new ObjectID field, it would allow a user to explicitly request FID to become the ObjectID field in the geodatabase with a command like: trees <- arcgisbinding::arc.select(arcgisbinding::arc.open("https://services.arcgis.com/V6ZHFr6zdgNZuVG0/arcgis/rest/services/Landscape_Trees/FeatureServer/0"))
arcgisbinding::arc.write("C:/testTrees.gdb/trees_arcgisbindings", data = trees, validate = TRUE, overwrite = TRUE, object_id = "FID") |
Has there been any progress on this? I recently had an issue with OBJECTIDs and a legacy database, and if this had been an option, I would have been able to complete the workflow in R with the arcgisbinding package. |
Description
This is a feature request to allow the specification of the OBJECTID field when using
arc.write()
. Currently a new field called OBJECTID is added every time you save a new feature. If you already have an OBJECTID field in the dataset it renames it to OBJECTID_1 and still creates a new OBJECTID field. I would hope for an additional parameter inarc.write()
named something likeobject_id
where you could specify a field name in your dataset to be used as the unique row identifier. The argument would take a field name as a string and would then do checks to ensure that the specified name exists in data, the data type in the column is integer or can be coerced to integer (if it is numeric), and that each row has a unique value. If any of these checks fail it could give a warning and default to creating the OBJECTID field (as it is currently) or alternatively error out is a message about what check failed. The default for this new variable could be NULL which could be set to mimic the current action for backwards compatibility. This would allow you to specify a current field name likefid
, orOBJECTID
as the unique identifier without forcing the creation of a new column in the dataset.Example of Current Action
Example of how new parameter would function
This is an theoretical example and will not run.
The text was updated successfully, but these errors were encountered: