Issue with tibble format in sits #1224

safalabolo · 2024-10-15T15:30:33Z

I would like to use sits starting from my raster data and ground truth, but I am encountering some issues. In particular, it seems that the tibble I created (tibble_result), although very similar to the one provided in the documentation (cerrado_2classes), is not in a format that sits can process.
Below is the code.
Thank you for your feedback.

Load necessary packages

library(sf) # For handling spatial data (shapefiles)
library(terra) # For handling raster data
library(tidyverse) # For data manipulation and visualization

1. Load the shapefile containing points

shapefile_path <- "D:/2023_agropontino/03_EO/class/input/shape_sub/gt_erbaceo_distretti_maggio_agosto_rev2.shp"
points <- st_read(shapefile_path) # Read the shapefile into a spatial dataframe
Reading layer gt_erbaceo_distretti_maggio_agosto_rev2' from data source D:\2023_agropontino\03_EO\class\input\shape_sub\gt_erbaceo_distretti_maggio_agosto_rev2.shp' using driver `ESRI Shapefile'
Simple feature collection with 1987 features and 10 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 319366.8 ymin: 4569939 xmax: 353301.1 ymax: 4604548
Projected CRS: WGS 84 / UTM zone 33N

2. Create a data frame for correspondence mapping

corrispondenze <- data.frame(

class2 = c(0, 2, 10, 11, 12, 13, 14, 15, 16, 17, 22),
nome = c("nudo", "kiwi", "vernino", "mais1", "ortive1", "mais2", "medica", "doppia", "ortive2", "asparago", "erbai")
)

3. Join the correspondence dataframe with the shapefile

points <- points %>%

left_join(corrispondenze, by = "class2") # Merge based on the 'class2' column

4. Verify the addition of the 'nome' column

print(head(points)) # Display the first few rows to confirm the merge
Simple feature collection with 6 features and 11 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 330199.6 ymin: 4573826 xmax: 352157.7 ymax: 4595130
Projected CRS: WGS 84 / UTM zone 33N
Source gid id name crop crop2 date note class class2 nome geometry
1 gt_20230810 107 1300 WPT 2157 residui grano residui grano 20230810 10 10 vernino POINT (330456.3 4594962)
2 gt_20230810 183 1376 WPT 2233 residui grano residui grano 20230810 0 0 nudo POINT (330199.6 4595130)
3 gt_20230810 231 1424 WPT 1143 giardino in irrigazione giardino irriguo 20230810 x 5 22 erbai POINT (351926.8 4573826)
4 gt_20230810 234 1427 WPT 1146 residui grano residui grano 20230810 0 0 nudo POINT (351995 4574030)
5 gt_20230810 237 1430 WPT 1149 zucchine zucchine 20230810 15 15 doppia POINT (352157.7 4574082)
6 gt_20230810 238 1431 WPT 1150 ortaggi ortaggi 20230810 17 17 asparago POINT (352142 4573974)

4. (Optional) Save the new shapefile with the added column

#output_shapefile_path <- "D:/2023_agropontino/03_EO/class/input/shape_sub/gt_erbaceo_distretti_maggio_agosto_rev2_modificato.shp"
#st_write(points, output_shapefile_path) # Save the modified points with the new column

5. Define the input directory and load NDVI raster files

indir <- 'D:/2023_agropontino/03_EO/class/input/raster/ndvi_ws_asIn_conf/'
inFile <- list.files(path = indir, pattern = glob2rx("*.tif$"), full.names = TRUE, recursive = FALSE) # List all .tif files

3. Extract dates from the .tif filenames

dates_ndvi <- as.Date(substring(basename(inFile), 1, 8), "%Y%m%d") # Convert the extracted substrings to Date objects

6. Use terra::rast to create a raster stack (SpatRaster)

ndvi_stack <- rast(inFile) # Load raster files into a SpatRaster object

7. Ensure the projection of points matches that of the raster

points <- st_transform(points, crs = crs(ndvi_stack)) # Transform the CRS of points to match the raster's

8. Precise extraction of NDVI values from points for each date

ndvi_values <- terra::extract(ndvi_stack, vect(points), fun = mean, na.rm = TRUE) # Extract mean NDVI values at point locations

7. Create a tibble for each point

tibble_result <- tibble(

latitude = st_coordinates(points)[, 2], # Get latitude from coordinates
longitude = st_coordinates(points)[, 1], # Get longitude from coordinates
start_date = min(dates_ndvi), # Start date (first observation)
end_date = max(dates_ndvi), # End date (last observation)
label = as.character(points$nome), # Labels from points
cube = "S2" # Dataset name
)

9. Add extracted NDVI values to the tibble

Convert ndvi_values to a dataframe and remove the ID column

ndvi_df <- as.data.frame(ndvi_values)
colnames(ndvi_df) <- c("ID", as.character(dates_ndvi)) # Rename columns with dates
ndvi_df <- ndvi_df[, -1] # Remove the ID column

Create a new tibble to combine NDVI information

tibble_ndvi <- tibble(

Index = rep(dates_ndvi, each = nrow(ndvi_df)), # Repeat dates for each point
NDVI = as.vector(t(ndvi_df)) # Transpose NDVI values into a vector
)

Add NDVI values to the resulting tibble

tibble_result <- tibble_result %>%

mutate(time_series = map(1:nrow(ndvi_df), ~ tibble_ndvi[seq(.x, nrow(tibble_ndvi), by = nrow(points)), ])) # Create time series for each point

print(tibble_result) # Print the result tibble

A tibble: 1,987 × 7

latitude longitude start_date end_date label cube time_series

1 4594962. 330456. 2023-01-05 2023-12-21 vernino S2 <tibble [42 × 2]>
2 4595130. 330200. 2023-01-05 2023-12-21 nudo S2 <tibble [42 × 2]>
3 4573826. 351927. 2023-01-05 2023-12-21 erbai S2 <tibble [42 × 2]>
4 4574030. 351995. 2023-01-05 2023-12-21 nudo S2 <tibble [42 × 2]>
5 4574082. 352158. 2023-01-05 2023-12-21 doppia S2 <tibble [42 × 2]>
6 4573974. 352142. 2023-01-05 2023-12-21 asparago S2 <tibble [42 × 2]>
7 4573924. 352174. 2023-01-05 2023-12-21 doppia S2 <tibble [42 × 2]>
8 4573893. 352402. 2023-01-05 2023-12-21 nudo S2 <tibble [42 × 2]>
9 4574184. 352539. 2023-01-05 2023-12-21 doppia S2 <tibble [42 × 2]>
10 4574203. 352513. 2023-01-05 2023-12-21 asparago S2 <tibble [42 × 2]>

ℹ 1,977 more rows

ℹ Use `print(n = ...)` to see more rows

tibble_result[[7]][[1]] # Access specific entry in the tibble

A tibble: 42 × 2

Index NDVI

1 2023-01-05 0.221
2 2023-01-30 0.185
3 2023-02-04 0.807
4 2023-02-14 0.960
5 2023-03-16 0.309
6 2023-03-21 0.00304
7 2023-03-26 0.768
8 2023-04-10 0.799
9 2023-04-20 0.346
10 2023-04-25 0.120

ℹ 32 more rows

ℹ Use `print(n = ...)` to see more rows

Load additional package for time series analysis

library(sits)

Check data types of the tibble

sapply(tibble_result, class) # Check classes of columns in the result tibble
latitude longitude start_date end_date label cube time_series
"numeric" "numeric" "Date" "Date" "character" "character" "list"
sapply(cerrado_2classes, class) # Check classes in another dataset
longitude latitude start_date end_date label cube time_series
"numeric" "numeric" "Date" "Date" "character" "character" "list"

Show the first few rows of the result tibble

head(tibble_result) # Display the first rows of the tibble

A tibble: 6 × 7

latitude longitude start_date end_date label cube time_series

1 4594962. 330456. 2023-01-05 2023-12-21 vernino S2 <tibble [42 × 2]>
2 4595130. 330200. 2023-01-05 2023-12-21 nudo S2 <tibble [42 × 2]>
3 4573826. 351927. 2023-01-05 2023-12-21 erbai S2 <tibble [42 × 2]>
4 4574030. 351995. 2023-01-05 2023-12-21 nudo S2 <tibble [42 × 2]>
5 4574082. 352158. 2023-01-05 2023-12-21 doppia S2 <tibble [42 × 2]>
6 4573974. 352142. 2023-01-05 2023-12-21 asparago S2 <tibble [42 × 2]>

head(cerrado_2classes) # Display the first rows of the other dataset

A tibble: 6 × 7

longitude latitude start_date end_date label cube time_series

1 -54.2 -14.0 2000-09-13 2001-08-29 Cerrado MOD13Q1 <tibble [23 × 3]>
2 -54.2 -14.0 2001-09-14 2002-08-29 Cerrado MOD13Q1 <tibble [23 × 3]>
3 -54.2 -14.0 2002-09-14 2003-08-29 Cerrado MOD13Q1 <tibble [23 × 3]>
4 -54.2 -14.0 2003-09-14 2004-08-28 Cerrado MOD13Q1 <tibble [23 × 3]>
5 -54.2 -14.0 2004-09-13 2005-08-29 Cerrado MOD13Q1 <tibble [23 × 3]>
6 -54.2 -14.0 2005-09-14 2006-08-29 Cerrado MOD13Q1 <tibble [23 × 3]>

Perform k-fold validation using Random Forest on the results

rfor_validate <- sits_kfold_validate(

samples = tibble_result, # Use the result tibble as samples
folds = 5, # Number of folds for validation
ml_method = sits_rfor(), # Specify the machine learning method (Random Forest)
multicores = 5 # Number of cores for parallel processing
)
Error in purrr::map():
ℹ In index: 1.
Caused by error in UseMethod():
! su un oggetto di classe ".predictors" è stato usato un metodo non applicabile per "c('tbl_df', 'tbl', 'data.frame')"
Run rlang::last_trace() to see where the error occurred.

The text was updated successfully, but these errors were encountered:

gilbertocamara assigned OldLipe Oct 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with tibble format in sits #1224

Issue with tibble format in sits #1224

safalabolo commented Oct 15, 2024

Load necessary packages

1. Load the shapefile containing points

2. Create a data frame for correspondence mapping

3. Join the correspondence dataframe with the shapefile

4. Verify the addition of the 'nome' column

4. (Optional) Save the new shapefile with the added column

5. Define the input directory and load NDVI raster files

3. Extract dates from the .tif filenames

6. Use terra::rast to create a raster stack (SpatRaster)

7. Ensure the projection of points matches that of the raster

8. Precise extraction of NDVI values from points for each date

7. Create a tibble for each point

9. Add extracted NDVI values to the tibble

Convert ndvi_values to a dataframe and remove the ID column

Create a new tibble to combine NDVI information

Add NDVI values to the resulting tibble

Load additional package for time series analysis

Check data types of the tibble

Show the first few rows of the result tibble

Perform k-fold validation using Random Forest on the results

Issue with tibble format in sits #1224

Issue with tibble format in sits #1224

Comments

safalabolo commented Oct 15, 2024

Load necessary packages

1. Load the shapefile containing points

2. Create a data frame for correspondence mapping

3. Join the correspondence dataframe with the shapefile

4. Verify the addition of the 'nome' column

4. (Optional) Save the new shapefile with the added column

5. Define the input directory and load NDVI raster files

3. Extract dates from the .tif filenames

6. Use terra::rast to create a raster stack (SpatRaster)

7. Ensure the projection of points matches that of the raster

8. Precise extraction of NDVI values from points for each date

7. Create a tibble for each point

9. Add extracted NDVI values to the tibble

Convert ndvi_values to a dataframe and remove the ID column

Create a new tibble to combine NDVI information

Add NDVI values to the resulting tibble

A tibble: 1,987 × 7

ℹ 1,977 more rows

ℹ Use print(n = ...) to see more rows

A tibble: 42 × 2

ℹ 32 more rows

ℹ Use print(n = ...) to see more rows

Load additional package for time series analysis

Check data types of the tibble

Show the first few rows of the result tibble

A tibble: 6 × 7

A tibble: 6 × 7

Perform k-fold validation using Random Forest on the results

ℹ Use `print(n = ...)` to see more rows

ℹ Use `print(n = ...)` to see more rows