Create `mica` from example dataset in Camtrap DP #220

peterdesmet · 2023-06-20T09:11:03Z

Rather than maintaining our own example dataset in inst/extdata, I suggest to create mica from the example dataset in Camtrap DP (also derived from the mica project), which lives https://github.com/tdwg/camtrap-dp/tree/main/example. It is versioned and valid with the format, so https://github.com/tdwg/camtrap-dp/tree/1.0-rc.1/example follows the 1.0-rc.1 specs. It is also complete and covers a number of use cases.

Delete https://github.com/inbo/camtraptor/tree/main/inst/extdata/mica: use Camtrap DP dataset instead
- Update read_camtrap_dp() tests to make use of a temp download: 10 fails
Delete https://github.com/inbo/camtraptor/tree/main/inst/extdata/mica_parsing_issues: this is only used to showcase problems() in read_camtrap_dp(), but as discussed with @damianooldoni it's not worth it for that. problems() could be explained with plain text in the example
- Create temp for read_camtrap_dp() parsing issues test: 1 fail
Decide what to do with https://github.com/inbo/camtraptor/tree/main/inst/extdata/mica_zenodo_5590881: this larger dataset is annoying to update and adds to the package size. Is it needed?
Update https://github.com/inbo/camtraptor/blob/main/R/data.R to document how the mica dataset was obtained.

The text was updated successfully, but these errors were encountered:

damianooldoni · 2023-09-05T09:42:12Z

In general I agree with @peterdesmet 👍
Small downsides about testing:

to avoid downloading files over and over from URLs in tests of read_camtrap_dp(), I would download them once and save them in tmp directory. In this way we can still test that read function works with local paths. After discussion with @PietrH: the nicest option is to use setup.R as described in testthat documentation. Worth to be checked. Probably overshooting as the tmp files are needed only by test-read_camtrap_dp.R. So, probably we can set it in test-read_camtrap_dp.R
we can modify the temporary files to create artificial parsing issues. In this way we can still check the correctness of the parsing issues returned by camtraptor.

mica_zenodo_5590881 is not needed and indeed is too big. I have used it while writing some functions as I needed bigger and more complex datapackages. But this doesn't justify its presence in inst/extdata

peterdesmet · 2023-09-06T10:01:49Z

peterdesmet · 2023-09-06T10:38:16Z

For reference, the deployments were changed as follows from old to new:

old	new
-	00a2c20d
29b7d356-4bb4-4ec4-b792-2af5cc32efa8	29b7d356
577b543a-2cf1-4b23-b6d2-cda7e2eac372	577b543a
62c200a9-0e03-4495-bcd8-032944f6f5a1	62c200a9
7ca633fa-64f8-4cfc-a628-6b0c419056d7	-

peterdesmet · 2023-09-06T15:46:45Z

@PietrH @damianooldoni I have reviewed all examples and tests

@damianooldoni see above for a couple of tests that still fail. It might have something to do with sex = NULL. Please pull the mica branch before debugging
@PietrH can you have a look at the failing write_eml test
@damianooldoni should we store something in inst/extdata now?

damianooldoni · 2023-09-07T08:25:52Z

Thanks @peterdesmet. I will give a look to the failing tests asap. About the need to store something in inst/extdata, following your first comment we don't need it. And my answer confirms such choice as the downsides of this choice can be easily solved.

Related to #220

damianooldoni · 2023-09-26T20:42:50Z

About failure in test-get_obs, the problem seems that I count number of distinct sequenceID per species and deployment in get_obs(), see https://github.com/inbo/camtraptor/blob/main/R/get_n_obs.R#L137.
@peterdesmet: is this still the right way to count obs?
In the test, I was just counting the number of rows (after filtering) as the sequence IDs were always unique.
So, if answer to my previous question is YES, then I have to make the test more robust, if the answer is NO, then I have to improve get_n_obs().

This fixes failure in test-get_n_species, related to #220

peterdesmet · 2023-09-27T07:07:29Z

I don't know what meaning users subscribe to "number of observations". See this example, which has 1 Ardea cinerea, 1 female Anas platyrhynchos, 1 male Anas platyrhynchos. If you group by deployment and scientificName, you will get (for this sequence alone):

2 observations (1 for Ardea, 1 for Anas) if you count the unique sequenceID/eventID (which is the same for all)
3 observations (1 for Ardea, 2 for Anas) if you count the unique observationID (which is different for all)

Note that this assumes you only use event-based observations. If you count media-based observations, you will get (for this sequence alone):

60 observations (1 * 30 for Ardea, 1 * 30 for Anas) if you count the unique sequenceID/eventID (which is the same for all)
90 observations (1 * 30 for Ardea, 2 * 30 for Anas) if you count the unique observationID (which is different for all)

In my opinion:

I would count the number of unique observationID (which should be the same as the number of rows). Counting by sequenceID/eventID would be a get_n_events().
I'm surprised the number of unique sequenceID was ever the same as the number of rows, it is not intended to.

peterdesmet · 2024-05-31T14:23:29Z

camtrapdp now uses the Camtrap DP example dataset via example_dataset(). Up to @damianooldoni to decide what is done for camtraptor. Work in #285 was abandoned.

damianooldoni · 2024-07-03T12:53:44Z

See #312. We close this issue as not relevant anymore.

peterdesmet added this to the v1.0 milestone Jun 20, 2023

damianooldoni self-assigned this Sep 4, 2023

peterdesmet assigned peterdesmet and unassigned damianooldoni Sep 6, 2023

damianooldoni mentioned this issue Sep 6, 2023

Remove media arg #272

Merged

damianooldoni added a commit that referenced this issue Sep 26, 2023

Solve error in test-get_n_individuals

c587b2f

Related to #220

damianooldoni added a commit that referenced this issue Sep 26, 2023

Update test for obs of unknown species

7fc616b

This fixes failure in test-get_n_species, related to #220

PietrH linked a pull request Nov 3, 2023 that will close this issue

220 Create mica from example dataset in Camtrap DP #285

Closed

PietrH mentioned this issue Nov 3, 2023

220 Create mica from example dataset in Camtrap DP #285

Closed

peterdesmet added the camtrapdp/camtraptor To be decided if this is related to camtrapdp or camtraptor label Mar 6, 2024

peterdesmet removed the camtrapdp/camtraptor To be decided if this is related to camtrapdp or camtraptor label May 31, 2024

damianooldoni closed this as completed Jul 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create `mica` from example dataset in Camtrap DP #220

Create `mica` from example dataset in Camtrap DP #220

peterdesmet commented Jun 20, 2023 •

edited

Loading

damianooldoni commented Sep 5, 2023 •

edited

Loading

peterdesmet commented Sep 6, 2023 •

edited by damianooldoni

Loading

peterdesmet commented Sep 6, 2023

peterdesmet commented Sep 6, 2023

damianooldoni commented Sep 7, 2023 •

edited

Loading

damianooldoni commented Sep 26, 2023

peterdesmet commented Sep 27, 2023

peterdesmet commented May 31, 2024

damianooldoni commented Jul 3, 2024

Create mica from example dataset in Camtrap DP #220

Create mica from example dataset in Camtrap DP #220

Comments

peterdesmet commented Jun 20, 2023 • edited Loading

damianooldoni commented Sep 5, 2023 • edited Loading

peterdesmet commented Sep 6, 2023 • edited by damianooldoni Loading

Tests to verify

Example to verify

peterdesmet commented Sep 6, 2023

peterdesmet commented Sep 6, 2023

damianooldoni commented Sep 7, 2023 • edited Loading

damianooldoni commented Sep 26, 2023

peterdesmet commented Sep 27, 2023

peterdesmet commented May 31, 2024

damianooldoni commented Jul 3, 2024

Create `mica` from example dataset in Camtrap DP #220

Create `mica` from example dataset in Camtrap DP #220

peterdesmet commented Jun 20, 2023 •

edited

Loading

damianooldoni commented Sep 5, 2023 •

edited

Loading

peterdesmet commented Sep 6, 2023 •

edited by damianooldoni

Loading

damianooldoni commented Sep 7, 2023 •

edited

Loading