Add spam dataloader #37

Prateek0xeo · 2025-01-05T14:01:37Z

This PR refactors the spam_dataloader function and introduces robust unit tests to ensure it meets expected functionality.

Results:

All test cases pass, confirming expected behavior of the dataloader under various scenarios.
Ensures compatibility with the Spam dataset structure and Torch requirements.

dfalbel

Thanks for the PR @Prateek0xeo !
I added some review comments.

R/spam-dataloader.R

Prateek0xeo · 2025-01-15T06:01:06Z

@dfalbel Thank you for the feedback!

As requested I have removed the library(torch) from the PR
Now I am allowing the user to define how they want to create the dataloader by keeping the dataset definition separate from the dataloader creation.
By returning a torch::dataset object, the user has full control to decide:

Batch size.
Whether to shuffle the data.
Number of parallel workers.

ds <- spam_dataset(download = TRUE)
trying URL 'https://hastie.su.domains/ElemStatLearn/datasets/spam.data'
Content type 'unknown' length 698341 bytes (681 KB)
downloaded 681 KB

loader <- torch::dataloader(
dataset = ds,
batch_size = 32,
shuffle = TRUE,
num_workers = 4
)
batch <- dataloader_make_iter(loader) %>% dataloader_next()
dim(batch$x)
[1] 32 57
length(batch$y)
[1] 32

If additional modifications are needed, please let me know!

dfalbel

Hi @Prateek0xeo

Thanks for updating the PR! I added a couple more comments.
Can you also add the dataset to the Readme table?

dfalbel · 2025-01-15T15:56:10Z

tests/testthat/test-spam-dataloader.R

@@ -0,0 +1,16 @@
+if (requireNamespace("testthat", quietly = TRUE)) {


You don't need this require statement here. devtools::test() will make sure testthat is laoded.

dfalbel · 2025-01-15T15:56:43Z

tests/testthat/test-spam-dataloader.R

+
+  test_that("spam_dataloader works as expected", {
+
+    loader <- spam_dataloader(download = TRUE)


You probably need to update the test cases, as spam_dataloader is now called spam_dataset.

dfalbel · 2025-01-15T15:59:54Z

R/spam-dataloader.R

+      x = torch_tensor(x, dtype = torch_float()),
+      y = torch_tensor(y, dtype = torch_long())


This is likely to work in most scenarios because torch would already be loaded but in any case, we should prefix the call with:

Suggested change

x = torch_tensor(x, dtype = torch_float()),

y = torch_tensor(y, dtype = torch_long())

x = torch::torch_tensor(x, dtype = torch_float()),

y = torch::torch_tensor(y, dtype = torch_long())

Prateek0xeo · 2025-01-15T18:50:18Z

@dfalbel I have updated the PR in accordance to the new requests made

Prateek0xeo added 4 commits January 3, 2025 19:23

changes

8477e84

All_testcases_passed

4b55235

minor changes

07b7bf6

All changes done

1f7829b

dfalbel requested changes Jan 14, 2025

View reviewed changes

R/spam-dataloader.R Outdated Show resolved Hide resolved

R/spam-dataloader.R Outdated Show resolved Hide resolved

change request

e371741

dfalbel requested changes Jan 15, 2025

View reviewed changes

dfalbel reviewed Jan 15, 2025

View reviewed changes

Prateek0xeo added 2 commits January 16, 2025 00:12

new request excecuted

8a09c25

README.md Updated

a0c973c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add spam dataloader #37

Add spam dataloader #37

Prateek0xeo commented Jan 5, 2025

dfalbel left a comment

Prateek0xeo commented Jan 15, 2025 •

edited

Loading

dfalbel left a comment

dfalbel Jan 15, 2025

dfalbel Jan 15, 2025

dfalbel Jan 15, 2025

Prateek0xeo commented Jan 15, 2025

		@@ -0,0 +1,16 @@
		if (requireNamespace("testthat", quietly = TRUE)) {


		test_that("spam_dataloader works as expected", {

		loader <- spam_dataloader(download = TRUE)

		x = torch_tensor(x, dtype = torch_float()),
		y = torch_tensor(y, dtype = torch_long())

Add spam dataloader #37

Are you sure you want to change the base?

Add spam dataloader #37

Conversation

Prateek0xeo commented Jan 5, 2025

dfalbel left a comment

Choose a reason for hiding this comment

Prateek0xeo commented Jan 15, 2025 • edited Loading

dfalbel left a comment

Choose a reason for hiding this comment

dfalbel Jan 15, 2025

Choose a reason for hiding this comment

dfalbel Jan 15, 2025

Choose a reason for hiding this comment

dfalbel Jan 15, 2025

Choose a reason for hiding this comment

Prateek0xeo commented Jan 15, 2025

Prateek0xeo commented Jan 15, 2025 •

edited

Loading