Skip to content

Commit

Permalink
Merge pull request #44 from datapages/irwpkg
Browse files Browse the repository at this point in the history
Irwpkg
  • Loading branch information
ben-domingue authored Jan 31, 2025
2 parents 9463867 + b9e0e2c commit c21f779
Show file tree
Hide file tree
Showing 4 changed files with 68 additions and 44 deletions.
10 changes: 5 additions & 5 deletions _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,16 +9,16 @@ website:
left:
- href: index.qmd
text: Home
- href: analysis.qmd
text: Getting Started
- href: data.qmd
text: Data Browser
- href: standard.qmd
text: Data Standard
- href: data.qmd
text: Data
- href: docs.qmd
text: Documentation
- href: analysis.qmd
text: Analysis
- href: research.qmd
text: Research
text: Research Examples
- href: about.qmd
text: About
- href: contact.qmd
Expand Down
63 changes: 60 additions & 3 deletions analysis.qmd
Original file line number Diff line number Diff line change
@@ -1,8 +1,66 @@
---
title: "Data analysis"
title: "Getting Started"
---

Here we provide example code in R and Python for loading multiple datasets in IRW and performing some summary computations over them. This should be a useful starting point for conducting your own analyses of the data. You may also be interested in the functionality embedded in the `irw` [package](https://github.com/ben-domingue/irw/tree/main/irw_pkg).

There are several ways of working with the IRW data. Below we will first describe how to get data from the IRW and then offer some suggestions for how to analyze it.

## Getting the data

There are several ways of getting data from the IRW.

* You can use the [Data Browser](data.qmd) to investigate individual datasets and then download them directly via [Redivis](https://redivis.com/datasets/as2e-cv7jb41fd/tables).

* You can also access IRW data programmatically. There are several ways of doing this that we describe below.

+ You can use a Redivis notebook. Consider some example workflows [here](https://redivis.com/workspace/studies/1812/workflows).

+ You can use the Redivis API for [R](https://apidocs.redivis.com/client-libraries/redivis-r) or [Python](https://apidocs.redivis.com/client-libraries/redivis-python) (**note** that you will first need to [generate and set an API token](https://apidocs.redivis.com/client-libraries/redivis-r/getting-started)). Given that we anticipate this being a popular means of using the IRW, we elaborate on how this can be done in the next section.

## Accessing IRW data from your machine

### Basic access

To illustrate a basic approach to using the IRW, we offer some code that shows a simple approach for downloading IRW data to your machine.

::: {.panel-tabset}
## R

```{r}
#| eval: false
#| echo: true
# first install redivis package: devtools::install_github("redivis/redivis-r", ref="main")
# individual dataset
dataset <- redivis::user("datapages")$dataset("item_response_warehouse")
df <- dataset$table("4thgrade_math_sirt")$to_tibble()
```

## Python

```{python}
#| eval: false
#| echo: true
#| python.reticulate: false
import redivis
# individual dataset
dataset = redivis.user('datapages').dataset('item_response_warehouse')
df = dataset.table('4thgrade_math_sirt').to_pandas_dataframe()
```
:::

### Flexible access

More complex workflows will benefit from using the custom R package `irwpkg`. We describe how this package can be used to flexibly download customized lists of data and do other useful things (e.g., generate lists of citations for data) [here](https://hansorlee.github.io/irwpkg/).


## Analyzing the data

Here we provide a first example of how you can work with IRW data. code in R and Python for loading multiple datasets in IRW and performing some summary computations over them. This should be a useful starting point for conducting your own analyses of the data.

::: {.panel-tabset}

Expand Down Expand Up @@ -83,7 +141,6 @@ summaries_list = [get_data_summary(name) for name in dataset_names]
summaries = pd.concat(summaries_list, ignore_index=True)
print(summaries)
```

:::

Here is a slightly more complex example showing how to compute the InterModel Vigorish contrasting predictings for the 2PL to predictions from the 1PL for an example dataset (using cross-validation across 4 folds).
Expand Down
37 changes: 2 additions & 35 deletions data.qmd
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
---
title: "Data"
title: "Data Browser"
---

{{< include _load-data.qmd >}}

Below we show metadata for the entire IRW, an example dataset, and illustrations of how to access the data programmatically. You can also explore the data [here](https://redivis.com/datasets/as2e-cv7jb41fd/tables).
Below we show metadata for the entire IRW which can be used to help identify datasets of potential interest. Individual datasets can then be further explored via the selection mentu below. You can also explore the data [here](https://redivis.com/datasets/as2e-cv7jb41fd/tables).

## Metadata

Expand Down Expand Up @@ -52,36 +52,3 @@ dataset_url = `https://redivis.com/embed/tables/datapages.item_response_warehous
html`<iframe id="myIframe" width="800" height="500" allowfullscreen style="border:0;" src = "${dataset_url}"></iframe>`
```

## Programmatic access

You can also access IRW data programmatically using the Redivis API for [R](https://apidocs.redivis.com/client-libraries/redivis-r) or [Python](https://apidocs.redivis.com/client-libraries/redivis-python) (**note** that you will first need to [generate and set an API token](https://apidocs.redivis.com/client-libraries/redivis-r/getting-started)). For example:

::: {.panel-tabset}
## R

```{r}
#| eval: false
#| echo: true
# first install redivis package: devtools::install_github("redivis/redivis-r", ref="main")
# individual dataset
dataset <- redivis::user("datapages")$dataset("item_response_warehouse")
df <- dataset$table("4thgrade_math_sirt")$to_tibble()
```

## Python

```{python}
#| eval: false
#| echo: true
#| python.reticulate: false
import redivis
# individual dataset
dataset = redivis.user('datapages').dataset('item_response_warehouse')
df = dataset.table('4thgrade_math_sirt').to_pandas_dataframe()
```
:::
2 changes: 1 addition & 1 deletion standard.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ Below are critical instructions for formatting data for the IRW.

4. If there are multiple scales available, the responses need to be split into mutiple files (one per scale). If multiple groups are assessed via the same scale, these data can be put into a single file (if desired, a column indicating group membership can be added).

While we have tried to offer generic guidance on formatting data to the IRW standard, there are innumerable idiosyncracies that may merit additional conversation. To discuss specific issues associated with your formatting your data to the IRW standard, please feel free to each out to us at `[email protected]`. We would be happy to talk more!
While we have tried to offer generic guidance on formatting data to the IRW standard, there are innumerable idiosyncracies that may merit additional conversation. To discuss specific issues associated with your formatting your data to the IRW standard, please feel free to each out to us at `[email protected]`. We would be happy to talk more! You can also use the [IRW Dataset Builder](https://irw-dataset-builder.streamlit.app/) that will help port data to the IRW data standard if you like.

## Adding data to the IRW

Expand Down

0 comments on commit c21f779

Please sign in to comment.