+ + + + +

+ + + + +

Datapasta provides RStudio addins and functions that give you complete freedom copy-paste data to and from your source editor, formatted for immediate use. Note: repeated use has been known to cause titilation and giddiness.


Places I’ve found this power useful:

  • Copying tables from Excel, Jupyter, and websites, where the source file cannot be easily read.
  • +
  • Embedding small-ish amounts of raw data from .csv into Rmarkdown files. The file thus contains code documentation and data, attaining the holy trinity of reproducibility.
  • +
  • Quickly pasting vector output from other queries into dplyr::filter( .. %in% ..).
  • +
  • Adding datasets to readily reproducible examples for posting to StackOverflow, Slack channels etc.
  • +
  • Creating c() expressions with a LOT less typing and fiddling.
  • +

Typical usage takes full advantage of addins within RStudio, however datapasta can be used with any R editor, even just the terminal. The typical RStudio case is described in full detail below, followed by the fallback behaviour.


+Typical Usage with Rstudio


+Pasting a table as a formatted tibble definition with tribble_paste() +


You can copy this html table of Brisbane weather forecasts:

tibble::tribble() or ‘transposed tibble’ is a really neat function that allows a tibble to be written in human readable format (Thanks be to Hadley).


To paste data as a tribble() call, just copy the table header and data rows, then paste into the source editor using the addin Paste as tribble. For best results, assign the addin to a memorable keyboard shortcut, e.g. ctrl + shift + t. See Customizing Keyboard Shortcuts.


tribble_paste() is a flexible function that guesses the separator and types of the data it pulls from the clipboard. Mostly this seems to work well. Occasionally it epic-fails. The supported separators are \| (pipe), \t (tab), , (comma), ;(semicolon). Most data copied from the internet or spreadsheets will be tab delimited. It will also attempt to recognise a lack of a header row and create a default for you, although this is not always possible.


+Pasting a list as a horizontal vector with vector_paste() +


A list could be a row or column of a spreadsheet or intermediate output. With the Paste as vector addin you can go from something like:

Mint    Fedora  Debian  Ubuntu  OpenSUSE


Mint, Fedora, Debian, Ubuntu, OpenSUSE




c("Mint", "Fedora", "Debian", "Ubuntu", "OpenSUSE")

This is pasted into the source editor at the current cursor.


Just like tribble_paste(), vector_paste() has a flexible parser that can guess the type and separator of the data. The supported separators are \| (pipe), \t (tab), , (comma), ;(semicolon) and end of line. The recommended keyboard shortcut is crtl + alt + shift + v.


+Pasting a list as a vertical vector with vector_paste_vertical() +


Given the same types of list inputs as above, the Paste as vector (vertical) addin pastes the output with each element on its own line, e.g.:

+ +

This is much nicer for long lists. I have found this is actually the version I use more often. I recommend using ctrl + shift + v as keyboard shortcut.


##Pasting as a data.frame with df_paste() The parser here is identical to tribble_paste() and has all the same type and separator guessing goodness. The difference is the output will be a formatted call to base::data.frame(). Some sensible line wrapping rules etc are implemented. Useful for purists and educators alike. Special thanks to Jonathan Carroll for contributing this feature.


So the Brisbane weather table from above becomes:

+           X = c("Partly cloudy.", "Partly cloudy.", "Possible shower.",
+                 "Partly cloudy.", "Shower or two. Possible storm.",
+                 "Possible shower.", "Partly cloudy.", "Mostly sunny.", "Partly cloudy.",
+                 "Possible shower.", "Partly cloudy."),
+    Location = c("Brisbane", "Brisbane Airport", "Beaudesert", "Chermside",
+                 "Gatton", "Ipswich", "Logan Central", "Manly",
+                 "Mount Gravatt", "Oxley", "Redcliffe"),
+         Min = c(19, 18, 15, 17, 15, 15, 18, 20, 17, 17, 19),
+         Max = c(29, 27, 30, 29, 32, 30, 29, 26, 28, 30, 27)

For a shortcut you could try ctrl + shift + d.


+Outputting data from your R evironment


+Output to R with dpasta() +


All of the above addin functions can be called directly with an R object argument. When run, this will result in the object being output at the current cursor. Usually the next line. To make things more magical, a there is a single function dpasta that will match the argument with the appropriate _paste() function based on its class. This means:

+ +

results in:

+      Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5, 5.4),
+       Sepal.Width = c(3.5, 3, 3.2, 3.1, 3.6, 3.9),
+      Petal.Length = c(1.4, 1.4, 1.3, 1.5, 1.4, 1.7),
+       Petal.Width = c(0.2, 0.2, 0.2, 0.2, 0.2, 0.4),
+           Species = as.factor(c("setosa", "setosa", "setosa", "setosa", "setosa",
+                                 "setosa"))
+   )


+ +

will give you:

+ +

+Avoiding fiddly data formatting


There are two addins that operate on RStudio cursor selections to make your life easier:


+Fiddle Selections unitl they’re better


Fiddle Selection is intended to remove some fiddly tasks from your workflow. It can turn raw data like 1 2 3 into c(1,2,3), then pivot from that to:

+  2,
+  3)

and back again to c(1,2,3). The parser here is really flexible too. It will accept data delimited by any combination of spaces, commas, and newlines.


Fiddle Selection Can also reflow messy tribble() and data.frame() expressions into neatly aligned ones, say after hand editing.


+Toggle Quotes


Toggle Vector Quotes will convert a selected expression like c(a,b,c) to a quoted version i.e c("a","b","c"). If it’s already quoted it will convert the other way to a bare version. All elements will be quoted if there’s a mixture. It also works with vertically algined expressions.


With the combination of these two you can get really lazy e.g. go from:

some stuff I typed
+  "stuff",
+  "I",
+  "typed") # mostly

in a couple of keystrokes!


Try assigning these addins to ctrl + shift + f and ctrl + shift + q respectively.


+Output to clipboard with dmdclip() +


dmdclip() can help you take the data to somewhere that uses markdown format, for example a Stack Overflow question or Github issue. This function will copy the resulting formatted data object call to the clipboard, inserting 4 spaces at the head of each line, which is markdown syntax for a pre-formatted block.



+ +

Will paste the following on the clipboard:

+       Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5, 5.4),
+        Sepal.Width = c(3.5, 3, 3.2, 3.1, 3.6, 3.9),
+       Petal.Length = c(1.4, 1.4, 1.3, 1.5, 1.4, 1.7),
+        Petal.Width = c(0.2, 0.2, 0.2, 0.2, 0.2, 0.4),
+            Species = as.factor(c("setosa", "setosa", "setosa", "setosa", "setosa",
+                                  "setosa"))
+    )

+Usage without RStudio


The rstudioapi package enables the calling of addins and output to the cursor. If the API is not detected, all the _paste() functions, and dpasta will output their text to the console, ready for copying and pasting to an editor window.


In this scenario you may wish to avoid installation of the rstudioapi package dependency. Use install.packages("datapasta", dependencies = "Depends") to avoid API installation, but be sure to follow up with install.packages(c("readr","clipr")).


note: The dpasta() function can be used without clipr installed, but you’re missing out on a fair amount of awesomeness if you limit yourself to that.


+Custom Behaviour for Your Unique Slowflake Setup


Custom behaviour can be created by taking advantage of the _construct() variants of the _paste() functions, as these return their output as an R object which can then be written to an appropriate buffer or clipboard.


for example, if you copied the Brisbane weather forecast from above to the clipboard and then called:

+ +

trib_call now contains a the tribble call as a character vector. You could then write this with:

write(trib_call, file = ..your desired location..)
+clipr::write_clip(trib_call) #Send it back to the clipboard.

+Configurable Options


+Upping the row guard


For your protection, datapasta will initially refuse to output R objects of 200 or more rows. Up the row limit for your specific scenario with dp_set_max_rows(n). Large numbers of rows could take a long time to format. In extreme cases you could crash your R/RStudio session.


+Deailing with “,” decimal marks


Use dp_set_decimal_mark(",") to handle numbers like 3,14.

Authors
index.html
+ + + + +
+ +
+ + +


+The Goods




+Introducing datapasta


datapasta is about reducing resistance associated with copying and pasting data to and from R. It is a response to the realisation that I often found myself using intermediate programs like Sublime to munge text into suitable formats. Addins and functions in datapasta support a wide variety of input and output situations, so it (probably) “just works”. Hopefully tools in this package will remove such intermediate steps and associated frustrations from our data slinging workflows.



  • Linux users will need to install either xsel or xclip. These applications provide an interface to X selections (clipboard-like). +
    • For example: sudo apt-get install xsel - it’s 72kb…
    • +
  • +
  • Windows and MacOS have nothing extra to do.
  • +


  1. Get the package: install.packages("datapasta") +
  2. +
  3. Set the keyboard shortcuts using Tools -> Addins -> Browse Addins, then click Keyboard Shortcuts… +
  4. +



+Use with RStudio


+Getting data into source


At the moment this package contains these RStudio addins that paste data to the cursor:

  • +tribble_paste which pastes a table as a nicely formatted call to tibble::tribble() +
    • Recommend Ctrl + Shift + t as shortcut.
    • +
    • Table can be delimited with tab, comma, pipe or semicolon.
    • +
  • +
  • +vector_paste which will paste delimited data as a vector definition, e.g. c("a", "b") etc. +
    • Recommend Ctrl + Alt + Shift + v as shortcut.
    • +
  • +
  • +vector_paste_vertical which will paste delimited data as a vertically formatted vector definition. +
    • Recommend Ctrl + Shift + v as shortcut
    • +
    • example output:
    • +
  • +
  • +df_paste which pastes a table on the clipboard as a standard data.frame definition rather than a tribble call. This has certain advantages in the context of reproducible examples and educational posts. Many thanks to Jonathan Carroll for getting this rolling and coding the bulk of the feature. +
    • Recommend Ctrl + Alt + Shift + d as shortcut.
    • +
  • +
  • +dt_paste which is the same as df_paste, but for data.table.
  • +

+Massaging data in source


There are two Addins that can help with creating and aliging data in your editor:

  • +Fiddle Selection will perform magic on a selection. It can be used to: +
    • Turn raw data delimited by any combination of commas, spaces, and newlines into a c() expression
    • +
    • Pivot a c() expr between horizontal and vertical layout.
    • +
    • Reflow messy tribble() and data.frame() exprs.
    • +
    • Recommend Ctrl +Shift + f as shortcut.
    • +
  • +
  • +Toggle Vector Quotes will toggle a c() expr between all elements wrapped in "" and all bare unquoted form. Handy in combination with above to save mucho keystrokes. +
    • Recommend Ctrl +Shift + q as shortcut.
    • +
  • +

+Getting Data out of an R session


There are two R functions available that accept R objects and output formatted text for pasting to a reprex or other application:

  • dpasta accepts tibbles, data.frames, and vectors. Data is output in a format that matches in input class. Formatted text is pasted at the cursor.

  • +
  • dmdclip accepts the same inputs as dpasta but inserts the formatted text onto the clipboard, preceded by 4 spaces so that is can be as pasted as a preformatted block to Github, Stackoverflow etc.

  • +

+Use with other editors


The only hard dependency of datapasta is readr for type guessing. All the above *paste functions can be called directly instead of as an addin, and will fall back to console output if the rstudioapi is not available.


On system without access to the clipboard (or without clipr installed) datapasta can still be used to output R objects from an R session. dpasta is probably the only function you care about in this scenario.


+Custom Installation


datapasta imports clipr and rstudioapi so as to make installation smooth and easy for most users. If you wish to avoid installing an rstudioapi you will never use you can use:

+ +


  • +tribble_paste works well with CSVs, excel files, and html tables, but is currently brittle with respect to irregular table structures like merged cells or multi-line column headings. For some reason Wikipedia seems chock full of these. :(
  • +
  • Quoted csv data, where the quotes contain commas will not be parsed correctly.
  • +
  • Nested list columns have limited support with tribble_paste()/dpasta(). Nested lists of length 1 fail unless all are length 1 - It’s complicated. You still get some output so it might be viable to fix and reflow with Fiddle Selection. Tread with caution.
  • +

+Prior art


This package is made possible by mdlincon’s clipr, and Hadley’s packages tibble and readr (for data-type guessing). I especially appreciate clipr's thoughtful approach to the clipboard on Linux, which pretty much every other R clipboard package just nope’d out on.


+Future developments


I am interested in expanding the types of objects supported by the output functions dpasta. I would also like to eventualy have Fiddle Selection to pivot function calls and named vectors. Feel free to contribute your ideas to the open issues.




0 to datapasta in 64 seconds via a video vignette:


Datapasta in 64 seconds

+ +
+ + +
+ + + +
+ + +

+datapasta 3.1.0 ‘Leave to Simmer’ Unreleased +

  • Exported _format functions
  • +
  • Adds dt_paste function for pasting as data.table (Thanks @jonocarroll, #72, closes #70)
  • +
  • Row names are kept in data.frames and data.tables (Thanks @sowla)
  • +
  • Column names that are invalid are now handled with backticks (Thanks @sharlagelfand)
  • +
  • Fixes issue with commas inside character vectors getting wrapped on
  • +
  • data.frame (and data.table) print is much prettier and robust with all args and cols aligned on ‘=’
  • +
  • zero row tibbles are supported with a fall-back to a tibble::tibble() call
  • +
  • all _construct functions now return input visibly
  • +
  • Fallback behaviour added to allow usage with remote R sessions like RStudio Server/Cloud, or ssh command line. (Thanks @gadenbuie, @jonthegeek)
  • +

+datapasta 3.0.0 ‘Colander Helmet’ 2018-01-24 +

  • When pasting from clipboard it now attempts to guess if there is no header row, in the case where the clipboard is all data. If you’re lucky it will create a default header for you when pasting (V1, V2, V3 etc.).
  • +
  • +dpasta() will now handle tribbles with R classes that cannot be represented in tribble form. It falls back to their character representation. This works well for things like dates.
  • +
  • New addin: ‘Fiddle Selection’. This is a kind of magic wand that can be waved over RStudio editor selections to: Reflow messy tribble and data.frame definitions, create c() expressions from raw data, and pivot c() exprs between vertical and horizontal format.
  • +
  • New addin: ‘Toggle Vector Quotes’. Given a horizontal or vertical c() expr, it will toggle all elements between quoted and bare format.
  • +
  • Complies with new CRAN policy on clipboard use. You cannot write to the clipboard in non-interactive sessions with dmdclip() - Why would you?. Tests containing clipboard use are skipped on CI and CRAN.
  • +

+datapasta 2.0.1 Unreleased +

  • Added a trailing newline after all pastes, this works much nicer for console output.
  • +
  • Fixed handling of backslashes. Relying on built-in function deparse() for escaping chars that need it.
  • +

+datapasta 2.0.0 ‘Fusilli Jerry’ 2017-03-26 +

  • Added the ability to parse objects from R and output as neatly formatted tibbles, dataframes and vectors with dpasta. The clipboard is not involved.
  • +
  • Added the ability to send these same types of objects to the clipboard formatted for markdown output with dmdclip.
  • +
  • Package can now operate in a close to fully featured way in editors other than RStudio. Output goes to console rather than cursor.
  • +
  • Added hooks for output customisation with _construct() functions that return the formatted output as an R character vector.
  • +
  • The decimal mark can be set for numeric data with dp_set_decimal_mark.
  • +
  • User can now paste natural looking comma separated lists as vectors, with automatic comma-splitting and whitespace trimming.
  • +

+datapasta 1.1.0 ‘CopyPesto’ 2017-01-09 +

  • Added df_paste() which pastes a table from the clipboard using a nicely formatted call to data.frame() rather than tribble() +
  • +
  • Better handling for empty lines that get accidently copied onto clipboard with table. Gracefully ignored.
  • +

+datapasta 1.0.0 2016-11-29 +

  • Added new addin ‘Paste as vector (vertical)’ to provide nicer formatting for long lists.
  • +
  • All addins now guess data types and format correctly in the source editor.
  • +
  • Empty rows in tables and empty cells in lists are formatted as NA’s when pasting instead of being ignored.
  • +
  • Added vignette, automated tests etc in prep for CRAN submission.
  • +

+datapasta 0.2 Unreleased +

  • Added graceful error handling on failed parse of text on clipboard to table.
  • +
  • +tribble_paste() and vector_paste() now pastse NA’s as unquoted, so R will parse as propper NA.
  • +
  • +tribble_paste() can parse an paste table text copied from raw delimited file e.g. csv, tsv, pipe delimited, seimi-colon delimited.
  • +
  • +vector_paste() uses a space between elements.
  • +

+datapasta 0.1.1 Unreleased +

  • Added a NEWS.md file to track changes to the package.
  • +
  • Fixed the handling of NAs in tab delimited files which resulted in phantom NA columns sometimes appearing with tribble_paste() +
  • +
+ + + +
+ + + +
