Skip to contents

R + Spreadsheets



The package is somewhere between dev and beta.

Testing and feedback would be helpful


Overview:

Spreadsheets are a great tool for presenting 2-dimensional data in a reader-friendly format and are still the workhorse of the business community, but spreadsheets don’t have the data processing power of R, so its common for data professionals to work in both spreadsheets and R. However, working in both tools can be frustrating. It’s time expensive to read data back forth between the two, and typically, I want to do my data processing in R and then send out a series of decent-looking tables to a workbook. xlr is an effort to reduce the friction that comes from needing to work in both spreadsheets and R.


Functions:

  • xl() – output a list of datasets as named spreadsheets in a workbook
  • paste_from_xl() – format for R your OS file/folder names or spreadsheet data that you copied into your clipboard memory
  • enscript() – produce the code to re-create an R object
  • repeated() – flag all repeated observations in a vector
  • sys_open() – request your OS to open files, folders, or URLs
  • enlist() – auto-name elements and catch lists without additional list nesting
  • copy_for_xl() – copy an R dataset to clipboard memory and format for pasting to a spreadsheet
  • list_iron() – flatten lists of embedded lists to a single-file list of named objects
  • entibble() – create tibbles with convenient default behavior for spreadsheet presentation


Details and Examples:

# Install
remotes::install_github('eauleaf/xlr', build_vignettes = TRUE)
library(xlr)


xl()

xl() is the primary function of the package, and many of the other functions are around to help it. The function is somewhat similar to View() from the utils package in that xl() is for auto-naming and quickly viewing datasets.

It’s easiest to see what xl() does by passing it some example data; try sending it a couple pre-loaded datasets, like:

xl(mtcars, iris, letters, AirPassengers, Titanic)

If you type the above code to the R console, you should get a workbook that pops open with those datasets presented in standardized, formatted spreadsheets. Once you’re done with the workbook, just close it; your spreadsheets are temporary. Close them, and they’re gone.

The video below is of xl() creating a temporary workbook where each data object becomes a labeled spreadsheet in a workbook.

xl(iris, mtcars, 1:100, dplyr::starwars)

To use xl(), pass a dataframe or a list of dataframes. It does not matter if you pass in lists with nested lists where your structured data objects are buried somewhere. xl() recursively flattens the list and auto-names your data objects to become spreadsheets in a workbook. In fact, nested lists can be useful in the naming of groups within spreadsheet tabs if you provide a sheet-naming specification to xl() like .tabname_spec = list(name_spec = "{outer}|{inner}").

Try running the below examples in your console.

your_datasets <- enlist(flowers = iris, mtcars, datasets::USArrests)

xl(your_datasets) 

xl(your_datasets, .tabname_spec = list(name_spec = "{outer}|{inner}")) 

iris |> splitter(Species) |> xl()

iris |> splitter(Species) |> xl(your_datasets)

xl() writes your workbooks in a temp directory and, to keep workbooks from building up, uses later::later() to remove the temp savefile 5 minutes after write time. Temp directories also delete themselves when you close R. If, however, you want to save a workbook, specify a path or just a save name as the xl() input parameter .path. Specifying a savename signals to xl() not to mark the file for cleanup. For example, xl(iris, .path = 'flower_data') will save your data under flower_data.xlsx within your working directory. (.path uses here::here() internally.)

xl() creates tibbles of all datasets you pass in, and if specified by the user parameter .return, optionally returns (always invisibly) the openxlsx workbook object (default), or the data objects in a list, or the workbook savepath as a string, or all of them in a list.


paste_from_xl()

paste_from_xl() pastes data from your clipboard memory to a tibble object in R. You can pass spreadsheet data to R by copying data from your spreadsheet, then type the following into your console:

my_data <- paste_from_xl()

The function guesses whether you have field names present, but if the function guessed wrong, you can specify with the parameter has_fieldnames, like below:

my_data <- paste_from_xl(has_fieldnames = TRUE)

The paste_from_xl() key-chord:

However, a better method for pasting is to use quick keys ctrl + alt + shift+ v. To use this key-chord, run the function set_xlr_key_chords(). (To set up quick keys, you only have to run this function once in RStudio, but you have to close RStudio and re-open it for the new quick keys to be active.)

Then:

  1. Copy some data from a spreadsheet.

  2. Click over to the RStudio console or to an RStudio editor. (The console and editor have different paste_from_xl() functionality.)

  3. Press ctrl + alt + shift + v all together to paste. (Just like pasting with ctrl + v, but using all 3 control keys, plus the v.)

You should see a tibble of your spreadsheet data assigned to a variable and the dataset echoed to the console. However, if you pressed ctrl + alt + shift + v, and you were over an editor window instead of the console, you also get formatted code written into the editor window that recreates the data object from script. The script is a formatted version of R’s deparsed internal data representation.

The video below is an example of copying spreadsheet data and then using key-chord ctrl + alt + shift + v in RStudio.

ctrl + alt + shift + v

Copy paths using paste_from_xl():

In addition to spreadsheet data, you can copy file paths with paste_from_xl(). Open a file navigation window in your operating system, highlight a few files or folders, and press ctrl + c; then go back to RStudio, click on an editor window, and press the key-chord ctrl + alt + shift + v, you get an enscripted vector of file paths pasted into your text editor.


The video below is an example of copying file paths from a file window and then using the key-chord ctrl + alt + shift + v in RStudio.

ctrl + alt + shift + v


enscript()

I often manually process data down to particular information that’s too small to deal with saving and retrieving from a serialized file or database. Instead, I just want to save my processed dataset as text within the script, but I hate manually recreating the object.

If you have an object in a variable, and want that variable’s data to be deparsed into a formatted script, pass that variable into enscript(), as in:

enscript(head(mtcars))

enscript() copies formatted code to your clipboard that will reproduce that variable’s internal R object.

Below is an example video of enscript():

Try running something like:

iris |> splitter(Species) |> purrr::map(head, 2) |> enscript()

You can pass in an expression or a single variable by piping ‘enscript()’ to the end of anything, or by using the quick keys. Enscript(), and its key-chord, is extremely useful for creating unit tests for package test files.

The enscript() key-chord:

It’s convenient to use quick keys with enscript(), which are ctrl + alt + shift + n. (I remember the key-chord as all three control keys, plus the n, which sounds short for ‘enscript’.)

To use the enscript() quick keys, Run the quick-key setup with set_xlr_key_chords() if you haven’t already. Then:

  1. Type a variable name like mtcars in your RStudio editor window (not in the console).

  2. Use your cursor to highlight the word mtcars.

  3. Press ctrl + alt + shift + n all together.

You should see the success message, ✔ Script copied to clipboard:, written to your console with the deparsed object echoed below it. You can now paste a formatted script into an editor window or the console with ctrl + v.

Note that the enscript() quick keys only work for a highlighted variable or highlighted expression in the RStudio editor window; the quick keys do not interpret anything highlighted in the console.


repeated()

repeated() helps filter a dataframe for identical observations. It should probably be in dplyr, but it’s in xlr because it’s not in dplyr. I use it to understand duplication errors. Use it like this:

mtcars |> dplyr::filter(repeated(disp))
#>                    mpg cyl  disp  hp drat    wt  qsec vs am gear carb
#> Mazda RX4         21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag     21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
#> Hornet Sportabout 18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
#> Duster 360        14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
#> Merc 280          19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
#> Merc 280C         17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
#> Merc 450SE        16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
#> Merc 450SL        17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3
#> Merc 450SLC       15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3

repeated() is just x %in% x[duplicated(x)] with assertion checks, but way easier to remember than x %in% x[duplicated(x)]. Use it in and and or statements like this:

# AND statements
mtcars |> dplyr::filter(repeated(hp), repeated(disp))
# OR statements
mtcars |> dplyr::filter(repeated(hp) | repeated(disp))


sys_open()

sys_open() opens the .xlsx files, but also opens any file, folder, or URL since this function requests the operating system to open the provided input locations via your operating system’s command-line interface. This function passes a path string (or many, if you like) to your OS, asking the OS to open up the location using its default application. The reason I use it instead of other functions is because I’ve had better luck getting files to open when requesting that the operating system make the decisions about how to open up the file.

Try running sys_open() without any inputs. It should pop open a navigation pane in your working directory.

If you aren’t able to open a file with sys_open(), then, using your operating system’s navigation window, right-click on the file to make sure your operating system has a default application assigned that file type; after that, double-click on the file to make sure it opens in your specified default application.

The video below is an example of opening up some PDF files by piping them from paste_from_xl() to sys_open().

sys_open(paste_from_xl())


enlist()

enlist() is for quick list-naming. This function produces named lists, without list-nesting when passed a single bare list. That is, unlike list(), enlist() removes its own embedded list layer if the user passes in a single list.

For instance, compare the structure and naming created by enlist:

enlist(enlist(enlist('hello'))) |> str()
#> List of 1
#>  $ hello: chr "hello"

To the structure and naming created by list:

list(list(list('hello'))) |> str()
#> List of 1
#>  $ :List of 1
#>   ..$ :List of 1
#>   .. ..$ : chr "hello"

The enlist() example is one list deep, whereas the `list() three lists deep. However, if you send two list objects into enlist, as shown in the example below, enlist has no choice but to wrap them in a new list.

enlist(list(letters), list(letters)) |> str()
#> List of 2
#>  $ list(letters):List of 1
#>   ..$ : chr [1:26] "a" "b" "c" "d" ...
#>  $ list(letters):List of 1
#>   ..$ : chr [1:26] "a" "b" "c" "d" ...

A few additional notes about enlist():

  • Naming is permissive, you can have duplicate names in a list, so be careful about indexing by name since R returns only the first equivalently named observation from the list.

  • enlist() has input parameter .label, that takes a naming function. Internally, .label is passed to nm in rlang::set_names() and so takes the same inputs. For example: enlist('hi','hello', .label = ~paste0(.,'_',1:2)).

  • enlist() supports non-standard evaluation, e.g. enlist(!!!letters).

  • Ignores empty inputs, e.g. enlist(,,'hi',,).

Some clipboard prep notes:

The next 3 xlr functions, use the clipboard memory. To use the clipboard, install the clipr package and then check that the package works by running clipr::dr_clipr(). Also, if using Linux, like me, make sure to install a clipboard tool via your terminal, e.g. sudo apt-get install xclip.


copy_for_xl()

copy_for_xl() copies data to your clipboard memory that’s formatted for pasting to cells in a spreadsheet. You can pass a data object to it, like:

copy_for_xl(mtcars)

The copy_for_xl() key-chord:

However, a better copying method is to use the quick keys ctrl + alt + shift + c. If you haven’t already done so, run the function, set_xlr_key_chords() to use these quick keys.

Then:

  1. Type mtcars in an RStudio text editor window (not in the console window)

  2. Use your cursor to highlight the word mtcars, and

  3. Press ctrl + alt + shift + c all together. (It’s just like copying with ctrl + c, but instead use all 3 control keys, plus the c.)

You should see ✔ Table copied to clipboard: written in the console. Open a spreadsheet and paste the copied mtcars data to it.

The below video example uses copy_for_xl(), via the console, to copy some starwars data.


list_iron()

list_iron() takes any nested list of lists and irons the input out to get a single-file list of objects.

Because an ‘.xlsx’ workbook can only make spreadsheets out of a simple, flat list of data objects, list_iron() recursively smooshes any additional list structure out of the input. list_iron() also has a few convenience features for naming the list objects and applying a function to each. For example, consider the ridiculously embedded list structure below and list_iron’s simplification of it:

ridiculous <- list(list(car_data = mtcars, list(list(flower_data = iris))))

# Original list depth
ridiculous |> str()
#> List of 1
#>  $ :List of 2
#>   ..$ car_data:'data.frame': 32 obs. of  11 variables:
#>   .. ..$ mpg : num [1:32] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
#>   .. ..$ cyl : num [1:32] 6 6 4 6 8 6 8 4 4 6 ...
#>   .. ..$ disp: num [1:32] 160 160 108 258 360 ...
#>   .. ..$ hp  : num [1:32] 110 110 93 110 175 105 245 62 95 123 ...
#>   .. ..$ drat: num [1:32] 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
#>   .. ..$ wt  : num [1:32] 2.62 2.88 2.32 3.21 3.44 ...
#>   .. ..$ qsec: num [1:32] 16.5 17 18.6 19.4 17 ...
#>   .. ..$ vs  : num [1:32] 0 0 1 1 0 1 0 1 1 1 ...
#>   .. ..$ am  : num [1:32] 1 1 1 0 0 0 0 0 0 0 ...
#>   .. ..$ gear: num [1:32] 4 4 4 3 3 3 3 4 4 4 ...
#>   .. ..$ carb: num [1:32] 4 4 1 1 2 1 4 2 2 4 ...
#>   ..$         :List of 1
#>   .. ..$ :List of 1
#>   .. .. ..$ flower_data:'data.frame':    150 obs. of  5 variables:
#>   .. .. .. ..$ Sepal.Length: num [1:150] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
#>   .. .. .. ..$ Sepal.Width : num [1:150] 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
#>   .. .. .. ..$ Petal.Length: num [1:150] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
#>   .. .. .. ..$ Petal.Width : num [1:150] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
#>   .. .. .. ..$ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

# Ironed list depth
ridiculous |> list_iron() |> str()
#> List of 2
#>  $ ridiculous|car_data   :'data.frame':  32 obs. of  11 variables:
#>   ..$ mpg : num [1:32] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
#>   ..$ cyl : num [1:32] 6 6 4 6 8 6 8 4 4 6 ...
#>   ..$ disp: num [1:32] 160 160 108 258 360 ...
#>   ..$ hp  : num [1:32] 110 110 93 110 175 105 245 62 95 123 ...
#>   ..$ drat: num [1:32] 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
#>   ..$ wt  : num [1:32] 2.62 2.88 2.32 3.21 3.44 ...
#>   ..$ qsec: num [1:32] 16.5 17 18.6 19.4 17 ...
#>   ..$ vs  : num [1:32] 0 0 1 1 0 1 0 1 1 1 ...
#>   ..$ am  : num [1:32] 1 1 1 0 0 0 0 0 0 0 ...
#>   ..$ gear: num [1:32] 4 4 4 3 3 3 3 4 4 4 ...
#>   ..$ carb: num [1:32] 4 4 1 1 2 1 4 2 2 4 ...
#>  $ ridiculous|flower_data:'data.frame':  150 obs. of  5 variables:
#>   ..$ Sepal.Length: num [1:150] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
#>   ..$ Sepal.Width : num [1:150] 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
#>   ..$ Petal.Length: num [1:150] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
#>   ..$ Petal.Width : num [1:150] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
#>   ..$ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

Notice the auto-naming of the list items above, i.e. “ridiculous|car_data”, “ridiculous|flower_data”. Control the naming with name_spec and name_repair. You can also apply a function, with parameter .f, while flattening out all of the leaves in the lists , as shown below:

list_iron(ridiculous, .f = ~tail(.,2))
#> $`ridiculous|car_data`
#>                mpg cyl disp  hp drat   wt qsec vs am gear carb
#> Maserati Bora 15.0   8  301 335 3.54 3.57 14.6  0  1    5    8
#> Volvo 142E    21.4   4  121 109 4.11 2.78 18.6  1  1    4    2
#> 
#> $`ridiculous|flower_data`
#>     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#> 149          6.2         3.4          5.4         2.3 virginica
#> 150          5.9         3.0          5.1         1.8 virginica


entibble()

entibble() makes a tibble with defaults useful for spreadsheets. Defaults include:

  • automatically presenting rownames, if they exist, as the first column of the tibble
  • not repeating input elements to obtain common lengths when producing a dataframe
  • treating a list of data inputs equivalent to comma-separated data inputs

For instance, compare the tibbled and entibbled data below.

alpha <- rlang::set_names(letters, LETTERS)

# rownames are forced in so they show up in spreadsheets
entibble(alpha) |> head(2)
#> # A tibble: 2 × 2
#>   rowname alpha
#>   <chr>   <chr>
#> 1 A       a    
#> 2 B       b
tibble::tibble(alpha) |> head(2)
#> # A tibble: 2 × 1
#>   alpha
#>   <chr>
#> 1 a    
#> 2 b

# lists passed to entibble are dumped and their objects joined like comma-separated 
# inputs if the listed objects have compatible row dimensions
entibble(list(alpha, letters)) |> tail(2)
#> # A tibble: 2 × 3
#>   rowname `1`   `2`  
#>   <chr>   <chr> <chr>
#> 1 Y       y     y    
#> 2 Z       z     z

# compare the above to `tibble()`
tibble::tibble(list(alpha, letters)) |> tail(2)
#> # A tibble: 2 × 1
#>   `list(alpha, letters)`
#>   <list>                
#> 1 <chr [26]>            
#> 2 <chr [26]>