Quirks and Oddities using R in Displayr

From Displayr
Jump to navigation Jump to search

This page describes some aspects of how R works in Q that are different to how it works in some other programs.

Only variable names and R Output names that appear in code are accessible within code

When R code is run, both the code and any associated data is sent to the R server, which then returns the result. The associated data is identified by scanning through the code and looking for the names of Variables, Variable Sets, and R Outputs. Where these do not appear in the code, they are not uploaded.

As an example, consider two R Outputs. The first contains myVariable <- 10. This causes an R Output to be created with a Name of myVariable and a value of 10. The second contains the following code:

myVariable * eval(parse(text = "myVariable")) * eval(parse(text = paste0("my","Variable")))

This will return a value of 1,000, as it retrieves the value 10 three times, and multiplies them together. This is identical to how R normally works.

By contrast, if the second R Output contains:

eval(parse(text = "myVariable")) * eval(parse(text = paste0("my","Variable")))

we get the error that object 'myVariable' not found. This is because the R code does not contain the Variable Name. Rather, it contains a string which is then converted into a variable name twice.

In situations where it is necessary to use code of this form, a solution is to have some irrelevant usage of the Variable Names at the beginning of the code block, which will cause the data to be uploaded. For example, the following code will return the correct result of 100:

tempVar = myVariable
eval(parse(text = "myVariable")) * eval(parse(text = paste0("my","Variable")))

References to public, external or other non-local variables in functions

Consider the following R Output:

f <- 1
b <- function()
{
        print(f)
}

Where an R Output contains a function, or, ends with a function, that function can be used within other places that can run R code (i.e., a different R Output, R Variable, or when creating an R Data Set). However, if the function refers to a variable, such as f in this example, which is not defined within the function, or within its signature, an error will typically occur (the exception is described below). For example, an R Output containing b(1), will return an error of object 'f' not found.

A similar error will occur if, for example, f <- 1 appears in a separate R Output with a name of f. In this case, a more voluminous error message is provided, saying, for example: An R Output called 'f' has been used in the R code. Although this does exist in this document, it is not available where it is referred to. A common cause of this is where referring to objects within a function. A solution is to change the function to accept 'f' (e.g., function(..., f))..

There are two ways to work around this:

  1. As described in the longer error message, the way to resolve this issue is to pass the variables into the function as arguments (e.g., b <- function(f)).
  2. When calling the function, you can refer to the variable elsewhere in the same R Output, as is done in the example below. The reason that this works is that any variables referred to within an R Output, other than within a function, are automatically loaded prior to computing an R Output.
z = f
b(1)

Recursive functions

Permitted recursions

In Q, it is ok to write recursive R functions. For example:

fib = function(i) {
    if (i < 2)
        i
    else
        fib(i-1) + fib(i-2)
}

It is also OK to write mutually recursive R functions. So, for example, the above function could have been written as:

fibsum = function(k, j) {
    fib(k) + fib(j)
}

fib = function(i) {
    if (i < 2)
        i
    else
        fibsum(i-1, i-2)
}

Problematic Recursion

However, it is a requirement in Q to have these two functions defined within the same R item. That is, it is not OK to create two R items:

fibsum = function(k, j) {
    fib(k) + fib(j)
}

and

fib = function(i) {
    if (i < 2)
        i
    else
        fibsum(i-1, i-2)
}

as they both require that the other one is defined before it is. If two (or more) functions are mutually defined, then Q will complain of mutual recursion, and suggest that you solve this by either rewriting the code to eliminate the recursion, or by defining all associated functions within the one R item.

Recursive Functions and Implications for S3 Classes

Q achieves S3 class functionality by making not only objects which are explicitly referenced in the code available to R for computation, but also objects which have the same initial prefix as objects referred to in the code. That is, if Q sees the code

plot(mydata)

and if there are functions which have been defined with the "plot." prefix, such as "plot.myclass", then these functions are made available for the R server to use when evaluating "plot(mydata)". If "mydata" is of class "myclass", then "plot.myclass" will be called, giving the functionality of S3 classes. On the other hand, if "mydata" is not of class "myclass", then "plot.myclass" will be ignored when evaluating "plot(mydata)".

This has an implication when it comes to recursion, because mutual recursion (above) may be present although not necessarily apparent. For example, two objects defined as

auxiliary_plotting  = function(x) {plot(x)}

and

plot.myclass = function(x) {auxiliary_plotting(x)}

exhibit mutual recursion. As the definition of "auxiliary_plotting" refers to "plot", which may mean "plot.myclass", there is an implicit reference to "plot.myclass". For this reason, "auxiliary_plotting" implicitly requires "plot.myclass" and "plot.myclass" requires "auxiliary_plotting", resulting in mutual recursion.