This guide first briefly describes the size processing system in NIMBLE. It then describes the process of adding a new size processor and lists some commonly used size processors. Note that NIMBLE’s size processing functionality can largely be found in genCpp_sizeProcessing.R.
The size processing step in NIMBLE proceeds by traveling recursively through the syntax tree, annotating each node with dimensionality, type, size expressions, and eigenizability (“yes”, “no”, or “maybe” for conversion to C++ Eigen package code). It also generates run-time size checks, inserts intermediate variables, and populates the local symbol table with locally created objects.
Nodes of the syntax tree are represented as exprClass
objects. Each node of the syntax tree (as given by the code
argument to a size processor) should exit the size processor with the following fields set:
code$nDim
: the dimension of the object. Must be an integer.code$type
: the type of the object, e.g. “double”code$sizeExprs
: usually, a list of length nDim, where entries are R expressions (not exprClass objects) for the size of the object in that dimension. List entries can be constants, e.g. sizeExprs[[1]] = 5
if it is known that the first dimension of this object will have length 5. Alternatively, a sizeExprs
list entry could be a generic expression, e.g. sizeExprs[[1]] <- quote(dim(x)[1])
, ensuring that the first dimension of this object will be the same as the size of the first dimension of the x
object.
sizeExprs
in some instances should be set using one of two functions:
productSizeExprs()
function should be used if the sizeExprs
for the current node should be a single dimension, the size of which is the product of the dimensions of the sizeExprs
of another nodemakeSizeExpressions()
function (located in genCpp_initSizes.R) should be used when sizeExprs
for a node are a combination of constants and expressionsnimbleList
or a nimbleFunction
, code$sizeExprs
will contain the symbolTable
entry of the corresponding nimbleList
or nimbleFunction
. This alows information from these objects (e.g. the elements of a nimbleList
) to be easily accessed at other parts of size processing.code$toEigenize
: either yes
, no
, or maybe
. Indicates whether to convert node to a type from the C++ Eigen package.code$name
: the name of the node being size processed. For a function call, say foo(x)
, code$name
will be "foo"
. Typically the name is already set and does not need to be modified, but sometimes a size processor will also split out different cases for C++ by changing the name. E.g. values
can be changed to setValues
.Two useful fields that should only be modified with care are:
code$caller
: The exprClass of the call in which the current call is nested.code$callerArgID
: the integer index of which argument the current call is to its caller. It should be that identical(code, code$caller$args[[ code$callerArgID ]])
is TRUE
.Note that some or all of these fields may be set by other functions called within a size processor (e.g. makeSizeExpressions()
), and as such do not necessarily need to be explicitly set within the processor itself. (I’m not sure what this statement means. Can you clarify it?)
Typically the first step in a size processor is to recurse on its arguments (see below), which means that the arguments can be counted on to have the above fields set. e.g. code$args[[1]]$nDim
will be the number of dimensions of the first argument (or whatever is returned by it).
Size processors collect additional expressions, called asserts
, that are later inserted into the syntax tree and become lines of code before or after the line being processed. asserts
may be collected for each line of code that is size processed, and the lines of code generated by the asserts
can be inserted either before (by default) or after (if wrapped in after()
) the expression they are assert
-ed from. asserts
are frequently run-time size checks, but can also be used to create intermediate variables, among other uses. For an example of generating asserts
that go both before and after a line of code, see the sizeasDoublePtr()
size processor.
code
: the expression class object representing the node of the syntax treesymTab
: the symbol table for the nimbleFunction method that is currently being size processedtypeEnv
: an environment originally designed for size expressions for objects known from initialization. Now it additionally is used to store some flags set in one step needed to be seen in another step. See additional information at top of genCpp_initSizes.R
.code
fields described in Size Processing OverviewIf another size processor is called from within the new size processor (e.g. sizeInsertIntermediate()
is commonly called from within a size processor), be sure to collect the asserts
returned by that size processor. The asserts are a list, so they can just be concatenated, e.g., asserts <- c(asserts, recursiveCallOfSomeKind())
. Size processing functions can also create their own asserts
expressions. Any asserts
either created or collected within a size processor should be returned from that size processor. If it is known that no asserts
exist at the end of a size processor, an empty list()
should be returned.
Any new size processing function must be added to the sizeCalls
list, located at the top of genCpp_sizeProcessing.R. The entry should be of the form fxnName = 'sizeProcessorName'
, where fxnName
is the name of the DSL function to be size processed, and 'sizeProcessorName'
is a character string naming the new size processing function.
A size processor for a special-case function foo
would typically be called sizeFoo
. Some size processors are used for groups of functions. E.g., sizeBinaryCwise
is for component-wise binary operations, such as A + B
.
exprClasses_setSizes()
: This is the entry point for any size processing. In some cases it is called exlicitly for recursion, but usually recursion is done via a call to recurseSetSizes
. See note above the exprClasses_setSizes()
function in genCpp_sizeProcessing.R for more information.sizeInsertIntermediate()
: Used to lift an expression, creating an intermediate variable. Useful especially in situations where a part of a line of code needs to be eigenized. E.g., if we have foo(bar(x), z)
and the size processor determines that bar(x) needs to be evaluated outside of the foo expression, using asserts <- c(asserts, sizeInsertIntermediate(code, 1, symTab, typeEnv))
will result in Interm32 <- bar(x)
as an assertion (where 32 is an arbitrary unique integer) and foo(Interm32, z)
as the current code. Note that bar(x)
should already have been processed (by recursion) prior to lifting it to an intermediate.sizeAssignAfterRecursing()
: The main size processor for assignment operations. Called after the right hand side of an assignment operation has been annotated. This processor considers all valid combinations of left hand and right hand side types for an assignment operations, and as such is rather long.recurseSetSizes(code, symTab, typeEnv)
: Recursively annotates arguments of a node in the syntax tree. Should generally be called at the beginning of a size processor if that node’s arguments may need processing. If not all arguments should be recursed into, the optional fourth argument useArgs (logical vector corresponding to arguments) can be given.code$args[[2]]$nDim
is valid, because code$args[[2]]
could be a constant (like 42
). So typically we’d use something like:arg2nDim <- if(inherits(code$args[[2]], 'exprClass')) code$args[[2]]$nDim else 0
foo(x)
knows that it can never appear in any expression except assignment. E.g. y <- foo(x)
is ok but w <- foo(x)$q + 3
is not ok. The size processor for foo
can check if code$caller %in% assignmentOperators
and if not it can use sizeInsertIntermediate
to lift itself out of its caller expression. An example is in generalFunSizeHandler
.RCfunction
or a member function of another nimbleFunction? Near the end of exprClasses_setSizes
, RCfunctions not in sizeCalls
are found in the user environment and added to neededRCfuns
. Then the generalFunSizeHandler
is called. This does not check types, but it lifts any argument expressions and sometimes lifts itself out of other expressions.nf$a
appears as NFvar(nf, 'a')
and is handled by sizeNFvar. Similarly, nf$foo(x)
becomes nfMethod(nf, 'foo')(x)
which is handled by sizeChainedCall
.