Functions in-depth

From Goodblox Wiki
Jump to navigationJump to search

Introduction

Functions are first-class values in Lua. That means that functions can be stored in variables, passed as arguments to other functions, and returned as results. Such facilities give great flexibility to the language: A program may redefine a function to add new functionality, or simply erase a function to create a secure environment when running a piece of untrusted code (such as code received through a network). Moreover, Lua offers good support for functional programming, including nested functions with proper lexical scoping; just wait. Finally, first-class functions play a key role in Lua's object-oriented facilities

Lua can call functions written in Lua and functions written in C. All the standard library in Lua is written in C. It comprises functions for string manipulation, table manipulation, I/O, access to basic operating system facilities, mathematical functions, and debugging. Application programs may define other functions in C.

Functions are the main mechanism for abstraction of statements and expressions in Lua. Functions can both carry out a specific task (what is sometimes called procedure or subroutine in other languages) or compute and return values. In the first case, we use a function call as a statement; in the second case, we use it as an expression:

    print(8*9, 9/8)
    a = math.sin(3) + math.cos(10)

Will result in:
72 1.125
<pre>

In both cases, we write a list of arguments enclosed in parentheses. If the function call has no arguments, we must write an empty list () to indicate the call. There is a special case to this rule: If the function has one single argument and this argument is either a literal string or a table constructor, then the parentheses are optional:

<pre> 
    print "Hello World"     <-->     print("Hello World")
    print [[a multi-line    <-->     print([[a multi-line
     message]]                        message]])
    f{x=10, y=20}           <-->     f({x=10, y=20})
    type{}                  <-->     type({})

Lua also offers a special syntax for object-oriented calls, the colon operator. An expression like o:foo(x) is just another way to write o.foo(o, x), that is, to call o.foo adding o as a first extra argument.

Functions used by a Lua program can be defined both in Lua and in C (or in any other language used by the host application). For instance, all library functions are written in C; but this fact has no relevance to Lua programmers. When calling a function, there is no difference between functions defined in Lua and functions defined in C.

As we have seen in other examples, a function definition has a conventional syntax; for instance

    -- add all elements of array `a'
    function add (a)
      local sum = 0
      for i,v in ipairs(a) do
        sum = sum + v
      end
      return sum
    end

In that syntax, a function definition has a name (add, in the previous example), a list of parameters, and a body, which is a list of statements. Parameters work exactly as local variables, initialized with the actual arguments given in the function call. You can call a function with a number of arguments different from its number of parameters. Lua adjusts the number of arguments to the number of parameters, as it does in a multiple assignment: Extra arguments are thrown away; extra parameters get nil. For instance, if we have a function like

   function f(a, b) return a or b end

we will have the following mapping from arguments to parameters:

    CALL             PARAMETERS
       
    f(3)             a=3, b=nil
    f(3, 4)          a=3, b=4
    f(3, 4, 5)       a=3, b=4   (5 is discarded)

Although this behavior can lead to programming errors (easily spotted at run time), it is also useful, especially for default arguments. For instance, consider the following function, to increment a global counter.

    function incCount (n)
      n = n or 1
      count = count + n
    end

This function has 1 as its default argument; that is, the call incCount(), without arguments, increments count by one. When you call incCount(), Lua first initializes n with nil; the or results in its second operand; and as a result Lua assigns a default 1 to n.


Multiple Results

An unconventional, but quite convenient feature of Lua is that functions may return multiple results. Several predefined functions in Lua return multiple values. An example is the string.find function, which locates a pattern in a string. It returns two indices: the index of the character where the pattern match starts and the one where it ends (or nil if it cannot find the pattern). A multiple assignment allows the program to get both results:

    s, e = string.find("hello Lua users", "Lua")
    
    print(s, e)   -->  7      9

Functions written in Lua also can return multiple results, by listing them all after the return keyword. For instance, a function to find the maximum element in an array can return both the maximum value and its location:

    function maximum (a)
      local mi = 1          -- maximum index
      local m = a[mi]       -- maximum value
      for i,val in ipairs(a) do
        if val > m then
          mi = i
          m = val
        end
      end
      return m, mi
    end
    
    print(maximum({8,10,23,12,5}))
Will result in:
23   3


Lua always adjusts the number of results from a function to the circumstances of the call. When we call a function as a statement, Lua discards all of its results. When we use a call as an expression, Lua keeps only the first result. We get all results only when the call is the last (or the only) expression in a list of expressions. These lists appear in four constructions in Lua: multiple assignment, arguments to function calls, table constructors, and return statements. To illustrate all these uses, we will assume the following definitions for the next examples:

    function foo0 () end                  -- returns no results
    function foo1 () return 'a' end       -- returns 1 result
    function foo2 () return 'a','b' end   -- returns 2 results

In a multiple assignment, a function call as the last (or only) expression produces as many results as needed to match the variables:

    x,y = foo2()        -- x='a', y='b'
    x = foo2()          -- x='a', 'b' is discarded
    x,y,z = 10,foo2()   -- x=10, y='a', z='b'

If a function has no results, or not as many results as we need, Lua produces nils:

    x,y = foo0()      -- x=nil, y=nil
    x,y = foo1()      -- x='a', y=nil
    x,y,z = foo2()    -- x='a', y='b', z=nil

A function call that is not the last element in the list always produces one result:

    x,y = foo2(), 20      -- x='a', y=20
    x,y = foo0(), 20, 30  -- x='nil', y=20, 30 is discarded

When a function call is the last (or the only) argument to another call, all results from the first call go as arguments. We have seen examples of this construction already, with print:

    print(foo0())          -->
    print(foo1())          -->  a
    print(foo2())          -->  a   b
    print(foo2(), 1)       -->  a   1
    print(foo2() .. "x")   -->  ax         (see below)

When the call to foo2 appears inside an expression, Lua adjusts the number of results to one; so, in the last line, only the "a" is used in the concatenation. The print function may receive a variable number of arguments. (In the next section we will see how to write functions with variable number of arguments.) If we write f(g()) and f has a fixed number of arguments, Lua adjusts the number of results of g to the number of parameters of f, as we saw previously.

A constructor also collects all results from a call, without any adjustments:

    a = {foo0()}         -- a = {}  (an empty table)
    a = {foo1()}         -- a = {'a'}
    a = {foo2()}         -- a = {'a', 'b'}

As always, this behavior happens only when the call is the last in the list; otherwise, any call produces exactly one result:

    a = {foo0(), foo2(), 4}   -- a[1] = nil, a[2] = 'a', a[3] = 4

Finally, a statement like return f() returns all values returned by f:

    function foo (i)
      if i == 0 then return foo0()
      elseif i == 1 then return foo1()
      elseif i == 2 then return foo2()
      end
    end
    
    print(foo(1))     --> a
    print(foo(2))     --> a  b
    print(foo(0))     -- (no results)
    print(foo(3))     -- (no results)

You can force a call to return exactly one result by enclosing it in an extra pair of parentheses:

    print((foo0()))        --> nil
    print((foo1()))        --> a
    print((foo2()))        --> a

Beware that a return statement does not need parentheses around the returned value, so any pair of parentheses placed there counts as an extra pair. That is, a statement like return (f()) always returns one single value, no matter how many values f returns. Maybe this is what you want, maybe not. A special function with multiple returns is unpack. It receives an array and returns as results all elements from the array, starting from index 1:

    print(unpack{10,20,30})    --> 10   20   30
    a,b = unpack{10,20,30}     -- a=10, b=20, 30 is discarded

An important use for unpack is in a generic call mechanism. A generic call mechanism allows you to call any function, with any arguments, dynamically. In ANSI C, for instance, there is no way to do that. You can declare a function that receives a variable number of arguments (with stdarg.h) and you can call a variable function, using pointers to functions. However, you cannot call a function with a variable number of arguments: Each call you write in C has a fixed number of arguments and each argument has a fixed type. In Lua, if you want to call a variable function f with variable arguments in an array a, you simply write

    f(unpack(a))

The call to unpack returns all values in a, which become the arguments to f. For instance, if we execute

    f = string.find
    a = {"hello", "ll"}

then the call f(unpack(a)) returns 3 and 4, exactly the same as the static call string.find("hello", "ll"). Although the predefined unpack is written in C, we could write it also in Lua, using recursion:

    function unpack (t, i)
      i = i or 1
      if t[i] ~= nil then
        return t[i], unpack(t, i + 1)
      end
    end

The first time we call it, with a single argument, i gets 1. Then the function returns t[1] followed by all results from unpack(t, 2), which in turn returns t[2] followed by all results from unpack(t, 3), and so on, until the last non-nil element.

Variable Number of Arguments

Some functions in Lua receive a variable number of arguments. For instance, we have already called print with one, two, and more arguments.

Suppose now that we want to redefine print in Lua: Perhaps our system does not have a stdout and so, instead of printing its arguments, print stores them in a global variable, for later use. We can write this new function in Lua as follows:

    printResult = ""
    
    function print (...)
      for i,v in ipairs(arg) do
        printResult = printResult .. tostring(v) .. "\t"
      end
      printResult = printResult .. "\n"
    end

The three dots (...) in the parameter list indicate that the function has a variable number of arguments. When this function is called, all its arguments are collected in a single table, which the function accesses as a hidden parameter named arg. Besides those arguments, the arg table has an extra field, n, with the actual number of arguments collected. Sometimes, a function has some fixed parameters plus a variable number of parameters. Let us see an example. When we write a function that returns multiple values into an expression, only its first result is used. However, sometimes we want another result. A typical solution is to use dummy variables; for instance, if we want only the second result from string.find, we may write the following code:

    local _, x = string.find(s, p)
    -- now use `x'
    ...

An alternative solution is to define a select function, which selects a specific return from a function:

    print(string.find("hello hello", " hel"))         --> 6  9
    print(select(1, string.find("hello hello", " hel"))) --> 6
    print(select(2, string.find("hello hello", " hel"))) --> 9

Notice that a call to select has always one fixed argument, the selector, plus a variable number of extra arguments (the returns of a function). To accommodate this fixed argument, a function may have regular parameters before the dots. Then, Lua assigns the first arguments to those parameters and only the extra arguments (if any) go to arg. To better illustrate this point, assume a definition like

   function g (a, b, ...) end

Then, we have the following mapping from arguments to parameters:

   CALL            PARAMETERS
      
   g(3)             a=3, b=nil, arg={n=0}
   g(3, 4)          a=3, b=4, arg={n=0}
   g(3, 4, 5, 8)    a=3, b=4, arg={5, 8; n=2}

Using those regular parameters, the definition of select is straightforward:

    function select (n, ...)
      return arg[n]
    end


Named Arguments

The parameter passing mechanism in Lua is positional: When we call a function, arguments match parameters by their positions. The first argument gives the value to the first parameter, and so on. Sometimes, however, it is useful to specify the arguments by name. To illustrate this point, let us consider the function rename (from the os library), which renames a file. Quite often, we forget which name comes first, the new or the old; therefore, we may want to redefine this function to receive its two arguments by name:

   -- invalid code
   rename(old="temp.lua", new="temp1.lua")

Lua has no direct support for that syntax, but we can have the same final effect, with a small syntax change. The idea here is to pack all arguments into a table and use that table as the only argument to the function. The special syntax that Lua provides for function calls, with just one table constructor as argument, helps the trick:

   rename{old="temp.lua", new="temp1.lua"}

Accordingly, we define rename with only one parameter and get the actual arguments from this parameter:

    function rename (arg)
      return os.rename(arg.old, arg.new)
    end

This style of parameter passing is especially helpful when the function has many parameters, and most of them are optional. For instance, a function that creates a new window in a GUI library may have dozens of arguments, most of them optional, which are best specified by names:

    w = Window{ x=0, y=0, width=300, height=200,
                title = "Lua", background="blue",
                border = true
              }

The Window function then has the freedom to check for mandatory arguments, add default values, and the like. Assuming a primitive _Window function that actually creates the new window (and that needs all arguments), we could define Window as follows:

    function Window (options)
      -- check mandatory options
      if type(options.title) ~= "string" then
        error("no title")
      elseif type(options.width) ~= "number" then
        error("no width")
      elseif type(options.height) ~= "number" then
        error("no height")
      end
    
      -- everything else is optional
      _Window(options.title,
              options.x or 0,    -- default value
              options.y or 0,    -- default value
              options.width, options.height,
              options.background or "white",   -- default
              options.border      -- default is false (nil)
             )
    end


More about Functions

Functions in Lua are first-class values with proper lexical scoping.

What does it mean for functions to be "first-class values"? It means that, in Lua, a function is a value with the same rights as conventional values like numbers and strings. Functions can be stored in variables (both global and local) and in tables, can be passed as arguments, and can be returned by other functions.

What does it mean for functions to have "lexical scoping"? It means that functions can access variables of its enclosing functions. (It also means that Lua contains the lambda calculus properly.) As we will see in this chapter, this apparently innocuous property brings great power to the language, because it allows us to apply in Lua many powerful programming techniques from the functional-language world. Even if you have no interest at all in functional programming, it is worth learning a little about how to explore those techniques, because they can make your programs smaller and simpler.

A somewhat difficult notion in Lua is that functions, like all other values, are anonymous; they do not have names. When we talk about a function name, say print, we are actually talking about a variable that holds that function. Like any other variable holding any other value, we can manipulate such variables in many ways. The following example, although a little silly, shows the point:

    a = {p = print}
    a.p("Hello World") --> Hello World
    print = math.sin  -- `print' now refers to the sine function
    a.p(print(1))     --> 0.841470
    sin = a.p         -- `sin' now refers to the print function
    sin(10, 20)       --> 10      20

Later we will see more useful applications for this facility. If functions are values, are there any expressions that create functions? Yes. In fact, the usual way to write a function in Lua, like

   function foo (x) return 2*x end

is just an instance of what we call syntactic sugar; in other words, it is just a pretty way to write

   foo = function (x) return 2*x end

That is, a function definition is in fact a statement (an assignment, more specifically) that assigns a value of type "function" to a variable. We can see the expression function (x) ... end as a function constructor, just as {} is a table constructor. We call the result of such function constructors an anonymous function. Although we usually assign functions to global names, giving them something like a name, there are several occasions when functions remain anonymous. Let us see some examples. The table library provides a function table.sort, which receives a table and sorts its elements. Such a function must allow unlimited variations in the sort order: ascending or descending, numeric or alphabetical, tables sorted by a key, and so on. Instead of trying to provide all kinds of options, sort provides a single optional parameter, which is the order function: a function that receives two elements and returns whether the first must come before the second in the sort. For instance, suppose we have a table of records such as

     network = {
       {name = "grauna",  IP = "210.26.30.34"},
       {name = "arraial", IP = "210.26.30.23"},
       {name = "lua",     IP = "210.26.23.12"},
       {name = "derain",  IP = "210.26.23.20"},
     }

If we want to sort the table by the field name, in reverse alphabetical order, we just write

   table.sort(network, function (a,b)
     return (a.name > b.name)
   end)

See how handy the anonymous function is in that statement. A function that gets another function as an argument, such as sort, is what we call a higher-order function. Higher-order functions are a powerful programming mechanism and the use of anonymous functions to create their function arguments is a great source of flexibility. But remember that higher-order functions have no special rights; they are a simple consequence of the ability of Lua to handle functions as first-class values.

Closures

When a function is written enclosed in another function, it has full access to local variables from the enclosing function; this feature is called lexical scoping. Although that may sound obvious, it is not. Lexical scoping, plus first-class functions, is a powerful concept in a programming language, but few languages support that concept.

Let us start with a simple example. Suppose you have a list of student names and a table that associates names to grades; you want to sort the list of names, according to their grades (higher grades first). You can do this task as follows:

    names = {"Peter", "Paul", "Mary"}
    grades = {Mary = 10, Paul = 7, Peter = 8}
    table.sort(names, function (n1, n2)
      return grades[n1] > grades[n2]    -- compare the grades
    end)

Now, suppose you want to create a function to do this task:

    function sortbygrade (names, grades)
      table.sort(names, function (n1, n2)
        return grades[n1] > grades[n2]    -- compare the grades
      end)
    end

The interesting point in the example is that the anonymous function given to sort accesses the parameter grades, which is local to the enclosing function sortbygrade. Inside this anonymous function, grades is neither a global variable nor a local variable. We call it an external local variable, or an upvalue. (The term "upvalue" is a little misleading, because grades is a variable, not a value. However, this term has historical roots in Lua and it is shorter than "external local variable".) Why is that so interesting? Because functions are first-class values. Consider the following code:

    function newCounter ()
      local i = 0
      return function ()   -- anonymous function
               i = i + 1
               return i
             end
    end
    
    c1 = newCounter()
    print(c1())  --> 1
    print(c1())  --> 2

Now, the anonymous function uses an upvalue, i, to keep its counter. However, by the time we call the anonymous function, i is already out of scope, because the function that created that variable (newCounter) has returned. Nevertheless, Lua handles that situation correctly, using the concept of closure. Simply put, a closure is a function plus all it needs to access its upvalues correctly. If we call newCounter again, it will create a new local variable i, so we will get a new closure, acting over that new variable:

    c2 = newCounter()
    print(c2())  --> 1
    print(c1())  --> 3
    print(c2())  --> 2

So, c1 and c2 are different closures over the same function and each acts upon an independent instantiation of the local variable i. Technically speaking, what is a value in Lua is the closure, not the function. The function itself is just a prototype for closures. Nevertheless, we will continue to use the term "function" to refer to a closure whenever there is no possibility of confusion. Closures provide a valuable tool in many contexts. As we have seen, they are useful as arguments to higher-order functions such as sort. Closures are valuable for functions that build other functions too, like our newCounter example; this mechanism allows Lua programs to incorporate fancy programming techniques from the functional world. Closures are useful for callback functions, too. The typical example here occurs when you create buttons in a typical GUI toolkit. Each button has a callback function to be called when the user presses the button; you want different buttons to do slightly different things when pressed. For instance, a digital calculator needs ten similar buttons, one for each digit. You can create each of them with a function like the next one:

    function digitButton (digit)
      return Button{ label = digit,
                     action = function ()
                                add_to_display(digit)
                              end
                   }
    end

In this example, we assume that Button is a toolkit function that creates new buttons; label is the button label; and action is the callback function to be called when the button is pressed. (It is actually a closure, because it accesses the upvalue digit.) The callback function can be called a long time after digitButton did its task and after the local variable digit went out of scope, but it can still access that variable. Closures are valuable also in a quite different context. Because functions are stored in regular variables, we can easily redefine functions in Lua, even predefined functions. This facility is one of the reasons Lua is so flexible. Frequently, however, when you redefine a function you need the original function in the new implementation. For instance, suppose you want to redefine the function sin to operate in degrees instead of radians. This new function must convert its argument, and then call the original sin function to do the real work. Your code could look like

   oldSin = math.sin
   math.sin = function (x)
     return oldSin(x*math.pi/180)
   end

A cleaner way to do that is as follows:

   do
     local oldSin = math.sin
     local k = math.pi/180
     math.sin = function (x)
       return oldSin(x*k)
     end
   end


Non-Global Functions

An obvious consequence of first-class functions is that we can store functions not only in global variables, but also in table fields and in local variables.

We have already seen several examples of functions in table fields: Most Lua libraries use this mechanism (e.g., io.read, math.sin). To create such functions in Lua, we only have to put together the regular syntax for functions and for tables:

   Lib = {}
   Lib.foo = function (x,y) return x + y end
   Lib.goo = function (x,y) return x - y end

Of course, we can also use constructors:

   Lib = {
     foo = function (x,y) return x + y end,
     goo = function (x,y) return x - y end
   }

Moreover, Lua offers yet another syntax to define such functions:

   Lib = {}
   function Lib.foo (x,y)
     return x + y
   end
   function Lib.goo (x,y)
     return x - y
   end

This last fragment is exactly equivalent to the first example. When we store a function into a local variable we get a local function, that is, a function that is restricted to a given scope. Such definitions are particularly useful for packages: Because Lua handles each chunk as a function, a chunk may declare local functions, which are visible only inside the chunk. Lexical scoping ensures that other functions in the package can use these local functions:

   local f = function (...)
     ...
   end
   
   local g = function (...)
     ...
     f()   -- external local `f' is visible here
     ...
   end

Lua supports such uses of local functions with a syntactic sugar for them:

   local function f (...)
     ...
   end

A subtle point arises in the definition of recursive local functions. The naive approach does not work here:

   local fact = function (n)
     if n == 0 then return 1
     else return n*fact(n-1)   -- buggy
     end
   end

When Lua compiles the call fact(n-1), in the function body, the local fact is not yet defined. Therefore, that expression calls a global fact, not the local one. To solve that problem, we must first define the local variable and then define the function:

   local fact
   fact = function (n)
     if n == 0 then return 1
     else return n*fact(n-1)
     end
   end

Now the fact inside the function refers to the local variable. Its value when the function is defined does not matter; by the time the function executes, fact already has the right value. That is the way Lua expands its syntactic sugar for local functions, so you can use it for recursive functions without worrying:

   local function fact (n)
     if n == 0 then return 1
     else return n*fact(n-1)
     end
   end

Of course, this trick does not work if you have indirect recursive functions. In such cases, you must use the equivalent of an explicit forward declaration:

   local f, g    -- `forward' declarations
   
   function g ()
     ...  f() ...
   end
   
   function f ()
     ...  g() ...
   end

Proper Tail Calls

Another interesting feature of functions in Lua is that they do proper tail calls. (Several authors use the term proper tail recursion, although the concept does not involve recursion directly.)

A tail call is a kind of goto dressed as a call. A tail call happens when a function calls another as its last action, so it has nothing else to do. For instance, in the following code, the call to g is a tail call:

   function f (x)
     return g(x)
   end

After f calls g, it has nothing else to do. In such situations, the program does not need to return to the calling function when the called function ends. Therefore, after the tail call, the program does not need to keep any information about the calling function in the stack. Some language implementations, such as the Lua interpreter, take advantage of this fact and actually do not use any extra stack space when doing a tail call. We say that those implementations support proper tail calls. Because a proper tail call uses no stack space, there is no limit on the number of "nested" tail calls that a program can make. For instance, we can call the following function with any number as argument; it will never overflow the stack:

   function foo (n)
     if n > 0 then return foo(n - 1) end
   end

A subtle point when we use proper tail calls is what is a tail call. Some obvious candidates fail the criteria that the calling function has nothing to do after the call. For instance, in the following code, the call to g is not a tail call:

   function f (x)
     g(x)
     return
   end

The problem in that example is that, after calling g, f still has to discard occasional results from g before returning. Similarly, all the following calls fail the criteria:

   return g(x) + 1     -- must do the addition
   return x or g(x)    -- must adjust to 1 result
   return (g(x))       -- must adjust to 1 result

In Lua, only a call in the format return g(...) is a tail call. However, both g and its arguments can be complex expressions, because Lua evaluates them before the call. For instance, the next call is a tail call:

     return x[i].foo(x[j] + a*b, i + j)

A tail call is a kind of goto. As such, a quite useful application of proper tail calls in Lua is for programming state machines. Such applications can represent each state by a function; to change state is to go to (or to call) a specific function.

See Also

Functions

Functions

Multiple Results