閉包學習筆記

閉包學習筆記

2010-05-27 星期四 晴朗

最近由於比較空閒,所以找個時間研究了一下閉包。其實以前也學習過一段時間,但是都是知其然而不知其所以然。現在終於可以說真正理解閉包是什麼了。

要研究閉包,網上最好的資料是《Javascript Closures——FAQ>FAQ Notes》。雖然它只是集中介紹了Javascript的閉包,但是其他語言也是一樣的機制。
下面我們也是直接介紹Javascript的閉包概念。

關鍵概念

1. The Resolution of Property Names on Objects

結論:給定一個屬性,會先在對象上查找該名字的屬性,如果不存在,則查找原型鏈,再找不到的話就返回undef。

2. Identifier Resolution, Execution Contexts and scope chains

The Execution Context

All javascript code is executed in an execution context. Global code (code executed inline, normally as a JS file, or HTML page, loads) gets executed in global execution context, and each invocation of a function (possibly as a constructor) has an associated execution context. Code executed with the eval function also gets a distinct execution context but as eval is never normally used by javascript programmers it will not be discussed here. The specified details of execution contexts are to be found in section 10.2 of ECMA 262 (3rd edition). 

When a javascript function is called it enters an execution context, if another function is called (or the same function recursively) a new execution context is created and execution enters that context for the duration of the function call. Returning to the original execution context when that called function returns. Thus running javascript code forms a stack of execution contexts

 When an execution context is created a number of things happen in a defined order. First, in the execution context of a function, an "Activation" object is created. The activation object is another specification mechanism. It can be considered as an object because it ends up having accessible named properties, but it is not a normal object as it has no prototype (at least not a defined prototype) and it cannot be directly referenced by javascript code.

The next step in the creation of the execution context for a function call is the creation of an arguments object, which is an array-like object with integer indexed members corresponding with the arguments passed to the function call, in order. It also has length and callee properties (which are not relevant to this discussion, see the spec for details). A property of the Activation object is created with the name "arguments" and a reference to the arguments object is assigned to that property.

Next the execution context is assigned a scope. A scope consists of a list (or chain) of objects. Each function object has an internal [[scope]] property (which we will go into more detail about shortly) that also consists of a list (or chain) of objects. The scope that is assigned to the execution context of a function call consists of the list referred to by the [[scope]] property of the corresponding function object with the Activation object added at the front of the chain (or the top of the list).

Then the process of "variable instantiation" takes place using an object that ECMA 262 refers to as the "Variable" object. However, the Activation object is used as the Variable object (note this, it is important: they are the same object). Named properties of the Variable object are created for each of the function's formal parameters, and if arguments to the function call correspond with those parameters the values of those arguments are assigned to the properties (otherwise the assigned value is undefined). Inner function definitions are used to create function objects which are assigned to properties of the Variable object with names that correspond to the function name used in the function declaration. The last stage of variable instantiation is to create named properties of the Variable object that correspond with all the local variables declared within the function.

The properties created on the Variable object that correspond with declared local variables are initially assigned undefined values during variable instantiation, the actual initialisation of local variables does not happen until the evaluation of the corresponding assignment expressions during the execution of the function body code.

It is the fact that the Activation object, with its arguments property, and the Variable object, with named properties corresponding with function local variables, are the same object, that allows the identifier arguments to be treated as if it was a function local variable.

Finally a value is assigned for use with the this keyword. If the value assigned refers to an object then property accessors prefixed with the this keyword reference properties of that object. If the value assigned (internally) is null then the this keyword will refer to the global object.

The global execution context gets some slightly different handling as it does not have arguments so it does not need a defined Activation object to refer to them. The global execution context does need a scope and its scope chain consists of exactly one object, the global object. The global execution context does go through variable instantiation, its inner functions are the normal top level function declarations that make up the bulk of javascript code. The global object is used as the Variable object, which is why globally declared functions become properties of the global object. As do globally declared variables.

The global execution context also uses a reference to the global object for the this object. 

說明:每個Activation/Variable對象包含了本函數執行上下文(Execution Context)中屬於自身的數據(參數,和局部變量),而通過函數對象中的scope可以得到其他相關函數的Activation/Variable對象,從而得到其他對象的數據。

scope chains and [[scope]]

The scope chain of the execution context for a function call is constructed by adding the execution context's Activation/Variable object to the front of the scope chain held in the function object's [[scope]] property.

 Function objects created with the Function constructor always have a [[scope]] property referring to a scope chain that only contains the global object.

Function objects created with function declarations or function expressions have the scope chain of the execution context in which they are created assigned to their internal [[scope]] property. 


說明:每個Activation/Variable對象可能包含函數對象(這經常就是閉包),這個函數對象通過作用域鏈可以得到外層函數的數據。
Identifier Resolution

Identifiers are resolved against the scope chain.

Identifier resolution starts with the first object in the scope chain. It is checked to see if it has a property with a name that corresponds with the identifier. Because the scope chain is a chain of objects this checking encompasses the prototype chain of that object (if it has one). If no corresponding value can be found on the first object in the scope chain the search progresses to the next object. And so on until one of the objects in the chain (or one of its prototypes) has a property with a name that corresponds with the identifier or the scope chain is exhausted. 

As execution contexts associated with function calls will have the Activation/Variable object at the front of the chain, identifiers used in function bodies are effectively first checked to see whether they correspond with formal parameters, inner function declaration names or local variables. Those would be resolved as named properties of the Activation/Variable object.

3. Closures

Automatic Garbage Collection

ECMAScript uses automatic garbage collection. The specification does not define the details, leaving that to the implementers to sort out, and some implementations are known to give a very low priority to their garbage collection operations. But the general idea is that if an object becomes un-referable (by having no remaining references to it left accessible to executing code) it becomes available for garbage collection and will at some future point be destroyed and any resources it is consuming freed and returned to the system for re-use.

This would normally be the case upon exiting an execution context. The scope chain structure, the Activation/Variable object and any objects created within the execution context, including function objects, would no longer be accessible and so would become available for garbage collection. 

Forming Closures

A closure is formed by returning a function object that was created within an execution context of a function call from that function call and assigning a reference to that inner function to a property of another object. Or by directly assigning a reference to such a function object to, for example, a global variable, a property of a globally accessible object or an object passed by reference as an argument to the outer function call. e.g:-

function exampleClosureForm(arg1, arg2){
    var localVar = 8;
    function exampleReturned(innerArg){
        return ((arg1 + arg2)/(innerArg + localVar));
    }
    /* return a reference to the inner function defined as -
       exampleReturned -:-
    */
    return exampleReturned;
}

var globalVar = exampleClosureForm(2, 4);

Now the function object created within the execution context of the call to exampleClosureForm cannot be garbage collected because it is referred to by a global variable and is still accessible, it can even be executed with globalVar(n).

But something a little more complicated has happened because the function object now referred to by globalVar was created with a [[scope]] property referring to a scope chain containing the Activation/Variable object belonging to the execution context in which it was created (and the global object). Now the Activation/Variable object cannot be garbage collected either as the execution of the function object referred to by globalVar will need to add the whole scope chain from its [[scope]] property to the scope of the execution context created for each call to it.

A closure is formed. The inner function object has the free variables and the Activation/Variable object on the function's scope chain is the environment that binds them.

The Activation/Variable object is trapped by being referred to in the scope chain assigned to the internal [[scope]] property of the function object now referred to by the globalVar variable. The Activation/Variable object is preserved along with its state; the values of its properties. Scope resolution in the execution context of calls to the inner function will resolve identifiers that correspond with named properties of that Activation/Variable object as properties of that object. The value of those properties can still be read and set even though the execution context for which it was created has exited.

In the example above that Activation/Variable object has a state that represents the values of formal parameters, inner function definitions and local variables, at the time when the outer function returned (exited its execution context). The arg1 property has the value 2,the arg2 property the value 4, localVar the value 8 and an exampleReturned property that is a reference to the inner function object that was returned form the outer function. (We will be referring to this Activation/Variable object as "ActOuter1" in later discussion, for convenience.)

If the exampleClosureForm function was called again as:-

var secondGlobalVar = exampleClosureForm(12, 3);

- a new execution context would be created, along with a new Activation object. And a new function object would be returned, with its own distinct [[scope]] property referring to a scope chain containing the Activation object form this second execution context, with arg1 being 12 and arg2 being 3. (We will be referring to this Activation/Variable object as "ActOuter2" in later discussion, for convenience.)

A second and distinct closure has been formed by the second execution of exampleClosureForm.

The two function objects created by the execution of exampleClosureForm to which references have been assigned to the global variable globalVar and secondGlobalVar respectively, return the expression ((arg1 + arg2)/(innerArg + localVar)). Which applies various operators to four identifiers. How these identifiers are resolved is critical to the use and value of closures.

Consider the execution of the function object referred to by globalVar, as globalVar(2). A new execution context is created and an Activation object (we will call it "ActInner1"), which is added to the head of the scope chain referred to the [[scope]] property of the executed function object. ActInner1 is given a property named innerArg, after its formal parameter and the argument value 2 assigned to it. The scope chain for this new execution context is: ActInner1-> ActOuter1-> global object.

Identifier resolution is done against the scope chain so in order to return the value of the expression ((arg1 + arg2)/(innerArg + localVar)) the values of the identifiers will be determined by looking for properties, with names corresponding with the identifiers, on each object in the scope chain in turn.

The first object in the chain is ActInner1 and it has a property named innerArg with the value 2. All of the other 3 identifiers correspond with named properties of ActOuter1; arg1 is 2, arg2 is 4 and localVar is 8. The function call returns ((2 + 4)/(2 + 8)).

Compare that with the execution of the otherwise identical function object referred to by secondGlobalVar, as secondGlobalVar(5). Calling the Activation object for this new execution context "ActInner2", the scope chain becomes: ActInner2-> ActOuter2-> global object. ActInner2 returns innerArg as 5 and ActOuter2 returns arg1, arg2 and localVar as 12, 3 and 8 respectively. The value returned is ((12 + 3)/(5 + 8)).

Execute secondGlobalVar again and a new Activation object will appear at the front of the scope chain but ActOuter2 will still be next object in the chain and the value of its named properties will again be used in the resolution of the identifiers arg1, arg2 and localVar.

This is how ECMAScript inner functions gain, and maintain, access to the formal parameters, declared inner functions and local variables of the execution context in which they were created. And it is how the forming of a closure allows such a function object to keep referring to those values, reading and writing to them, for as long as it continues to exist. The Activation/Variable object from the execution context in which the inner function was created remains on the scope chain referred to by the function object's [[scope]] property, until all references to the inner function are freed and the function object is made available for garbage collection (along with any now unneeded objects on its scope chain).

Inner function may themselves have inner functions, and the inner functions returned from the execution of functions to form closures may themselves return inner functions and form closures of their own. With each nesting the scope chain gains extra Activation objects originating with the execution contexts in which the inner function objects were created. The ECMAScript specification requires a scope chain to be finite, but imposes no limits on their length. Implementations probably do impose some practical limitation but no specific magnitude has yet been reported. The potential for nesting inner functions seems so far to have exceeded anyone's desire to code them. 

寫道最後都懶散了,直接貼文檔了。一言以蔽之:如果說類是封裝是行爲的數據,那麼閉包就是封裝着數據的行爲。創建一個閉包(往往是通過執行一個返回一個內層函數對象的函數),就是創建了一個邦定了創建該閉包的執行上下文的函數對象,因爲是對象,所以他往往是有實例狀態的。

經典例子:
<html>
<head>
<script>
function closure(){
    for(i=0; i <10; i++){
        var btn = document.getElementById("button" + i);
function handler(i){
   return (function(){
              alert(i);
           });
}
btn.onclick = handler(i); 
    }
}
</script>

</head>


<body onLoad="closure()">
<input id="button0" type="button" value="0" />
<input id="button1" type="button" value="1" />
<input id="button2" type="button" value="2" />
<input id="button3" type="button" value="3" />
<input id="button4" type="button" value="4" />
<input id="button5" type="button" value="5" />
<input id="button6" type="button" value="6" />
<input id="button7" type="button" value="7" />
<input id="button8" type="button" value="8" />
<input id="button9" type="button" value="9" />
</body>

</html>

說明:
 當我們定義如下函數時:
function handler(i){
   return (function(){
              alert(i);
           });
}
就確定了其作用域鏈,這是一個Lexical Scope,也就是說scope chain是在函數定義時就已經確定了。但是這時候scope chain中的每個Activation/Variable對象都是空的,也就是說是無狀態的。
當我們執行外層函數handler時,就創建並且進入了handler的執行上下文:
btn.onclick = handler(i); 
根據前面的介紹的:“When an execution context is created a number of things happen in a defined order.。。。”handler函數的整個完整執行上下文就創建好了,執行完成之後本來應該是被垃圾回收的,但是因爲handler返回了一個內部函數,所以該上下文不會被回收,而是邦定到內部函數了,thus形成了閉包(一個封裝着狀態的行爲,不就是對象麼^_^):
Closure A "closure" is an expression (typically a function) that can have free variables together with an environment that binds those variables (that "closes" the expression).   


groovy的閉包

Closure semantics

Closures appear to be a convenient mechanism for defining something like an inner classs, but the semantics are in fact more powerful and subtle than what an inner class offers. In particular, the properties of closures can be summarized in this manner:

  1. They have one implicit method (which is never specified in a closure definition) called doCall()
  2. A closure may be invoked via the call() method, or with a special syntax of an unnamed () invocation. Either invocation will be translated by Groovy into a call to the Closure's doCall() method.
  3. Closures may have 1...N arguments, which may be statically typed or untyped. The first parameter is available via an implicit untyped argument named it if no explicit arguments are named. If the caller does not specify any arguments, the first parameter (and, by extension, it) will be null.
  4. The developer does not have to use it for the first parameter. If they wish to use a different name, they may specify it in the parameter list.
  5. Closures always return a value. This may occur via either an explicit return statement, or as the value of the last statement in the closure body (e.g. an explicit return statement is optional).
  6. A closure may reference any variables defined within its enclosing lexical scope. Any such variable is said to be bound to the closure
  7. Any variables bound to a closure are available to the closure even when the closure is returned outside of the enclosing scope.
  8. Closures are first class objects in Groovy, and are always derived from the class Closure. Code which uses closures may reference them via untyped variables or variables typed as Closure.
  9. The body of a closure is not executed until it is explicitly invoked e.g. a closure is not invoked at its definition time
  10. A closure may be curried so that one a copy the closure is made with one or more of its parameters fixed to a constant value
一言以蔽之:如果說類是封裝是行爲的數據,那麼閉包就是封裝着數據的行爲!
其實閉包就是一個類,只是他的狀態只有在運行時候才能確定(因爲他的數據很多時候來自於外部(lexical scope),他只是保存着這些數據的引用,所以多個閉包可能共享同個狀態(several closures can share the same state (or part of it))),因爲他是一個對象,所以他可以作爲參數和返回值到處傳遞,我們可以通過()或call()方法促發他的doCall()方法。

例子:變量引用——Closure邦定的是變量的引用
package alibaba.b2b.forrest;

import groovy.lang.Delegate;

class A {
private String member  = "private member";
private String method(){
return "this is a private method";
}
def publicMethod(String name){
def localVar = "localVar";
def clos = {println "before: ${member} ${name} ${localVar} ${method()}" 
member = "member changed in closure";
localVar = "localVar changed in closure";
println "after: ${member} ${name} ${localVar} ${method()}" 
};
return clos;
}
}


A sample = new A();
def closureVar = sample.publicMethod ("Forrest");
closureVar ();
println();
def closureVar2 = sample.publicMethod ("Gump");
closureVar2 ();

/////////////////output//////////////////////////
before: private member Forrest localVar this is a private method
after: member changed in closure Forrest localVar changed in closure this is a private method

before: member changed in closure Gump localVar this is a private method
after: member changed in closure Gump localVar changed in closure this is a private method

例2:this & owner & delegate

this, owner, and delegate

this : as in Java, this refers to the instance of the enclosing class where a Closure is defined
owner : the enclosing object (this or a surrounding Closure)
delegate : by default the same as owner, but changeable for example in a builder or ExpandoMetaClass

 

Example:

class Class1 {

  def closure = {

    println this.class.name

    println delegate.class.name

    def nestedClos = {

      println owner.class.name

    }

    nestedClos()

  }

}

 

def clos = new Class1().closure

clos.delegate = this

clos()

/*  prints:

 Class1

 Script1

 Class1$_closure1  */

 

因爲閉包實際上就是一個對象,所以它是可以,並且經常作爲參數傳遞。這點有點類似於C++中的函數對象。

def list = ['a','b','c','d']

def newList = []

 

list.collect( newList ) {

   it.toUpperCase()

}

println newList           //  ["A", "B", "C", "D"]

In the above example, the collect method accepts a List and a Closure argument. The same could be accomplished like so (although it is more verbose):

def list = ['a','b','c','d']

def newList = []

def clos = { it.toUpperCase() }

list.collect( newList, clos )

assert newList == ["A", "B", "C", "D"]

groovy方法調用中Closure可以並且經常作爲參數傳遞,而爲了方便編寫代碼,一般也是將Closure寫在方法調用的最後,這個特性是Groovy的Builder構建DSL的核心(當然最核心的是方法調用invokeMethod方法)。關於Groovy的Builder構建DSL,我們將在另外一篇文章中介紹。


發佈了47 篇原創文章 · 獲贊 7 · 訪問量 18萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章