Safe Bool idiom

Safe Bool idiom

轉自:http://visnuhoshimi.spaces.live.com/blog/cns!35416b2a4dc1b31b!2040.entry

它的參考文獻(英文的):http://www.artima.com/cppsource/safeboolP.html

 

在閱讀boost源代碼,看到這個詞,以前從來沒有注意過,使用boost的時候也從來沒有留意它,其實它包含了作者的深思熟慮呢。
safe bool idiom的起源來自於這樣子的一種用法:if( obj ) { ... },對於obj是built-in類型,那自然行得通,但是對於一個class或者struct就並不總是可以了。當然可以通過這樣子的用法來代替:if( obj.is_valid() ){},問題很明顯,第一煩瑣,第二通用性不夠,也許應該使用obj.Valid()或者obj.Empty()之類的,這與template結合起來考慮是尤其是個問題。因此我們希望有一個解決方案是的我們直接可以寫下if(obj)這樣的語句。
初初一想,這很簡單,重載operator bool不就可以了,又或者重載operator void*?是的,這兩個方法確確實實使得我們可以使用if(obj)這樣的語句了,但是仔細推敲的話就會發現有很多問題。
//---------------------------
//重載operator bool的方法
class Test
{
public:
operator bool()const{return ok_;}
};
但是如果我們寫下這樣的語句的時候,你可以理解它的行爲嗎?
Test test;
int tmp = test;
我想你應該發現了一個絕對讓所有人抓頭的詭異行爲。

//----------------------------
//重載operator void*的方法
class Test
{
public:
operator void*()const{ return ok_?this:0;}
};
現在又有一個誤用的例子
Test test;
delete test;
看看它會做什麼吧。
//----------------------------
//重載operator !的方法
class Test
{
public:
bool operator !()const{return !ok_};
};
現在可以這樣來判斷了
Test t;
if(!!t){...}
顯得有些麻煩,不是嗎?

OK,現在我們來看看boost的作法了。
class Test
{
private:
   struct dummy   {void nonnull() {};};
   typedef void (dummy::*safe_bool)();
public:
   operator safe_bool () const{ return (this->empty())? 0 : &dummy::nonnull; }
};
OK,太棒了!我們不能使用delete t;也不能直接寫下int tmp = t,除非到int的隱式轉換存在;也不需要使用!!t這種c中的技巧(它可以將一個非零的數變成1);看起來幾乎是完美了。

但是可能還有一個問題就是似乎你可以寫出這樣的語句,而且它們是合法的,卻沒有意義。
Test t1,t2;
if(t1<==t2){....}

下面是另外一種聰明的方法,但是也存在同樣的問題
//----------------------------
//一個類似於operator void*的更聰明的方法
class Test
{
private:
class _nested_class{};
public:
operator _nested_class*()const{ reutnr ok_?reinterpret<_nested_class*>this:0;}
};
同樣,現在你不可以使用諸如delete test這樣的語句了,因爲_nested的析構是private;

本文大部分內容是閱讀這篇文章後的一些認識:http://www.artima.com/cppsource/safeboolP.html,文中最後提到了一種更好的辦法實現Safe bool idiom,但是看上去boost庫並沒有使用它。
我的註釋:
Safe Bool idiom主要利用了c++的兩個特性:
1)a pointer to member cannot be converted to a void*.
2)An rvalue of pointer to member type can be converted to an rvalue of type bool.
可參考此貼:
參考文獻:
The Safe Bool Idiom
by Bjorn Karlsson
July 31, 2004
Summary
Learn how to validate objects in a boolean context without the usual harmful side effects.

In C++, there are a number of ways to provide Boolean tests for classes. Such support is either provided to make usage intuitive, to support generic programming, or both. We shall examine four popular ways of adding support for the popular and idiomatic if (object) {} construct. To conclude, we will discuss a new solution, without the pitfalls and dangers of the other four. Let the games begin.

The Goal

Some types, for example pointers, allow us to test their validity in Boolean contexts. Any rvalue of arithmetic, enumeration, pointer, or pointer to member type, can be implicitly converted to an rvalue of type bool. We frequently use this property to select a branch of code to execute, for example when acquiring a resource:

  if (some_type* p=get_some_type()) {
    // p is valid, use it
  }
  else {
    // p is not valid, take proper action
  }

Of course, such usage is not only useful for built-in types; any type with an unambiguous meaning of validity could greatly benefit from such a Boolean conversion. The alternative is to use a member function for testing. As an example, consider testing a smart pointer (without an implicit conversion to the contained pointer) for validity:

  smart_ptr<some_type> p(get_some_type());
  if (p.is_valid()) {
    // p is valid, use it
  }
  else {
    // p is not valid, take proper action
  }

Besides being more verbose, this version differs from the previous in that the name p needs to be declared outside of the scope in which it is used. This is bad from a maintenance perspective. Also, the name is_valid will probably differ depending on the type of smart pointer at use�it can just as well be is_empty, Empty, Valid, or any other name a creative designer might have thought of when creating it. Finally, even when disregarding the naming issue and the problem with declaration scope, for smart pointers there's the very real requirement to support pointer-like use. It should typically be possible to convert existing code to make use of smart pointers rather than raw pointers, with a minimum of change to the code base, e.g., code like this should work regardless of pointer smartness:

  template <typename T> void some_func(const T& t) {
    if (t)    
      t->print();
  }

Without some conversion to a Boolean testable type, the above if-statement won't compile for smart pointers. The goal that we set out to accomplish in this article is making that conversion safe. As we shall see, that's a bit harder than one would imagine at first glance.

The Obvious Approach Is operator bool

This classical approach has a straightforward implementation. I'll use the same class (Testable) throughout this article, as seen in the following code:

   // operator bool version
  class Testable {
    bool ok_;
  public:
    explicit Testable(bool b=true):ok_(b) {}

    operator bool() const {
      return ok_;
    }
  };

  // operator! version
  class Testable {
    bool not_ok_;
  public:
    explicit Testable(bool b=true):not_ok_(!b) {}

    bool operator!() const {
      return not_ok_;
    }
  };

  // operator void* version
  class Testable {
    bool ok_;
  public:
    explicit Testable(bool b=true):ok_(b) {}

    operator void*() const {
      return ok_==true ? this : 0;
    }
  };

  // nested class version
  class Testable {
    bool ok_;
  public:
    explicit Testable(bool b=true):ok_(b) {}

    class nested_class {};

    operator const nested_class*() const {
      return ok_ ? reinterpret_cast<const nested_class*>(this) : 0;
    }
  };

Note the implementation for the conversion function:

  operator bool() const {
    return ok_;
  }

Now, we can use instances of the class in expressions like this:

  Testable test;
  if (test) 
    std::cout << "Yes, test is working!/n";
  else 
    std::cout << "No, test is not working!/n";

That's fine, but there's a nasty caveat to this as the conversion function has just told the compiler that it's free to do things behind our backs (lesson 0: never trust a compiler to do your job for you; at least not to do it properly);

  test << 1;
  int i=test;

These are both nonsense operations, but yet allowed and legal C++ (we also have the issue of overloading to consider, which makes things even worse). So, operator bool is not a very good approach. We're also able to compare any types that utilize this technique with each other, although that rarely makes sense:

  Testable a;
  AnotherTestable b;

  if (a==b) {
  }

  if (a<b) {
  }

What else can we do? Well, one improvement is to add another (private) conversion function to an integral type, and thereby disallow the nonsensical operations, even those for equality and ordering. Simply declaring a private conversion function to int does the trick. However, some drawbacks remain, making the solution less than satisfactory. The error messages when a user invokes the ambiguity aren't consistent, or readable. Also, these conversion functions may interfere with perfectly valid conversions and overloads. So we must look elsewhere for a clean solution to this problem.

Not Exactly Obvious, operator!

It's time to move on to safer ground, through operator!. Programmers are already accustomed to using this unary logical negation operator in Boolean contexts, which is a desirable property for intuitive usage. Still, some users might not be ready for what some people call the double-bang trick (see below), which is a requirement for checking the "good state" of such an object. The implementation is trivial:

  bool operator!() const {
    return !ok_;
  }

This is a much better approach—no more implicit conversion or overloading issues to worry about, and two idiomatic ways of testing Testable:

  Testable test;
  if (!!test) 
    std::cout << "Yes, test is working!/n";
  if (!test2) {
    std::cout << "No, test2 is not working!/n";

The first version utilizes a useful trick: if (!!test). It's sometimes called the double-bang trick [1], but alas, it is not nearly as elegant or straightforward as if (test). [Editor's note: This is an old C trick used to map non-zero values to the number 1 so you can have numeric integer values map into a binary-valued index (0 or 1) for use with an array of size two] This is a pity, because if people don't understand how something works it really doesn't matter whether it's safe or not. It's still a very useful technique, but it will typically be used in library code, where �ordinary� users never see it. Of course, it's still possible to compare different types, just as was the case with the first approach (although the obscure syntax should make it obvious that it rarely makes sense to do so). Are there better ways than this?

A Seemingly Innocent Approach � operator void*

Here's a clever idea—using a conversion function to void*. It's clever because there aren't that many things you can do with a void* except test it in Boolean contexts. Here's how it works:

  operator void*() const {
    return ok_==true ? this : 0;
  }

Another trivial implementation! Don't worry, from here on nothing's trivial... As you might have guessed, this solution is flawed, too. The problem is that it is now possible to do this:

  Testable test;
  delete test; 

Ouch! If you think that this situation can be saved with a little const trickery, think again: The C++ Standard explicitly allows delete expressions with pointers to const types [2]. Perhaps the best-known use of this technique comes from the C++ Standard� the conversion that allows the state of iostreams to be queried uses it. However, while the intention is this;

  if (std::cout) { // Is the stream ok?
  }

it is also quite possible to do this;

  std::cout << std::cin << std::cout;

Also, using a conversion like this means that it is possible to test instances of different types in Boolean contexts (all types that utilize this flawed idiom). So, it's time to get radical, and travel deeper into C++ territory.

Almost Getting There with a Nested Class

In 1996, Don Box wrote about a very clever technique in his C++ Report column�a technique originally created to support testing for nullness�that almost does what we came here for. It involves a conversion function to a nested type (that doesn't even need to be defined), like so:
  class Testable {
    bool ok_;
  public:
    explicit Testable(bool b=true):ok_(b) {}

    class nested_class;

    operator const nested_class*() const {
      return ok_ ? reinterpret_cast<const nested_class*>(this) : 0;
    }
  };

Now, this version supports Boolean tests, but alas, too much so. We're now able to write erroneous things like this:

  Testable b1,b2;

  if (b1==b2) {
  }

  if (b1<b2) {
  }

We could poison all operators that have been enabled by the conversion to make the algorithm a better fit for our purposes, but there's an even better way.

The Safe Bool Idiom

It's time to make these tests safe. Remember that we need to avoid unsafe conversions that allow for erroneous usage. We must also avoid overloading issues, and we definitely shouldn't allow deletion through the conversion. So, what do we do? Without further ado, let me give you the solution in code.
  class Testable {
    bool ok_;
    typedef void (Testable::*bool_type)() const;
    void this_type_does_not_support_comparisons() const {}
  public:
    explicit Testable(bool b=true):ok_(b) {}

    operator bool_type() const {
      return ok_==true ? 
        &Testable::this_type_does_not_support_comparisons : 0;
    }
  };

Simple, eh? Let's examine what's going on here. First, we typedef bool_type to be a pointer to a const member function of Testable, taking zero arguments and returning void. This is our magic type that allows for testing in Boolean contexts, without taking part in overloading contexts. Next, we define a conversion function to bool_type, just as we did with bool and void* earlier. Finally, we return "true" using a pointer to a member function (this_type_does_not_support_comparisons), which fits the bool_type, and a null value for "false". It's now possible to safely test instances of Testable in Boolean contexts. The strange name does have a purpose; read on to find out what it is!

Compared to a conversion function to bool, we have avoided the unfortunate overloading issues, and the effects of returning an integral type (making some nonsense constructs legal). We have added the usability that was obscured by operator!, and disabled the potential delete issue with operator void*. Quite impressive! There's one additional twist that makes the solution complete, and that is to disable comparisons between distinct instances of Testable. With our current implementation, you can write code like this:

  Testable test;
  Testable test2;
  if (test1==test2) {}
  if (test!=test2) {}

Comparisons like the above are not only meaningless; they're dangerous, because they imply an equivalence relationship that can never exist between different instances of Testable. We need to find a way to disable such nonsensical comparisons.

  template <typename T> 
    bool operator!=(const Testable& lhs,const T& rhs) {
	lhs.this_type_does_not_support_comparisons();	
      return false;	
    } 
  template <typename T>
    bool operator==(const Testable& lhs,const T& rhs) {
	lhs.this_type_does_not_support_comparisons();
      return false;		
    }

Of course! By defining operator== and operator!= as non-members, and having them attempt to call a non-public member (this_type_does_not_support_comparisons) on the Testable argument will result in a compile error, thus disallowing tests that don't make sense [3]. (The obvious versions taking Testable as the second argument are omitted here for brevity.) Using parameterized implementations, we ensure that an error is only emitted if and when a comparison function is instantiated. The long (and fairly descriptive) name of the called member function will definitely be part of the compiler error message if a comparison function is instantiated, making it easy to locate and correct the error. For tests that you do want to allow, simply define the comparison operators as usual.

This, my friends, is the safe bool idiom. When people started using this idiom, it was discovered that there was an efficiency penalty on some compilers — the member function pointer caused a compiler headache resulting in slower execution when the address was fetched. Although the difference is marginal, the current practice is typically to use a member data pointer instead of a member function pointer. [4]

A Reusable Solution

If you're like me, you don't want to follow the aforementioned steps every time you need to make a class �Boolean testable�. You want something reusable, and you deserve no less! There are two plausible solutions: Using a base class with a virtual function for the actual logic, or a base class that knows which function to call on the derived class. As virtual functions come at a cost (especially if the class you're augmenting with Boolean tests doesn't contain any other virtual functions), I add support for both versions below:
  class safe_bool_base {
  protected:
    typedef void (safe_bool_base::*bool_type)() const;
    void this_type_does_not_support_comparisons() const {}

    safe_bool_base() {}
    safe_bool_base(const safe_bool_base&) {}
    safe_bool_base& operator=(const safe_bool_base&) {return *this;}
    ~safe_bool_base() {}
  };

  template <typename T=void> class safe_bool : public safe_bool_base {
  public:
    operator bool_type() const {
      return (static_cast<const T*>(this))->boolean_test()
        ? &safe_bool_base::this_type_does_not_support_comparisons : 0;
    }
  protected:
    ~safe_bool() {}
  };

  template<> class safe_bool<void> : public safe_bool_base {
  public:
    operator bool_type() const {
      return boolean_test()==true ? 
        &safe_bool_base::this_type_does_not_support_comparisons : 0;
    }
  protected:
    virtual bool boolean_test() const=0;
    virtual ~safe_bool() {}
  };

  template <typename T, typename U> 
    void operator==(const safe_bool<T>& lhs,const safe_bool<U>& rhs) {
      lhs.this_type_does_not_support_comparisons();	
      return false;
  }

  template <typename T,typename U> 
  void operator!=(const safe_bool<T>& lhs,const safe_bool<U>& rhs) {
    lhs.this_type_does_not_support_comparisons();
    return false;	
  }

Here's how to use safe_bool:

  class Testable_with_virtual : public safe_bool<> {
  protected:
    bool boolean_test() const {
      // Perform Boolean logic here
    }
  };

  class Testable_without_virtual : 
    public safe_bool <Testable_without_virtual> {
  public:
    bool boolean_test() const {
      // Perform Boolean logic here
    }
  };

The first class, Testable_with_virtual, derives publicly from safe_bool, and implements a virtual function boolean_test—this function is called whenever an instance is tested (as in if (obj){}, or if (!obj){}). The second class, Testable_without_virtual, also derives publicly from safe_bool, and in addition, it passes itself as a template parameter to its base class. This little trick—known as the Curiously Recurring Template Pattern— enables the base class to downcast (to the derived class) using static_cast and call boolean_test with no extra runtime overhead and no virtual function calls. Some people may feel that this is a slight misuse of inheritance; while it might be argued that an instance of a derived class is-a safe_bool of sorts, this is certainly not the intent of this code. However, there is little reason to believe that even neophyte programmers will fall into the trap of misunderstanding this relationship. The destructors of the safe_bool classes are protected to minimize the potential for misuse. But there's still hope for the (in my opinion, overly) conscientious object-oriented purist; use private inheritance, and make the conversion function public by reintroducing it in the correct scope:

  class Testable_without_virtual : 
    private safe_bool <Testable_without_virtual> {
  public:
    using safe_bool<Testable_without_virtual>::operator safe_bool;

    bool boolean_test() const {
      return true; // Logic goes here!
    }
  };

Matthew Wilson [5] pointed out that the inheritance strategy (using safe_bool as a base class) may lead to size penalties on some compilers, specifically, those that do not implement EBO (Empty Base Optimization) properly. Although most modern compilers do when it comes to single inheritance, there may be a size penalty with multiple inheritance.

Knowing When to Say No

Yes, this is a cool idiom, and you're probably eager to try it out on some of your own classes, right? Before you go ahead, please consider that it's imperative to understand that this idiom should only be used where there is a reasonably unambiguous notion of validity for objects of a class. Consider the void* conversion in iostreams. Do you know which state flags are considered in that test? Most people probably think they do. I looked it up and realized that at least I was wrong—the eofbit flag is ignored. This goes to show that if programmers' expectations of the semantics may differ, providing a member function with a descriptive name is much better. In the case of iostreams, it's reasonable to test for errors using fail(), and failure of input using !good(). For a container class, having a conversion function would be absolutely disastrous, because the possible interpretations of its meaning are so plentiful. For most classes, the proper way to design tests for validity is to provide member functions with clever names, not clever conversion functions. There you go.

Prior Art

New discoveries typically build on previous findings, and the safe bool idiom is no different; in addition to existing protocols with similar properties, related topics have been thoroughly treated in books and articles. The following is by no means an exhaustive listing, but rather a small collection of important contributions that I've come across [6].
  • Scott Meyers discusses the pitfalls of conversion functions in his classic book, More Effective C++ [Addison-Wesley, 1995] (Item 5). He also endorses the protocol discovered by Don Box on the errata page for the aforementioned book.
  • Don Box demonstrates the nested class technique in C++ Report (published March 1996).
  • Stephen C. Dewhurst talks about the misuse(s) of conversion functions in his great book, C++ Gotchas [Addison-Wesley, 2002].
  • Angelika Langer and Klaus Kreft demonstrate how the streams' void* conversion works in their IOStreams tome, Standard C++ IOStreams and Locales.

Without the work of these people, and others who've come to the same conclusions, chances are that the safe bool idiom would not have seen the light of day.

Summary

As C++ programmers, we accept that there are a number of different ways to reach a programming goal—and different approaches typically involve different tradeoffs. Carefully balancing usability and safety is hard, and depending on which side of the fence you're standing on, you'll either be screaming "safety", or "usability" first. It's therefore especially satisfying to be able to present a solution to a common problem that appeals to both sides. As usual, I would have liked to say that this was my idea—but, also as usual, that is not the case. The man behind this ingenious idea is Peter Dimov [7,8], and it's to him we should send our thanks�as users, and as library developers.

Acknowledgements

I'd like to thank the following people for their gracious help and suggestions with this article:
  • David Abrahams, for taking part in the discussion of when and how the safe bool idiom should be used, and in particular for making the point that the support for generic programming is important for certain types.
  • Chuck Allison, for his careful editing of the article (twice).
  • Stephen C. Dewhurst, for reviewing this article, and for making sure that I didn't forget to mention that there are plenty of times when the idiom should not be used. Or, as Steve said it, �sometimes it's reasonable to just say no�.
  • Peter Dimov, for inventing the idiom, and for reviewing an early draft of this article.
  • Kevlin Henney, for clarifying what does, and what does not, really constitute an idiom.
  • Howard Hinnant, for taking part in the discussion of when and how the safe bool idiom should be used (and how to implement it properly).
  • Scott Meyers, for reviewing this article, and pointing out the (many) places where improvement was needed.
  • Matthew Wilson, for reviewing this article.

References

Some people also refer to it as �the Boolean conversion operator�. ISO/IEC 14882:98, �5.3.5/2. See Vandevoorde and Josuttis, C++ Templates, Addison-Wesley, 2003, pp. 392-393. If the tests are indeed valid, you can always enable them by defining the operators for your class. Don't miss Matthew's forthcoming book, Imperfect C++: Practical Solutions For Real-life Programming [Addison-Wesley 2004], where this topic and many more are covered in great detail! If you're aware of other important contributions, please let me know! More about Peter at The twist of declaring (but not defining) operator== and operator!= was invented by Douglas Gregor, and the technique is sometimes referred to as �poisoning operators�. I've added an (parameterized) implementation that fails to compile (when instantiated) rather than just defining the operators and waiting until link time for the error to appear.

Talk Back!

Discuss this article in the Articles Forum topic,

About the Author

Bjorn Karlsson is proud to be a C++ designer, programmer, teacher, preacher, and student. He has finally learned enough about C++ to realize how little he knows. When not reading or writing articles, books, or code, he has the privilege to be a part of the Boost community, and a member of The C++ Source Advisory Board. He appreciates it when people send him interesting emails at [email protected].

http://www.boost.org/people/peter_dimov.htm. The Safe Bool Idiom.
發佈了25 篇原創文章 · 獲贊 9 · 訪問量 30萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章