Binary search and lower_bound, upper_bound

傳統的二分查找是爲了尋找目標數在一個有序數組中是否存在的問題。返回值爲布爾型變量(true or false)。但是有時候需要知道更多信息,比如如果存在,那麼在數組中該數出現的第一個位置(left most)和最後一個位置的索引是多少?如果不存在,那麼如果把該數插入到數組中去,應該插入到哪個位置?

所以需要深入一步探索二分查找算法。C++中給出了幾個庫函數binary_search(),lower_bound() 和 upper_bound()。


binary_search()

Test if value exists in sorted sequence
Returns true if any element in the range [first,last) is equivalent to val, and false otherwise.
The elements are compared using operator< for the first version, and comp for the second. Two elements, a and b are considered equivalent if (!(a< b) && !(b< a)) or if (!comp(a,b) && !comp(b,a)).
The elements in the range shall already be sorted according to this same criterion (operator< or comp), or at least partitioned with respect to val.
The function optimizes the number of comparisons performed by comparing non-consecutive elements of the sorted range, which is specially efficient for random-access iterators.

The behavior of this function template is equivalent to:

template <class ForwardIterator, class T>
  bool binary_search (ForwardIterator first, ForwardIterator last, const T& val)
{
  first = std::lower_bound(first,last,val);
  return (first!=last && !(val<*first));
}

lower_bound() and upper_bound()

lower_bound()

Return iterator to lower bound
Returns an iterator pointing to the first element in the range [first,last) which does not compare less than val.
The elements are compared using operator< for the first version, and comp for the second. The elements in the range shall already be sorted according to this same criterion (operator< or comp), or at least partitioned with respect to val.
The function optimizes the number of comparisons performed by comparing non-consecutive elements of the sorted range, which is specially efficient for random-access iterators.
Unlike upper_bound, the value pointed by the iterator returned by this function may also be equivalent to val, and not only greater.

upper_bound()

Return iterator to upper bound
Returns an iterator pointing to the first element in the range [first,last) which compares greater than val.
The elements are compared using operator< for the first version, and comp for the second. The elements in the range shall already be sorted according to this same criterion (operator< or comp), or at least partitioned with respect to val.
The function optimizes the number of comparisons performed by comparing non-consecutive elements of the sorted range, which is specially efficient for random-access iterators.
Unlike lower_bound, the value pointed by the iterator returned by this function cannot be equivalent to val, only greater.

  vector<int> arr{ 2, 4, 6, 6, 8, 10 };
  int n = arr.size();
  auto iter_low = lower_bound( arr.begin(), arr.end(), 6 );
  cout << "lower_bound is : " << *iter_low << " the one before is :" << *prev(iter_low) << endl;

  auto iter_upper = upper_bound( arr.begin(), arr.end(), 6 );
  cout << "upper_bound is : " << *iter_upper << endl;

結果輸出是

lower_bound is : 6 the one before is :4
upper_bound is : 8
  int n = arr.size();
  auto iter_low = lower_bound( arr.begin(), arr.end(), 3 );
  cout << "lower_bound is : " << *iter_low << " the one before is :" << *prev(iter_low) << endl;

  auto iter_upper = upper_bound( arr.begin(), arr.end(), 3 );
  cout << "upper_bound is : " << *iter_upper << endl;
lower_bound is : 4 the one before is :2
upper_bound is : 4

可以看到,這裏的 lower_bound 和 upper_bound 就實現了我們要的功能。如果存在該數,則分別返回最前面的數,和最後面的數的後一個位置的迭代器。如果不存在,則返回應該插入的位置的迭代器。

另外有些應用場景下我們需要知道這個位置的索引號,而不是迭代器,對於 vector 可以使用得到的迭代器減去 begin得到索引號。

  vector<int> arr{ 2, 4, 6, 6, 8, 10 };
  int n = arr.size();
  auto iter_low = lower_bound( arr.begin(), arr.end(), 6 );
  if ( iter_low != arr.end() ) {
    auto idx = iter_low - arr.begin();
    cout << "index of lower bound : " << idx << endl;
    cout << "lower_bound is : " << arr[idx] << " the one before is :" << arr[idx-1] << endl;
  }
  else {
    cout << "lower_bound is beyond the range of array" << endl;
  }

  auto iter_upper = upper_bound( arr.begin(), arr.end(), 6 );
  cout << "upper_bound is : " << *iter_upper << endl;

輸出結果:

index of lower bound : 2
lower_bound is : 6 the one before is :4
upper_bound is : 8

Implementation of binary_serach, upper_bound and lower_bound

int find_position( const vector<int> &arr, int beg, int end, int target )
{
  int mid = 0;
  // binary_search
  while ( beg <= end ) {
    // target > arr[i] for all i < beg
    // target <= arr[i] for all i > end
    mid = beg + (( end - beg) >> 1);
    if ( arr[mid] >= target ) {
      end = mid - 1;
    }
    else {
      beg = mid + 1;
    }
  }
  return arr[beg] == target;

  // lower_bound
  /*while ( beg <= end ) {
    // target > arr[i] for all i < beg
    // target <= arr[i] for all i > end
    mid = beg + (( end - beg) >> 1);
    if ( arr[mid] >= target ) {
      end = mid - 1;
    }
    else {
      beg = mid + 1;
    }
  }
  return beg;*/

  // upper bound
  /*while ( beg <= end ) {
    // target >= arr[i] for all i < beg
    // target < arr[i] for all i > end
    mid = beg + (( end - beg) >> 1);
    if ( arr[mid] > target ) {
      end = mid - 1;
    }
    else {
      beg = mid + 1;
    }
  }
  return beg;*/
}

Versiom 2

bool binarySearch(const vector<int>& arr, int target) {
  int beg = 0, end = arr.size() - 1;  // end is the last index
  int mid = 0;
  // binary_search
  while (beg < end) {
    // target > arr[i] for all i < beg
    // target <= arr[i] for all i > end
    mid = beg + ((end - beg) >> 1);
    if (arr[mid] == target) {
      return true;
    } else if (arr[mid] > target) {
      end = mid;
    } else {
      beg = mid + 1;
    }
  }
  return arr[beg] == target;
}

int lowerBound(const vector<int>& arr, int target) {
  int beg = 0, end = arr.size();  // end is the size of array
  int mid = 0;
  // binary_search
  while (beg < end) {
    // target > arr[i] for all i < beg
    // target <= arr[i] for all i > end
    mid = beg + (( end - beg) >> 1);
    if (arr[mid] >= target) {
      end = mid;
    } else {
      beg = mid + 1;
    }
  }

  //cout << "lower bound = " << lower_bound(arr.begin(), arr.end(), target) - arr.begin() << " beg = " << beg << endl; 
  return beg;
}


int upperBound(const vector<int>& arr, int target) {
  int beg = 0, end = arr.size();  // end is the size of array
  int mid = 0;
  // binary_search
  while (beg < end) {
    // target >= arr[i] for all i < beg
    // target < arr[i] for all i > end
    mid = beg + (( end - beg) >> 1);
    if (arr[mid] > target) {
      end = mid;
    } else {
      beg = mid + 1;
    }
  }

  //cout << "upper bound = " << upper_bound(arr.begin(), arr.end(), target) - arr.begin() << " beg = " << beg << endl; 
  return beg;
}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章