K-d樹進行最近鄰搜索的過程演示和詳細分解

In this tutorial we will go over how to use a KdTree for finding the K nearest neighbors of a specific point or location, and then we will also go over how to find all neighbors within some radius specified by the user (in this case random).

本文將演示我們如何用KdTree去尋找查找特定點或位置的k個最近鄰，然後我們還將介紹如何查找用戶指定半徑內的所有鄰（在本例中是隨機的）。

K-d樹的組織

A k-d tree, or k-dimensional tree, is a data structure used in computer science for organizing some number of points in a space with k dimensions. It is a binary search tree with other constraints imposed on it. K-d trees are very useful for range and nearest neighbor searches. Each level of a k-d tree splits all children along a specific dimension, using a hyperplane that is perpendicular to the corresponding axis. At the root of the tree all children will be split based on the first dimension (i.e. if the first dimension coordinate is less than the root it will be in the left-sub tree and if it is greater than the root it will obviously be in the right sub-tree). Each level down in the tree divides on the next dimension, returning to the first dimension once all others have been exhausted. They most efficient way to build a k-d tree is to use a partition method like the one Quick Sort uses to place the median point at the root and everything with a smaller one dimensional value to the left and larger to the right. You then repeat this procedure on both the left and right sub-trees until the last trees that you are to partition are only composed of one element.

From~Wikipedia $^[1]$

K-d樹，或稱K維樹，在計算機科學中常用來組織K維空間中若干點的數據結構。它是一個帶有其他約束的二進制搜索樹。K-D樹在範圍查詢(range search)和最近鄰(nearest neighbor search)中非常有效。K-D樹的每個級別(level)都使用垂直於相應軸的超平面沿特定維度分割所有子級。在樹的根上，所有子級都將根據第一個維度進行拆分（即，如果第一個維度座標小於根，則它將位於左子樹中，如果它大於根，則它顯然位於右子樹中）。樹中的每一層都在下一個維度上進行劃分，當所有其他維度都用盡後，將返回到第一個維度。構建K-D樹最有效的方法是使用一種類似於快速排序的分區方法，將中間點放在根上，所有東西的一維值越小越好。然後在左樹和右樹上重複此過程，直到要分區的最後一個樹只由一個元素組成。

圖1 k-d 樹組織和空間劃分

圖1展示的是一個二維k-d樹,對點集 ${(2,3),(5,4),(9,6),(4,7),(8,1),(7,2)}$ 進行空間劃分,首先從A節點 $(7,2)$ 開始,以垂直X軸超平面(二維座標系中的超平面是直線])進行劃分,座標小於A節點的被劃分到左子樹,橫座標大於A節點的被劃分到右子樹.同理對其他節點進行劃分.

利用K-d樹進行最近鄰搜索

搜索過程如以下動圖2所示:

圖2 kd樹搜索動態演示

對動圖2過程進行拆解如下:

(1)對於給定的kd tree,其根節點是A,以垂直於X軸的直線劃分二維空間,在X軸上座標比A小的被劃分到左子樹,比A大的被劃分到右子樹.對第二級(level)節點B,C通過垂直Y軸的直線劃分空間.

(2)給定一個點(圖上打叉的十字交叉點),如何利用kd樹找到其最近鄰(Nearest Neighbor,NN)節點及距離呢?當前是不知道的.

(3)首先從kd樹的根節點(即A節點)出發,進行深度優先搜索(維持一個棧結構來保存父節點),將查詢點到根節點的距離先設爲最優估計(best estimate),然後遍歷A的左子樹.

(4)計算查詢點到根節點的左孩子節點(left child,即B節點)的距離,和之前的最優估計(best estimate)進行比較,發現到B節點的距離更短,因此更新最優估計(best estimate),再按照先左子樹後右子樹的方式依次遍歷.

(5)由於B的子節點D和E到查詢點的距離都沒有B近,因此在B的子分支(sub-branch)上,B是最優估計點,對左子樹的遍歷完成,按照深度優先搜索的原理,將返回有子樹進行遍歷.

(6)同理對右子樹遍歷完,此時A節點的所有孩子節點都已經被搜索過,B節點是整棵樹的最優估計,即B節點是查詢點的最近鄰,同時可以計算得到查詢點到最近鄰的距離.

Reperence

[1] k-d tree - Wikipedia
[2] 代碼實現可參考 http://pointclouds.org/documentation/tutorials/kdtree_search.php

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

K-d樹進行最近鄰搜索的過程演示和詳細分解

K-d樹進行最近鄰搜索的過程演示和詳細分解

K-d樹的組織

利用K-d樹進行最近鄰搜索

Reperence

杭州的 IT 崩盤了麼？

開源高性能結構化日誌模塊NanoLog

Azure Virtual Network (22) 多訂閱使用Azure DNS解析問題 Windows Azure Platform 系列文章目錄

Python 潮流週刊#55：分享 9 個高質量的技術類信息源！

【簡寫Mybatis-02】註冊機的實現以及SqlSession處理

手繪二維碼

.NET藉助虛擬網卡實現一個簡單異地組網工具

TiDB整體架構以及在Mac系統上快速安裝部署TiDB

在Linux上安裝Flink以及編寫打包WordCount程序

Flink Streaming流式滑動窗口單詞計數_With IntelliJ IDEA

【課程筆記】Lecture2-斯坦福自然語言處理cs224n

深度解讀FRAGE: Frequency-Agnostic Word Representation(2018-NIPS)

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結