ES學習筆記九-地理位置(geolocation)

geolocation


Elasticsearch offers two ways of representing geolocations: latitude-longitude points using the geo_point field type, and complex shapes defined in GeoJSON, using the geo_shape field type.

ES用兩種方式代表geolocation,數據類型爲geo_point的經緯座標點,或者數據類型爲geo_shape定義的geoJSON複雜的圖形

PUT /attractions
{
 
"mappings": {
   
"restaurant": {
     
"properties": {
       
"name": {
         
"type": "string"
       
},
       
"location": {
         
"type": "geo_point"
       
}
     
}
   
}
 
}
}
geo-points不能被自動識別。所以必須通過mapping指定

PUT /attractions/restaurant/1
{
 
"name":     "Chipotle Mexican Grill",
 
"location": "40.715, -74.011" location的格式化順序爲維度,經度
}

PUT
/attractions/restaurant/2
{
 
"name":     "Pala Pizza",
 
"location": { 推薦使用這個,不容易弄混啊~~~
   
"lat":     40.722,
   
"lon":    -73.989
 
}
}

PUT
/attractions/restaurant/3
{
 
"name":     "Mini Munchies Pizza",
 
"location": [ -73.983, 40.719 ] 經度,緯度
}

filtering by geo-point

Four geo-point filters can be used to include or exclude documents by geolocation:

geo_bounding_box Find geo-points that fall within the specified rectangle. 找出geo-points落在置頂的矩形中 geo_distance Find geo-points within the specified distance of a central point. 找出落在以distance爲半徑,已centeral point爲原點的圓內的geo-points geo_distance_range Find geo-points within a specified minimum and maximum distance from a central point. geo_polygon Find geo-points that fall within the specified polygon. This filter is very expensive. If you find yourself wanting to use it, you should be looking at geo-shapes instead.這個效率有點慢,如果你有使用此filter的需求,請參考geo-shapes.Geo-filters are expensive — they should be used on as few documents as possible. First remove as many documents as you can with cheaper filters, like term or rangefilters, and apply the geo-filters last.

geo-filter效率有點慢,所以把geo-filters放在其他filter之後

geo_bounding_box filter

GET /attractions/restaurant/_search
{
 
"query": {
   
"filtered": {
     
"filter": {
       
"geo_bounding_box": {
         
"location": {
           
"top_left": {
             
"lat":  40.8,
             
"lon": -74.0
           
},
           
"bottom_right": {
             
"lat":  40.7,
             
"lon": -73.0
           
}
         
}
       
}
     
}
   
}
 
}
}

optimizing bounding boxes

PUT /attractions
{
 
"mappings": {
   
"restaurant": {
     
"properties": {
       
"name": {
         
"type": "string"
       
},
       
"location": {
         
"type":    "geo_point",
         
"lat_lon": true
       
}
     
}
   
}
 
}
}

The location.lat and location.lon fields will be indexed separately. These fields can be used for searching, but their values cannot be retrieved.

GET /attractions/restaurant/_search
{
 
"query": {
   
"filtered": {
     
"filter": {
       
"geo_bounding_box": {
         
"type":    "indexed",
         
"location": {
           
"top_left": {
             
"lat":  40.8,
             
"lon": -74.0
           
},
           
"bottom_right": {
             
"lat":  40.7,
             
"lon":  -73.0
           
}
         
}
       
}
     
}
   
}
 
}
}

Setting the type parameter to indexed (instead of the default memory) tells Elasticsearch to use the inverted index for this filter.

While a geo_point field can contain multiple geo-points, the lat_lon optimization can be used only on fields that contain a single geo-point. lat_lon優化只能用在只有一個值的字段上

geo_distance filter

{
 
"query": {
   
"filtered": {
     
"filter": {
       
"geo_distance": {
         
"distance": "1km",
         
"location": {
           
"lat":  40.715,
           
"lon": -73.988
         
}
       
}
     
}
   
}
 
}
}

geo_distance_range filter

The only difference between the geo_distance and geo_distance_range filters is that the latter has a doughnut shape and excludes documents within the central hole.

Instead of specifying a single distance from the center, you specify a minimum distance (with gt or gte) and maximum distance (with lt or lte), just like a range filter:

GET /attractions/restaurant/_search
{
 
"query": {
   
"filtered": {
     
"filter": {
       
"geo_distance_range": {
         
"gte":    "1km",
         
"lt":     "2km",
         
"location": {
           
"lat":  40.715,
           
"lon": -73.988
         
}
       
}
     
}
   
}
 
}
}

 

Matches locations that are at least 1km from the center, and less than 2km from the center.

caching geo-filters

The results of geo-filters are not cached by default, for two reasons:

  • Geo-filters are usually used to find entities that are near to a user’s current location. The problem is that users move, and no two users are in exactly the same location. A cached filter would have little chance of being reused.
  • Filters are cached as bitsets that represent all documents in a segment. Imagine that our query excludes all documents but one in a particular segment. An uncached geo-filter just needs to check the one remaining document, but a cached geo-filter would need to check all of the documents in the segment.

This setting can be changed on a live index with the update-mapping API:

POST /attractions/_mapping/restaurant
{
 
"location": {
   
"type": "geo_point",
   
"fielddata": {
     
"format":    "compressed",
     
"precision": "1km"
   
}
 
}
}

sorting by distance

GET /attractions/restaurant/_search
{
 
"query": {
   
"filtered": {
     
"filter": {
       
"geo_bounding_box": {
         
"type":       "indexed",
         
"location": {
           
"top_left": {
             
"lat":  40,8,
             
"lon": -74.0
           
},
           
"bottom_right": {
             
"lat":  40.4,
             
"lon": -73.0
           
}
         
}
       
}
     
}
   
}
 
},
 
"sort": [
   
{
     
"_geo_distance": {
       
"location": {
         
"lat":  40.715,
         
"lon": -73.998
       
},
       
"order":         "asc",
       
"unit":          "km",
       
"distance_type": "plane"
     
}
   
}
 
]
}

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章