快速的將pandas的數據輸入到elasticsearch

原創

2019-04-09 22:23

廢話不多說，假設你的數據大概是這樣的：

In [25]: df[:1]
Out[25]: 
                               timestamp    open    high     low   close  \
timestamp                                                                  
2019-01-01 08:00:00  2019-01-01 08:00:00  3703.5  3703.5  3703.0  3703.0   

                       volume  e_open  e_high  e_low  e_close  e_volume  \
timestamp                                                                 
2019-01-01 08:00:00  524924.0  133.35  133.35  133.3   133.35   77507.0   

                          rate  change change_str  cum_change cum_change_str  
timestamp                                                                     
2019-01-01 08:00:00  27.769029    -0.0     -0.00%        -0.0         -0.00% 

In [26]: df.columns
Out[26]: 
Index(['timestamp', 'open', 'high', 'low', 'close', 'volume', 'e_open',
       'e_high', 'e_low', 'e_close', 'e_volume', 'rate', 'change',
       'change_str', 'cum_change', 'cum_change_str'],
      dtype='object')

放入elasticsearch之前，需要創建一下索引，主要是爲了設置timestamp的格式：

PUT btc_eth_data
{
  "mappings": {
    "doc":{
      "properties":{
        "timestamp" : {
                "type" : "date",
                "format": "yyyy-MM-dd HH:mm:ss"
        }        
      }
    }

  }
}

直接上代碼：

pandas2es.py

import pandas as pd
from elasticsearch import Elasticsearch
import json

es = Elasticsearch()
#假設你有一堆數據，通過df加載，並且進行可必要的處理
df = pd.read_csv("/Users/lex/Code/bitmex_arbitrage/data.csv")

#
# 數據處理
# 然後準備輸入到elasticsearch當中

df_as_json = df.to_json(orient='records', lines=True)
bulk_data = []

for json_document in df_as_json.split('\n'):
    bulk_data.append({"index":{
                '_index': "btc_eth_data",
                '_type': "doc",
            }})
    bulk_data.append(json.loads(json_document))
    # 一次bulk request包含1000條數據
    if len(bulk_data) > 1000:
        es.bulk(bulk_data)
        bulk_data = []
es.bulk(bulk_data)

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

快速的將pandas的數據輸入到elasticsearch

解鎖 Elastic 最新的數據採集模塊 - Ingest manager 和 Elastic Agent

Elastic Stack超實用技巧 5分鐘教你玩轉各種場景

滴滴基於 ElasticSearch 的一站式搜索中臺實踐（轉）

滴滴 Elasticsearch 多集羣架構實踐（轉）

Elasticsearch開發進階指南——如何選擇合適的ES版本

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結