cassandra clustering key 的查詢原理

Suppose your clustering keys are

k1 t1, k2 t2, ..., kn tn

where ki is the ith key name and ti is the ith key type. Then the order data is stored in is lexicographic ordering where each dimension is compared using the comparator for that type.

So (a1, a2, ..., an) < (b1, b2, ..., bn) if a1 < b1 using t1 comparator, or a1=b1 and a2 < b2 using t2 comparator, or (a1=b1 and a2=b2) and a3 < b3 using t3 comparator, etc..

This means that it is efficient to find all rows with a certain k1=a, since the data is stored together. But it is inefficient to find all rows with ki=x for i > 1. In fact, such a query isn't allowed - the only clustering key constraints that are allowed specify zero or more clustering keys, starting from the first with none missing.

For example, consider the schema

create table clustering (
    x text,
    k1 text,
    k2 int,
    k3 timestamp,
    y text,
    primary key (x, k1, k2, k3)
);

If you did the following inserts:

insert into clustering (x, k1, k2, k3, y) values ('x', 'a', 1, '2013-09-10 14:00+0000', '1');
insert into clustering (x, k1, k2, k3, y) values ('x', 'b', 1, '2013-09-10 13:00+0000', '1');
insert into clustering (x, k1, k2, k3, y) values ('x', 'a', 2, '2013-09-10 13:00+0000', '1');
insert into clustering (x, k1, k2, k3, y) values ('x', 'b', 1, '2013-09-10 14:00+0000', '1');

then they are stored in this order on disk (the order select * from clustering where x = 'x'returns):

 x | k1 | k2 | k3                       | y
---+----+----+--------------------------+---
 x |  a |  1 | 2013-09-10 14:00:00+0000 | 1
 x |  a |  2 | 2013-09-10 13:00:00+0000 | 1
 x |  b |  1 | 2013-09-10 13:00:00+0000 | 1
 x |  b |  1 | 2013-09-10 14:00:00+0000 | 1

k1 ordering dominates, then k2, then k3.

primary key決定了在哪個node上，cluster key 決定的是存儲的順序，而且是按照cluster key1, cluster key2, cluster key3 的順序來存儲的，所以上例子中，：

select * from clustering where x='x' and k1='a', 很容易查，但是select * from clustering where x='x' and k2='b',這個時候得先把k1=＊查出來，然後再找k2='b'的，所以沒有意義了。

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

cassandra clustering key 的查詢原理

CQL IN 語法的應用方法

一句話講明白 pthread_key_t和pthread_key_create()

如何區分cassandra裏primary key,partition key,cluster key,clustering key

TCP 和UDP 的區別

cassandra ALLOW FILTER 的工作原理

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結