- 左連接 不能將左表廣播
- 右連接 不能將右表廣播
內連接
情況1:select * from test_table_5000 as t1, test_table_10000 as t2 where t1.id=t2.id關聯鍵是分佈鍵,不涉及重分佈
情況2:select * from test_table_5000 as t1, test_table_10000 as t2 where t1.id=t2.id2
方式1:將t1表廣播到每一個segment上,數據量是N*6(segment數量)
方式2:將t2表根據id2重分佈數據,數據量是M
情況3:select * from test_table_5000 as t1, test_table_10000 as t2 where t1.id2=t2.id2
- 方式1:將兩個表都按照id2重分佈,數據量: M+N
- 方式2:將小表進行廣播,數據量:N*(segment數量)
- 左分佈鍵 關聯 右非分佈鍵
- 無分佈鍵進行關聯:
左連接
- 情況1:select * from test_table_5000 as t1 left join test_table_10000 as t2 on t1.id=t2.id 兩個表都是分佈鍵,不涉及廣播與重分佈
- 情況2:select * from test_table_5000 as t1 left join test_table_10000 as t2 on t1.id=t2.id2 將t2表進行重分佈,數據量 M
- 情況3:select * from test_table_5000 as t1 left join test_table_10000 as t2 on t1.id2=t2.id
- 方式1:將左表按照id2重分佈,數據量N
- 方式2:將右表廣播,數據量爲M*6
- 情況4:select * from test_table_5000 as t1 left join test_table_10000 as t2 on t1.id2=t2.id2
- 方式1:將兩表按照id2重分佈,數據量N+M
- 方式2:將右表廣播,數據量爲M*6
全連接
- 情況一:select * from test_table_5000 as t1 full outer join test_table_10000 as t2 on t1.id=t2.id; 兩表關聯鍵都是分佈鍵,不涉及重分佈與廣播
- 情況二:select * from test_table_5000 as t1 full outer join test_table_10000 as t2 on t1.id=t2.id2 無論t2表數據量有多大,都是隻能重分佈t2
- 情況三:select * from test_table_5000 as t1 full outer join test_table_10000 as t2 on t1.id2=t2.id2 對兩個表進行重分佈