Greenplum表的管理實踐-1
本文章主要介紹和實踐如何創建、修改、刪除表,包括臨時表的管理,同時針對表的約束,包括非空,唯一,主外鍵,默認等,另外還會簡單進行數據的插入和修改,刪除的實驗操作。
文章目錄
首先需要申明的是Greenplum數據庫的表與任何一種關係型數據庫中的表類似,不過其表中的行被分佈在系統中的不同Segment上。 當用戶創建一個表時,用戶會指定該表的分佈策略。
Greenplum的表分佈策略,存儲特殊屬性,已經大表的分區,我會在表實踐-2上進行總結整理。
Greenplum創建的表的策略或者注意事項:
- 該表的列以及它們的數據類型。參見選擇列的數據類型。
- 任何用於限制列或者表中能包含的數據的表或者列約束。參見設置表和列約束。
- 表的分佈策略,這決定了Greenplum數據庫如何在Segment之間劃分數據。參見選擇表分佈策略。
- 表存儲在磁盤上的方式。參見選擇表存儲模型。
- 大型表的表分區策略。參見創建和管理數據庫。
1 創建表
create table department(deptid int not null,
deptname varchar(20),
createtime timestamp)
DISTRIBUTED BY(deptid);
archdata=# create table department(deptid int not null,
archdata(# deptname varchar(20),
archdata(# createtime timestamp)
archdata-# DISTRIBUTED BY(deptid);
CREATE TABLE
archdata=#
archdata=#
archdata=# \dt department
List of relations
Schema | Name | Type | Owner | Storage
--------+------------+-------+---------+---------
public | department | table | gpadmin | heap
(1 row)
archdata=# \dt+ department
List of relations
Schema | Name | Type | Owner | Storage | Description
--------+------------+-------+---------+---------+-------------
public | department | table | gpadmin | heap |
(1 row)
查看錶大小
select pg_size_pretty(pg_relation_size('department'));
archdata=# select pg_size_pretty(pg_relation_size('department'));
pg_size_pretty
----------------
0 bytes
(1 row)
archdata=#
2 創建臨時表
PostgreSQL支持兩類臨時表,會話級和事務級臨時表。在會話級別的臨時表中,在整個會話的生命週期中,數據一直保存。事務級臨時表,數據只存在於這個事務的生命週期中。不指定臨時表的屬性,
PostgreSQL中,不管是事務級還是會話級臨時表,當會話結束時,臨時表就會消失。這與oracle數據庫不同,在oracle數據庫中,只是臨時表中的數據消失,而臨時表還存在。
PostgreSQL臨時表是schema下所生成的一個特殊的表,這個schema的名稱爲“pg_temp_n”,其中n代表數字,不同的session數字不同。
一個會話創建的臨時表不能被其他會話訪問。
默認情況下,創建的臨時表是會話級的,如果需要創建事務。需要添加“on commit delete rows”子句。(注:“on commit”子句形式有三種:“on commit preserve rows”,默認值,會話級;“on commit delete rows”,事務級,事務結束,刪除數據;“on commit drop”,事務級,事務結束,刪除臨時表)
創建臨時表的關鍵字“temporary”可以縮寫爲“temp”。
PostgreSQL爲了與其他數據庫創建臨時表的語句保持兼容,還沒有“GLOBAL”和“LOCAL”關鍵字,但兩個關鍵字沒有用處。
2.1 會話級臨時表
創建臨時表
create temporary table temp_t as select * from pg_class;
archdata=# create temporary table temp_t as select * from pg_class;
NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column(s) named 'relname' as the Greenplum Database data distribution key for this table.
HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
SELECT 411
archdata=# \dt pg_class;
List of relations
Schema | Name | Type | Owner | Storage
------------+----------+-------+---------+---------
pg_catalog | pg_class | table | gpadmin | heap
(1 row)
archdata=# select count(*) from pg_class;
count
-------
411
(1 row)
archdata=#
在本session中是可以看到表的
archdata-# \dt temp_t
List of relations
Schema | Name | Type | Owner | Storage
--------------+--------+-------+---------+---------
pg_temp_1428 | temp_t | table | gpadmin | heap
(1 row)
archdata-#
在其他session中去查詢相關臨時表
archdata=# select pg_backend_pid();
pg_backend_pid
----------------
4334
(1 row)
\dt temp_t
archdata=# \dt temp_t
No matching relations found.
archdata=#
發現直接/dt的方式是查詢不到temp表的
指定schema方式進行
\dt pg_temp_1428.temp_t
archdata-# \dt pg_temp_1428.temp_t
List of relations
Schema | Name | Type | Owner | Storage
--------------+--------+-------+---------+---------
pg_temp_1428 | temp_t | table | gpadmin | heap
(1 row)
archdata-#
archdata=# select * from pg_temp_1428.temp_t;
relname | relnamespace | reltype | relowner |
relam | relfilenode | reltablespace | relpages | reltuples | reltoastrelid | reltoastidxid | relhasi
ndex | relisshared | relkind | relstorage | relnatts | relchecks | reltriggers | relukeys | relfkeys
| relrefs | relhasoids | relhaspkey | relhasrules | relhassubclass | relfrozenxid | re
lacl | reloptions
----------------------------------------------------------------+--------------+---------+----------+
-------+-------------+---------------+----------+-----------+---------------+---------------+--------
-----+-------------+---------+------------+----------+-----------+-------------+----------+----------
+---------+------------+------------+-------------+--------------
2.2 創建會話級別的臨時表
archdata=# create temporary table temp_t2(id int,note text) on commit delete rows;
NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.
HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE
archdata=#
archdata=# insert into temp_t2 values(1,'a');
INSERT 0 1
archdata=#
archdata=# select * from temp_t2;
id | note
----+------
(0 rows) ---看不到記錄了
archdata=#
使用事務控制方式
postgres=# begin;
BEGIN
postgres=# insert into temp_t2 values(1,'a');
INSERT 0 1
postgres=# insert into temp_t2 values(1,'b');
INSERT 0 1
postgres=# select * from temp_t2;
id | note
----+------
1 | a
1 | b
(2 行記錄)
postgres=# end;
COMMIT
postgres=# select * from temp_t2;
id | note
----+------
(0 行記錄)
看到在事務結束後,表中的數據也都沒有了,在別的會話中也是無法看到這個臨時表的。
3 約束
3.1 字段級別的check
archdata=# CREATE TABLE department1 (
archdata(# deptid int NOT NULL,
archdata(# deptname varchar(20),
archdata(# createtime timestamp CONSTRAINT create_check CHECK (createtime > '1970-01-01')
archdata(# ) DISTRIBUTED BY(deptid);
CREATE TABLE
archdata=#
archdata=#
archdata=# \dt department1
List of relations
Schema | Name | Type | Owner | Storage
--------+-------------+-------+---------+---------
public | department1 | table | gpadmin | heap
(1 row)
archdata=# \d department1
Table "public.department1"
Column | Type | Modifiers
------------+-----------------------------+-----------
deptid | integer | not null
deptname | character varying(20) |
createtime | timestamp without time zone |
Check constraints:
"create_check" CHECK (createtime > '1970-01-01 00:00:00'::timestamp without time zone)
Distributed by: (deptid)
archdata=#
3.2 表級的check
archdata=# CREATE TABLE department2 (
archdata(# deptid int NOT NULL,
archdata(# deptname varchar(20) not null,
archdata(# createtime timestamp,
archdata(# parentcreatetime timestamp,
archdata(# check(parentcreatetime>createtime)
archdata(# ) DISTRIBUTED BY(deptid);
CREATE TABLE
archdata=#
archdata=#
archdata=# \d department2
Table "public.department2"
Column | Type | Modifiers
------------------+-----------------------------+-----------
deptid | integer | not null
deptname | character varying(20) | not null
createtime | timestamp without time zone |
parentcreatetime | timestamp without time zone |
Check constraints:
"department2_check" CHECK (parentcreatetime > createtime)
Distributed by: (deptid)
archdata=#
3.3 非空約束
關鍵字NOT NULL
archdata=# CREATE TABLE department3 (
archdata(# deptid int NOT NULL check (deptid>0),
archdata(# deptname varchar(20) not null,
archdata(# createtime timestamp,
archdata(# parentcreatetime timestamp
archdata(# ) DISTRIBUTED BY(deptid);
CREATE TABLE
archdata=# \d department3
Table "public.department3"
Column | Type | Modifiers
------------------+-----------------------------+-----------
deptid | integer | not null
deptname | character varying(20) | not null
createtime | timestamp without time zone |
parentcreatetime | timestamp without time zone |
Check constraints:
"department3_deptid_check" CHECK (deptid > 0)
Distributed by: (deptid)
archdata=#
3.4 唯一約束
關鍵件unique
archdata=# CREATE TABLE department4 (
archdata(# deptid int unique,
archdata(# deptname varchar(20) not null,
archdata(# createtime timestamp,
archdata(# parentcreatetime timestamp
archdata(# ) DISTRIBUTED BY(deptid);
NOTICE: CREATE TABLE / UNIQUE will create implicit index "department4_deptid_key" for table "department4"
CREATE TABLE
archdata=# \d department4
Table "public.department4"
Column | Type | Modifiers
------------------+-----------------------------+-----------
deptid | integer |
deptname | character varying(20) | not null
createtime | timestamp without time zone |
parentcreatetime | timestamp without time zone |
Indexes:
"department4_deptid_key" UNIQUE, btree (deptid)
Distributed by: (deptid)
archdata=# \di department4
No matching relations found.
archdata=# \di department4_deptid_key
List of relations
Schema | Name | Type | Owner | Storage | Table
--------+------------------------+-------+---------+---------+-------------
public | department4_deptid_key | index | gpadmin | heap | department4
(1 row)
archdata=#
3.5 主鍵約束
關鍵是primary key,就是唯一非空的,主鍵可以多個字段組合。
CREATE TABLE department5 (
deptid int pr1mary key,
deptname varchar(20) not null,
createtime timestamp,
parentcreatetime timestamp
) DISTRIBUTED BY(deptid);
archdata=# CREATE TABLE department5 (
archdata(# deptid int primary key,
archdata(# deptname varchar(20) not null,
archdata(# createtime timestamp,
archdata(# parentcreatetime timestamp
archdata(# ) DISTRIBUTED BY(deptid);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "department5_pkey" for table "department5"
CREATE TABLE
archdata=#
archdata=# \d department5
Table "public.department5"
Column | Type | Modifiers
------------------+-----------------------------+-----------
deptid | integer | not null
deptname | character varying(20) | not null
createtime | timestamp without time zone |
parentcreatetime | timestamp without time zone |
Indexes:
"department5_pkey" PRIMARY KEY, btree (deptid)
Distributed by: (deptid)
archdata=#
3.6 外鍵約束
不支持外鍵,但是語法上允許外鍵約束,但不會起作用。
archdata=# create table emp(
archdata(# empid int not null,
archdata(# empname varchar(20),
archdata(# deptid int not null references department5(deptid),
archdata(# constraint pk_emp primary key(empid))
archdata-# DISTRIBUTED BY(empid);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "emp_pkey" for table "emp"
WARNING: Referential integrity (FOREIGN KEY) constraints are not supported in Greenplum Database, will not be enforced.
CREATE TABLE
archdata=#
Greenplum支持外鍵
3.7 默認約束
archdata=# CREATE TABLE department6 (
archdata(# deptid int primary key,
archdata(# deptname varchar(20) not null,
archdata(# createtime timestamp,
archdata(# parentcreatetime timestamp default '2019-01-01 00:00:00'
archdata(# ) DISTRIBUTED BY(deptid);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "department6_pkey" for table "department6"
CREATE TABLE
archdata=# \d department6
Table "public.department6"
Column | Type | Modifiers
------------------+-----------------------------+----------------------------------------------------
--------
deptid | integer | not null
deptname | character varying(20) | not null
createtime | timestamp without time zone |
parentcreatetime | timestamp without time zone | default '2019-01-01 00:00:00'::timestamp without ti
me zone
Indexes:
"department6_pkey" PRIMARY KEY, btree (deptid)
Distributed by: (deptid)
archdata=#
3.8 自增加
方法一 先創建序列,然後設置字段的自增。
create sequence seq_test start with 1 increment by 1 no minvalue no maxvalue cache 1
Sequence(序列)是數據庫經常使用自增列屬性,對於單機PostgreSQL實例,數據庫維護一個自增變量即可。但是對於Greenplum的MPP架構,如果每個節點都維護自己的Sequence,那麼Sequence將會出現重複,那麼Greenplum是如何處理的呢?序列是master統一維護的。
archdata=# create sequence seq_test start with 1 increment by 1 no minvalue no maxvalue cache 1
archdata-# ;
CREATE SEQUENCE
create table tb_test(a int not null default nextval('seq_test'))
方案二 使用serial數據烈性
create table tb_test1(a serial not null,b text)
archdata=# \d tb_test1
Table "public.tb_test1"
Column | Type | Modifiers
--------+---------+------------------------------------------------------
a | integer | not null default nextval('tb_test1_a_seq'::regclass)
b | text |
Distributed by: (a)
archdata=#
4 修改表
4.1 修改表名
alter table tb_test1 rename to tb_test_new;
4.2 修改字段名
alter table department rename column createtime to deptcreatetime;
4.3 添加字段
alter table department add column pid int not null;
4.4 刪除字段
alter table department drop column pid;
4.5 刪除表外鍵
首先找出數據庫表的外鍵名稱:
\d [tablename]
....
"table_name_id_fkey" FOREIGN KEY (id) REFERENCES other_table(id) ....
1
2
3
然後使用下面的命令刪除外鍵:
ALTER TABLE [tablename] DROP CONSTRAINT table_name_id_fkey;
5 刪除表
PostgreSQL 使用 DROP TABLE 語句來刪除表格,包含表格數據、規則、觸發器等,所以刪除表格要慎重,刪除後所有信息就消失了。
drop table table_name;
如果想刪除主鍵的表,需要添加cascade,這樣就可以刪除主鍵表,同時刪除了子表的外鍵約束,其他的不會刪除。
6 數據的插入、修改和刪除
6.1 插入
例如,指定要插入的列名和值:
INSERT INTO products (name, price, product_no) VALUES ('Cheese', 9.99, 1);
只指定要插入的值:
INSERT INTO products VALUES (1, 'Cheese', 9.99);
通常,數據值都是常量,但也可以使用標量表達式。例如:
INSERT INTO films SELECT * FROM tmp_films WHERE date_prod <
'2016-05-07';
可以在單個命令中插入多行。例如:
INSERT INTO products (product_no, name, price) VALUES
(1, 'Cheese', 9.99),
(2, 'Bread', 1.99),
(3, 'Milk', 2.99);
6.2 更新
UPDATE 命令在一個表中更新行。可以更新一個表中所有的行、所有行的一個子集或者單個行。可以單獨更新每一列而不影響其他列。
要執行一次更新,需要:
•要更新的表和列的名稱
•這些列的新值
•指定要更新的行的一個或者更多條件。
例如,下面的命令把所有價格爲5 的產品更新爲價格爲 10:
UPDATE products SET price = 10 WHERE price = 5;
在Greenplum數據庫中使用 UPDATE 由下列限制:
•GPORCA可以爲Greenplum分佈鍵列提供更新支持,Postgres planner則不會 。
•如果使用了鏡像,不能在UPDATE語句中使用STABLE或VOLATILE 函數。
•Greenplum數據庫的分區列不能被更新.
6.3 刪除數據
DELETE命令從一個表中刪除行。指定一個WHERE子句可以刪除滿足特定條件的行。如果不指定WHERE 子句,該表中所有的行都會被刪除。其結果是一個合法的但爲空的表。例如,從產品表中刪除所有價格爲10的行:
DELETE FROM products WHERE price = 10;
要從一個表中刪除所有行:
DELETE FROM products;
在Greenplum數據庫中使用 DELETE 具有和使用UPDATE類似的限制:
•如果使用了鏡像,不能在 UPDATE 語句中使用STABLE 或VOLATILE 函數。