在 Azure CentOS VM 中配置 SQL Server 2019 AG - (上)

在 Azure CentOS VM 中配置 SQL Server 2019 AG - (上)

前文
假定您對Azure和SQL Server HA具有基礎知識
假定您對Azure Cli具有基礎知識
目標是在Azure Linux VM上創建一個具有三個副本的可用性組,並實現偵聽器和Fencing配置
環境
SQL Server 2019 Developer on Linux
Azure VM Fencing agent
Azure Cli實現部分配置
CentOS 7.7 Azure VM,分別SQL19N1,SQL19N2,SQL19N3,位於同一VNet
步驟
爲VM創建資源組和可用性集

中國東部2創建資源組

az group create --name SQL-DEMO-RG --location chinaeast2

創建用於VM人Availability Set,配置2個容錯域,2個更新域

az vm availability-set create \

--resource-group SQL-DEMO-RG \
--name AGLinux-AvailabilitySet \
--platform-fault-domain-count 2 \
--platform-update-domain-count 2

使用Template部署3臺VM
第一次創建VM時,會生成template,然後下載保存下,修改其中的參數值後,就可以方便地創建配置類似的VM。VM的配置主要有:

使用前面的可用性集
使用同一個子網
IP使用Standard
SSH public key配置
模板和參數文件太長,就不展示了。可以在Azure Portal上自行獲取。

如下是SQL19N2的配置,修改參數文件後,直接可以用於創建SQL19N3

templateFile="./templateFile"
paramFile="./vmParams-sql19n2.json"
az deployment group validate --name sql19n2vm \

 -g SQL-DEMO-RG --template-file $templateFile --parameters $paramFile

配置VM使用固定內網IP和公網DNS Label
三臺VM都需要修改配置,如下只是一臺的配置示例

找出nic和IP的信息

az network nic list -g SQL-DEMO-RG --query "[].{nicName:name,configuration:ipConfigurations[].{ipName:name,ip:privateIpAddress,method:privateIpAllocationMethod}}" -o yaml

修改privateIpAllocationMethod爲Static

az network nic ip-config update -g SQL-DEMO-RG --nic-name sql19n1152 --name ipconfig1 --set privateIpAllocationMethod=Static

找出pbulic ip名稱

az network public-ip list -g SQL-DEMO-RG --query "[].name" -o tsv

配置Public IP的DNS name,只能使用數據和小字字母

az network public-ip update -g SQL-DEMO-RG -n SQL19N1ip851 --dns-name sql19n1
安裝HA相關軟件包
最好先更新一下系統的軟件包,再安裝HA相關軟件。

yum update -y
yum install -y pacemaker pcs fence-agents-all resource-agents fence-agents-azure-arm
reboot
爲羣集和SQL Server開放防火牆端口

Pacemaker和Corosync的端口

TCP: Ports 2224,3121,21064,5405

UDP: Port 5405

firewall-cmd --add-port=2224/tcp --permanent
firewall-cmd --add-port=2224/tcp --permanent
firewall-cmd --add-port=21064/tcp --permanent
firewall-cmd --add-port=5405/tcp --permanent
firewall-cmd --add-port=5405/udp --permanent

SQL Server端口和AG鏡像端口

TCP: 1433,5022

firewall-cmd --add-port=1433/tcp --permanent
firewall-cmd --add-port=5022/tcp --permanent
firewall-cmd --reload
添加hosts記錄
vi /etc/hosts
172.17.2.8 SQL19N1
172.17.2.9 SQL19N2
172.17.2.10 SQL19N3
創建Pacemaker羣集

設置Pacemaker的默認用戶密碼,三臺VM上

passwd hacluster

設置pacemaker和pcsd自啓動在三臺VM上

systemctl enable pcsd
systemctl start pcsd
systemctl enable pacemaker

創建羣集,在master節點

sudo pcs cluster auth SQL19N1 SQL19N2 SQL19N3 -u hacluster
sudo pcs cluster setup --name agcluster SQL19N1 SQL19N2 SQL19N3 --token 30000 --force
sudo pcs cluster start --all
sudo pcs cluster enable --all

查看羣集狀態

pcs status

在三個節點上修改quorum的expected-votes爲3,其實三節點羣集默認爲3

設置表示,羣集存活需要3票,這個修改隻影響當前running羣集,不會變成羣集的永久性配置保存下來

pcs quorum expected-votes 3
在Azure上爲Fencing Agent配置Servic Princinpal

1. 創建 aad app,成功後記錄下相應的appID

az ad app create --display-name sqldemorg-app --identifier-uris http://localhost
--password "1qaz@WSX3edc" --end-date '2030-04-27' --credential-description "sql19 ag secret"

2. 創建aad App的Service Principal

az ad sp create --id

3. 將service Principal分配到VM對應的管理role,對每個VM都要執行

我這裏分配的是Owner role,這不是安全的做法。應該使用自定義一個role,只給最小權限

自定義role需要Azure訂閱是PP1或者PP2級別

az role assignment create --assignee --role owner \
--scope /subscriptions//resourceGroups//providers/Microsoft.Compute/virtualMachines/SQL19N1
創建Azure的STONITH 設備
我使用的是Azure China,所以需要指定cloud=china,如果使用global Azure不需要指定此參數。
執行 fence_azure_arm -h,查看此資源代理的更多幫助信息

pcs property set stonith-timeout=900
pcs stonith create rsc_st_azure fence_azure_arm login="" passwd="" resourceGroup="" tenantId="" subscriptionId="" power_timeout=240 pcmk_reboot_timeout=900 cloud=china
安裝SQL 2019及工具

安裝 SQL 2019和HA 資源代理

sudo curl -o /etc/yum.repos.d/mssql-server.repo https://packages.microsoft.com/config/rhel/7/mssql-server-2019.repo
sudo yum install -y mssql-server
sudo /opt/mssql/bin/mssql-conf setup
sudo yum install mssql-server-ha

安裝 mssql-tools

sudo curl -o /etc/yum.repos.d/msprod.repo https://packages.microsoft.com/config/rhel/7/prod.repo
sudo yum install -y mssql-tools unixODBC-devel

將mssql-tools目錄加入到aPATH,方便使用

echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bash_profile
echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc
source ~/.bashrc

安裝 mssql-cli

sudo rpm --import https://packages.microsoft.com/keys/microsoft.asc
sudo curl -o /etc/yum.repos.d/mssql-cli.repo https://packages.microsoft.com/config/rhel/7/prod.repo
sudo yum install mssql-cli

查看SQL 狀態

systemctl status mssql-server
如果您熟悉 SQL Server相關的PowerShell,建議將PowerShell也安裝上,並安裝SQLServer module。對SQL Server的配置,使用PowerShell會方便很多

yum install powershell -y
pwsh
Install-Module SQLServer

查看SQL相關的命令

Get-Command -Module SQLServer
配置AG
創建PowerShell 函數方便後續執行T-SQL

打開PowerShell的 profile文件,如果不存在需要則需要創建

vi /root/.config/powershell/Microsoft.PowerShell_profile.ps1

將如下函數加入 到 profile文件中,每次打開pwsh時就可以直接調用

函數有兩個參數,$sql表示需要執行的T-SQL,最好使用here-string以避免字符轉義問題

$servers表示目標實例,數組類型。默認值爲當前環境中的三個實例

function run-sql ($sql,$servers=("SQL19N1","SQL19N2","SQL19N3"))
{

    $secpasswd = "1qaz@WSX"|ConvertTo-SecureString -AsPlainText -Force
    $cred=New-Object System.Management.Automation.PSCredential -ArgumentList 'sa', $secpasswd
    $sql
    "---------"
    foreach($svr in $servers) {"Running T-SQL on $svr..."; Invoke-Sqlcmd -ServerInstance $svr -Credential $cred -Query $sql}

}
啓用 hadr功能,每個實例
sudo /opt/mssql/bin/mssql-conf set hadr.hadrenabled 1
sudo systemctl restart mssql-server
啓動AG extened event session

T-SQL,每個實例

ALTER EVENT SESSION AlwaysOn_health ON SERVER WITH (STARTUP_STATE=ON);
GO
在主副本實例上創建證書,這個證書用於驗證Mirroring endpoint通信。將證書和私鑰複製到其它節點上的相同的目錄位置。授予mssql用戶訪問權限
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '1qaz@WSX';
GO
CREATE CERTIFICATE dbm_certificate WITH SUBJECT = 'dbm';
GO
BACKUP CERTIFICATE dbm_certificate
TO FILE = '/var/opt/mssql/data/dbm_certificate.cer'
WITH PRIVATE KEY (

       FILE = '/var/opt/mssql/data/dbm_certificate.pvk',
       ENCRYPTION BY PASSWORD = '1qaz@WSX'
   );

複製證書和私鑰到輔助副本主機SQL19N2和SQL19N3

cd /var/opt/mssql/data
scp dbm_certificate.* root@SQL19N2:/var/opt/mssql/data/
scp dbm_certificate.* root@SQL19N3:/var/opt/mssql/data/

輔助副本節點上修改權限

cd /var/opt/mssql/data
chown mssql:mssql dbm_certificate.*
在輔助副本實例中創建master key並導入證書
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '1qaz@WSX';
GO
CREATE CERTIFICATE dbm_certificate

FROM FILE = '/var/opt/mssql/data/dbm_certificate.cer'
WITH PRIVATE KEY (
FILE = '/var/opt/mssql/data/dbm_certificate.pvk',
DECRYPTION BY PASSWORD = '1qaz@WSX'
        );

創建AG的鏡像端口,注意防火牆和NSG配置端口例外
CREATE ENDPOINT [Hadr_endpoint]

AS TCP (LISTENER_PORT = 5022)
FOR DATABASE_MIRRORING (
    ROLE = ALL,
    AUTHENTICATION = CERTIFICATE dbm_certificate,
    ENCRYPTION = REQUIRED ALGORITHM AES
    );

GO
ALTER ENDPOINT [Hadr_endpoint] STATE = STARTED;
創建三個副本,同步模式的AG,主副本實例上執行
CREATE AVAILABILITY GROUP [ag1]

 WITH (DB_FAILOVER = ON, CLUSTER_TYPE = EXTERNAL)
 FOR REPLICA ON
     N'SQL19N1' 
           WITH (
         ENDPOINT_URL = N'tcp://SQL19N1:5022',
         AVAILABILITY_MODE = SYNCHRONOUS_COMMIT,
         FAILOVER_MODE = EXTERNAL,
         SEEDING_MODE = AUTOMATIC,
         SECONDARY_ROLE(ALLOW_CONNECTIONS = ALL)
         ),
     N'SQL19N2' 
      WITH ( 
         ENDPOINT_URL = N'tcp://SQL19N2:5022', 
         AVAILABILITY_MODE = SYNCHRONOUS_COMMIT,
         FAILOVER_MODE = EXTERNAL,
         SEEDING_MODE = AUTOMATIC,
         SECONDARY_ROLE(ALLOW_CONNECTIONS = ALL)
         ),
     N'SQL19N3'
     WITH( 
        ENDPOINT_URL = N'tcp://SQL19N3:5022', 
        AVAILABILITY_MODE = SYNCHRONOUS_COMMIT,
        FAILOVER_MODE = EXTERNAL,
        SEEDING_MODE = AUTOMATIC,
        SECONDARY_ROLE(ALLOW_CONNECTIONS = ALL)
        );

GO
ALTER AVAILABILITY GROUP [ag1] GRANT CREATE ANY DATABASE;
GO
爲Pacemaker創建sql登錄並授權,每個實例
USE [master]
GO
CREATE LOGIN [pacemakerLogin] with PASSWORD= N'1qaz@WSX'
go
ALTER SERVER ROLE [sysadmin] ADD MEMBER [pacemakerLogin];
GO
將pacemaker的login信息保存到本地文件
echo "pacemakerLogin" >> /var/opt/mssql/secrets/passwd
echo "1qaz@WSX" >> /var/opt/mssql/secrets/passwd

只允許root讀取

chown root:root /var/opt/mssql/secrets/passwd
chmod 400 /var/opt/mssql/secrets/passwd

將輔助副本加入到AG, 輔助副本執行
ALTER AVAILABILITY GROUP [ag1] JOIN WITH (CLUSTER_TYPE = EXTERNAL);
GO

auto_seeding功能需要的權限

ALTER AVAILABILITY GROUP [ag1] GRANT CREATE ANY DATABASE;
GO
如果您不希望pacemakerLogin具有sysadmin的權限,可以將之從sysadmin中移除,並授予如下權限。每個實例
ALTER SERVER ROLE [sysadmin] DROP MEMBER [pacemakerLogin]
GO
GRANT ALTER, CONTROL, VIEW DEFINITION ON AVAILABILITY GROUP::ag1 TO pacemakerLogin;
GO
GRANT VIEW SERVER STATE TO pacemakerLogin;
GO
添加數據庫到AG,主副本執行
CREATE DATABASE [db1];
GO
ALTER DATABASE [db1] SET RECOVERY FULL;
GO
BACKUP DATABASE [db1]
TO DISK = N'nul';
GO
ALTER AVAILABILITY GROUP [ag1] ADD DATABASE [db1];
GO
可用性數據庫狀態
SELECT * FROM sys.databases WHERE name = 'db1';
GO
SELECT DB_NAME(database_id) AS 'database', synchronization_state_desc FROM sys.dm_hadr_database_replica_states;
在Pacemaker羣集中配置AG
創建AG資源,ag_name要指定爲之前創建AG名稱
pcs resource create agcluster ocf:mssql:ag ag_name=ag1 meta failure-timeout=30s master notify=true
創建虛擬IP資源

禁用fencing

pcs property set stonith-enabled=false

創建VIP

pcs resource create virtualip ocf:heartbeat:IPaddr2 ip=172.17.2.7

創建 colacation constraint,vip和master必需在同一個節點上啓動
pcs constraint colocation add virtualip agcluster-master INFINITY with-rsc-role=Master
創建 ordering constraint,vip要先於master副本資源啓動
pcs constraint order promote agcluster-master then start virtualip

查看當前的約束

pcs constraint show --full
重新啓用STONITH並查看羣集狀態
pcs property set stonith-enabled=true
pcs status

我的環境中的狀態信息


Cluster name: agcluster
Stack: corosync
Current DC: SQL19N3 (version 1.1.20-5.el7_7.2-3c4c782f70) - partition with quorum
Last updated: Wed Apr 29 04:24:50 2020
Last change: Wed Apr 29 04:24:45 2020 by root via cibadmin on SQL19N1

3 nodes configured
5 resources configured

Online: [ SQL19N1 SQL19N2 SQL19N3 ]

Full list of resources:

rsc_st_azure (stonith:fence_azure_arm): Started SQL19N1
Master/Slave Set: agcluster-master [agcluster]

 Masters: [ SQL19N1 ]
 Slaves: [ SQL19N2 SQL19N3 ]

virtualip (ocf::heartbeat:IPaddr2): Started SQL19N1

Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
測試Failover和Fencing

手動failover

pcs resource move agcluster-master SQL19N2 --master
pcs status

手動 failover會生成一個constraint,避免AG資源再回到原來的節點

如果希望AG後續還能 failover回來,需要手動刪除之

pcs constraint show --full
pcs constraint remove cli-prefer-agcluster-master

嘗試Fencing羣集節點,每個節點都試一下

如下命令的fencing只是重啓node,如果要安全關閉node,使用--off參數

pcs stonith fence SQL19N3 --debug

作者:Joe.TJ

原文地址https://www.cnblogs.com/Joe-T/p/12803084.html

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章