基于Ansible的分布式MPP数据库Greenplum一键部署安装包的设计实现

一、说在前面的废话

最近在工作中研究分布式MPP数据库Greenplum的一键部署安装包的制作,无意间在查看Greenplum的官网时发现了它基于ansible的相关文档,于是开始深入了解ansible及ansible-playbook的使用,并顺利实现了一个Greenplum的一键部署安装包。

https://gpdb.docs.pivotal.io/6-7/install_guide/ansible-example.html

接下来介绍下ansible吧。

二、ansible概述

Ansible是一个开源配置管理工具,可以使用它来自动化任务,部署应用程序实现IT基础架构。Ansible可以用来自动化日常任务,比如,服务器的初始化配置、安全基线配置、更新和打补丁系统,安装软件包等。Ansible架构相对比较简单,仅需通过SSH连接客户机执行任务即可:

ansible的工程目录结构通常如下:

其优点总结如下

  • 无需客户端

与Chef、Puppet以及Saltstack(现在也支持Agentless方式salt-ssh)不同,Ansible是无客户端Agent的,所以无需在客户机上安装或配置任何程序,就可以运行Ansible任务。由于Ansible不会在客户机上安装任何软件或运行监听程序,因此消除了许多管理开销,我们可以在即可上手使用Ansible管理服务器,同时Ansible的更新也不会影响任何客户机。

  • 使用SSH进行通讯

默认情况下,Ansible使用SSH协议在管理机和客户机之间进行通信。可以使用SFTP与客户机进行安全的文件传输。

  • 并行执行

Ansible与客户机并行通信,可以更快地运行自动化任务。默认情况下,forks值为5,可以按需,在配置文件中增大该值。

三、基于ansible的Greenplum分布式部署安装包制作

经查阅Greenplum官方的安装部署说明文档,可以理顺出greenplum在每台机器节点上的安装步骤,文档地址:

https://gpdb.docs.pivotal.io/6-7/install_guide/install_guide.html

分析可知,绝大步骤在每台机器上的配置基本相同,只有一些特定的安装步骤需要在master节点上进行,大致内容如下:

  • 【所有节点上】Greenplum各个主机节点的环境配置
  • 【所有节点上】创建服务器Greenplum主机账号
  • 【所有节点上】Greenplum数据库RPM包的安装
  • 【所有节点上】Greenplum数据目录的配置
  • 【所有节点上】账号间免登录配置
  • 【master节点上】Greenplum数据库的初始化与启动

第一步骤:进行【所有节点上】的安装与配置:

#!/usr/bin/env ansible-playbook
---

- hosts: all
  vars_files:
    - vars/gpdb.yml
  remote_user: root
  become: yes
  become_method: sudo
  connection: ssh
  gather_facts: yes
  tasks:
    - name: 01. stop and disable firewall service
      shell: '{{ item }}'
      with_items:
        - 'systemctl unmask firewalld'
        - 'systemctl start firewalld.service'
        - 'systemctl stop firewalld.service'
        - 'systemctl disable firewalld.service'
    - name: 02. be sure expect is installed
      yum: name=expect state=installed
    - name: 03. close selinux temporary
      shell: setenforce 0
      failed_when: false
    - name: 04. close selinux forever
      when: ansible_distribution == "CentOS" and ansible_distribution_major_version == "7"
      lineinfile:
        dest: /etc/selinux/config
        regexp: '^SELINUX='
        line: 'SELINUX=disabled'
    - name: 05. be sure ntp is installed
      yum: name=ntp state=installed
      tags: ntp
    - name: 06. configure sync time using aliyun server
      when: ansible_distribution == "CentOS" and ansible_distribution_major_version == "7"
      cron: name="sync time" minute='*/5' hour=* day=* month=* weekday=* job="/usr/sbin/ntpdate -u ntp1.aliyun.com >/dev/null 2>&1"
      ignore_errors: true
    - name: 07. update configure file for etc-hosts
      copy: src=gpnodes/hosts dest=/etc/hosts
    - name: 08. change host name to etc-hostname
      raw: 'echo {{hostname|quote}} > /etc/hostname'
    - name: 09. change host name by command hostname
      shell: hostname {{hostname|quote}}
    - name: 10. create greenplum admin user
      user:
        name: '{{ greenplum_admin_user }}'
        password: "{{ greenplum_admin_password | password_hash('sha512') }}"
    - name: 11. copy greeplum rpm package to host
      copy:
        src: '{{ package_path }}'
        dest: /tmp
    - name: 12. backing up sysctl
      copy:
        src: /etc/sysctl.conf
        remote_src: yes
        dest: /tmp/sysctl.conf.bak
        backup: yes
    - name: 13. get shmall 
      shell: echo $(expr $(getconf _PHYS_PAGES) / 2) 
      register: shmall
    - name: 14. get shmmax
      shell: echo $(expr $(getconf _PHYS_PAGES) / 2 \* $(getconf PAGE_SIZE))
      register: shmmax
    - name: 15. get min_free_kbytes
      shell: awk 'BEGIN {OFMT = "%.0f";} /MemTotal/ {print $2 * .03;}' /proc/meminfo
      register: min_free_kbytes
    - name: 16. set shmall
      sysctl:
        name: kernel.shmall
        value: '{{ shmall.stdout }}'
        reload: yes
    - name: 17. set shmmax
      sysctl:
        name: kernel.shmmax
        value: '{{ shmmax.stdout }}'
        reload: yes
    - name: 18. set min_free_kbytes
      sysctl:
        name: vm.min_free_kbytes
        value: '{{ min_free_kbytes.stdout }}'
        reload: yes
    - name: 19. configure sysctl
      sysctl:
        name: '{{ item.key }}'
        value: '{{ item.value }}'
        sysctl_set: yes
        state: present
        reload: yes
        ignoreerrors: yes
      with_dict:
        kernel.shmmni: 4096
        vm.overcommit_memory: 2
        vm.overcommit_ratio: 95
        net.ipv4.ip_local_port_range: 10000 65535
        kernel.sem: 500 2048000 200 40960
        kernel.sysrq: 1
        kernel.core_uses_pid: 1
        kernel.msgmnb: 65536
        kernel.msgmax: 65536
        kernel.msgmni: 2048
        net.ipv4.tcp_syncookies: 1
        net.ipv4.conf.default.accept_source_route: 0
        net.ipv4.tcp_max_syn_backlog: 4096
        net.ipv4.conf.all.arp_filter: 1
        net.core.netdev_max_backlog: 10000
        net.core.rmem_max: 2097152
        net.core.wmem_max: 2097152
        vm.swappiness: 0
        vm.zone_reclaim_mode: 0
        vm.dirty_expire_centisecs: 10
        vm.dirty_writeback_centisecs: 3
        vm.dirty_background_ratio: 10
        vm.dirty_ratio: 20
        vm.dirty_background_bytes: 0
        vm.dirty_bytes: 0 
    - name: 20. state PAM limits
      pam_limits:
        domain: '*'
        limit_type: '-'
        limit_item: '{{ item.key }}'
        value: '{{ item.value }}'
      with_dict:
        nofile: 655360
        nproc: 655360
        memlock: unlimited
        core: unlimited
    - name: 21. install package
      yum:
        name: '/tmp/{{ package_path | basename }}'
        # installroot: '{{ greenplum_install_directory }}'
        state: present
    - name: 22. cleanup package file from host
      file:
        path: '/tmp/{{ package_path | basename }}'
        state: absent
    - name: 23. find install directory
      find:
        paths: '{{ greenplum_install_directory }}'
        patterns: 'greenplum-db*'
        file_type: directory
      register: installed_dir
    - name: 24. change install directory ownership
      file:
        path: '{{ item.path }}'
        owner: '{{ greenplum_admin_user }}'
        group: '{{ greenplum_admin_user }}'
        recurse: yes
      with_items: '{{ installed_dir.files }}'
    - name: 25. update pam_limits
      pam_limits:
        domain: '{{ greenplum_admin_user }}'
        limit_type: '-'
        limit_item: '{{ item.key }}'
        value: '{{ item.value }}'
      with_dict:
        nofile: 524288
        nproc: 131072
    - name: 26. find installed greenplum version
      shell: . '{{ greenplum_install_directory }}'/greenplum-db/greenplum_path.sh && '{{ greenplum_install_directory }}'/greenplum-db/bin/postgres --gp-version
      register: postgres_gp_version
    - name: 27. fail if the correct greenplum version is not installed
      fail:
        msg: "Expected greenplum version {{ version }}, but found '{{ postgres_gp_version.stdout }}'"
      when: "version is not defined or version not in postgres_gp_version.stdout"
    - name: 28. Create data directory if it does not exist
      file:
        path: '{{ item }}'
        state: directory
        mode: '0755'
      with_items:
        - '{{ greenplum_data_directory }}/'
        - '{{ greenplum_data_directory }}/master/'
        - '{{ greenplum_data_directory }}/primary/'
        - '{{ greenplum_data_directory }}/mirror/'
    - name: 29. copy greenplum node temporary files
      copy:
        src: '{{ item }}'
        dest: '/home/{{ greenplum_admin_user }}/'
        remote_src: no
      with_items:
        - gpnodes/all_hosts
        - gpnodes/all_ips
        - gpnodes/master_hosts
        - gpnodes/standby_hosts
        - gpnodes/segment_hosts
    - name: 30. change data directory ownership
      file:
        path: '{{ greenplum_data_directory }}'
        owner: '{{ greenplum_admin_user }}'
        group: '{{ greenplum_admin_user }}'
        recurse: yes

第二步骤:进行master节点上】的配置与数据库初始化:

#!/usr/bin/env ansible-playbook
---

- hosts: all
  vars_files:
    - vars/gpdb.yml
  remote_user: root
  become: yes
  become_method: sudo
  connection: ssh
  gather_facts: yes
  tasks:
    - name: 31. copy files for initialize greenplum master
      copy:
        src: '{{ item }}'
        dest: '/home/{{ greenplum_admin_user }}/'
        remote_src: no
      with_items:
        - gpnodes/gpadmin_hosts
        - template/gpadmin_auto_ssh.sh
        - template/initdb_gpdb.sql
    - name: 32. replace greenplum admin user environment bash file
      template: src=template/gpadmin_bashrc.j2 dest=/home/{{ greenplum_admin_user }}/.bashrc
    - name: 33. copy and configure gpinitsystem config file
      template: src=template/gpinitsystem_config.j2 dest=/home/{{ greenplum_admin_user }}/gpinitsystem_config
    - name: 34. change data directory ownership
      file:
        path: '/home/{{ greenplum_admin_user }}/'
        owner: '{{ greenplum_admin_user }}'
        group: '{{ greenplum_admin_user }}'
        recurse: yes
    - name: 35. configure greenplum admin user auto login
      command: sh /home/{{ greenplum_admin_user }}/gpadmin_auto_ssh.sh /home/{{ greenplum_admin_user }}/gpadmin_hosts
      become: yes
      become_user: '{{ greenplum_admin_user }}'
    - name: 36. initialize greenplum master database
      shell: '{{ item }}'
      become: yes
      become_method: su
      become_flags: '-'
      become_user: '{{ greenplum_admin_user }}'
      with_items:
        - "gpinitsystem -a -c /home/{{ greenplum_admin_user }}/gpinitsystem_config -h /home/{{ greenplum_admin_user }}/segment_hosts -s smdw"
        - "psql -d postgres -U gpadmin -f /home/{{ greenplum_admin_user }}/initdb_gpdb.sql"
        - "echo \"host  all  all  0.0.0.0/0  password\" >> /home/{{ greenplum_admin_user }}/data/master/gpseg-1/pg_hba.conf"
        - "gpstop -u"

以上两大步骤完成后,一个分布式MPP数据库Greenplum就很快搭建起来了。辅助编写了很少的shell脚本即完成了一个靠人工大约要花费半天时间的工作量。

完整的安装包制作项目源代码请见:

项目地址:https://gitee.com/inrgihc/greenplum_installer

安装文档:https://gitee.com/inrgihc/greenplum_installer/wikis

使用起来也比较简单:

git clone https://gitee.com/inrgihc/greenplum_installer.git
cd greenplum_installer/
make all
cd bin/
[root@localhost bin]# tree .
.
├── account.txt
└── greenplum6-centos7-release.bin

经过以上命令,即完成了一个CentOS7下Greenplum6.6.0分布式安装bin包的制作,接下来使用吧,先将greenplum6-centos7-release.bin文件上传至10.101.1.10~10.101.1.13上任意一台服务器上,然后以root身份按如下命令运行:

[root@localhost root]# cat account.txt 
10.101.1.10 root 123321
10.101.1.11 root 123321
10.101.1.12 root 123321
10.101.1.13 root 123321
[root@localhost bin]# tree .
.
├── account.txt
└── greenplum6-centos7-release.bin

0 directories, 2 files
[root@localhost bin]# sh ./greenplum6-centos7-release.bin ./account.txt  install

四、总结

ansible里的知识还是很多的,当前也只是了解了最基本的使用,更多资料如下:

  • (1) https://blog.csdn.net/workwithwebis3w/article/details/94617764
  • (2) http://www.ansible.com.cn/index.html
  • (3) https://docs.ansible.com/
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章