別以爲真懂Openstack: 虛擬機創建的50個步驟和100個知識點(2)

二、nova-api

image

步驟3:nova-api接收請求

nova-api接收請求,也不是隨便怎麼來都接收的,而是需要設定rate limits,默認的實現是在ratelimit的middleware裏面實現的。

然而有時候,我們希望實現distributed rate-limiting,從而Turnstile是一個不錯的選擇。

https://github.com/klmitch/turnstile
http://pypi.python.org/pypi/turnstile

步驟4:對Token的驗證

步驟5:查看Policy

這兩步已經在keystone的時候研究過

步驟6:檢查quota

nova, neutron, Cinder各有各的quota,並且可以從命令行進行管理

# nova -h | grep quota
    quota-class-show    List the quotas for a quota class.
    quota-class-update  Update the quotas for a quota class.
    quota-defaults      List the default quotas for a tenant.
    quota-delete        Delete quota for a tenant/user so their quota will
    quota-show          List the quotas for a tenant/user.
    quota-update        Update the quotas for a tenant/user.

# nova quota-show
+-----------------------------+-------+
| Quota                       | Limit |
+-----------------------------+-------+
| instances                   | 10    |
| cores                       | 20    |
| ram                         | 51200 |
| floating_ips                | 10    |
| fixed_ips                   | -1    |
| metadata_items              | 128   |
| injected_files              | 5     |
| injected_file_content_bytes | 10240 |
| injected_file_path_bytes    | 255   |
| key_pairs                   | 100   |
| security_groups             | 10    |
| security_group_rules        | 20    |
+-----------------------------+-------+

# cinder -h | grep quota
    quota-class-show    List the quotas for a quota class.
    quota-class-update  Update the quotas for a quota class.
    quota-defaults      List the default quotas for a tenant.
    quota-show          List the quotas for a tenant.
    quota-update        Update the quotas for a tenant.
    quota-usage         List the quota usage for a tenant.

# cinder quota-show 1779b3bc725b44b98726fb0cbdc617b1
+-----------+-------+
|  Property | Value |
+-----------+-------+
| gigabytes |  1000 |
| snapshots |   10  |
|  volumes  |   10  |
+-----------+-------+

# neutron -h | grep quota
  quota-delete                   Delete defined quotas of a given tenant.
  quota-list                     List quotas of all tenants who have non-default quota values.
  quota-show                     Show quotas of a given tenant
  quota-update                   Define tenant's quotas not to use defaults.

# neutron quota-show 1779b3bc725b44b98726fb0cbdc617b1
+---------------------+-------+
| Field               | Value |
+---------------------+-------+
| floatingip          | 50    |
| network             | 10    |
| port                | 50    |
| router              | 10    |
| security_group      | 10    |
| security_group_rule | 100   |
| subnet              | 10    |
+---------------------+-------+

推薦下面的文章

openstack nova 基礎知識——Quota(配額管理)

http://www.sebastien-han.fr/blog/2012/09/19/openstack-play-with-quota/

步驟7:在數據庫中創建Instance實例

有關nova的database schema參考下面的文章

http://www.prestonlee.com/2012/05/03/openstack-nova-essex-mysql-database-schema-diagram-and-sql/

MySQL是Openstack中最重要的組件之一,所以在生產環境中High Availability是必須的。

MySQL的HA有下面幾種方式:

http://dev.mysql.com/doc/mysql-ha-scalability/en/index.html

RequirementMySQL ReplicationMySQL with DRBD with Corosync and PacemakerMySQL Cluster
Availability


Platform SupportAll Supported by MySQL ServerLinuxAll Supported by MySQL Cluster
Automated IP FailoverNoYesDepends on Connector and Configuration
Automated Database FailoverNoYesYes
Automatic Data ResynchronizationNoYesYes
Typical Failover TimeUser / Script DependentConfiguration Dependent, 60 seconds and Above1 Second and Less
Synchronous ReplicationNo, Asynchronous and SemisynchronousYesYes
Shared StorageNo, DistributedNo, DistributedNo, Distributed
Geographic redundancy supportYesYes, via MySQL ReplicationYes, via MySQL Replication
Update Schema On-LineNoNoYes
Scalability


Number of NodesOne Master, Multiple SlavesOne Active (primary), one Passive (secondary) Node255
Built-in Load BalancingReads, via MySQL ReplicationReads, via MySQL ReplicationYes, Reads and Writes
Supports Read-Intensive WorkloadsYesYesYes
Supports Write-Intensive WorkloadsYes, via Application-Level ShardingYes, via Application-Level Sharding to Multiple Active/Passive PairsYes, via Auto-Sharding
Scale On-Line (add nodes, repartition, etc.)NoNoYes

要想系統的學習Mysql replication,推薦下面的這本書

《MySQL High Availability Tools for Building Robust Data Centers》

還有一種方式是Mysql + galera,可以搭建Active + Active的Mysql應用

MySQL分支版本 MySQL/Galera 1.0發佈

參考下面的兩篇文章

http://www.sebastien-han.fr/blog/2012/04/08/mysql-galera-cluster-with-haproxy/

http://www.sebastien-han.fr/blog/2012/04/01/mysql-multi-master-replication-with-galera/

還有一種常見的HA的技術,就是pacemaker

image 

最底層是通信層corosync/openais

負責cluster中node之間的通信

上一層是Resource Allocation Layer,包含下面的組件:

CRM Cluster Resouce Manager

是總管,對於resource做的任何操作都是通過它。每個機器上都有一個CRM。

CIB Cluster Information Base

CIB由CRM管理,是在內存中的XML數據庫,保存了cluster的配置和狀態。我們查詢出來的configuration都是保存在CIB裏面的。nodes, resources, constraints, relationship.

DC Designated Coordinator

每個node都有CRM,會有一個被選爲DC,是整個Cluster的大腦,這個DC控制的CIB是master CIB,其他的CIB都是副本。

PE Policy Engine

當DC需要進行一些全局配置的時候,首先由PE根據當前的狀態和配置,計算出將來的狀態,並生成一系列的action,使得cluster從初始狀態變爲結果狀態。PE僅僅在DC上運行。

LRM Local Resource Manager

本地的resource管理,調用resource agent完成操作,啓停resource,將結果返回給CRM

再上一層是Resource Layer

包含多個resource agent。resource agent往往是一些shell script,用來啓動,停止,監控resource的狀態。

要弄懂Pacemaker,推薦讀《SUSE high availability guide》

https://www.suse.com/documentation/sle_ha/singlehtml/book_sleha/book_sleha.html

本人做了一些筆記和實驗,請參考

High Availability手冊(1): 環境

High Availability手冊(2): 架構

High Availability手冊(3): 配置

步驟8:創建filter_properties,用於nova scheduler

步驟9:發送RPC給nova-conductor

有關nova-conductor的文章

http://cloudystuffhappens.blogspot.com/2013/04/understanding-nova-conductor-in.html

在Openstack中,RPC的發送是通過RabbitMQ

RabbitMQ可以通過Pacemaker進行HA,當然也可以搭建自己的RabbitMQ Cluster

學習RabbitMQ當然首推《RabbitMQ in Action》

本人也做了一些筆記

 

RabbitMQ in Action (1): Understanding messaging

RabbitMQ in Action (2): Running and administering Rabbit

RabbitMQ in Action(5): Clustering and dealing with failure

還沒完全讀完,敬請諒解

當然Openstack中對於RabbitMQ的使用,一篇很好的文章是

NOVA源碼分析——NOVA中的RabbitMQ解析

本人也對RPC的調用過程進行了代碼分析

Openstack中RabbitMQ RPC代碼分析 

步驟10:nova-condutor創建request_spec,用於scheduler

步驟11:nova-conductor發送RPC給nova-scheduler

三、nova-scheduler

 

image

選擇一個物理機來創建虛擬機,我們稱爲schedule的過程

nova scheduler的一個經典的圖如下

../_images/filteringWorkflow1.png

就是先Filter再Weighting,其實scheduler的過程在很早就參與了。

步驟13:對Host進行Filtering

Filtering主要通過兩個變量進行,request_spec和filter_properties,而這些變量在前面的步驟,都已經準備好了。

而不同的Filter只是利用這些信息,然後再根據從HostManager統計上來的HostState信息,選出匹配的Host。

request_spec中的第一個信息就是image的properties信息,尤其是當你想支持多種Hypervisor的時候,Xen的image, KVM的image, Hyper-V的image各不相同,如何保證image跑在正確的Hypervisor上?在image裏面這種hypervisor_type property就很必要。

請閱讀下面的文章

http://www.cloudbase.it/filtering-glance-images-for-hyper-v/

image properties還會有min_ram, min_disk,只有內存和硬盤夠大纔可以。

Flavor裏面可以設置extra_specs,這是一系列key-value值,在數據結構中,以instance_type變量實現,可以在裏面輸入這個Flavor除了資源需求的其他參數,從而在Filter的時候,可以使用。

host aggregates將所有的Host分成多個Group,當然不同的Group可以根據不同的屬性Metadata劃分,一種是高性能和低性能。

在Openstack文檔中,這個例子很好的展示了host aggregates和Flavor extra_specs的配合使用

http://docs.openstack.org/trunk/config-reference/content/section_compute-scheduler.html

Example: Specify compute hosts with SSDs

This example configures the Compute service to enable users to request nodes that have solid-state drives (SSDs). You create a fast-iohost aggregate in the nova availability zone and you add the ssd=true key-value pair to the aggregate. Then, you add the node1, and node2compute nodes to it.

$ nova aggregate-create fast-io nova
+----+---------+-------------------+-------+----------+
| Id | Name    | Availability Zone | Hosts | Metadata |
+----+---------+-------------------+-------+----------+
| 1  | fast-io | nova              |       |          |
+----+---------+-------------------+-------+----------+

$ nova aggregate-set-metadata 1 ssd=true
+----+---------+-------------------+-------+-------------------+
| Id | Name    | Availability Zone | Hosts | Metadata          |
+----+---------+-------------------+-------+-------------------+
| 1  | fast-io | nova              | []    | {u'ssd': u'true'} |
+----+---------+-------------------+-------+-------------------+

$ nova aggregate-add-host 1 node1
+----+---------+-------------------+-----------+-------------------+
| Id | Name    | Availability Zone | Hosts      | Metadata          |
+----+---------+-------------------+------------+-------------------+
| 1  | fast-io | nova              | [u'node1'] | {u'ssd': u'true'} |
+----+---------+-------------------+------------+-------------------+

$ nova aggregate-add-host 1 node2
+----+---------+-------------------+---------------------+-------------------+
| Id | Name    | Availability Zone | Hosts                | Metadata          |
+----+---------+-------------------+----------------------+-------------------+
| 1  | fast-io | nova              | [u'node1', u'node2'] | {u'ssd': u'true'} |
+----+---------+-------------------+----------------------+-------------------+

Use the nova flavor-create command to create the ssd.large flavor called with an ID of 6, 8 GB of RAM, 80 GB root disk, and four vCPUs.

$ nova flavor-create ssd.large 6 8192 80 4 +----+-----------+-----------+------+-----------+------+-------+-------------+-----------+-------------+
| ID | Name      | Memory_MB | Disk | Ephemeral | Swap | VCPUs | RXTX_Factor | Is_Public | extra_specs |
+----+-----------+-----------+------+-----------+------+-------+-------------+-----------+-------------+
| 6  | ssd.large | 8192      | 80   | 0         |      | 4     | 1           | True      | {}          |
+----+-----------+-----------+------+-----------+------+-------+-------------+-----------+-------------+

Once the flavor is created, specify one or more key-value pairs that match the key-value pairs on the host aggregates. In this case, that is the ssd=true key-value pair. Setting a key-value pair on a flavor is done using the nova flavor-key command.

$ nova flavor-key ssd.large set ssd=true

Once it is set, you should see the extra_specs property of the ssd.large flavor populated with a key of ssd and a corresponding value of true.

$ nova flavor-show ssd.large
+----------------------------+-------------------+
| Property                   | Value             |
+----------------------------+-------------------+
| OS-FLV-DISABLED:disabled   | False             |
| OS-FLV-EXT-DATA:ephemeral  | 0                 |
| disk                       | 80                |
| extra_specs                | {u'ssd': u'true'} |
| id                         | 6                 |
| name                       | ssd.large         |
| os-flavor-access:is_public | True              |
| ram                        | 8192              |
| rxtx_factor                | 1.0               |
| swap                       |                   |
| vcpus                      | 4                 |
+----------------------------+-------------------+

Now, when a user requests an instance with the ssd.large flavor, the scheduler only considers hosts with the ssd=true key-value pair. In this example, these are node1 and node2.

另一個作用是Xen和KVM的POOL分開,有利於XEN進行Live Migration

另一個作用是Windows和Linux的POOL分開,因爲Windows是需要收費的,而Linux大多不需要,Windows的收費是按照物理機,而非虛擬機來收費的,所有需要儘量的將windows的虛擬機放到一個物理機上。

Filter_properties的裏面scheduler_hints是一個json,裏面可以設置任何值,用於Filter的時候使用。

例如JsonFilter

The JsonFilter allows a user to construct a custom filter by passing a scheduler hint in JSON format. The following operators are supported:

  • =

  • <

  • >

  • in

  • <=

  • >=

  • not

  • or

  • and

The filter supports the following variables:

  • $free_ram_mb

  • $free_disk_mb

  • $total_usable_ram_mb

  • $vcpus_total

  • $vcpus_used

Using the nova command-line tool, use the --hint flag:

$ nova boot --image 827d564a-e636-4fc4-a376-d36f7ebe1747 --flavor 1 --hint query='[">=","$free_ram_mb",1024]' server1

With the API, use the os:scheduler_hints key:

Select Text

{

"server": {

"name": "server-1",

"imageRef": "cedef40a-ed67-4d10-800e-17455edce175",

"flavorRef": "1"

},

"os:scheduler_hints": {

"query": "[&gt;=,$free_ram_mb,1024]"

}

}

我們可以指定某個物理機,用下面的命令--availability-zone <zone-name>:<host-name>

步驟14:對合適的Hosts進行weighting並且排序

選出了Hosts,接下來就是進行Weighting的操作

Weighting可以根據很多變量來,一般來說Memory和disk是最先需要滿足的,CPU和network io則需要次要考慮,一般來說,對於付錢較少的Flavor,能滿足memory和disk就可以了,對於付錢較多的Flavor,則需要保證其CPU和network io.

步驟15:nova-scheduler想選出的Host發送RPC


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章