Azure Storage Note


Windows Azure Storage

-Availability

-Durability

-Scalability


Basic Knowledge

-PartitionKey

                                -Blob - Container name + BlobName

                                -Messages -Queue Name

                                -Entity -Table Name + Patition key

                -Throughput

                                -single Queue and table parition

                                                -Up to 500 trans per sec

                                -single Blob partition

                                             - reads/write up to 60MB/s

-single storage account

-Up to 5000 trans per sec

-Up to 3 GB reads/write per sec


Restful API and Client Libaray Supported
-Client libaray need to create a client base on credential + Restful URI

Blob Storage

--$root container

- Operation

-GetBlob (get whole blob or a specific range) 

-putblob              

-delete blob              

-copyblob              

-new copy

-Copy + Delete: rename a blob

-snapshotblob         

-Read only version

-Restore (promotion snapshot to new version of blob)

-List snapshots     

-leaseblob (exclusive update)         

-Acquire,Renew,Release,Renew

Use case: master election process

-meta data (can be get and set separately with blob)

-Sharing scenarios

-Container ACLS (Access Control List)

**if give delete permission on container, it does not mean can delete container,but can delete all the blob in the container

-Shared Access Signatures

-Signed identifier
-Short URI
-Support dynamic change start/end time, permission

-Retry

-Timeout

-Accesscondition

-Custom domain name

-Block Blob (Accessing Stream Workload)

-2 phase commitment

- benefit for retry & efficient continuation

-CommittedList

                                                -PutBlockList(u=blockId1,c=blockId2,blockId3..)
                                                -GetBlockList
                                                               -canget commmited List 
                                                                                -md5check when you download the content
                                                                -canget uncommmited list
                                                                                -figure what part upload fails
 -Uncommited List
                                                -PutBlock(BlockId1)

-PageBlob (Accessing Random Workload)

- 1 phase commitment

                                -PutPage[512,2048]

                                                -put has to be 512 byte align

                                -PutPage

                                -ClearPage

                                -GetPageRanges

                                                -GetValid page ranges in the blob

                                -GetBlob[1000,2048)



BlobTips

                -high throughput
                                -default connect limit
                                -update/downloadmultiple files in paraller
                                -ParallelOperationThreadCount
                                                -singleblob uploading>32MB
                                -BlobRequestOptions
                                                -Timeout
                                if useprogramming restful potocal please use -retry and exponential backoff fortimeout or server busy
                                -CDN
                -Block Blob
                                -stream +commit-base write
                -Page Blob

                                - randomwrite/read

-Set Timeout value on BlobClient or BlobRequestOptions

-Client Library uses default 90 sec

- Use Share Access Singatures

-Container Access Level -allows revoking permission

-Provider appropriate permission

-use https since there are pre-authentically URL

Drive

                -NTFS API

                -Page blob

                - use Disk Management

                                -Create VHD(*.vhd)

                                -Upload to blob

                -IntitialCache

                -Create Cloud Drive base on blob

                -Mount Drive

-basically it is Get Lease of Page blob

                -Demount

-basically it is Release Lease of Page blob

-Snapshot Drive

-to support multiple drives read only

-Mounted by one VM at a time for read/write

-A VM can dynamically mount up to 16 drive

Table

                -WCF(ADO) Data Service

                -PatitionKey

                                -Entity Locality

                                -Entity GroupTransactions

                                -Tablescalibility

                -Table

                -Entity

                                -Insert

                                -Update

                                                -Merge

                                                -Replace

                                -Delete

                                -Query

                                -Entity GroupTransactions

 

                -Operations

                                -LinqQuery.AsTableServiceQuery<Movie>()

                                                -ContinuationToken (1000 each time)

                                -SaveChangesWithRetries()

                                                -SaveChangesOptions

                                                                -Batch

TableTips

                -Default .Net HTTP connectionsis set to 2

                -If programing retry, need toimplement

                                -SaveChangesWithRetries

                                -AsTableServiceQuery (Continuation token)

                -**Handle Confilct bcos of retry

                                - with retry ,previous operations might success but might network error does not return toclient

                -Avoid "Append only"on parition key

                                - good to haveinsert cross table

  - SELECT A PARTITION KEY

-From scalability, Query Efficient & Speed, Entity Group Transaction as below

-Scalability

                                -Patition Keyallow load balance cross servers

                                -good to havepartition key load balance

                                                -avoidsingle partition key, read is not scable

                                -good to havepartition key load distribute incase throttle

                                                -avoid append and prepend only

- each time only one server is busy, write is not scable

                -Query Efficient & Speed

                                -Avoid frequency scan

                                -Parallel query

                                -Single Entity

                                                -Goodto have partition key and row key

                                -Table ScanQuery

                                                -Avoid Continue Token

                                                                -WhereRating>5

                                                -Use RangeQuery & Parallel

                                                                -WherePatitionkey>='A' and Patitionkey<'D' and Rating>5

                                                                -WherePatitionkey>>'D' and Rating>5

                                                -Avoidto use "OR"

                                                -Expectcontinuation token for all expect in 1 entity

                                                                -ifcount>1000

                                                                -ifexecution time >5s

                                                                -ifat the end of partition range boundary

                                -Large Scan

                                                -Split to rang and Parallel

                                                -Use another table

                                -"OR"

                                                -Individualquery and Parallar

                                -User Interaction

                                                -Cache

                -Entity Group Transaction

                                -Reduce roundtrip

                                -<=100commandsand payload <4MB

                                -Account ID as partition key

                                                -insteadof user table and rental table

-WCF Data Service

- use new context for each logical operation

-bcos context track the entity, if you are going to update 1 million entity , then....

-Add object/attach to can throw exception if entity is already exist

-Point query throws exception if resource is not exist - useIgnoreResourceNotFoundException

-Point queries use the table's clustered index.


Queue

                -Loosely Coupled workflow withqueues

                -Guarantee delivery/processingthe message - 2 steps process

                                -Message Dequeue& Invisible

                                -Delete Messageor Crash re-visible

                -FetchAttributes

                                -GetmessageCount and decide increase/reduce worker

                -make message processingidempotent

                -do not rely on order


QueueTips

                -Message can be up to 64KB

                -A Message maybe processed morethan once

                -Message process canbe any order

                -For higher throughout

                                -Batch multiplework item into a single message

                                -Use multipleQueue

                -use DequeueCount to removeposion message

                -Monitor message count todynamic increase/reduce worker role


Others

LooselyCoupled Worker with Queue

                -case study

                                -Continuationfor long running Work items

                                                -RecordProgress

                                -Scale QueueThroughput

                                                -Batchwork item into Blob and store Blob into Queue

                                                -Oruse multiple queue

 

Lifecycle management (upgrade and versioning)                                            

In-placeRolling Upgrade

                -remember that (old versionrunning side by side with new version)

                -Protocal change with Rollingupdate

                                -2 steps process

                                                -version 1.5

                                                -version 2

                -Windows Azure Table Schemachange

                                -type of change

                                                -Addingnon-key properties

                                                -Removingnon-key properties

                                                -changingpartition key or row key

                                -2 steps process

                                                -V1Client: IgnoremissingProperties



What is New (2011 September Event "Build") ?

            -Blob

-Efficient Resume for browsers and streaming media player

            -Table

                        -QueryProjection ($select)

                                    -Projectonly selected columns

                                    e.g.:var query=(from entity incontext.CreateQuery<CustomerSubSetTable>("Customers")

                                                                                    .selectnew CustomerSubSetTable

                                                                                                {

                                                                                                            PartitionKey=entity.PartitionKey,

                                                                                                            RowKey=entity.RowKey,

                                                                                                            TotolPurcharse=entity.TotalPurcharse

                                                                                                }).AsTableServiceQuery<CustomerSubSetTable>();

                                                foreach(CustomerSubSetTablecustomer in query)

                                                {

                                                }

                        -UpsertEntity (don't put ETag)

                                    -InsertOrReplace

                                    -InsertOrMerge

            -Queue

                        -Allowworker to extend invisibility timeout

                        -Allowworker to update content of queue message

                                    -Enableefficient continuation on worker failure       

                                   

                                   

Storage Analytics

            -Log (storein windows azure blob, a request typically appear in log within 15 minutes)

                        -traceall transactions for blob, table ,queue

                                    -howlong request take

                                    -whatclient ip

                                    -whatis the request id

                                    -whichblob, container was been access

            -Metric(storein windows azure table)

                        -perhours of summary of key statistics about the traffic to their blob, table,queue

                                    -totaltransaction

                                    -storageserver latency

                                    -applicationE2E latency

                                                -timefor input to be transferred to storage service

                                                -timefor storage to process request and compute result (as storage server latency)

                                                -timefor application to retrive result

            **RetentionPolicy on both logs and metric in terms of days

           

           

Windows Azure Storage Internal - Storage Stamps (how to makeazure storage availbility, durability, scalability)

            -onestorage account is assigned to one storage stamps (storage tenant)

            -onestorage stamps include 10-20 rack data storage (2-30 TB)

            -3 Layer

                        -FrontEnd Layer

                                    -authentication

                                    -authorization

                                    -login,routing

                                    -holdpartition map (index)

                        -Patitionlayer

                                    -knowwhat is table, blob, queue object

                                    -maketable, blob,queue object strong consistent

                                    -Scalableobject(table,blob,queue) index

                                                -spreadthe index cross 100s server

                                                -dynamic load balance

                        -DFSLayer

                                    -makefiles durable,replic 3 times cross fault domain and upgrad domain

                                    -dochecksum

                                    -loadbalancing

                                                -read- each replics can be reads

                                                -writeuse journal drive to low latency




發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章