Azure Storage Note

商宝
2023-12-01
Availability
Durability
Scalability




10-20 rack storage


Windows Azure Storage Stamps


Stream Layer (Duratable)
-Files,Blob
Patition Layer 
-Blob
-Queue
-Entity

Patition Master
-Range Partition
-Index Range Partition
-Partition Map
-Partiton Server

Blob Stream
Data Stream 
Log Stream


Stream layer (Append only) concepts
-avaiablility
-consistency


Block 
Extent (repliacte 3 times) 
unit of replication
include sequence block
Steams 
list point of extent

EN node
write to primary, replicate to secondary


Blob Storage
- getblob (get whole blob or a specific range)
- putblob
- delete blob
- copyblob
- snapshotblob
- leaseblob
- meta data (canbe get and set separately with blob)

Restful API and Client Libaray
-Client libaray need to create a client base on credential + Restful URI


BlobRequestOptions
-Retry
-Timeout


Blob Tips
-high throughput
-defaultconnectlimit 
-update/download multiple files in paraller
-ParallelOperationThreadCount
-single blob uploading>32MB
-BlobRequestOptions
-Timeout
if use programming restful potocal please use -retry and exponential backoff for timeout or server busy
-CDN
-Block Blob
-stream + commit-base write
-Page Blob 
- random write/read


Drive
-NTFS API
-Page blob
- use Disk Management 
-Create VHD (*.vhd)
-Upload to blob
-IntitialCache
-Create Cloud Drive base on blob
-Mount Drive
-Demount/Snapshot Drive


Table
-WCF(ADO) Data Service
-PatitionKey
-Entity Locality
-Entity Group Transactions
-Table scalibility
-Table
-Entity
-Insert
-Update
-Merge
-Replace
-Delete
-Query
-Entity Group Transactions


-Operations
-LinqQuery.AsTableServiceQuery<Movie>()
-Continuation Token (1000 each time)
-SaveChangesWithRetries()
-SaveChangesOptions
-Batch
Table Tips
-Default .Net HTTP connections is set to 2
-If programing retry, need to implement
-SaveChangesWithRetries
-AsTableServiceQuery
-**Handle Confilct bcos of retry
- with retry , previous operations might success but might network error does not return to client
-Avoid "Append only" on parition key
- good to have insert cross table

Queue
-Loosely Coupled workflow with queues
-Guarantee delivery/processing the message
-Message Dequeue & Invisible
-Delete Message or Crash re-visible
Queue Tips
-Message can be up to 64KB
-A Message maybe processed more than once
-Message process canbe any order
-For higher throughout
-Batch multiple work item into a single message
-Use multiple Queue
-use DequeueCount to remove posion message
-Monitor message count to dynamic increase/reduce worker role


Blob
-Copy Blob
-new copy
-Copy + Delete
-Snapshot
-Read only version
-Restore (promotion snapshot to new version of blob)
-List snapshots
-Lease (exclusive update)
-Acquire,Renew,Release,Renew
-$root container
-Custom domain name
-Shared Access Singatures
-start/end time
-grant resource access level, container level or blob level
-types of permission 
-Signed identifier
-short URI
-support dynamic change start/end time, permission

-Block Blob (Accessing Stream Workload)
-Committed List
-PutBlockList(u=blockId1,c=blockId2,blockId3..)
-GetBlockList 
-can get commmited List  
-md5 check when you download the content
-can get uncommmited list
-figure what part upload fails
-Uncommited List
-PutBlock(BlockId1)

-Page Blob (Accessing Random Workload)
-PutPage[512,2048]
-has to be 512 byte align
-PutPage
-ClearPage
-GetPageRanges
-Get Valid page ranges in the blob
-GetBlob[1000,2048)

XDrive
-A VM can dynamic mount up to 8 drive
-A Page Blob can only be mount by one VM at a time
-Mount Drive
-Get Lease of blob
-Demount Drive
-Release Lease of blob
-Mounting Drives Snapshots to support multiple drives read only

Table Tips
-Scalability
-Patition Key allow load balance cross servers
-good to have partition key load balance
-avoid single partition key
-good to have partition key load distribute incase throttle
-avoid append only
-Query Efficient & Speed
-Avoid frequence scan
-Parallel query
-Single Entity
-Good to have partition key and row key
-Table Scan Query
-Avoid Continue Token
-Where Rating>5
-Use Range Query & Paralle
-Where Patitionkey>='A' and Patitionkey<'D' and Rating>5
-Where Patitionkey>>'D' and Rating>5
-Avoid to use "OR"
-Expect continuation token for all expect in 1 entity
-if count>1000
-if execution time >5s
-if at the end of partition range boundary
-Large Scan
-Split to rang and Parallar
-Use another table 
-"OR"
-Individual query and Parallar
-User Interaction
-Cache
-Entity Group Transaction
-Reduce round trip
-<=100commands and payload <4MB
-Account ID as partition key
-instead of user table and rental table

Queue
-FetchAttributes
-Get messageCount and decide increase/reduce worker
-make message processing idempotent
-do not rely on order


Storage Performance
-Throughput
-up to few hundred megebytes per sec
-Transactions
-up to few hundred requests per sec


Storage
-Partition Key
-Blob - Blob Name
-Messages - Queue Name
-Entity - Patition key + Row Key
-Throughput
-Queue and table
-Upto 500 trans per sec
-Blob
-Small reads/write up to 30MB/s
-Large reads/write up to 60MB/s


-Service Model
-Type of roles
-How they connect
-Configuration of the service
-how many instance
-update domain


Loosely Coupled Worker with Queue
-case study
-Continuation for long running Work items
-Record Progress
-Scale Queue Throughput
-Batch work item into Blob and store Blob into Queue
-Or use multiple queue


Life cycle management (upgrade and versioning)
In-place Rolling Upgrade
-remember that (old version running side by side with new version)
-Protocal change with Rolling update
-2 steps process 
- version 1.5
- version 2
-Windows Azure Table Schema change
-type of change
-Adding non-key properties
-Removing non-key properties
-changing partition key or row key
-2 steps process
-V1 Client: IgnoremissingProperties
 类似资料:

相关阅读

相关文章

相关问答