Clustered PHP - DC PHP 2009

Why cluster?

services more user, service faster, increase rliability, get rich

Objectives

linear capacity increase,linear cost increase, exponential reliability increase.

Common topics in clustering php

Load Balancing

Database Scaling

Replicated Storage

Backups

Data Caches

Distributed Sessions

Staging Strategies

Debuging

Background Services

Load Balancing

Your load banancer may or may not...

Remove bad notes from the pool

Balance by performance

Balance by weight

Route by geolocation

Support sticky sessions

Have 1 million other features

Load Balancing Tools

some among thounds

DNS Servers

Big IP

Perlbal

nginx

Varnish

Database Scaling

common things you can do

Partitioning

Replication

Sharding

Database Partitioning(數據庫分區)

Every user is assigned to a database server

User don't share data between each other( between servers)

When you need more capacity, add another database server.

Works for some apps , dosen't work for others

implementation example: invoice and timesheet management app

Database Replication(mysql)（數據庫複製）

master-master

master-slave

master-many slave

master-master:

server1 replicates(as master) to server2(acting as slave)

server2 replicates(as master) to server1(acting as slave)

works well to a point

complete nightmare when replication gets desynchronized

dosen't actually improve write performance

good for basic high availability

master-slave:

server1 replicates(as master) to server2(acting as slave)

good frist step

makes you re-write your application to consider slave queries

dosen't increate write performance

de-synchronization is relatively painless

replication lag

master-many slave:

server1 replicates(as master) to many servers(acting as slaves)

thundering read performance

makes you re-write your application to consider slave queries

dosen't increase write performance

de-synchronization is relatively painless

replication lag

Database Sharding（數據庫分片）

data is split between multiple database servers（數據分別存儲在不同的服務器上）

logical index is kept of what data is where(for example, a mathematical index or a look up chart)（邏輯索引與數據保存在一起）

you have to grab, parse and correlate data across servers（你必須抓取、解析及在服務器間關聯數據)

theoretically limitless scalability（理論上無限的可擴展性）

complicated(複雜)

implementation example: digg, facebook, etc

Replicated Storage

common things you can do:

replicated file system

lookup tables

storage services

huge NAS arrays(巨大的磁盤陣列)

Replicated file system

very affordable

various replication modes

nothing to keep track of in your app

easy to implement

can cause massive failures if poorly configured

Lookup tables

very affordable

limitless mode; entirely up to you (限制少，完全由你)

entirely dependent on your application logic

can cause massive failures if poorly configured

Storage services

very expensive

theoretically limit-less capacity(理論上無限的容量)

easy to use

data must be pulled back first if used locally

costs and bandwidth usage can be mitigated(for example, by putting a proxy in front of it)（可以減少成本和帶寬的使用）

huge NAS arrays(巨大的磁盤陣列)

insanely expensive(瘋狂的昂貴)

bullet-proof fault tolerance .. at a price

easy to use... for a price

Backups

common methods:

all-RAID(dosen't work)

snapshots

copying from slaves

all-RAID dosen't work

why?

RAID won't keep your application from deleting data everywhere(RAID不能保持你任何地方的程序的一致性，當有數據被刪除時)

Snapshots

use a mechanism to make a snapshot of the partion i.e. LVM partions

works really well

easy if you do it from the beginning

requires some planning

should be used with RAID drives

copying from slaves

take a slave out of rotation and copy from it i.e. MySQL databases

works really well

easy if you do it from the beginning

requires some planning

backups can be out of date(過時，過期)

Data Caches

PHP doesn't have cross-request persistence, so someone added it: memcached

in-memory

fast

scalable

proven

use it

Got configuration data? Small,high-TTL data sets? Use APC.

Large,high-TTL data sets? Use files.

Mind the race condition.(競爭條件)

Replicated Sessions

pick your poison:

memcache w. redundancy

database

shared file system(don't actually do this)

Staging Strategies(分期策略)

if you value your free time:

Staging Strategies(dev)(分期策略)

do use source control systems(subversion, etc)

do profile your to loop for obvious performance issues

do use phpdoc tags

do make your dev environment as similar to live as practical(i.e., don't develop on windows and run live on UNIX)

do document all your changes

do use TDD(test-driven development)

Staging Strategies(Test)(分期策略)

do make test functionally identical to live, except for data

do create data fixtures (夾具)that are representative of real-life data

do create functional tests for the user interface(Selenium)

do not push anything to stage that did not pass unit tests

Staging Strategies(stage)(分期策略)

do make stage identical to a live node

do connect to the live database

do have test 'users' to perform destructive operation against

do have a mechanism to automate pushing stage to live

Staging Strategies(live)(分期策略)

do not ever make changes by hand on live

do automate pushing updates

do take nodes out rotation when you push updates

do not allow ssh access to live except when really needed

Debuging

do use xdebug on dev, test, and stage

do prepare an automated action that can turn xdebug and profiling on/off on 1 of the live nodes. you can and will run into errors that only exist on live.

do write a test case to replicate the the bug and then fix the bug, whenever possible

do first look if bugs are explainable by platform differences between development and production systems(i.e., don't develop on Windows and deploy on UNIX)

do go to my talk at ZendCon in October, "it Works on Dev"

Background Services

do void launching background processes from the web app

PHP doesn't have a native message queue, so(many) people wrote some. example, gearmand. do use a message queue.

do check for memory leaks in background tasks! many php libraries and also many php versions themselves still leak memory. try to write a loop in bash for a background task rather than in php. recycle the process often.

do plan your message format carefully

do persist important messages

[email protected]

Clustered PHP - DC PHP 2009

電子科技大學計算機科學與技術就讀體驗

Golang爬蟲代理接入的技術與實踐

Vbs與批處理高級教程

Clustered PHP - DC PHP 2009

gSoap中啓動服務器端示例代碼

統計文件信息及生成SQL語句（VBS）實例

OffsetMonth 和 OffsetDay

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結