淺談集羣情況下的session實現機制

===========================================================

作者: tbase(http://tbase.itpub.net)
發表於: 2008.06.02 13:55
分類: 工作流
出處: http://tbase.itpub.net/post/4931/463415
---------------------------------------------------------------

1.集羣的歷史
集羣，英文叫cluster，是一個老話題了，使用yahoo.cn, baidu.com和g.cn都沒有搜索到究竟集羣這個概念是從什麼開始提出的，後來發現了一篇英文文章http: //www.domaingurus.com/faqs/what-is-a-server-cluster.html，想詳細瞭解的話可以看看。

2.爲什麼要做服務器做集羣？
通常情況下，我們的應用都不需要做集羣，但隨着訪問量的加大，一臺服務器無法支撐，無法做出快速的響應，而且在這種訪問壓力下，有可能會壓垮服務器。這種情況下，我們需要增加服務器、分發請求給各個服務器，但必須保證在客戶看來是在訪問一個服務器而不是多個。下面引用了袁紅崗的一段採訪記錄，應該說比較有說服力：

“在以下兩種情況下集羣是有用的:1. 高併發超負荷運行的主機，例如google這樣的網站，它的訪問量是相當大的，因此google會採取集羣策略來分散客戶的請求，以提高整體響應能力。我們接觸的很多J2EE應用負荷量都不大，其實每秒訪問量在500以下的應用都沒有必要採取集羣策略。2. 失效轉移，其實我認爲這纔是集羣真正有用的地方，使用一臺低成本計算設備作爲主設備的備份，在主設備發生故障時及時接替，以保證7x24小時不間斷服務。綜上所述，在準備採用集羣之前，一定要仔細分析具體的應用環境，以避免不必要的浪費。”....

3.服務器集羣的難點
作爲每個web應用，實現會話是最基本的要求。我們都知道用戶在訪問服務器的時候，服務器會爲每個用戶生成一個唯一的session，當如果下次這個用戶的請求被分發到集羣中的另一臺的時候，那臺服務器也必須要重新創建一個新的session，導致用戶前面的session信息丟失。因此，集羣的一個重要難點在於如何保證這些集羣服務器使用的session都是該用戶的同一個session。

多臺集羣服務器之間互相複製session信息
一臺服務器存放session信息，由集羣服務器讀取
將session信息保存於客服端cookie，節省session複製的開銷

可以想到，1和2其實都有很大的開銷，而3則是一個不錯的選擇，當然有一個極大的缺陷：無法保持多個請求之間的狀態信息，而只能保存一些最基本的、經常使用的信息。

Server Cluster Definition

原文URL: http://www.domaingurus.com/faqs/what-is-a-server-cluster.html

What is a Cluster?

A cluster is the aggregation of multiple stand-alone computers linked together by software and networking technologies to create a unified system.

Clusters are typically categorized into 2 general types:

High Performance Computing (HPC), made up of markets traditionally serviced by supercomputers for applications requiring greater computational power than a single computer can provide; or
Enterprise or High Availability (HA) with automatic failover, load balancing, redundancy, and other features that provide high reliability for the data center. Many HPC clusters also incorporate some of the features of HA clusters.

Application requirements vary between and within each of these system types. For this reason it’s imperative that you choose a cluster partner that understands the intricacies of cluster design and can help you avoid the pitfalls of cluster deployment.

High-availability (HA) clusters

High-availability clusters are implemented primarily for the purpose of improving the availability of services which the cluster provides. They operate by having redundant nodes, which are then used to provide service when system components fail. The most common size for an HA cluster is two nodes, which is the minimum requirement to provide redundancy. HA cluster implementations attempt to manage the redundancy inherent in a cluster to eliminate single points of failure. There are many commercial implementations of High-Availability clusters for many operating systems. The Linux-HA project is one commonly used free software HA package for the Linux OS

Load-balancing clusters

Load-balancing clusters operate by having all workload come through one or more load-balancing front ends, which then distribute it to a collection of back end servers. Although they are primarily implemented for improved performance, they commonly include high-availability features as well. Such a cluster of computers is sometimes referred to as a server farm. If this packages does not meet your needs, or if you need help determining how best to utilize the power of a dedicated server cluster, complete our advanced sever cluster questionnaire and our sales team will prepare a custom hosting quote just for you.

High Performance Computing (HPC) Clusters

Linux clusters are democratizing supercomputing for engineers, scientists, and researchers whose work demands the highest levels of computational analysis, modeling, and simulations. The Customers section of this site explains how innovative teams in a variety of industries are using HPC clusters to help speed up product development and groundbreaking research.

HPC clusters are optimized for workloads which require jobs or processes happening on the separate cluster computer nodes to communicate actively during the computation. These include computations where intermediate results from one node's calculations will affect future calculations on other nodes.

Cluster history

The history of cluster computing is best captured by a footnote in Greg Pfister's In Search of Clusters: "Virtually every press release from DEC mentioning clusters says 'DEC, who invented clusters...'. IBM didn't invent them either. Customers invented clusters, as soon as they couldn't fit all their work on one computer, or needed a backup. The date of the first is unknown, but I'd be surprised if it wasn't in the 1960's, or even late 1950's."

The formal engineering basis of cluster computing as a means of doing parallel work of any sort was arguably invented by Gene Amdahl of IBM, who in 1967 published what has come to be regarded as the seminal paper on parallel processing: Amdahl's Law. Amdahl's Law describes mathematically the speedup one can expect from parallelizing any given otherwise serially performed task on a parallel architecture. This article defined the engineering basis for both multiprocessor computing and cluster computing, where the primary differentiator is whether or not the interprocessor communications are supported "inside" the computer (on for example a customized internal communications bus or network) or "outside" the computer on a commodity network.

Consequently the history of early computer clusters is more or less directly tied into the history of early networks, as one of the primary motivation for the development of a network was to link computing resources, creating a de facto computer cluster. Packet switching networks were conceptually invented by the RAND corporation in 1962. Using the concept of a packet switched network, the ARPANET project succeeded in creating in 1969 what was arguably the world's first commodity-network based computer cluster by linking four different computer centers (each of which was something of a "cluster" in its own right, but probably not a commodity cluster). The ARPANET project grew into the Internet -- which can be thought of as "the mother of all computer clusters" (as the union of nearly all of the compute resources, including clusters, that happen to be connected). It also established the paradigm in use by all computer clusters in the world today -- the use of packet-switched networks to perform interprocessor communications between processor (sets) located in otherwise disconnected frames.

The development of customer-built and research clusters proceded hand in hand with that of both networks and the Unix operating system from the early 1970s, as both TCP/IP and the Xerox PARC project created and formalized protocols for network-based communications. The Hydra operating system was built for a cluster of DEC PDP-11 minicomputers called C.mmp at C-MU in 1971. However, it wasn't until circa 1983 that the protocols and tools for easily doing remote job distribution and file sharing were defined (largely within the context of BSD Unix, as implemented by Sun Microsystems) and hence became generally available commercially, along with a shared filesystem.

The first commercial clustering product was ARCnet, developed by Datapoint in 1977. ARCnet wasn't a commercial success and clustering per se didn't really take off until DEC released their VAXcluster product in the 1984 for the VAX/VMS operating system. The ARCnet and VAXcluster products not only supported parallel computing, but also shared file systems and peripheral devices. They were supposed to give you the advantage of parallel processing, while maintaining data reliability and uniqueness. VAXcluster, now VMScluster, is still available on OpenVMS systems from HP running on Alpha and Itanium systems.

Two other noteworthy early commercial clusters were the Tandem Himalaya (a circa 1994 high-availability product) and the IBM S/390 Parallel Sysplex (also circa 1994, primarily for business use).

No history of commodity compute clusters would be complete without noting the pivotal role played by the development of Parallel Virtual Machine (PVM) software in 1989. This open source software based on TCP/IP communications enabled the instant creation of a virtual supercomputer -- a high performance compute cluster -- made out of any TCP/IP connected systems. Free form heterogeneous clusters built on top of this model rapidly achieved total throughput in FLOPS that greatly exceeded that available even with the most expensive "big iron" supercomputers. PVM and the advent of inexpensive networked PC's led, in1993, to a NASA project to build supercomputers out of commodity clusters. In 1995 the invention of the "beowulf"-style cluster -- a compute cluster built on top of a commodity network for the specific purpose of "being a supercomputer" capable of performing tightly coupled parallel HPC computations. This in turn spurred the independent development of Grid computing as a named entity, although Grid-style clustering had been around at least as long as the Unix operating system and the Arpanet, whether or not it, or the clusters that used it, were named. reference: wikipedia

wzy126126

發佈了37 篇原創文章 · 獲贊 8 · 訪問量 13萬+

私信關注

淺談集羣情況下的session實現機制

Server Cluster Definition

What is a Cluster?

High-availability (HA) clusters

Load-balancing clusters

High Performance Computing (HPC) Clusters

Cluster history

在javascript中在function處提示missing(before function parameters錯誤

快速排序！

Internet Explorer無法打開Internet 站點的原因

很詳細的Log4j配置步驟

深入解析ORACLE字符集

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結