《精通數據倉庫設計》中英對照_第三章

《精通數據倉庫設計》中英對照_第三章

第二部分 模型開發

數據倉庫應該表示企業數據的各個方面,這些方面以主題域和業務數據模型開始。我們將在第3章使用一個假想的公司,指導一步一步地開發這兩個模型。然後在第4章使用這個業務模型作爲起點,使用一系列轉換步驟開發數據倉庫數據模型。後面4章鑽研數據倉庫數據模型的各個具體方面,幷包含有演示示例。第一個是無所不在購物天堂(GOSH)。GOSH是一個國內連鎖店,設計向國際擴張。第二個是美食公司(DFC)。DFC是一個擁有大量消費者的打包食品生產商,生產多種類型食物,從麪粉倒聽裝食品,到日常食品,速凍午餐、冰激凌等。

數據倉庫集成多個數據源的數據,並長時間保存。在源系統中的鍵可能指向本系統唯一的記錄,但是到了數據倉庫後可能不再這樣。在第5章,我們討論操作型系統鍵結構導致的問題,及在數據倉庫模型中如何處理這種問題。數據倉庫區別於其他系統的一個屬性是歷史性,在第6章,我們介紹數據倉庫日曆建模的重要性及在數據模型裏維護歷史數據的不同方法。這些方法幫助我們掌握數據倉庫不同尋常的屬性,也就是,用於抓住數據在某個時間的快照。

7章和第8章鑽研數據倉庫常見的兩種數據類型的建模——層次和事務。數據倉庫的設計影響折中方案,即要承認源系統(一般爲關係型)的結構,也要承認流行的多維數據集市結構。對層次與事務的處理提供達到平衡的技術。本部分的最後,也就是第9章,我們討論確保數據倉庫性能良好的步驟,描述如何優化物理數據倉庫模式。

 

PART 2 Model Development

The data warehouse should represent the enterprise perspective of the data, and that perspective starts with the subject area and business data models. Using a fictitious company, we provide a step-by-step process to develop these two models in Chapter 3. Then using the business data model as the starting point, Chapter 4 develops the data warehouse data model using eight sequential transformation steps. The following four chapters delve into specific aspects of the data warehouse data model and include case studies demonstrating the principles. These case studies primarily use two company scenarios to develop the business case. The first is the General Omnificent Shopping Haven (GOSH).GOSH is a national department store chain with designs to expand internationally. The second is The Delicious Food Company (DFC). DFC is a large consumer packaged goods manufacturer that produces a wide range of food products, from powders and canned goods to dairy products, frozen dinners, and ice cream.

The data warehouse integrates data from multiple sources and stores the integrated form of that data for a long period of time. The keys that are used in each source system may have uniquely identified records within each system, but they may not be appropriate for the data warehouse. In Chapter 5, we review the problems posed by key structures in the operational systems and how these should be addressed in the data warehouse data model. One of the distinguishing characteristics of the data warehouse is its historical perspective. In Chapter 6, we explain the importance of modeling the calendar in the data warehouse and different approaches for maintaining the historical perspective in this data model. These approaches help us deal with the unusual nature of the data warehouse, that is, it is used to capture snapshots of the data over time.

Chapters 7 and 8 delve into modeling two types of data frequently stored in the data warehouse – hierarchies and transactions. The design of the data warehouse reflects a compromise. It recognizes both the structure of the source systems (typically relational) and the structure of the popular dimensional data marts. The treatment of the hierarchies and transactions provides techniques for striking the right balance. We close this part of the book with a chapter on the steps needed to ensure that the data warehouse performs well. In Chapter 9, we describe what is needed to optimize the physical data warehouse schema.

 

第三章 理解業務模型

所有的應用系統都包含基於數據的信息供公司使用,數據倉庫也一樣。業務數據模型表示那些數據,是所有系統的模型的基礎,包括數據倉庫模型。在第2章,我們描述了一個第三範式模型如何提供數據一致性及限制數據冗餘性。我們在本章將降到,業務模型是一個三範式模型。因爲數據倉庫的目標之一是給企業提供一個關於事實和表徵的一致視圖,從滿足這些標準的模型開始非常重要;因此,業務模型是數據倉庫模型的基礎。使用八個定義良好的步驟轉換業務模型,從而創建數據倉庫模型,這會在第4章介紹。

31  業務場景

32  主題域模型

321  關於特定行業的考慮

322  主題域模型開發過程

323  Zenith汽車公司的主題域模型

33  業務數據模型

34  小結

 

Chapter 3 Understanding the Business Model

All application systems, as well as the data warehouse, contain information based on the data used by the company. The business data model represents that data and is the foundation for all systems’ models, including the data warehouse model. In Chapter 2, we described how a third normal form model provides data consistency and restricts data redundancy. As we will present in this chapter, the business model is a third normal form model. Since one of the objectives of the data warehouse is to provide a consistent view of the facts and figures for the enterprise, it is important to start with a model that meets those criteria; therefore, the business model is used as the foundation for the data warehouse model. Building the data warehouse model consists of transforming the business data model using eight well-defined steps, and this is covered in Chapter 4.

一個完全開發的業務數據模型可能包含上百個實體。定義主要信息分組的主題域模型是管理這些實體的一個好方法,它提供一個邏輯方法把這些實體分組。本章以描述主題域模型開始,重點介紹它怎樣幫助確保數據倉庫模型的一致性和管理冗餘。然後,我們列出業務數據模型,它與主題域模型的關係,及其開發步驟。關於業務數據模型的常見抱怨是太深奧並且實踐價值有限,這一節消除這些觀念,並且演示這個模型是一個用速記符號(矩形和線條)描述業務的方式,它對應用系統後續開發帶來很大便利。它對建立主題域模型和業務數據模型提供一個高層級的描繪。我們已經在本書的“推薦閱讀”章節包含了關於這個話題幾本喜愛的圖書。

這是一本關於“怎樣進行”數據倉庫建模的書,縱觀本書第2和第3部分,建模的概念會使用實際的場景來演示。我們使用一個業務場景來演示建模活動,這在本章的開始部分介紹。

A fully developed business data model may contain hundreds of entities. A subject area model, which defines the major groupings of information, is a good way to manage these entities by providing a logical approach for grouping the entities. This chapter begins by describing the subject area model, with particular emphasis on how it helps ensure consistency and manage redundancy in the data warehouse model. We then outline the business data model, its relationship to the subject area model, and the steps required to develop it. A common complaint about the business data model is that it is esoteric and of limited practical value. The section on the business data model dispels these concerns and demonstrates that this model is a means of describing the business in a shorthand notation (that is, rectangles and lines) that facilitates the subsequent development of supporting application systems. It provides a high-level description of the process for building the subject area model and business data model. Complete books have been written on this subject alone, and they should be consulted for additional information. We’ve included some of our favorite books on this topic in the “Recommended Reading” section of this book.

This is a “how to” book on data warehouse modeling. Throughout Parts Two and Three of the book, the modeling concepts will be demonstrated using practical scenarios. We use a business scenario to demonstrate the modeling activities, and it is described at the beginning of this the chapter.

1.   業務場景

我們使用一個汽車製造工廠的場景,在本章用來開發主題域模型和業務模型,在第4章用來開發數據倉庫模型。根據業務場景的描述,我們潛入主題域模型。

我們的汽車製造工廠名叫傑力士汽車公司(ZAC)。ZAC創建於1935年,生產兩款汽車——傑力士,及更高檔豪華的途克多。每種款式都有描述汽車類型的型號,每個型號有3個序列可選。表3.1描述了這些型號,序列號在表3.2描述。

Business Scenario

We use a business scenario of an automobile manufacturer to develop the subject area model and business data model in this chapter and the data warehouse data model in Chapter 4. Following the description of the business scenario, we will dive into the subject area model.

Our automotive manufacturing firm is named Zenith Automobile Company (ZAC). ZAC was founded in 1935, and manufactures two makes of automobile —Zeniths and the higher-end luxury Tuxedos. Each of these makes have models that describe the type of car, and each model has three series available. The models are described in Table 3.1, and the series are described in Table 3.2.

Table 3.1 Car Models  汽車型號

MAKE

款式

MODEL NAME

樣式名稱

TARGET GROUP

目標羣體

DESCRIPTION

描述

Zenith

Zipster

The young at heart(and age)

The Zipster is a sporty, subcompact-class car with a small price tag, excellent gas mileage, and limited options. This is the low-end offering in the Zenith line of cars.

Zenith

Zombie

Older retired drivers with a limited income

The Zombie is a compact sized, four-door automobile, noted for its economical upkeep and good gas mileage.

Zenith

Zoo

Families with small children

The Zoo is a four-door, mid-size car. The car is moderately priced and has good gas mileage.

Zenith

Zoom

Sports car enthusiast of modest means seeking excitement

The Zoom is a moderately expensive, big-engine performance car that offers quick response, agile handling, and fast acceleration.

Zenith

Zeppelin

Luxury minded individual

The Zeppelin is the top-of-the-line Zenith car offering unsurpassed quality and features. It is a four door, full sized model.

Tuxedo

Topsail

Young professionals

The Topsail is a mid-sized, two-door sedan equipped with a full complement of luxury features, including leather seats, an eight-way power-adjustable seat, a tilt steering wheel, and a high-tech car alarm.

Tuxedo

Tiara

The truly discriminating sophisticated driver

The Tiara is a full-sized four-door, sedan that is the top of the line Tuxedo automobile and is priced accordingly. It has many of the same features found in the Topsail but offers every conceivable luxury, including seat and outside

mirror heaters.

Tuxedo

Thunderbolt

Wealthy sports car enthusiasts

The Thunderbolt marks an acknowledged milestone in sports cars. It combines all the breathtaking performance of a thoroughbred with the ease of operation, comfort, and reliability of a passenger car.

 

所有ZAC汽車都通過全美各地的代理商銷售,代理商是獨立的實體,作爲ZAC的代理商,有零售權,受ZAC的制度管理,制度之一就是需要他們每月提交財務報表。代理商駐紮在銷售片區,片區組成銷售地區,銷售地區再組成銷售大區。所有分配都是到銷售地區級次,激勵程序由ZAC公司開發。

All of ZAC’s cars are sold through dealers throughout the United States. Dealers are independent entities, but to retain their right to serve as ZAC dealers, they are governed by ZAC’s rules. One of those rules requires them to submit monthly financial statements. The dealers are located within sales areas, which are grouped into sales territories, which are grouped into sales regions. Allocations are made at the sales area level, and incentive programs are developed by ZAC corporate.

 

 

Table 3.2 Car Series  3.2 汽車系列

 

MAKE

款式

SERIES NAME

系列名稱

ACRONYM

縮寫

DESCRIPTION

描述

Zenith

No Frills

無裝飾

NF

This is the base level containing no upgrades. Base level consists of vinyl seats, low-end carpeting, smaller engines, manual transmissions, and three paint colors.

Zenith

Some Frills

少許裝飾

SF

This is the next level and comes with upgraded fabric for the interior seats, moderately upgraded carpet, automatic transmission, larger engines, tinted windows, radio, five paint colors including metallic colors, and so on.

Zenith

Executive Frills

高級裝飾

EF

The cars in this series come with leather interior, high-quality carpet, automatic transmission, larger engines, air conditioning, tinted windows, cruise control, power windows and locks, radio/tape player, eight paint colors including metallic colors, and so on. This series is not available for the Zipster or the Zombie.

Tuxedo

Pricey Frills

超值裝飾

PF

Cars in this series come with leather interior, radio/tape deck, air conditioning, optional automatic transmission, cruise control, power windows and door lock, and keyless entry system.

Tuxedo

Decadent Frills

豪華裝飾

DF

Cars in this series come with all the features for the CF Series plus tinted windows, antitheft car alarm, moon roof, and radio/tape player/CD player with eight speakers.

Tuxedo

Truly Decadent

超豪華裝飾

TDF

Cars in this series have all the Frills features listed for the PF Series plus power-operated moon roof, advanced sound system and insulation, automatic climate control system, dual illuminated vanity mirrors, and heated front seats.

 

多年來,ZAC開發了很多系統,平臺有主機、微型機、甚至PC機,建立和(或)購買其他自動化生產工具,導致多個分離的系統和數據庫。現在,擁有IBM 3090, DEC VAX, Tandem, Sun, HP,還有PC機和蘋果機等。孰不分佈在DB2VSAMEnscribe文件,Non-stop SQL,RDB, Oracle, Sybase, and Informix等數據庫。終端用戶使用Paradox, Rbase, Microsoft Access, and Lotus Notes等之類的工具。不用說,數據分佈在上公司百個分離的數據庫裏,很多都不能訪問。

ZAC 剛開始用一個信息引擎來再造業務,再造的第一個項目效果顯著,它是一個包含代理商信息的數據倉庫,幫助公司渡過了嚴峻的形勢。這個數據倉庫的主題域是汽車和代理商,很少涉及到激勵程序和銷售組織。

這個數據倉庫的推動力是這些主題域的數據今天不容易得到,導致機會丟失、金錢浪費、高管對公司的運營情況、方向及銷售把握不清。經過公司關鍵人物的會晤,ZAC決定開發一個數據倉庫及一系列數據集市,以解決如下問題:

 

Over the years, ZAC has developed a myriad of systems on mainframes, minicomputers, and even PCs. It built and/or bought other automobile manufacturing facilities, which resulted in even more disparate systems and databases. Currently, it has IBM 3090s, DEC VAXs, Tandems, Suns, and HPs, plus PCs and Macintoshes. Their data is spread out in DB2, VSAM and Enscribe files, Non-stop SQL, RDB, Oracle, Sybase, and Informix. End users have tools such as Paradox, Rbase, Microsoft Access, and Lotus Notes. Needless to say, the data is spread out in hundreds of disparate databases throughout the company, with many in inaccessible formats.

ZAC is just beginning an information engineering effort to reengineer its business. The first project the reengineering effort highlighted as critical to the survival of the company is a data warehouse containing information about its dealers’ car sales. The subject areas it has identified for this warehouse are Automobiles and Dealers, with less emphasis on Incentive Programs and Sales Organizations.

The impetus for the data warehouse is the fact that the data from these subject areas is not easily obtained today, causing opportunities to be lost, money to be wasted, and high-level executives to be uneasy about the direction and health of their company and their automotive sales. Based on interviews with key stakeholders, the ZAC decided to undertake development of a data warehouse and a set of data marts that could answer the following questions:

■■每月銷售趨勢如何,即每個代理商、銷售片區、地區、大區、州和大都市圈(MSA)銷售的各款式、樣式、系列、顏色(MMSC)的數量和金額。

■■每月的庫存情況如何,即每個代理商、銷售片區、地區、大區、大都市圈(MSA),每個MMSC的庫存數量。

■■每月銷售數量和金額有怎樣的放射形態,即每個MMSC,每個代理商、銷售片區、地區、大區的銷售與去年同期、前年同期比較的變化情況。

■■每月實際銷售情況(數據與金額)的趨勢如何,即每個MMSC,每個代理商、銷售片區、地區、大區的實際銷售與計劃目標的比較,用戶需要這些信息,包括每月彙總數,及年累計數(YTD)。

■■每月的歷史(兩年前比較)情況如何,既每MMSC的零售數量和金額與批發代理商的對比。

■■每個MMSC每月累計銷售情況與去年累計的比較如何。

■■每月銷售趨勢是什麼,即激勵措施導致每個MMSC,每個代理商、銷售片區、地區、大區的銷售數量和金額。

■■每月平均時間趨勢如何,即每個代理商收到新車型後的銷售速度,每個MMSC,每個代理商、銷售片區、地區、大區的銷售數量和金額。

■■每月平均銷售價格如何。

■■每個代理商的付款時間如何。

■■車型改變前後的銷售情況對比。

 

■■ What is the monthly sales trend in terms of quantity and dollar amounts sold of each make, model, series, and color (MMSC) for a specific dealer, by each sales area, sales territory, and sales region, for each state and for each metropolitan statistical area (MSA)?

■■ What is the pattern in the monthly quantity of inventory by MMSC for each dealer, by each sales area, sales territory, sales region, and MSA?

■■ How does the monthly quantity and dollars of sold automobiles by MSC譯者注:應該是MMSC having a particular emissions type—by Dealer, Factory, Sales Area, Sales Territory, and Sales Region—compare with the same time frame last year and the year before?

■■ What is the trend in monthly actual sales (dollars and quantities) of MMSC for each dealer, sales area, sales territory, and sales region compared to their objectives? Users require this information both by monthly totals and cumulative year to date (YTD).

■■ What is the history (two-year comparisons) of the monthly quantity of units sold by MMSC and associated dollar amounts by retail versus wholesale dealers?

■■ What are the monthly dollar sales and quantities by MMSC this year to date as compared to the same time last year for each dealer?

■■ What is the monthly trend in sales dollars and quantities by MMSC for particular types of incentive programs, by dealer, sales area, sales territory, sales region, and MSA?

■■ What is the monthly trend in the average time it takes a dealer to sell a particular MMSC (called velocity and equal to the number of days from when a dealer receives the car to the date it is sold) by sales area, sales territory, sales region, and MSA?

■■ What was the monthly average selling price of an MMSC for each dealer, sales area, sales territory, sales region, and MSA?

■■ How many days was a dealer placed on credit hold for this month only and for the entire year? In addition, what was the total number of months in the past two years that the dealer was put on credit hold?

■■ Compare monthly sales dollars and quantities from the last body style (body style is make + model) to the current body style for each sales region? Body styles change every four years.

 

2.   主題域模型

數據倉庫由主題域組織起來,所以以主題域模型作爲數據倉庫模型的開始是非常自然的方法論。數據倉庫的面向業務性把它和傳統的應用系統區分開。在傳統的操作型系統裏,雖然數據模型應該從主題域模型開始,但這一步常常被省略。因爲操作型系統面向特殊的業務功能與流程,設計的重點在於高效地處理有關事務。因此,它的模型也是以事務處理能力爲重點,使用的流程大大影響數據的組織。在數據倉庫裏,主題傾向性保留在物理數據庫設計的核心。核心的業務流程在源操作型系統和數據集市描述,數據集市的數據來源於數據倉庫,而核心的數據倉庫設計是面向主題的。

就如我們在第2章指出的那樣,主題域是企業感興趣的物理項、概念、人們、地點、事件等的主要分組,我們也指出主題域模型可以快速開發。一個企業可以參考其他企業已經開發的主題域模型,而不必從草稿開始。有很多主題域普通適用於企業,幾乎所有的組織都有客戶、供應商、產品和設施,這些都是可選的主題域。我們後面將會講道,從一個通用的模型開始是一個好的起點,例如表3.3所示。

 

Subject Area Model

A data warehouse is organized by subject area, so it is only natural that the methodology for a data warehouse data model should begin with the subject area model. The subject-orientation of the data warehouse distinguishes it from a traditional application system. In the traditional operational system, although the data model should begin with a subject area model, this step is often omitted. Since the operational system is oriented toward specific business functions and processes, its design needs to emphasize the efficiency with which it can process the related transactions. Its model, therefore, is adjusted to emphasize the transaction-processing capabilities, with the processes that use it greatly influencing the data’s organization. With the data warehouse, the subject orientation remains at the core of the physical database design. The core business processes are depicted in the source operational systems and with the data marts that people use to obtain data from the data warehouse, but the core data warehouse design remains subject oriented.

As we indicated in Chapter 2, subject areas are major groupings of physical items, concepts, people, places, and events of interest to the enterprise. We also indicated that the subject area model can be developed very quickly. An organization developing its first subject area model can benefit from work performed by others so that it doesn’t need to start from scratch. There are many subject areas that are common across industries; virtually all organizations have customers, suppliers, products, and facilities. These are candidates for subject areas. A good point at which to start, as explained later in this chapter, is a generic model, such as the one shown in Table 3.3.

Table 3.3 Common Subject Areas   公共主題域

SUBJECTAREA

主題域

DEFINITION

定義

EXAMPLES

示例

REMARKS

注意

Business Environment

業務環境

Conditions, external to the company which affect its business activities

影響公司業務活動的外部條件

•Regulation  •Competition

•License

規章制度

競爭對手

執照

These are often not implemented in a data warehouse.

這些一般在數據倉庫裏沒有實現。

Communications

溝通

Messages and the media used to transmit the messages

消息及用於傳達消息的媒體

•Advertisement

•Audience

•Web Site

廣告

觀衆

網站

These often pertain to marketing activities, though Content they can apply to internal and other communications.

這些經常屬於市場活動,通過可以使用的內容,用於內部及其他溝通。

Customers1

顧客

People and organizations who acquire and/or use the company’s products

獲得並且/或者使用公司產品的人與組織

•Customer

•Prospect

•Consumer

顧客

潛在顧客

消費者

The definition provides for capturing potential customers(prospects) distinguishing between parties who buy the product and for those who use it.

這個定義區分潛在客戶與購買產品及使用產品者之間的區別。

External Organizations

外部組織

Organizations, except Customers and Suppliers, external to the company

公司外部組織,除了供應商和客戶以外

•Competitor

•Partner

•Regulator

競爭對手

合作伙伴

政府調節員

The exclusion of Customers and Suppliers is consistent with the subject areas’ being mutually exclusive.

排除客戶和供應商,使主題域非斥

Equipment

設備

Movable machinery, devices, and tools and their integrated components

可移動的機械、設備、工具及其組裝部件

•Computer

•Vehicle

•Crane

計算機

車輛

起重機

Software that is integral to equipment is included within this subject area; other software is included within the Information subject area.

作爲構成完整設備的一部分軟件也包括在這個主題域裏;其他軟件包含在信息主題域裏。

Facilities

設施

Real estate and structures and their integrated components

實體不動產與構造及其綜合部件

•Real Estate

•Building

•Mountain

實體不動產

建築物

山脈

Integrated components (for example, an alarm system within a building) are often included as part of the facility unless a company is specifically interested in those components.

綜合部件(如一個建築物裏的警報系統)常常作爲設施的一部分,除非一個公司對這些部件特別關注。

Financials

財務

Information about money that is received, retained, expended, or tracked by the company

公司關於金錢的收取、佔有、支出、跟蹤等信息。

•Money

•Receivable

•Payable

現金

應收

應付

 

Human Resources1

人力資源

Individuals who perform work for the company and the formal and informal organizations to which they belong

爲公司工作的個人及它們所屬的正式、非正式的組織

•Employee

•Contractor

•Position

職員

承包人

職位

 

Includes prospective(for example, applicants) and former (for example, retirees) employees. Some companies prefer to establish the organizational structure within a separate subject area.

包括預期的(例如應聘者)與曾經的(例如退休人員)職員。有些公司更喜歡在一個分開的主題域建立組織結構專題。

 

Information

信息

Facts and the information about facts and mechanisms that manage them

事實與關於事實的信息及管理它們的機制

•Application System

•Database

•Meta Data

應用系統

數據庫

元數據

This includes the information about the company’s computing environment, and also includes non-electronic information.

包括公司計算環境的信息,還包括電子信息。

 

Locations

位置

Geographical points or areas

地理位置或區域

•Geopolitical Boundary

•Country

•Address

地理邊界

國家

地址

This can be expanded to include electronic locations such as email addresses and phone numbers

這可以拓展到電子位置,例如email地址,電話號碼等。

Materials

原材料

Goods and services that are used or consumed by the company or that are included in a piece of equipment, facility, or product

公司使用或消耗的商品或服務,或者包含設備、設施、產品上小部件

•Chemical

•Fuel

•Supply

化工產品

燃料

供應品

 

Sometimes, a product is used as a component of another product. When this is the case, a relationship between the relevant entities will be indicated in the business data model.

有時,一個產品是另一個產品的部件,這種情況,相關實體之間的關係會在業務數據模型指明。

Products

產品

Goods and related services that the company or its competitors provide or make available to Customers

公司或者競爭對手提供或者顧客可以得到的商品與相關服務

•Product

•Service

•Advice

產品

服務

建議

 

Competitor items that the company does not provide are often included to facilitate monitoring these and to support future decisions.

公司不能提供的競爭對手的項目常常包含進來,方便監控及支持將來的決策

Sales

銷售

Transactions that shift the ownership or control of a product from the Company to a Customer

把產品的所有權或控制權從公司交給顧客的交易

•Sales Transaction

•Sales Transaction Detail

•Credit Memo

銷售交易

銷售交易明細

貸貨通知

Sales is actually an associative subject area in that it is the intersection of the Customer, Store, Product, and so on. Some companies may choose to include the entities related to sales within one of those subject areas instead.

銷售事實上是一個管理主題域,它和客戶、存貯、產品等有關聯。有些公司更願意把銷售作爲這些有關主題域的一個實體。

 

Suppliers1

供應商

Legal entities that provide the company with goods and services

提供給公司商品和服務的法人實體

•Broker

•Manufacturer

•Supplier

經紀人

製造廠家

供應商

In the case of a contractor, the person doing the work is included in Human Resources, and the company that provides that person is included in Suppliers.

在承包人的情況,執行工作的人在人力資源主題裏,提供人力的公司在供應商主題裏。,

 

 

1另一個方法是用“夥伴”主題域代理顧客、外部組織、人力資源、供應商等。夥伴是一個有用的概念,避免在物理實現上的重複,區分主要的夥伴(如顧客、外部組織、人力資源、供應商)能使這些主題域模型更好理解及使用。

1 Another approach is to create “Parties” as a subject area in lieu of Customers, External Organizations, Human Resources, and Suppliers. While Parties may be a useful concept to avoid duplication in a physical implementation, distinguishing among the major parties (for example, Customers, External Organizations, Human Resources, and Suppliers) improves comprehension and usage of the subject area model.

 

進一步的指導,我們建議你考慮你所處的行業特徵,下一節描述開發主題域模型時要考慮的具體的行業特徵。

As a further aid, we recommend that you consider characteristics specific to your industry. The next section describes considerations for organizations in specific industries embarking on development of a subject area model.

 

2.1. 對具體行業的考慮

每個行業都有特徵普通適合於這個行業內的公司,考慮這些不同點,創建主題域模型的模型可以更加簡化。一些示例如下:

Considerations for Specific Industries

Each industry has characteristics that are common to companies within that industry. By understanding these distinctions, the process of creating the subject area model can be further simplified. Some examples follow.

 

零售行業

建立零售行業的主題域模型特別要考慮以下問題:

■■在零售行業,重點是常常按層次劃分銷售組織。因此,這個行業的公司傾向於把表3.3中的人力資源主題域分成兩個主題域:人力資源和內部組織。

■■設施當然也是零售商感興趣的,一個特殊的設施,倉庫,常常分出來作爲一個單獨的主題域。

■■零售商一般不創造產品,常常指的是銷售項,這回替代產品主題域,並對定義做相應的調整。

Retail Industry Considerations

Special considerations for building the subject area model in the retail industry are:

■■ Within the retail industry, major emphasis is often placed on the sales organization hierarchy. Companies in this industry would, therefore, tend to separate the Human Resources subject area as described in Table 3.3 into two subject areas: Human Resources and Internal Organizations.

■■ While facilities are certainly of interest to retailers, one particular facility, the Store, is often of major interest. As a result, stores are sometimes distinguished as a separate subject area.

■■ Retailers typically don’t create products and often refer to what they sell as Items. This would replace the Products subject area, with the definition adjusted accordingly.

 

製造業:

建立製造業的主題域模型特別要考慮以下問題:

■■在製造業,製造設施受到特別關注,因此常常單獨作爲一個主題域。

■■製造過程常常產生廢物,而且有法律管理廢物。廢物有時作爲獨立的主題域。

Manufacturing Industry Considerations

Special considerations for building the subject area model in the manufacturing industry are:

■■ Within the manufacturing industry, the manufacturing facilities are of particular interest, so these are often distinguished within a separate subject area.

■■ Waste is often produced as part of the manufacturing process, and there are laws that govern the waste. Waste is sometimes isolated as a separate subject area.

 

公用事業

建立公用事業的主題域模型特別要考慮以下問題:

■■在公用事業行業,對發電設備(例如,發電車間)特別感興趣,這些常常分成獨立的主題域。

■■電子網絡或者氣體管道包含物理與邏輯部件。物理部件由實際的電線、開關、管道、閥門等組成,邏輯部件由裝載容量、網絡拓撲結構等組成。有時會把這些分成兩個獨立的主題域:設備指物理部件,網絡指邏輯部件。

Utility Industry Considerations

Special considerations for building the subject area model in the utility industry are:

■■ Within the utility industry, power-producing facilities (for example, power plants) are of particular interest, and these may be distinguished into separate subject areas.

■■ The electrical network or gas pipeline consists of both physical and logical components. The physical components consist of the actual wires, switches, pipes, valves, and so on; the logical components consist of the load-carrying capacity, network topology, and so forth. These are sometimes split into two subject areas with Equipment addressing the physical components and Networks addressing the logical components.

 

財產及意外傷害保險行業

建立財產及意外傷害保險行業的主題域模型特別要考慮以下問題:

■■財產及意外傷害保險行業常常處理保費、保單和索賠,每一種都被當作一個獨立的主題域。

■■在財務主題域,這些公司也需要處理貯備金,因爲貯備金的重要性,常常被當作一個獨立的主題域。

■■顧客的定義需要調整爲保單所有人及保單收益人,在某些方面,這類似於購買產品的顧客和使用產品的消費者。

Property and Casualty Insurance Industry Considerations

Special considerations for building the subject area model in the property and casualty insurance industry are:

■■ The property and casualty insurance industry typically deals with premiums, policies, and claims. Each of these is usually treated as a separate subject area.

■■ In the Financials subject area, these companies also need to deal with reserves, and due to the importance of the reserves, they could be treated in a separate subject area.

■■ The definition of customer needs to be adjusted to incorporate the concept of the party that owns an insurance policy and the party that may benefit from a claim. In some respects, this is similar to the concept of the customer who buys a product and the consumer who uses it.

 

石油行業

建立石油行業的主題域模型特別要考慮的是:油井和煉油廠會描述爲設施,因爲他們在這個行業的重要性,每一個都要有一個單獨的主題域。

 

Petroleum Industry Considerations

A special consideration for building the subject area model in the petroleum industry is that wells and refineries could be described as facilities, but due to their significance within this industry, each deserves to be its own subject area.

 

醫療衛生行業

建醫療衛生行業的主題域模型特別要考慮以下問題:

■■在醫療衛生行業,有幾種類型的供應商,包括保健設施、內科醫師,藥劑師等等。對每一類都要考慮他們在主題域模型的位置。

■■在有些醫療衛生行業的公司,唯一感興趣的顧客是患者,這樣,顧客主題域要改名爲患者。

Health Industry Considerations

Special considerations for building the subject area model in the health industry are:

■■ There are several types of suppliers in the health industry, including the healthcare facility, the physician, the pharmacist, and so on. Consideration needs to be given to each of these to determine their positioning in the subject area model.

■■ In some companies within the health industry, the only customer of interest is the patient, and the Customers subject area would then be named Patients.

2.2. 主題域模型開發過程

在本章的前面我們已經提出,主題域模型可以在幾天內完成開發。有三種主要開發主題域模型的方法:

■■封閉房間

■■面談

■■小型會議

每一種方法,你都可以從零開始,也可以用一個通用的模型開始,兩種方法都有效,選擇基於具體的愛好和背景。這三種主要的方法總結在表3.4,我們推薦使用第三種方法——小型會議——如果方便的話。在後面章節我們會解釋理由。

Subject Area Model Development Process

As stated earlier in this chapter, the subject area model can be developed in a matter of days. There are three major ways of developing the subject area model:

■■ Closed room

■■ Interviews

■■ Facilitated sessions

In each of the methods, you have the option of either starting from a clean slate or using a generic model as the starting point. Both approaches are valid, and the selection depends on the participants’ preferences and background. The three major methods are summarized in Table 3.4. We recommend that the third approach—the use of facilitated sessions—be used if feasible. We explain why in the sections that follow.

 

3.4 主題域模型開發方法比較

 

方法

描述

好處

缺點

封閉房間

數據建模員基於擁有的信息在隔離環境下開發,然後提交審批。

建模員理解流程

模型可以快速開發

 

建模員可能擁有的業務知識不夠多。

業務人員沒有參與感。

面談

與關鍵業務代碼一一面談,建模員使用這些信息創建模型,然後提交審批。

每個人都可能參與模型開發。

參與者擁有業務知識。

獲得一些業務所有權。

一一面談花費更多的時間。

雖然得到了業務知識,但是沒有取得一致意見。

小型會議

就是領導一羣業務代表一起開發主題域模型。

參與者擁有業務知識。

得到業務所有權。

通過交流取得一致意見。

安排需要的參與者的日程可能比較困難。

 

 

Table 3.4 Subject Area Model Development Options

METHOD

DESCRIPTION

ADVANTAGES

DISADVANTAGES

Closed Room

Data modeler(s) develop the subject area model in a vacuum, based on information they have, and then submit it for approval.

•Modelers understand the possess

•A model can be developed quickly.

 

•Modelers may not process Sufficient . business knowledge.

•The business has no sense of ownership.

Interviews

Key business representatives are interviewed individually, and the modelers use this information to create the model.The result is then submitted for approval.

•Each person has the opportunity to contribute to the model. •Contributors possess thebusiness knowledge.

•Some business ownership is obtained.

 

•Individual interviews take more time.

•While business knowledge is obtained, consensus isn’t built.

 

Facilitated Sessions

A facilitator leads a group of business representatives in the development of the subject area model.

•Contributors possess the business knowledge.

•Business ownership is generated.

•Consensus is developed through the interaction.

•Scheduling the required

participants may be difficult.

 

 

 

2.2.1.    封閉開發

封閉開發使建模員自顧埋頭苦幹,而很少或者根本沒有與業務人員參與,它的前提是建模專家擁有開發主題域模型所需要最重要的技能,且進一步假設建模員理解業務。當使用這種方法時,建模員基於他自己對業務的理解來開發模型。建模員把企業信息歸納成15——20個主要的組,每一個組作爲一個主題域。一旦完成,建模員對每個主題域創建定義,並確保所有的定義互相排斥。

一般不推薦這種開發方法。建模員即使有很少對整個業務知識瞭解得足以開發一個持久的主題域模型。在很多方面,模型更像藝術,而不是科學。例如,建模員需要決定是否把人力資源作爲一個獨立的主題域,或者爲職員、承包人、應聘這等等創建人力資源主題域,在加上一個獨立的內部組織主題域用於處理職位、組織層次、工作分類等等。從建模來看,兩種方法都是正確的,同時,定義影響主題域的範圍。常常基於個人的偏好而做出決定,重要的是不是建模員來做決定。使用這種方法開發的模型提交審批後,業務代表傾向於把它當作另一個信息技術實踐,使它很難獲得支持。

有些情況這個方法是必要的。如果建模員不能得到足夠的業務支持來創建模型,只能選擇使用這種方法,否則不能創建模型。在這種情況下,有一個比較接近的模型總比沒有模型好。建模員應當做好充分準備對模型進行調整及不斷得到業務關於模型建設性的批評。雖然在只有少許業務支持的情況下,主題域模型的工作可以i繼續,但是當開始開發業務數據模型時業務支持沒有到來,就要嚴肅的考慮終止項目了。

 

Closed Room Development

Closed room development entails the modelers working on their own with little or no involvement by business representatives. It is in keeping with a philosophy that the modeling expertise is the most important skill needed in developing the subject area model. It further presumes that the modeler understands the business. When this approach is used, the modeler develops the subject area model based on his or her perceptions of the business. The process that the modeler typically uses consists of trying to group the enterprise’s information into 15–20 major groupings, each of which would be a subject area. Once this is done, the modeler would create a definition for each one and would ensure that all of the definitions are mutually exclusive.

This approach is generally not recommended. The modeler rarely, if ever, fully understands the entire business sufficiently to create a durable subject area model. There are some aspects of the model that are more art than science. For example, the modeler needs to decide whether to keep Human Resources as a single subject area or to create a Human Resources Subject Area for the employees, contractors, applicants, and so on, and a separate Internal Organizations Subject Area for the positions, organizational hierarchy, job classifications, and so on. Either approach is correct from a modeling perspective, as long as the definitions reflect the scope of the subject area. The decision is often based on people’s preferences and it is important that the modeler not be the one to make this decision. When a model developed using this approach is subsequently presented for review, the business representatives are prone to treat this as another information technology exercise, thus making it difficult to garner support for it.

There are circumstances under which this approach is necessary. If the modeler cannot get sufficient business support to create the model, then the choice becomes whether to use this approach or to have no model. When this situation exists, it is better to have a model that is likely to be close than to have no model at all. The modeler should fully expect that adjustments will be needed and should continuously try to gain constructive business criticism of the model. While work on the subject area model can proceed with minimal business support, if the business support is not forthcoming when work on the business data model begins, serious consideration should be given to halting the project.

2.2.2.    面談開發

面談是從各個業務代表處獲得信息的優秀方法。建立面談的第一個挑戰是決定需要誰參與。因爲主題域模型代表整個企業,從組織結構圖開始是一個好多方法。建模員應該會見企業內代碼主要部門的人員,根據他們現在的位置或者以前的位置。

要會見的企業代碼人數大概10——15人,每個人都要求描述他/她所在領域的高層次的工作流。使用這些信息,面談者應當嘗試定義每個人感興趣的主要的信息組及他們之間的交叉點。下面給出一個“與銷售經理面談”的示例。

Development through Interviews

Interviews provide an excellent means of obtaining information from individual business representatives. The first challenge in setting up the interviews is determining who needs to participate. Since the subject area model represents the entire enterprise, a good place to start is the organizational chart. The modeler should interview people who represent the major departments in the enterprise either by their current position or by virtue of their previous positions.

A reasonable representation of the enterprise should be available by interviewing 10–15 people. Each of these people should be asked to describe the high-level workflow in his or her area. Using this information, the interviewer should try to identify the major groupings of information of interest to each person and the interactions among them. A sample interview is provided in the “Interview with the Sales Executive” sidebar.

與銷售經理面談

以下是示例會談的開始:

採訪者:早上好,Jim(銷售副總)。非常感謝你從百忙之中抽出時間與我談談。

(採訪者應當簡要介紹談話的目的及從JIM那裏得到信息的重要性)

採訪者:請你概要的說說銷售過程。

銷售副總:客戶來到我們商店四處看看,選擇他們想要的東西,把它們放進購物車。在收款臺,這些物品通過電子掃描,終端會提醒銷售員介紹促銷產品,然後銷售員詢問客戶是否對這些感興趣,並嘗試獲得客戶的電話號碼。如果數據庫中已經存在,銷售員會確認客戶姓名和地址;如果是一個新客戶,銷售員嘗試獲得客戶的姓名和地址並輸入數據庫。我們已經成功地獲取了70%的客戶信息。然後,客戶離開商店。

採訪者:基於我們的討論,我定義了幾個主要感興趣的事情:客戶、商店、銷售人員、銷售事務和商品,對不對?

銷售副總:我們的銷售人員可以獲得促銷信息,這給我們帶來很大價值。我想這也和重要。

採訪者:謝謝,我遺漏了這點。讓我們談談這些事情之間的關係。客戶來到這個商店——客戶可以從其他地方購買嗎?

銷售副總:現在沒有,但是我們正考慮建立一個電子商務平臺。

採訪者:是所有的客戶都是個人客戶,還是有一些作爲組織的代表?

銷售副總:我們的客戶可能是消費者,也可能是企業代表。

採訪者:對待這兩類客戶有什麼不同嗎?

(面談繼續,採訪者基於從回答者獲得的信息提出更深入的問題)

Interview with the Sales Executive

Following is the beginning of a sample interview:

Interviewer: Good morning, Jim (vice president of sales). I appreciate your taking time from your busy schedule to speak with me this morning.

 (The interviewer would then briefly explain the purpose of the interview and the importance of getting information from Jim’s perspective.)

Interviewer: Please describe the sales process to me at a high level.

Sales VP: Our customers come into our store and look around. They select items that they would like and then place them in a cart. At the checkout counter, the items they’ve selected are electronically scanned. The terminal alerts the salesperson to promotional items, then the salesperson asks the customer about his or her interest in these. The salesperson also tries to obtain the customer’s phone number. If it is already in our database, the salesperson confirms the customer’s name and address; if it is a new one, the salesperson tries to get the customer’s name and address and enters them into our database. We’ve been successful in obtaining information to identify about 70 percent of our customers. The customer then leaves the store.

Interviewer: Based on our discussion, I’ve identified the following major things of interest: customers, stores, salespeople, sales transactions, and items. Is that correct?

Sales VP: We gain a lot of value from having the promotional information available to our salespeople. I think that’s important, too.

Interviewer: Thanks, I missed that one. Let’s take a look at the relationships among these things. The customer comes into the store—can customers buy the items elsewhere?

Sales VP: Not at this time, but we’re considering establishing an electronic commerce facility.

Interviewer: Are all the customer’s individual consumers, or are some considered representatives of organizations?

Sales VP: Our customers may be either consumers or representatives of businesses.

Interviewer: Is there any difference in the treatment of the two types of customers?

(Interview continues with the interviewer delving further into items based on the answers received.)

 

面談的主要產物應該是一系列主題域及其定義(從受訪者的觀點)。得到的這些信息用於幫助創建主題域模型,也爲業務模型提供信息。通過深入面談,我們較好的利用了業務代表的時間。之後創建業務模型時,我們可以以面談得到的這些信息作爲開始,把重點放在確認和提煉工作上。

One of the major products of the interview should be a set of subject areas and definitions from that person’s perspective. The information obtained will help create the subject area model and will also provide information for the business data model. By delving further within this interview, we make better use of the business representatives’ time. When we subsequently work on the business data model, we can start with the information we obtained from these interviews and then focus on confirmation and refinement.

技巧

在面談前準備好一些問題,但不期望全部用到。這些問題提供一個好的檢查單,以保證覆蓋所有關鍵點;然而,一個好的採訪者根據提供的信息與提供信息的方式調整面談。面談一結束,建模員要鞏固這些信息。建模者可能會受到一些矛盾的信息,這些矛盾需要解決。有時,解決方法可能是使用最平常的事例,但是另一方面,需要主持一個討論來澄清這些分歧。最後的主題域模型要提供給每一個受訪者確認。根據個人的職位與技術部署,確認過程可能通過一個簡短的討論,而不是通過遞交模型。

 

TIP

Go to an interview prepared with a set of questions, but don’t expect to use them all. The questions provide a good checklist for ensuring that key points are covered; however, a good interviewer adjusts the interview to reflect the information being provided and the way that it is provided. Once the interviews are completed, the modeler needs to consolidate the information. It is possible that the modeler will receive conflicting information, and these conflicts need to be resolved. Sometimes, the resolution may be one of using the most generalized case, but at other times, a discussion to clarify the differences may be needed. The resultant subject area model should be provided to each of the interviewees for verification. Depending upon the person’s position and technical disposition, the verification may be conducted through a brief discussion rather than through submission of the model for review.

 

 

 

2.2.3.    通過小型會議開發

作者發現使用小型會議是最快速有效的方法。會議的參與者包括各個業務領域的代表,這與面談一樣。最大的不同是,代表之間互相交流,而不是單獨參與。但是有時把這些人召集到一起很困難,一旦做到,模型可以很快完成並在業務代碼之間折中認同。主要的步驟是準備一到兩次會議,在會議之間進行開發工作,並繼續工作。如果開發組從頭開始,需要兩次會議;如果開發組使用一個現成的模型,可能同意一個會議就可以完成工作。

Development through Facilitated Sessions

The approach that the authors have found to be the most effective and efficient is the use of facilitated sessions. These sessions involve representatives of the various business areas, just as the interviews do. The significant difference is that the people are interacting with each other instead of providing individual contributions. While it is sometimes difficult to get the people together, when this is accomplished, the product is completed very quickly and reflects compromises with which the business representatives agree. The major steps in the process are preparation, one or two facilitated sessions, work between the facilitated sessions, and follow-on work. If the group is starting from a clean slate, two facilitated sessions will be needed; if the group is using a starter model, it may be possible to complete the effort in one session.

 

準備

準備工作包括選擇和邀請參與者及後勤安排。至少要在會議前一到兩週開始準備。會議成功的一個關鍵點是要讓參與者理解會議的目的、過程及他們的角色。這要在邀請函中描述清楚。

Preparation

Preparation consists of selecting and inviting the participants and making the logistical arrangements. The preparation should be performed at least one to two weeks before the session. One of the keys to a successful session is to ensure that the participants understand the purpose, the process, and their role. These should be described in the invitation letter.

第一次會議

第一次會議的議程應包括以下幾項:

介紹:參與者作自我介紹,討論會議目標。

培訓:對有關概念和過程進行培訓。

頭腦風暴:頭腦風暴用於開發一個潛在主題域清單。

提煉:總結與提煉主題域清單,得到主題域。

結論:回顧會議結果,安排模型創建。

這個議程假設開發組從頭開始,如果開發組以一個普通的或行業模型開始,可以使用下列議程:

介紹:參與者作自我介紹,討論會議目標。

培訓:對有關概念和過程,及開始模型進行培訓。

討論及提煉主題域:討論開始模型裏的主題域,推導出一系列主題域,對這些主題域的定義進行討論和提煉。

提煉:總結與提煉主題域清單,得到主題域。

第一次會議議程中非常重要的一部份是培訓。在會議的培訓部份,演示者解釋什麼是主題域,如何區分及定義他們,它爲何對後續的模型有益。會議過程(例如頭腦風暴)與規則一起描述。

First Facilitated Session

The agenda for the first session should include the following items:

Introductions. The participants introduce themselves, and the session objectives are reviewed.

Education. Education is provided on the relevant concepts and on the process.

Brainstorming. Brainstorming is used to develop a list of potential subject areas.

Refinement. The list of potential subject areas is reviewed and refined to arrive at the set of subject areas.

Conclusion. The session results are reviewed, and assignments for definition creation are made.

 

This agenda presumes that the group will be starting with a clean slate. If the group starts with a generic or industry model, the following agenda would apply:

Introductions. The participants introduce themselves, and the session objectives are reviewed.

Education. Education is provided on the relevant concepts, on the process, and on the starter model.

Review and refinement of subject areas. The subject areas in the starter model are reviewed, and a set of subject areas is derived. Definitions for those subject areas are then reviewed and refined.

Refinement. The list of potential subject areas is reviewed and refined to arrive at the set of subject areas.

A critical part of the agenda for the first session is education. During the educational portion of the meeting, the facilitator explains what a subject area is, how it should be identified and defined, and why the resultant model is beneficial. The processes (for example, brainstorming) to be employed are also described along with the rules for the facilitated session.

技巧

如果組內有些成員瞭解這些概念,而有些不瞭解,可以在真正的會議之前組織一次培訓會議。這個參與者提供選擇,而不必強迫懂這個議題的人蔘加不情願的培訓。

 

TIP

If some members of the group understand the concepts and others don’t, consider having an educational session before the actual facilitated session. This provides the attendees with a choice and does not force people who know the topic to attend redundant education.

本節的剩餘部分假設開發組不以一個現成模型開始。在培訓會議之後,開發組投入一個頭腦風暴會議提出潛在的主題域。在頭腦風暴會議裏,所有的發言都被記錄下來,不經過任何討論。因此,對人們定義主題域,及報表、過程、功能、實體、屬性、組織等等來說不是罕見的。圖3.1顯示一個頭腦風暴會議的潛在結果,事例爲Zenith 汽車公司這樣的製造廠商。如果你仔細地審視這些圖表,你會看到大多數第二頁及部分第三頁指出了太細的細節。當這種情況發生時,指導者應該提醒開發組 定義主題域。

The remainder of this section presumes that the group is not beginning with a starter model. Following the educational session, the group engages in a brainstorming session to identify potential subject areas. In a brainstorming session, all contributions are recorded, without any discussion. It is, therefore, not uncommon for people to identify reports, processes, functions, entities, attributes, organizations, and so on, in addition to real subject areas. Figure 3.1 shows the potential result of such a brainstorming session for an automobile manufacturer such as the Zenith Automobile Company. If you look closely at the flip charts, you’ll see that most of the second sheet and part of the third sheet deviated into too great a level of detail. When this happens, the facilitator should remind the group of the definition of a subject area.

 

 

下一步是檢查這些項目並排出不是潛在主題域的項目。每個項目都要討論,如果沒有取得一個潛在主題域一致的定義,她被移出,可能被其他取得一致定義的主題域代替。這個過程完畢後,清單上的主題域會減少,如圖3.2 所示。一些轉換活動如下:

The next step in the process is to examine the contributed items and exclude items that are not potential subject areas. Each item is discussed and, if it does not conform to the definition of a potential subject area, it is removed and possibly replaced by something that conveys the concept and could conform to the definition of a subject area. When this process is over, there will be fewer subject areas on the list, as shown in Figure 3.2. Some of the transformation actions that took place follow:

■■物品和產品用來指同一樣事情,選用“汽車”術語,因爲所有的物品和產品都是由汽車而來。而且,這些包含轎車、油漆、豪華轎車、部件、組件、發動機、二手車。

■■客戶和消費者用於指同一樣事情,選用“客戶”術語。潛在客戶也被吸收到這個領域。

■■變遷報表和銷售分析報表用“報表”表示,並不再使用。

■■市場用“功能”表示,且不再使用,在討論過程裏,增加了廣告和促銷主題。

■■信用卡和貸款組合成付款方式。

■■職員和承包商組合成人力資源。

■■代理權和代理商認爲是相同的,代理商選作主題域名稱。

 

■■ ITEMS and PRODUCTS were determined to be the same thing and AUTOMOBILES was selected as the term to be used since all the products and items were driven by the automobiles. Further, these were found to encompass CARS, PAINT, LUXURY CAR, PARTS, PACKAGES, MOTORS, USED CARS.

■■ CUSTOMER and CONSUMER were determined to be the same thing and CUSTOMERS was selected as the term to be used. PROSPECTS was absorbed into this area.

■■ VARIANCE REPORT and SALES ANALYSIS REPORT were determined to be reports and eliminated.

■■ MARKETING was determined to be a function and was eliminated. During the discussion, ADVERTISEMENTS and PROMOTIONS were added.

■■ CREDIT CARD and LOAN were grouped into PAYMENT METHODS.

■■ EMPLOYEES and CONTRACTOR were combined into HUMAN RESOURCES.

■■ DEALERSHIPS and DEALERS were deemed to be the same, and DEALERS was chosen as the subject area.

這樣整理之後,清單包含獨立的數據組,但是其中有些比其他的更重要。下一步,要求開發組再次檢查列表並把這些項目分組。例如,圖3.2種列出了倉庫、分發中心、工廠等,倉庫和分發中心應該分組爲一個潛在的主題域“設施”,和工廠一起作爲主題域。這個過程完成後,最有可能的候選主題域已經定義了,如圖3.3 所示。

The resultant list should consist solely of data groupings, but some may be more significant than others. Next, the group is asked to look at the list and try to group items together. For example, WAREHOUSES, DISTRIBUTION CENTERS, and FACTORIES are shown in Figure 3.2. WAREHOUSES and DISTRIBUTION CENTERS could be grouped into a potential subject area of FACILITIES, with FACTORIES also established as a subject area. When this process is over, the most likely candidates for the subject areas will have been identified, as shown in Figure 3.3.

這樣第一次小型會議實際上完成了。在準備第二次會議的過程中,每一個主題域要分給兩個人,每個人寫出主題域的定義草稿,且每個主題域至少要包含三個實體。(有些人可能不只負責一個主題域)。這個工作應該在會議之後很快完成並且提交給指導者。建議小組成員在兩次會議之間,指導者使用這些信息及主題域模型模版信息(如果有)來作爲下次會議的開始。

This virtually completes the first facilitated session. In preparation for the next session, each subject area should be assigned to two people. Each of these people should draft a definition for the subject area and should identify at least three entities that would be included within it. (Some people may be responsible for more than one subject area.) The work should be completed shortly following the meeting and submitted to the facilitator. The group should be advised that on the intervening day, the facilitator uses this information and information from subject area model templates (if available) to provide a starting point for the second session.

 

鞏固並準備第二次小型會議

在兩次會議之中(至少有一天時間),指導者回顧定義與事例實體,並使用這些創建一個主題域的定義列表,這將用於第二次會議。指導者應當創建一個文檔,顯示已提供的文獻及建議。例如,主題域客戶,應該有以下文獻:

文獻1:“客戶是那些購買或者準備購買產品的人。”示例實體有客戶,批發商、潛在購買者。

文獻2:“客戶是那些獲取我們產品用於內部消費的組織。” 示例實體有客戶、客戶子公司、購買代理。

主題域模版(前面表3.3所示)信息提供了一個客戶的定義“獲得/或者使用公司產品的個人與組織,”並且提供客戶、潛在客戶、消費者作爲示例實體。使用這些信息,指導者應該包含“客戶”信息如表3.5所示。每一個主題域都要提供類似的信息。

 

Consolidation and Preparation for Second Facilitated Session

During the period (potentially as little as one day) between the two facilitated sessions, the facilitator reviews the definitions and sample entities and uses these to create the defined list of subject areas that will be used in the second facilitated session. The facilitator should create a document that shows the contributions provided, along with a recommendation. For example, for the subject area of Customers, the following contributions could have been made:

Contribution 1. “Customers are people who buy or are considering buying our items.” Sample entities are Customer, Wholesaler, and Prospect.

Contribution 2. “Customers are organizations that acquire our items for their internal consumption.” Sample entities are Customer, Customer Subsidiary, and Purchasing Agent.

 

The subject area template information (previously shown in Table 3.3) provides a definition of Customers as “People and organizations who acquire and/or use the company’s products,” and provides Customer, Prospect, and Consumer as sample entities. Using this information, the facilitator could include the information for CUSTOMERS shown in Table 3.5. Similar information would be provided for each of the subject areas.

 

第二次會議

第二次會議的議程包括以下幾項:

回顧:回顧第一次會議的成果及以後所作的工作。

提煉:回顧和提煉主題域及他們的定義。

關係。創建每對主題域之間的主要關係。

結論:複習模型,討論沒有解決的問題,定義下一步的工作。

Second Facilitated Session

The agenda for the second session should include the following items:

Review. The results of the first session and the work performed since then are reviewed.

Refinement. The subject areas and their definitions are reviewed and refined.

Relationships. Major relationships between pairs of subject areas are created.

Conclusion. The model is reviewed, unresolved issues are discussed, and follow-up actions are defined.

第二次會議的成功高度依賴於每個參與者是否及時完成分派的任務及指導者編輯的文檔。在每個主題域的討論時間,會指出他們的限制。如果主題域沒有在分配的結束時間完成,完成剩下工作的責任會分給小組其他成員。常常,剩下的工作會包含提煉定義的用詞(而不是意義)。

在所有的主題域討論完成之後,定義主題域之間的主要關係,並繪製主題域草圖。這個步驟是最不嚴格的,因爲主題域的關係可以自然地從業務數據模型得出。第二次會議一個嚴格的最後步驟是開發問題列表及下一步的計劃。

The success of the second session is highly dependent on each of the participants completing his or her assignment on time and on the facilitator compiling a document that reflects the input received and best practices. A limit should be placed on the discussion time for each subject area. If the subject area is not resolved by the end of the allotted time, the responsibility to complete the remaining work should be assigned to a member of the team. Often, the remaining work will consist of refining the wording (but not the meaning) of the definition.

After all of the subject areas have been discussed, the major relationships among the subject areas are identified and the resultant subject area diagram is drawn. This step is the least critical one in the process because the subject area relationships can be derived naturally from the business data model as it is developed. A critical final step of the second facilitated session is the development of the issues list and action plan.

下一步工作

問題列表和工作計劃是第二次會議非常重要的產品,因爲它提供了保證下一步工作完成的方法。問題列表包含會議中提出的需要解決的問題。每一項都應包含負責人的名稱和期限。工作計劃大概列出開發主題域模型剩下的工作步驟。常常,會議的產品能很快應用於支持業務數據模型的開發,在提煉工作完成以後。

 

Follow-on Work

The issues list and action plan are important products of the second facilitated session, since they provide a means of ensuring that the follow-on work is completed. The issues list contains questions that were raised during the session that need to be resolved. Each item should include the name of the person responsible and the due date. The action plan summarizes the remaining steps for the subject area model. Often, the product of the session can be applied immediately to support development of the business data model, with refinements being completed over time based on their priority.

2.2.4.    主題域模型的好處

不管主題域模型能開發可以多快,也只有在能得到好處的情況下才值得付出努力。在第二章已經列出了三個主要的好處:

主題域模型指導業務模型的開發。

它影響數據倉庫項目選擇。

它指導數據倉庫開發項目。

主題域模型是一個幫助建模員組織工作及幫助爲數據庫倉庫的工作的多個項目小組之間瞭解領域之間的重疊部分。工具條顯示主題域模型如何用於輔助數據倉庫項目的定義和選擇。

Subject Area Model Benefits

Regardless of how quickly the subject area model can be developed, the effort should be undertaken only if there are benefits to be gained. Three major benefits were cited in Chapter 2:

■■ The subject area model guides the business data model development.

■■ It influences data warehouse project selection.

■■ It guides data warehouse development projects.

The subject area model is a tool that helps the modeler organize his or her work and helps multiple teams working on data warehouse projects recognize areas of overlap. The sidebar shows how the subject area model can be used to assist in data warehouse project definition and selection.

 

2.3. Zenith汽車公司的主題域模型

Zenith 汽車公司潛在的主題域模型如圖3.5。只顯示出了需要回答業務問題和客戶的主題域。

數據倉庫項目定義與選擇

3.4顯示需要回答Zenith汽車公司業務問題的主要主題域。

Subject Area Model for Zenith Automobile Company

A potential subject area model for the Zenith Automobile Company is provided in Figure 3.5. Only the subject areas needed to answer the business questions and Customers are shown.

Data Warehouse Project Definition and Selection

Figure 3.4 shows the primary subject areas that are needed to answer the business questions for the Zenith Automobile Company.

使用圖3.4的信息,顯示一個邏輯的實現順序,首先開發汽車、代理商、銷售組織主題域,因爲實際上所有的問題都依賴於他們。工廠或激勵程序程序可以在下一步開發,緊接着開發其中剩下的另一個。因爲提車的業務問題,不需要關於客戶和供應商的任何信息,即使問題3——問題7是最重要的,他們也不應該第一步引入。得出這個結論的理由是,爲了回答這些問題,你仍然需要其他三個主題域的信息。

這是一個迭代開發方法的示例,這樣數據倉庫增量創建,一直盯着最終的目標。

Using the information in Figure 3.4, a logical implementation sequence would be to develop the Automobiles, Dealers, and Sales Organizations subject areas first since virtually all the questions are dependent on them. Factories or Incentive Programs could be developed next, followed by the remaining one of those two. For the business questions posed, no information about Customers and Suppliers is needed. Even if the business considered question 3 or 7 to be the most significant, they should not be addressed first. The reason for this conclusion is that in order to answer those questions, you still need information for the other three subject areas.

This is an example of the iterative development approach whereby the data warehouse is built in increments, with an eye toward the final deliverable.

每一個主題域定義如下:

■■汽車是一種交通裝置,相關的部件由Zenith汽車公司製造並由代理商銷售。

■■客戶是從代理商處獲得汽車及其部件的夥伴。

■■代理商是授權出售Zenith汽車公司製造的汽車和部件的經銷人。

■■工廠是Zenith汽車公司製造汽車及其部件的設施。

■■激勵程序是鼓勵汽車銷售的一些考量。

■■銷售組織是代理商按照利潤的信息的分組。

3.6提供了一個零售公司的潛在主題域模型。這個模型作爲第5章到第8章的學習案例參考。

 

Definitions for each subject area follow:

■■ Automobiles are the vehicles and associated parts manufactured by Zenith Automobile Company and sold through its dealers.

■■ Customers are the parties that acquire automobiles and associated parts from Dealers.

■■ Dealers are agencies authorized to sell Zenith Automobile Company automobiles and associated parts.

■■ Factories are the facilities in which Zenith Automobile Company manufactures its automobiles and parts.

■■ Incentive Programs are financial considerations designed to foster the sale of automobiles.

■■ Sales Organizations are the groupings of Dealers for which information is of interest.

Figure 3.6 provides a potential subject area model for a retail company. This model is provided as a reference point for some of the case studies used in Chapters 5–8.

 

每一個主題域的定義示例如下:

■■溝通指通訊及用於通訊的媒體。

■■客戶是取得或使用公司商品的人或組織。

■■設備是可移動的機械、裝置、工具及其集成組件。

■■人力資源爲公司完成工作的個人。

■■財務是關於公司收取、保有、期望或跟蹤金錢等的信息。

■■內部組織是人力資源所屬的正式或非正式的分組。

■■物品是公司或其競爭對手提供的商品或服務。

■■位置是地理點或區域。

■■其他設施是實體資產或其他構造,及其集成組件,商店除外。

■■銷售是把物品所有權或控制權從公司交給客戶的過程。

■■商店是銷售發生的地方,包括小貨攤。

■■賣主是製造或提供物品給公司的法人實體。

 

Sample definitions for each of the subject areas follow.

■■ Communications are messages and the media used to transmit the messages.

■■ Customers are people and organizations who acquire and/or use the company’s items.

■■ Equipment is movable machinery, devices, and tools and their integrated components.

■■ Human Resources are individuals who perform work for the company.

■■ Financials is information about money that is received, retained, expended, or tracked by the company.

■■ Internal Organizations are formal and informal groups to which Human Resources belong.

■■ Items are goods and services that the company or its competitors provide or make available to Customers.

■■ Locations are geographic points and areas.

■■ Other Facilities are real estate and other structures and their integrated components, except stores.

■■ Sales are transactions that shift the ownership or control of an item from the Company to a Customer.

■■ Stores are places, including kiosks, at which Sales take place.

■■ Vendors are legal entities that manufacture or provide the company with items.

3.   業務數據模型

我們在第2章已經說明,模型是事物的抽象和表示,表示或者演示原事物的部分或全部。業務模型是一種模型,它是數據在一個指定業務環境的抽象和表示,它幫助人們直觀的瞭解業務信息之間的關係(“各部分如何組合起來”) 。應用業務數據模型的產品包括應用系統,數據倉庫,數據集市。而且,模型提供數據庫的元數據(即關於數據的信息),幫助人們理解和使用最終的數據。主題域模型提供業務數據模型的基礎,而且這個模型減少應用系統正確反映業務環境的開發風險。

 

Business Data Model

As we explained in Chapter 2, a model is an abstraction or representation of a subject that looks or behaves like all or part of the original. The business data model is one type of model, and it is an abstraction or representation of the data in a given business environment. It helps people envision how the information in the business relates to other information in the business (“how the parts fit together”). Products that apply the business data model include application systems, the data warehouse, and data mart databases. In addition, the model provides the meta data (or information about the data) for these databases to help people understand how to use or apply the final product. The subject area model provides the foundation for the business data model, and that model reduces the development risk by ensuring that the application system correctly reflects the business environment.

 

3.1. 業務數據開發過程

如果象本節描述的,業務數據模型還不存在,那麼在着手數據倉庫數據模型之前應該先進行這部分開發。開發業務數據模型的過程不能缺少第一步:定義參與者。在理想情況下,數據管理者與建模員聯合開發業務數據模型。大多數公司沒有正式的數據管理人員,而且業務中心( 有時是信息技術中心)可能也看不到開發業務數據模型的價值。總而言之,它耽誤了寫代碼!業務數據模型的好處已經在第二章羅列,但是在缺乏正式的數據管理人員的情況下,數據建模員需要指定關鍵業務代表,他們必須具有必要的知識和權利去做出有關數據定義和關係的決定。這些常常叫做“主題專家”,簡稱SME。一旦這些被確定,建模員需要得到他們的委託開始建模活動。這不是小事,常常需要做出折衷。例如,SME可能更願意回答問題及評審進度,但是不願意參與建模進程。在建模員理解了這些參與層次後,他/她應該評估減少SME參與完成模型的風險。

 

 

Business Data Development Process

If a business data model does not exist, as is assumed in this section, then a portion of it should be developed prior to embarking on the data warehouse data model development. The process for developing the business data model cannot be described without first defining the participants. In the ideal world, the data stewards and the data modelers develop the business data model jointly.

Most companies do not have formal data stewardship programs, and the business community (and sometimes the information technology community) may not see any value in developing the business data model. After all, it delays producing the code! The benefits of the business data model were presented  in Chapter 2, but in the absence of formal data stewards, the data modeler needs to identify the key business representatives with the necessary knowledge and the authority to make decisions concerning the data definitions and relationships. These are often called “subject matter experts” or SMEs (pronounced “smeeze”). Once these people are identified, the modeler needs to obtain their commitment to participate in the modeling activities. This is no small chore, and often compromises need to be made. For example, the SMEs may be willing to answer questions and review progress, but may not be willing to participate in the modeling sessions. After the modeler understands the level of participation, he or she should evaluate the risk of the reduced SME involvement to the accuracy and completeness of the model.

 

然後,建模員應該調整他/她的精力與時間,如果在SME委託的任務與計劃任務之間存在顯著的分歧時。開發一個完整的業務數據模型可能需要6——12個月,在這段時間內,沒有切實的業務產出。然而,這可能是一個理論上正確的方法,實際上很少這樣做。我們建議使用以下方法:

1、  定義主題域,用於項目迭代需要的數據。

2、  定義主題域內感興趣的實體,並建立標識符。

3、  指定實體之間的關係。

4、  增加屬性。

5、  確認模型結構。

6、  確認模型內容。

本節剩餘部分描述這6個活動。

Then, the modeler should adjust his or her effort estimate and schedule if there is a significant difference between the SMEs’ committed level of involvement and the level that was assumed when the plan was created. Development of a complete business data model can take 6 to 12 months, with no tangible business deliverable being provided during that timeframe. While this may be the theoretically correct approach, it is rarely a practical one. We recommend using the following approach:

1. Identify the subject area(s) from which data is needed for the project iteration.

2. Identify the entities of interest within the affected subject area(s) and establish the identifiers.

3. Determine the relationships between pairs of entities.

4. Add attributes.

5. Confirm the model’s structure.

6. Confirm the model’s content.

The remainder of this section describes these six activities.

 

3.1.1.    定義有關的主題域

在這個案例中,主題域要回答的問題如圖3.5所示,他們是:汽車代理商,工廠,激勵機制,銷售組織。在這個主題域模型中,還有其他的主題域,但是這些在首次迭代數據倉庫時不需要。主題域模型的第一個應用給我們提供一個快速限定工作範圍的方法。如果我們僅僅着眼於裏面的幾個問題,我們能進一步減少範圍。例如,假設我們第一次迭代不回答問題3——7,那我們就不需要工廠和激勵機制的信息,也不需要任何關於客戶的信息。能夠排除這些主題域是非常重要的。例如,客戶數據,事實上是最難取得的數據之一。如果在開始時業務只關注與銷售統計,那麼關於具體客戶的數據能夠排除在第一次迭代的範圍之外。這樣,避免了對客戶進行普遍的定義及解決多個客戶文件的集成問題(在汽車業,關於客戶的信息需要從代理商處獲取)。請記住,排除一個主題域,並不是降低它的重要性,僅僅是降低定義業務規則的緊迫性,讓數據倉庫的後續提交物能夠早點實現,如圖3.7 。類似地,在開發主題域的細節時,應該把焦點放在本次跌代需要用到的實體上。如圖3.7指出了使用主題域模型限定範圍的好處。首先,這個項目可以分爲多次迭代,每一次都比整個項目短。第二,迭代是可以重疊的(如果資源允許),這樣進一步縮短了整了項目的時間。例如,第一次迭代的分析和建模一旦完成,就可以進行第二次迭代的分析建模,同時進行首次迭代的開發。有些工作需要其他工作已經完成,但是這可以通過很好的計劃避免。能快速提供一個業務提交品,往往值得這個風險。

 

 

Identify Relevant Subject Areas

The subject areas with information needed to answer the questions posed in the scenario described for are shown in Figure 3.5. These are: Automobiles,Dealers, Factories, Incentive Programs, and Sales Organizations.

There are other subject areas in the subject area model, but these do not appear to be needed for the first few iterations of the data warehouse. This first application of the subject area model provides us with a quick way of limiting the scope of our work. We could further reduce our scope if we want to address only a few of the questions. For example, let’s assume that the first iteration doesn’t answer questions 3 and 7. To answer these questions, we don’t need any information from Factories and Incentive Programs, nor do we need information about Customers for any of the questions.

Being able to exclude these subject areas is extremely important. Customer data, for example, is one of the most difficult to obtain accurately. If the business is initially interested in sales statistics, then information about the specific customers can be excluded from the scope of the first iteration of the model. This avoids the need to gain a common definition of “customer” and to solve the integration issues that often exist with multiple customer files. (In the automotive industry, information about Customers requires cooperation from the Dealers.) It is important to remember that excluding a subject area has no bearing on its importance—it only has a bearing on the urgency of defining the business rules governing that area and hence the speed with which the next business deliverable of the data warehouse can be created, as shown in Figure 3.7. Similarly, in developing the details for the other subject areas, the focus should remain on the entities needed for the iteration being developed. Figure 3.7 points out several benefits of using the subject areas to limit scope.

First, the project can be subdivided into independent iterations, each of which is shorter than the full project. Second, the iterations can often overlap (if resources are available) to further shorten the elapsed time for completing the entire effort. For example, once the analysis and modeling are completed for the first iteration, these steps can begin for the second iteration, while the development for the first iteration proceeds. Some rework may be needed as additional iterations are pursued, but this can often be avoided through reasonable planning. The value of providing the business deliverables quicker is  usually worth the risk.

 

3.1.2.    定義主要的實體並建立標識符

一個實體是公司感興趣的一個人,地點,事情,事件,或者一個概念,且公司有能力和意願獲取這些信息。常常通過聽一個用戶描述業務,或者閱讀一個地區的描述文檔,或者與主題域專建面談的過程可以獲得實體。我們得出結論,汽車、代理商、銷售組織是回答前三個問題需要的信息。讓我們檢查:銷售。

潛在實體應當通過頭腦風暴會議、面談、分析獲得。不期望初始列表很完善。當模型開發出來後,實體要增加到列表中去,而原來的一些項目可能會刪除,根據數據倉庫首次迭代的具體情況而定。要定義每一個實體,但是在花太多時間在實體上前,建模員應該快速決定這個實體是否在本次迭代範圍內。這樣做的好處是顯然的,定義一個實體需要花費時間,如果在意見不統一的時候還需要相當的討論。在等到需要這個實體時,不只度過了時間,SME們也更加傾注於定義的工作,因爲他們他們工作的重要性。

最後,模型會轉換成物理數據庫,每個表需要一個主鍵唯一定義一個實例。我們因此應該給每一個實體設計一個標識符。既然這是業務建模,我們不需要考慮這個標識符的物理屬性,因此,我們可以對每一個實體簡單的創建一個主鍵屬性,叫“[實體名]ID”或者“[實體名]編碼”。ID與編碼的不同在“實體與屬性建模習俗”裏說明。大多數的建模工具產生需要的外鍵,我們的模型會包含層疊的外鍵。“實體與屬性建模習俗”欄列出了一些我們建模時常用的實體和屬性命名方法。表3.6是這個活動的產出物。

 

Identify Major Entities and Establish Identifiers

An entity is a person, place, thing, event, or concept of interest to a company and for which the company has the capability and willingness to capture information. Entities can often be uncovered by listening to a user describe the business, by reviewing descriptive documents for an area, and by interviewing subject matter experts. We concluded that information from three subject areas—Automobiles, Dealers, and Sales Organizations—is needed to address the first three questions. Let’s examine Sales.

Potential entities should be developed through a brainstorming session, interviews, or analysis. The initial list should not be expected to be complete. As the model is developed, entities will be added to the list and some items initially inserted in the list may be eliminated, particularly for the first iteration of the data warehouse. Each of the entities needs to be defined, but before spending too much time on an entity, the modeler should quickly determine whether or not the entity is within the scope of the data warehouse iteration being pursued. The reason for this screening is obvious—defining an entity takes time and may involve a significant amount of discussion if there is any controversy. By waiting until an entity is needed, not only is time better spent, but the SMEs are also more inclined to work on the definition since they understand the importance of doing so.

Eventually, the model will be transformed into a physical database with each table in that database requiring a key to uniquely identify each instance. We therefore should designate an identifier for each entity that we will be modeling. Since this is a business model, we need not be concerned with the physical characteristics of the identifier; therefore, we can simply create a primary key attribute of “[Entity Name] Identifier” or “[Entity Name] Code” for each entity. The difference between Identifier and Code is described in the “Entity- and Attribute-Modeling Conventions” sidebar, which shows the

entity-modeling conventions we’ve adopted. Most modeling tools generate foreign keys when the relationships dictate the need and, by including the identifier, our model will include the cascaded foreign keys. The “Entity- and Attribute-Modeling Conventions” sidebar summarizes the conventions we used to name and define entities and attributes. Table 3.6 presents the results of this activity for the entities of interest for the business questions that need to be answered.

 

實體與屬性建模習俗

每個企業都應該建立給實體與屬性命名、定義的規則。實體與屬性表示面向業務的視圖,命名習俗不僅侷限於物理約束。一些要考慮的習俗如下。

實體命名習俗包括:

每一個實體應當有一個唯一的名字。

實體名應當首字母大寫(介詞和連詞除外)

實體名應當由面向業務的術語組成。

使用不縮寫的全詞。

在單詞之間使用空格。

使用單數名詞。

避免冠詞,介詞和連詞。

名字的長度沒有限制(一個好的名字應該是Bill to Customer,一個不好的名字是BTC,或者Bill-to-Cust)。

 

Entity- and Attribute-Modeling Conventions

The rules for naming and defining entities and attributes should be established within each enterprise. Entities and attributes represent business-oriented views, and the naming conventions are not limited by physical constraints. Some of the conventions to consider are as follows.

Entity naming conventions include:

Each entity should have a unique name.

The entity name should be in title case (that is, all words except for prepositions and conjunctions are capitalized).

Entity names should be composed of business-oriented terms:

Use full, unabbreviated words.

Use spaces between words.

Use singular nouns.

Avoid articles, prepositions, and conjunctions.

The length of the name is not limited. (A good entity name would be Bill to Customer; a poor one would be BTC or Bill-to-Cust.)

 

屬性命名習俗包括:

屬性名應包含一到多個主單詞,零到多個修飾符,一個類型單詞。

主單詞描述項目,它常常屬性所在的實體同名。

限制符進一步描述項目。

類型詞(如數量,名稱)是項目類型的描述。

每個屬性應有一個在實體內唯一的名字。如果同一個屬性,除了主詞以外(如過期日期、狀態)用於多個實體,它應永遠有同樣的定義。

屬性名應手字母大寫。

每個屬性名應由面向業務的術語組成。

使用不縮寫的全詞,名字的長度沒有限制。

在單詞之間使用空格。

使用單數名詞。

避免冠詞,介詞,連接,如then,and等。

 

Attribute naming conventions include:

Attribute names should contain one or more prime words, zero or more modifiers, and one class word.

The prime word describes the item. It is often the same as the name of the entity within which the attribute belongs.

The qualifier is a further description of the item

The class word (for example, amount, name) is a description of the type of item.

Each attribute should have a unique name within an entity. If the same attribute, except for the prime word (for example, expiration date, status) is used in several entities, it should always have the same definition.

The attribute name should be in title case.

Each attribute name should be composed of business-oriented terms:

Use full, unabbreviated words. The length of the name is not limited.

Use spaces between words.

Use singular nouns.

Avoid articles, prepositions, and conjunctions such as “the” and “and.”

 

實體和屬性定義習俗包括:

定義要使用一致的形式。

定義應自滿足的。

定義應清晰、簡潔。

定義不應嵌套遞歸,不用用同樣的單詞來定義自己。

定義應面向業務。

定義應互斥。

定義應獨立於物理系統約束。

 

Entity and attribute definition conventions include:

Definitions should use consistent formats.

Definitions should be self-sufficient.

Definitions should be clear and concise.

Definitions should not be recursive. A word should not be used to define itself.

Definitions should be business-oriented.

Definitions should be mutually exclusive.

Definitions should be independent of physical system constraints.

 

在業務模型裏,我們可提供一個屬性用於描述(同時避免連接到一個實體把代碼翻譯成描述) .僅僅當我們把模型遷移到數據倉庫時需要代碼,用於保證使用有效的代碼( 域約束可以實現) 或者節省存貯空間.當創建數據模型時,我們使用代碼-描述實體.

 

In the business model, we can provide an attribute for the description (and avoid having a reference entity for translating the code into the description).The code is needed only when we migrate to the data warehouse, where it is used either to ensure that only valid codes are used (domain constraints can also accomplish this) or to reduce the storage requirements. We create code—description entities—when we build the data warehouse model.

3.1.3.    定義關係

在維護所有數據模型時,有必要使用建模工具.下面列出了市面上的一些常用工具.每一種工具都有各自的優缺點,但是每一種都只要提供基本的建模功能.每一種工具版本之間的不同在本書不討論.

常用的數據建模工具包括;

■■Erwin, Computer Associates 公司

■■ ER Studio , Embarcadero 公司

■■ Oracle Designer,  Oracle公司

■■ Silverrun , Magna Solutions公司

■■ System Architect,  Popkin公司

■■ Visio , Microsoft 公司

■■ Warehouse Designer ,  Sybase 公司.

 

Define Relationships

A modeling tool is essential for developing and maintaining all data models. Some of the common tools on the market follow. There are advantages and disadvantages to each of the tools, but every one of them performs at least the basic modeling functions. The differences among the tools change with each release and hence are not described in this book.

 

Common data modeling tools include

■■ ERwin by Computer Associates

■■ ER Studio by Embarcadero

■■ Oracle Designer by Oracle

■■ Silverrun by Magna Solutions

■■ System Architect by Popkin

■■ Visio by Microsoft

■■ Warehouse Designer by Sybase

 

關係圖形化的描述業務規則.下面列出在業務數據模型需要反應的部分業務規則:

■■ 汽車案製造廠家、款式、系列、顏色分類。

■■汽車在工廠製造。

■■ 一個選項包包含各種選項,每個選項都可以包含在多個選項包內。

■■ 汽車包含零到多個選項包。

■■ 汽車分配給代理商。

■■ 汽車由代理商銷售。

這個規則可以通過和有關的主題域專家討論獲得。下一步是定義每對實體間的關係。圖3.8 顯示支持這些問題的實體。

 

The relationships diagrammatically portray the business rules. Following is a partial set of business rules that need to be reflected in the business data model.

■■ An automobile is classified by make, model, series, and color.

■■ An automobile is manufactured in a factory.

■■ An option package contains one or more options, each of which may be included in several option packages.

■■ An automobile contains zero, one, or more option packages.

■■ An automobile is allocated to a dealer.

■■ An automobile is sold by a dealer.

These rules would be uncovered through discussions with appropriate subject matter experts. The next step in the process is to define the relationships between pairs of entities. Figure 3.8 shows the entities needed in the model to support these questions.

 

小貼士:

業務數據模型的另一個信息來源是已經存在的其他系統。當使用已有系統系統時,建模員需要認識物理數據庫的技術約束,以及設計者的一些假設(往往缺乏文檔)。因此,不僅在業務模型裏要考慮,在某些情況下,可能是模型的輸入。發現的與已經存在系統的任何不同都需要記錄好文檔,用於轉換規則。

 

TIP

Another source of information for the business data model is the database of an existing system. While this is a source, the modeler needs to recognize that the physical database used by a system reflects technical constraints and other (frequently undocumented) assumptions made by the person who designed it as well as erroneous or outdated business rules. It should, therefore, not be considered to be the business model, but it certainly can be used as input to the model. Any differences discovered in using a database from an existing system should be documented. These will be used when the transformation rules are developed.

3.1.4.      增加屬性

屬性是實體的事實或離散信息片。這樣的一個屬性已經包含在圖裏——標示符。其他屬性要回答其他的業務問題。例如,關於庫齡的問題。基於這個需求,存貯初始日期需要作爲一個屬性。

小貼士:

在業務模型裏,隨着時間改變的信息應儘可能貼上日曆標籤。例如,與其存貯庫齡,不如記錄倉庫開放日期及最後維修日期。在數據倉庫模型裏,我們可以選擇只存貯出生日期,或者既存貯出生日期,又存貯年齡。如果我們要做年齡分佈分析,我們選擇在集市裏存貯年齡分佈。(如果我們這樣做,我們需要包含更新年齡分佈的邏輯,否則,集市需要在每個裝入週期重建)。

 

Add Attributes

An attribute is a fact or discrete piece of information pertaining to an entity. One such attribute has already been included in the diagram—the identifier. At this point, additional attributes needed to answer the business questions of interest are added. For example, the questions involving the Store requested information on the store’s age. Based on that requirement, the store inception date should be added as an attribute.

 

TIP

In the business model, information that changes with time should be tied to calendar dates whenever possible. For example, instead of store age, the date the store was opened or last renovated should be shown. In the data warehouse model, we have options on whether to store just the date of birth or both the date of birth and the age. If we’re doing analysis based on a range of ages, we may choose to store the age range in the mart. (If we choose this option, we will need to include logic for updating the age range unless the mart is rebuilt with each load cycle.)

 

數據倉庫模型設計的一個難題是預見業務用戶最終想要的屬性。既然業務數據模型主要用於支持數據倉庫,那麼問題清單也要在此列出。導致這個困難的部分原因是因爲業務用戶不知道自己真正需要什麼,直到他們使用這個系統的時候纔會發現。找出這些潛在需求的來源有目前的報表、查詢、源系統。這個問題在第4章更深入的討論,那是建立數據倉庫模型的第一步。

“實體屬性建模習俗”欄列出了一些我們常用於定義實體和屬性名稱的習慣。圖3.9顯示了擴展的模型,包含了屬性。針對實體,我們期望增加、刪除、改變模型。

 

The difficulty with a data warehouse data model is anticipating the attributes that business users will eventually want. Since the business data model is being built primarily to support the warehouse, that problem manifests itself at this point. Part of the reason for the difficulty is that the business users truly do not know everything they need. They will discover some of their needs as they use the environment. Some sources to consider in identifying the potential elements are existing reports, queries, and source system databases. This area is discussed more thoroughly in Chapter 4 as part of the first step of creating the data warehouse data model.

The “Entity- and Attribute-Modeling Conventions” sidebar summarizes the conventions we used to name and define attributes. Figure 3.9 shows the expanded model, with the attributes included. As was the case with the entities, we should expect additions, deletions, and changes as the model continues to evolve.

 

3.1.5.    評審模型結構

業務數據模型應滿足第三範式(在第二章有說明)。簡言之,在第三範式裏,每一個屬性都依賴於實體的鍵,且是全部鍵,且只有主鍵。

記住,業務模型不需要提供好的性能,它永遠不實現,那時數據倉庫、操作型系統、數據集市等的後續模型需要的。在這個階段,第三範式提供最大的靈活性、穩定性,一致性。

 

Confirm Model Structure

The business data model should be presented in what is known as “third normal form.” The third normal form was described in Chapter 2. By way of summary, in the third normal form, each attribute is dependent on the key of the entity in which it appears, on the whole key, and on nothing but the key.

Remember that the business model does not need to provide good performance. It is never implemented. It is the basis of subsequent models that may be used for a data warehouse, an operational system, or data marts. For that usage, the third normal form provides the greatest degree of flexibility and stability and ensures the greatest degree of consistency.

 

小貼士

業務數據模型一個純粹的視圖就是它是一個第三範式模型,只考慮邏輯視圖(不考慮物理視圖)。

好幾個數據建模工具存貯物理屬性,即實體對應表,屬性對應字段。理論家僅僅在模型使用到具體應用(如數據倉庫)時才增加這些。

一些實際的方法是業務數據模型裏包含物理模型的信息。因爲,好幾個應用會使用這個業務模型,而且都是從拷貝這個模型的有關部分開始。如果不只一個應用需要同一個實體,那每一個應用都要花力氣建立每個字段的數據類型等物理特性。這種複製工作減少了潛在的不一致性。更好的方法是使用建模工具創建部分信息。

熟練的建模員使用域定義來減少工作量,並提供靈活性。建模工具的域特性用於定義有效值、數據類型、非空屬性等等。域的一個應用是這些特性的唯一組合,而不是定義每一列的物理特性,而只賦予域。這還提供未來變化的靈活性,進一步減少工作量。

TIP

A purist view of the business data model is that it is a third normal form model that is concerned only with the logical (and not physical) view. Several of the datamodeling tools store information about the physical characteristics of the table for each entity and about the physical characteristics of the column for each attribute.The theoretician would address these only when the model is applied for an application such as the data warehouse.

A more practical approach is to include some information pertaining to the physical model in the business model. The reason for this is that several applications will use the business model, and they start by copying the relevant section of the model. If more than one application needs the same entity, then each is forced to establish the physical characteristics such as datatype for the resultant table and its columns. This creates duplicate effort and introduces a potential for inconsistency. A better approach is to create some of this information within the business model in the modeling tool.

The use of domain definitions is another technique that experienced modelers use to minimize work and provide flexibility. The domain feature of the modeling tool can be used to define valid values, data types, nullability, and so on. One application of domains is to establish one for each unique combination of these, then instead of defining each of the physical characteristics of a column, it is merely assigned to a domain. In addition to reducing the workload, this provides the flexibility to accommodate future changes.

 

3.1.6.    評審模型內容

 

開發業務模型的最後一步,也可能是最重要的一步是評審模型內容,這通過與業務代表討論完成,使用的技術很多。在與業務用戶會談時,建模員必須提醒,模型是最終產品的表示,它是一種描述業務的技術,這種方式方便系統和數據倉庫的開發。一些業務代碼可能既願意也有能力評審模型,而另一些,可能需要建模員使用平常的語言提出各種問題,來驗證業務規則和定義。例如,建模員可能需要引導一個面談,通過向業務代表提問的方式來確認每一個業務規則。

Confirm Model Content

The last, and possibly most important, step in developing the business data model is to verify its content. This is accomplished through a discussion with business representatives. The techniques used vary. In meeting with the business users, the modeler must remember that the model is a means to an end. It is a technique for describing the business in a way that facilitates the development of systems and data warehouses. Some business representatives may be both willing and able to review the actual model. With others, the modeler may need to ask questions in plain English that verify the business rules and definitions. For example, the modeler may need to conduct an interview in which he or she confirms the relationships by asking the business representative if each of the business rules that the relationships represents is valid.

 

4.   小結

主題域模型與生俱來就是數據倉庫的基礎,因爲數據倉庫就是“面向主題的”。主題域模型提供一個組織業務數據模型的好方法。主題域模型爲企業定義14-25個主要的組,每一個組與其它組互斥。主題域模型能在幾天內創建出來,使用簡便的會議。兩個簡便會議中的第一個包括有關概念的培訓,頭腦風暴出潛在主題域清單,並提煉清單。在第二次會議前初步定義這些主題,然後再會上對這些進行評審,確認主題域及其定義並進行提煉,給模型增加主要的關係,並評審模型,並且記錄未解決的問題及確定下一步行動。

 

Summary

The subject area model is inherent in the foundation of the data warehouse, since the warehouse itself is “subject oriented.” The subject area model provides a good way of organizing the business data model. The subject area model identifies the 15–25 major groupings of significance to the company, with each one mutually exclusive of the others. The subject area model can be created in a few days, using facilitated sessions. The first of two facilitated sessions includes education on the relevant concepts, brainstorming a list of potential subject areas, and refinement of the list. Preliminary definitions are developed prior to the second meeting, at which the results of the first session and the work performed since then are reviewed, the subject areas and their definitions are reviewed and refined, major relationships are added to the model, and the model is reviewed. Unresolved issues and follow-up actions may also be identified.

業務數據模型是後面所有事情的基礎。顯著的錯誤會導致連鎖效果,所以驗證模型的結構與內容非常重要。業務數據模型描述了一個企業重要的信息,及這些信息如何聯繫起來。它完全獨立於任何組織、功能、技術等等。它爲任何應用系統的數據庫設計提供堅實的基礎,包括數據倉庫。一個完整的業務數據模型很複雜,可能需要一年時間來完成。與其開發一個 完整的業務數據模型,建模員應創建一個現有業務問題需求的模型。

 

This business data model is the foundation of everything that follows. Significant errors can have a cascading effect, so it is very important to verify both the structure and the content of the model. The business data model describes the information of importance to an enterprise and how pieces of information are related to each other. It is completely independent of any organizational, functional, or technological considerations. It therefore provides a solid foundation for designing the database for any application system, including a data warehouse. A complete business data model is complex and can easily require a year to complete. Instead of developing a complete business data model, the data warehouse modeler should create only those portions of the model that are needed to support the business questions being asked. Within the scope of the business questions being asked, the business data model is developed by identifying the subject areas from which data is needed, identifying and defining the major entities, establishing the relationships between pairs of entities, adding attributes, conforming to the third normal form, and confirming the content of the model.

在業務問題的範圍內,業務數據模型用於定義主題域,制定及定義主要的實體,建立每實體之間的關係,增加屬性,使其滿足第三範式,並且確認模型內容。

 

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章