Better, Faster, Cheaper: Summer ’09 Data Warehousing Roundup

Better, Faster, Cheaper: Summer ’09 Data Warehousing Roundup

 

Netezza and Teradata news follows Sybase and IBM announcements. Vendors promise higher performance, lower cost and greater deployment flexibility.


By

Doug Henschen
August 3, 2009

 

The better-faster-cheaper trend in data warehousing continues. Four leading vendors are introducing new software, new hardware and optimized integrations of the two realms that promise better performance, higher scalability and lower cost. The latest announcements are being made at this week's TDWI World Conference in San Diego. Netezza and IBM are arguably making the biggest news, while Teradata and Sybase focused on software upgrades supporting in-database processing.

The news from Netezza, which broke last week in the blogosphere ahead of tomorrow's formal announcement, is that it is moving off proprietary hardware onto industry-standard blade-server hardware. The move takes advantage of the low cost and steady performance improvements available on commodity hardware, yet the architecture continues to use the field programmable gate arrays (FPGA) that have been Netezza's performance differentiator.

"Rather than having a proprietary intelligent storage node, where the disk, the FPGA and the CPU are all on the same card, we'll use standard storage arrays connected to commodity blades amended with our FPGA accelerator cards," says Phil Francisco, Netezza's vice president of product management and product marketing.

Netezza says current customers will be able to use existing code and queries on the new platform. Getting off proprietary hardware is an important strategic step that is likely to free up research and development funds while also improving margins. "A proprietary, monolithic architecture will not survive in the long run in data warehousing because it's inherently more expensive," says Forrester Analyst Jim Kobielus. "If you can use cheap, off-the-shelf components wherever possible, you can build a cheaper appliance, and cheap is everything in a commodity market."

The Netezza TwinFin appliance, the first product to be released on this new blade-server-based platform, has been beta tested by several customers and is available immediately. The TwinFin scales from several hundred gigabytes to more than a petabyte, and it is said to deliver three- to five-times faster performance than Netezza's current platform. The vendor also emphasized that the new platform drops Netezza's prices under $20,000 per terabyte. Whether that "resets the bar on price-performance in the industry," as the vendor claims, is a matter of interpretation. Vendors including Greenplum and Kickfire are already below that threshold.

"Netezza has a lot of street credibility for delivering scalability and performance, but they've been at the middle of the market in terms of pricing," Kobielus says. "Now they can position themselves as a performance leader and as a price-performance leader. If what they are saying holds true, then they'll have a strong claim on both fronts."

Netezza also announced today that entry-level, high-capacity and memory-intensive appliances will be released on the new commodity-based platform. The entry-level product will target deployments ranging from tens to hundreds of gigabytes, but it will ultimately scale to more than a terabyte, Francisco says. The high-capacity model is aimed at archiving and record-retention applications where scalability trumps query speed -- a niche currently well served by both Greenplum and Teradata's Extreme Data Appliance. The memory-intensive model is designed for environments with complex queries, thousands to tens of thousands of users or both -- the high-end enterprise data warehouse (EDW) market commanded by Teradata and coveted by HP with its Neoview offering.

Netezza's exclusive storage and blade server hardware partner is IBM. The FPGA accelerator cards are attached on standard "side cars" available on IBM blades, but Francisco says the new architecture could be built on just about any industry-standard Intel-powered blade server. Netezza will continue to sell its current platform "for some time," Francisco says. But the long-term plan is to move entirely to the commodity-hardware-based platform.

Teradata Revs Its Database

Teradata's news this week is the general release of Teradata Database 13 and Teradata Tools and Utilities 13. The upgrades pack 93 new features and capabilities said to improve system performance by up to 30 percent. Highlights include a faster-running and more efficient query optimizer and a doubling of extract, load and transform (ELT) speeds while supporting simultaneous data analysis. But the key theme is in-database processing, with geospatial data and certain online analytical processing (OLAP) functions now supported while high-performance in-database data mining with partner SAS has been enhanced.

"The whole move to in-database processing is certainly a key theme that we've had for a long time," says Teradata development chief Scott Gnau. "The geospatial piece will let customers analyze not only what happened and who it happened to but also where it happened, and we can also do in-database predictive modeling of 'where will it happen, to whom and why?'"

Many databases support geospatial analysis, but Gnau says Teradata has integrated it into the parallel execution engine of the database. "Rather than just being a bolt-on solution -- which is how we implemented geospatial analysis previously and how it's generally implemented by competitors -- we've enabled those geospatial operators to execute alongside other operators in our massively parallel environment. That means performance, performance, performance."

Teradata has continued work with SAS to extend the depth and breadth of data mining functions that can be automatically handled within the database. Teradata has also increased the amount of memory available for such analyses for improved scalability, Gnau says.

Asked whether Netezza's less-than-$20,000-per-terabyte claim would draw attention from Teradata customers, Gnau countered, "our pricing and our product family are extremely competitive, and we provide a lot of choice... Customers don't have to be locked into a specific appliance and do a forklift [hardware] upgrade when they want to move to a different class of machine."

Gnau specifically touts Teradata's support for "coexistence," which is the capability to combine old and new hardware in a single system rather than replacing systems outright to scale up. That capability is unique to Teradata, says Gartner Analyst Don Feinberg, but it's not without compromise. "If I have a two-node Teradata system, I can add nodes on new hardware. The catch is that they will run as slow as the slowest node," he explains. "All nodes act as a single Teradata system, and you can't have parts of a query operating at different speeds."

Joining the In-Database Parade

Sybase, too, is championing in-database processing with the Sybase IQ 15.1 release, announced in mid July (see "Sybase IQ Upgrade Adds Support for In-Database Analytics" ). The release is the first column-oriented database to support in-database processing for data mining, predictive analysis, OLAP and other compute-intensive analytic functions. The release extended the product's built-in library of statistical and data mining analytic functions while also adding standard-SQL OLAP extensions for analysis of large data sets.

Sybase has also launched a partner program to enable third-party providers to plug their analytics into the Sybase IQ database. Only one vendor, Fuzzy Logix, has joined to date. Teradata has worked exclusively with analytics leader SAS, and the companies' two-year-old partnership yielded production-ready in-database offerings this spring. Netezza has a broader community of more than 100 in-database partners. Many are little-known developers of analytic applications and services, but SAS, SPSS, Catalina Marketing and other influential firms are also on the list.

Completing the Package

In contrast to Netezza's hardware-focused announcement and the Teradata and Sybase software announcements, IBM last week trumpeted the integration of hardware and software with optional business intelligence and analytic application modules. The company bills the IBM Smart Analytic System, to be released in September, as a single-vendor, single-order offering with all components pre-integrated and optimized to deliver maximum performance.

Some critics derided IBM's sketchy announcement as "markitecture" and a broader bundling of the IBM Balanced Warehouse, but Forrester's Kobielus says he's impressed: "IBM is alone in the market in taking things to this extent... It offers many more options [than competitors] for the packaging of its appliances for various size classes and functional profiles... And now they are adding verticalized and horizontalized solutions that are aligned with the delivery of consulting and professional services from IBM Global Business Services."

The short take on recent announcements is that IBM is stressing its breadth, Teradata and Sybase are accentuating their depth, and Netezza is lowering its cost while looking for broader deployments.

"What's really happening is that vendors are enhancing their software, and if they have hardware they are starting to take advantage of the faster chips, bigger disk drives and better interconnections," observes Gartner's Donald Feinberg. "If they're not doing so already, vendors are also moving to handle mixed workloads and generic data warehousing needs because they don't want their appliances to be pigeonholed as being just for data marts."

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章