Netezza TwinFin: A step towards a potential acquisition?

Ever since Netezza became a public company, every once and a while someone tries to start a rumor that Netezza is on the verge of being acquired (likely started by people who want a quick return on their Netezza stock buy). These rumors usually involve a company like Oracle buying Netezza, which never made a lot of sense to me, since Oracle has their own DBMS product and has very little reason to buy a much smaller competitor like Netezza and maintain two lines of code that target the same market. This is why it wasn’t surprising that Microsoft chose to acquire DATAllegro instead of Netezza, even though Netezza was much farther along than DATAllegro and had a larger customer base. DATAllegro essentially left the DBMS engine in tact, with its key technological assets sitting on top of the DBMS, turning many single-node Ingres instances into a large, shared-nothing, MPP DBMS. Since DATAllegro used a nice, modular architecture, Microsoft was able to replace Ingres with SQL Server, and use DATAllegro’s technology to turn SQL Server into a MPP DBMS without significant modifications to the core SQL Server DBMS engine (see Microsoft’s Project Madison).

But two events now make me wonder if Netezza might actually end up being acquired by a vendor that currently sells a competing DBMS product (likely either IBM with DB2 or HP with NeoView).

First, there was the release Oracle Database Machine. Oracle openly admits that the Oracle Database Machine frequently gets a factor of between 10 and 70 performance improvement relative to previous Oracle offerings (i.e. Oracle RAC) on scan-heavy analytical workloads. But the center of the Oracle Database Machine is …. Oracle RAC! So how does it get the order of magnitude performance improvement relative to RAC? By connecting RAC (using Infiniband) to a shared-nothing storage layer (Exadata) that can perform database scans at extremely high speeds and do some basic database operations like tuple selection and projection. Since scan-oriented queries are limited by the speed with which the scan can occur, simply connecting RAC to a storage layer that can do scans really well yields significant improvement.

Perhaps Netezza’s greatest asset is its ability to achieve high performance on table scans. By using FPGAs to perform decompression, selection, and projection as data is read off of disk, Netezza is able to perform scans faster than what competitors (at least row-store competitors) can do on commodity hardware. If the Oracle Database Machine is successful (Larry Ellison said at a recent earnings call that it “is shaping up to be our most exciting and successful new product introduction in Oracle’s 30 year history”), I would expect its competitors to follow suit — and connect their DBMS engines to a high performance storage layer the way Oracle did with Exadata.

Second, Netezza’s recent move to re-architect their appliance via TwinFin (announced a few weeks ago) is a clear embrace of commodity hardware components. Before this redesign, Netezza was a monolithic appliance. As detailed by ComputerWeekly, if you wanted to upgrade storage or processing capacity, you had to wait for the next Netezza release and replace the whole appliance with the Netezza’s next generation. Now, the core part of the Netezza technology can be placed in the “sidecar” expansion slot in the standard IBM BladeServer family of servers. This allows customers to upgrade the IBM blades independently of the Netezza technology.

Looking at it a different way: the technology behind Netezza’s stellar scan performance can now be found in a nice modular component, the “DB Accelerator” card, that can be placed in standard expansion slots in blade servers. The move towards a more modular architecture is reminiscent of the DATAllegro architecture that allowed Microsoft to replace Linux with Windows and Ingres with SQL Server and keep the majority of the rest of the DATAllegro technology. DATAllegro was sold for $ 275 million to Microsoft when it only had 3-4 customers.

Netezza’s current market cap is currently $ 550 million and it has orders of magnitude more customers than DATAllegro did (and is currently profitable). Hence it seems like a prime candidate for an acquisition. Its recent architectural redesign allow it to be acquired even by a company with a competing data warehouse product, since its core technology can be used in the storage layer as a drop in accelerator for table scans and used in a similar way that Oracle uses Exadata. IBM seems like a natural fit given their close partnership on TwinFin. Otherwise HP seems like an option since NeoView seems like it is having trouble getting off of the ground. Time will tell, but I will no longer ignore Netezza acquisition rumors the way I once did.

CATEGORIES: Database Management

European Thoughts on RDA

Some European libraries are asking the question: “Should we adopt RDA as our cataloging code?” The discussion is happening through the European RDA Interest Group (EURIG). Members of EURIG are preparing reports on what they see as the possibilities that RDA could become a truly international cataloging code. With the increased sharing of just about everything between Europe’s countries — currency, labor force, media, etc. — the vision of Europe’s libraries as a cooperating unit seems to be a no-brainer.

There are interesting comments in the presentations available from the EURIG meeting. For example:

Spain has done comparisons with current cataloging and some testing using MARC21. They conclude: “Our decision will probably depend on the flexibility to get the different lists, vocabularies, specific rules… that we need.” In other words, it all depends on being able to customize RDA to local practice.

Germany sees RDA as having the potential to be an international set of rules for data sharing (much like ISBD today), with national profiles for internal use. Germany has starting translating the RDA vocabulary terms in the Open Metadata Registry, but notes that translation of the text must be negotiated with the co-publishers of RDA, that is the American Library Association, the Canadian Library Association, and CILIP.

The most detail, though, comes from a report by the French libraries. (The French are totally winning my heart as a smart and outspoken people. Their response to the Google Books Settlement was wonderful.) This report brings up some key issues about RDA from outside the JSC.

First, it is said in this report, and also in some of the EURIG presentations from their meeting, that it is RDA’s implementation of FRBR that makes it a candidate for an international cataloging code. FRBR is seen as the model that will allow library metadata to have a presence on the Web, and many in the library profession see getting a library presence on the Web as an essential element of being part of the modern information environment. One irony of this, though, is that Italy already has a cataloging code based on FRBR, REICAT, but that has gotten little attention. (A large segment of it is available in English if you are curious about their approach. )

The French interest in FRBR is specifically about Scenario 1 as defined in RDA; a model with defined entities and links between them. An implementation of Scenario 2, which links authority records to bibliographic records, would be a mere replication of what already exists in France’s catalogs. In other words, they have already progressed to level 2 while U.S. libraries are still stuck in level 3, the flat data model.

Although the French libraries see an advantage to using RDA, they also have some fairly severe criticisms. Key ones are:

  • it ignores ISO standards, and does not follow IFLA standards, such as Names of person, or Anonymous classics*
  • it is a follow-on to, and makes concessions to, AACR(1 and 2), which is not used by the French libraries
  • it proposes one particular interpretation of FRBR, not allowing for others, and defines each element as being exclusively for use with a single FRBR entity

They recommend considering the possibility of creating a European profile of RDA scenario 1. This would give the European libraries a cataloging code based on RDA but under their control. They do ask, however, what the impact on global sharing will be if different library communities use different interpretations of FRBR. (My answer: define your data elements and exchange data elements; implement FRBR inside systems, but make it possible to share data apart from any particular FRBR structuring.)

* There is a strong adherence to ISO and IFLA standards outside of the U.S. I don’t know why we in the U.S. feel less need to pay attention to those international standards bodies, but it does separate us from the greater library community.

(Thanks to John Hostage of Harvard for pointing out the recent EURIG activity on the RDA-L list.)

CATEGORIES: Database Management

FAQs for SQL Server Database Corruption

Question: Does a missing file (.mdf) send a database in suspect mode?
Answer: NO. It may have true with the earlier (SQL 2000 & earlier) versions of SQL server but it is not true in current versions. Your database may be inaccessible but not in suspect mode in case of missing data files.

Question: How many tools are available to check the database integrity?
Answer: CheckDB is the only available tool to check the database integrity.

Question: How to bring database online from suspect mode?
Answer: If you database has gone in suspect mode then you are not able to do any operation on the database. To bring database online, ALTER Database Set Emergency. Once your database in emergency then RUN CheckDB with repair_allow_data_loss on the database and bring online.

Question: Do you loss data after running CheckDB with repair_allow_data_loss?
Answer: The short answer is YES. As name suggests, you will loss some amount of data after running CheckDB with repair_allow_data_loss.

Question: What are options available in CheckDB tool?
Answer: There are two repair options available in CheckDB tool. One is Repair_Rebuild and another is Repair_Allow_Data_Loss.

Question: What are the reasons for database corruption?
Answer: There are ton of reasons for database corruption. Some popular reasons are hardware failure, virus attack, power failure, meta-data structure corruption etc.

Note: 99% SQL database corruption happens due to faulty hardware.

Question: What to do after database corruption?
Answer: Don’t panic; if possible take backup of corrupt database.

Question: How to recover corrupt database?
Answer: It is a very brought question but we can classified it in three simple steps.

  1. If backup is available then restore corrupt database from backup.
  2. Run CheckDB on the database, see the SQL server error log & Run CheckDB with repair_allow_data_loss. 
  3. Try any third party MS gold partner’s SQL recovery software.

Question: Which is the best third party SQL recovery software?
Answer: The short answer is Stellar Phoenix SQL recovery software. This recovery software recovers and repairs.mdf and.ndf files of SQL server. It supports all latest version of SQL server including SQL Server 2008 R2, 2008, 2005, 2000 & 7.0.

Question:Does simple recovery model store log files?
Answer: No, Simple recovery model does not store log files of SQL server database but full recovery model stores all the log records of SQL server database.

CATEGORIES: Database System