Friday, January 4, 2013

Musings of a NetApp Insight 2012 Attendee

As has come to be expected, NetApp Insight consisted of 3 days packed with all things NetApp. This year the breakout sessions focused on areas such as Cloud, FlexPod, Big Data and of course Cluster-Mode just to name a few.
I attended an interesting discussion on Big Data. NetApp has their own issues with Big Data. They get somewhere around 800,000 Autosupports per week. The autosupport data has to land, be parsed & extracted before being pushed into a data warehouse. Apparently, the Oracle database solution couldn’t handle the growth. Their autosupport data load doubles every 16 months. Various applications have to access the autosupport data in varying forms. When we go to MyAutosupport and look up a customer system, the MyAutosupport application is accessing the data in one of their data marts. Many tools access this data for end users, support personnel and design engineering. The data must be available quickly for these organizations to be effective. NetApp found that it became increasingly difficult to meet its SLA’s around this solution.

So NetApp decided to implement a replacement solution based on Hadoop. Hadoop is an open framework designed to process large amounts of data in parallel. The solution encompasses both compute and storage. To manage the storage, Hadoop includes its own file system that is designed to store Petabytes of data and provide high speed access. Since Hadoop owns the file system, it knows the location of all the chunks of data. Hadoop can intelligently allocate compute resources according to data locality. In other words, the data doesn’t have to be moved to the compute node. The compute nodes are chosen because they have direct access to particular data chunks. NetApp chose to utilize their E-series (Engenio acquired) storage for the Hadoop Distributed File System, HDFS. In addition, NetApp utilized the low cost FAS2040 to store metadata from the HDFS Name nodes. The solution consists of 32 Data nodes with 7 x E2600 storage arrays. Each Data node accesses its own LUNs from the storage. The overall Autosupport solution is quite complex, but the Hadoop storage portion is a little less complicated. Interestingly, the data nodes are connected to the storage via eSAS rather than Fibre Channel or Ethernet. More info on Open Hadoop on E-series can be found here:

FlexPod is certainly getting a lot of attention. The FlexPod is based on a Cisco Validate Designs and NetApp Verified Architectures joint solution. Multiple FlexPod solutions are available for Virtual infrastructures, multi-tenancy, VDI etc. Of great benefit is that the designs are jointly supported by Cisco, NetApp and VMware.

The original FlexPod was designed and sized to accommodate 1500 users running VDI, MS Exchange, SharePoint & MS SQL. It encompassed three Cisco UCS B-Series blade chassis, dual fabric interconnects, fabric extenders, Nexus switches plus the NetApp FAS3210. A pretty high entry level cost for smaller companies.

Since then, NetApp and Cisco have announced an Entry level FlexPod. It is comprised of less the expensive FAS2240 storage, as few as one C-Series rackmount server utilizing iSCSI for boot. The Entry FlexPod does require the same caliber network switches, so it is still quite expensive for small businesses.

Recently, NetApp and Cisco have introduced the ExpressPod. A scaled down entry level version of FlexPod aimed toward smaller businesses. The ExpressPod consists of either FAS2220 or FAS2240 storage (no FAS32xx), two 1Gb Nexus 3048 switches and C-Series rackmount servers. Since the design omits fabric interconnects, UCS manager is not included. The price is much lower for entry level and the cooperative support model is in force.

Well last but not least, NetApp had a lot of discussion around “Cluster-mode.” Not sure how much everyone knows about “7-mode” vs. “Cluster-mode.” Seven mode is basically a high availability pair where a surviving controller will take over the personality of the failing controller to provide its services. Cluster mode on the other hand is horizontally scalable by adding more controllers to a cluster. A service runs from a virtual entity called a vserver which can physically run anywhere (theoretically). It’s similar to a VMware cluster.

A little history here. NetApp had Ontap (their operating system) version 7.x for many years. Some of our customers are still running on 7.3.x. NetApp purchased Spinnaker Networks in 2003 primarily to acquire their clustering technology. In 2006 NetApp had integrated the technology from Spinnaker into their Ontap GX product. A separate code-based Ontap primarily focused on High Performance Computing solutions. NetApp’s goal has been to merge the two code bases into one, which was realized with Ontap 8.x. So with Ontap 8.x a customer has a choice of which MODE they desire to run; 7 or cluster. Unfortunately, it’s not quite that simple. Cluster mode does have some additional requirements above and beyond a traditional 7-mode implementation. NetApp would like all greenfield implementations to go cluster-mode. Currently, NetApp doesn’t allow/support partners doing migrations from 7-mode to C-mode. Traditional tools such as SnapMirror are not 100% compatible between the two modes due to volume structure. In the future, partners will be able to participate in migration services when NetApp produces end user migration tools.

Another barrier to entry on Cluster-mode is the need for external 10Gb switches along with corresponding adapters in each controller. This hardware is solely for use by Ontap for cluster communication. This infrastructure is not to be used for customer data. So NetApp sells these additional switches for $1 currently. Don’t know when this “promo” will end, but certainly does help even the playing field between the two modes.

In the not too distant future, NetApp will not develop anymore 7-mode releases. Customers will have to make the switch eventually, but will it be as soon as NetApp would like? Don’t know, but doubtful.


No comments:

Post a Comment