Normally this would be an internal note to colleagues about what went on at the recently held annual SQL Server user group conference. However since I personally signed up to be one of the Platon bloggers, this will be a public document instead, as some of the things here might be of general interest.
To those unfamiliar to PASS, it’s an organization or community consisting solely of very dedicated people working with Microsoft SQL Server. The summit is an annual event, held in Seattle USA, gathering this year approximately 5000 people, but summits are also held in Europe and finally in Scandinavia this November under the name SQL Rally. However the true summit with keynotes from prominent Microsoft people is in Seattle, so to me that’s the place to be. This year there were 170 sessions presented during the three conference days and a number of full day sessions were delivered during the two pre-confernece days. All exclusively on SQL Server and no repeting sessions, so naturally you have to choose what you want to see.
Introducing……SQL Server 2012
So what did I bring home of new things? first of all the awaited next version of SQL Server code named Denali was finally given its release name and we are now awaiting SQL Server 2012 which is to be released Q1 2012. For now CTP 3 of Denali is available for download, but rumors of a RC0 version being available have struck my ears.
Working with BI both for customers and our own BI for IT solution, I was of course mostly interested in sessions around this topic. Two major new things being part of the 2012 release are, column based index and the new BISM (Power pivot) for analysis services.
Column based indexing have been expected with great anticipation. Used correctly it can speed up queries in an order of magnitudes. However the trees don’t grow into heaven. The index type is read only and other restrictions apply making it a little tricky to utilize, but combined with portioning, it’s possible to get around these and actually make something useful that performs out of this world.
Analysis services now comes in two flavors The old and well known OLAP version with the lovely query language MDX is still there and is almost untouched since 2008R2, but there’s the new kid on the block, which in essence is Power Pivot for servers utilizing the Excel like query language DAX. The latter edition of analysis service is based on the column store index to provide optimized query performance in a relational model. So now, like it or not, you could have two different instances of Analysis services running instead of one. I don’t want to be a party spoiler, but my guess is that OLAP type analysis is about to become part of the SQL Server engine by melting the two worlds together and Analysis services as we know it is about to be deprecated. Not in this version or maybe the next two, but it makes sense to have only one technology to maintain rather than two and I think software history will acknowledge this thesis. A lot of people will be upset about it, since the learning curve for MDX is rather steep and lot of time and effort have been spent in this area and now you might just have to learn yet another language.
Integration Services has also had a makeover, primarily in the form of better and more solid deployment, especially the project deployment as opposed to the current package deployment is interesting. Other than that, performance as well as other minor changes has been made but to my knowledge, nothing to pop the champagne over.
New TSQL features
In the relational engine, new features regarding the windowing functions have been provided and performance is generally improved. Now we can expect Lead(), Lag(), FIRST() and LAST() functions as well as other functions including a great alternative to TOP(), eliminating the need for an ETL tool to do some of the processing that was previously very hard or resource consuming to do in TSQL.
In a session held by Itzik Ben-Gan this was covered in more detail, along with a technique of using the not so well know “Cross Apply” syntax in TSQL. This was a moment of great enlightenment to me, however I couldn’t help feeling slightly embarrassed not having pursued this earlier, but well that’s why you should participate in these types of sessions. Check out this site, if you’re interested in having a look at some TSQL code from Itzik, it’s well worth your time if you’re a SQL developer. http://tsql.solidq.com/resources.htm.
Know the path to your storage
Being part of an Infrastructure company, means that I also take certain interest in Infrastructure related topics, such as those components responsible for the performance we can expect in the end. To the trained database people it’s no surprise that the disks eventually storing the data is an ever returning bottleneck when it comes to making those full scan queries run as fast as possible. When SANs were introduced in organizations, as the general platform for storage, the issues of getting enough throughputs for our database systems grew. This was a specific topic covered in at least a couple of sessions. You need to get in close contact with the storage guys in IT ops and make sure they understand that your database requires more than just capacity, they also require throughput (IOs pr. Sec) and low latency. So get in contact with the storage people and/or vendors if you’re not already.
In this context a thing came to my attention that I hadn’t thought about. The new black in storage is of course SSD (Solid State Disks). These disks can speed up IO tremendously and as prices are constantly being lowered I suppose this will be the standard for storing “Online” data in a couple of years if not earlier. However one has to be aware that SSD is in fact two different things. Solid State Disks are merely hard drives with FAST IO and no moving parts, but they act as regular hard drives, meaning that the path to the disk from the database is the same as to a regular disk, either if this is on a SAN or it is directly attached. Now since most controllers were built for regular hard drives and the throughput they’re able to produce, these controllers actually become the new bottlenecks. The other SSD, the Solid State Device, i.e. the ones from Fusion IO are on the other hand specific expansion cards added to a server and can be presumed as a sort of flash memory only they are presented as logical disk drives. These devices are not limited by the usual limitations of disk controllers, but rather by the PCI slot they’re inserted into, which in itself is a different story. The Microsoft CAT team had an interesting presentation covering this amongst other things. So when someone is talking SSD, you might go ahead and ask them what they mean. 95% chance you’ve got the first type, given the price tag of Solid state devices, and also you have the option to add Solid State Disks to your SAN giving it a performance boost, if you got some of all the other configuration stuff right in the attempt.
No IT summit without the cloud and Microsoft is all in when it comes to storing data in the cloud. SQL Azure now provides databases up to 150 GB in size. Although better than the previous 10 GB limit, for data warehouse, we are still a long way to having something useful. Also since Analysis Services isn’t available in the sky, I tend to lose a little interest in this space. However something that did catch my outmost attention was a demo where local data, through a nice and friendly interface was enriched with data from an external source which had address data exposed in the Azure marketplace. This enriching of local data with data from external vendors isn’t exactly new, but the user friendly way it was performed was excellent and having a market place for data is great and makes it much more transparent than what it is today. When I’m done being a consultant, I want to earn a living of selling cleansed and intelligent data to companies, could turn out to be just as good a deal as App development, who knows, anyway I don’t carry the skills for App development.
The PASS community is growing every year and the level of many of the sessions delivered great information and provided food for thought. I didn’t attend the black belt session held by the renowned Bob Ward from Microsoft CSS, who spent 3 hours in the TempDB with the audience, but maybe next year I’ll join in. Also I didn’t leave any room in this blog for Hadoop, Hive, ODBC drivers for Linux, PDW, MDS, DQS, Consolidation Appliance, Always On etc. You need to find this elsewhere and I expect SQLServer Central to be a good source for this or a simple Google search.
I was focused on what would provide value to me in the roles I play in my every work life and another participant would probably have described the summit completely different, however this was my take on PASS 2011. Feel free to leave comments and ask questions.