Damaged Index on SSAS Cube After Running ProcessAdd

Since we upgraded to SQL Server 2012 about 12 months ago, we randomly hit upon a particular error when running a ProcessAdd against one of our dimensions. The error in question was:

Microsoft.AnalysisServices.AdomdClient.AdomdErrorResponseException: File system error: The index data is damaged. Physical file: \\?\F:\OLAP\Data\XYZ.0.db\Member.0.dim\16.Id.Aggregate Member.fact.map.hdr. Logical file: .

The dimension in particular was our largest dimension by some margin. We have a cube per client and each of our clients cubes can vary in size, from a few mb to a +1TB. The dimension in question however never ran over the 4GB max recommended size for a dimension. This corrpution was causing us a major issue as the cube was unusablle and the resolution to fix was to run a process Update with Process Affected Objects, effectively meaning we had to rebuild our aggregation designs. On the larger cubes this took hours and halted any importing.

I searched for a fix online and sure enough there was a fix for our exact issue on Microsoft’s support site and that there is a fix for 2012 RTM Cu2, which was packaged up with SP1.

2673997 FIX: “The index data is damaged” error message when you run an MDX query against a dimension or partition in SSAS 2008, in SSAS 2008 R2 or in SSAS 2012

http://support.microsoft.com/kb/2673997/EN-US

Seeing as we had already applied SP1 (and CU2) this fix was not pertinent to us.

This chronic issue was really harming us. Eventually I made a support call to Microsoft, as we are a Gold Partner and got tech support and this seem such a time to utilize it. I got through to a member of the dev team and we worked our way through the scenario. Turns out it was a trifecta of setting that had caused us issues.

The first of these was our anti virus scanning was scanning one particular file in the $\OLAP directory which was causing the dimension to be come corrupt. I’ve blogged before about anti virus scanning the SSAS folders

hAnti Virus scanning

Swansong Musings

TL;DR go here for mixture of personal/professional stuff. Or here for more professional stuff.

Going meta again this week, as I blog about blogging in other places and a certain mistake I made not two weeks ago.

It Must Be True, It’s Written On The Internet

blog-post-internetOne of the reasons I write a blog is because I find I’m more likely to remember something if I write it down. So a blog is a useful place. And usually I’m very meticulous in my research and provide examples in order to confirm with myself what I believe to be the case to actually be the case. But on occasion I will get it wrong. And this week I got it wrong. Big time. And it is to do with defining the files when creating a database. And whilst I was right in the fact that you cannot overwrite the system variables wrt where files will be written, I overlooked the fact that you can define the locations yourself. Look at that. I got it so wrong I’m using the trifecta of bold, underline and italics to make the point.

Below is a screenshot of an mdf and ldf file being created to a different location:eggonmyface So a pre-model script or a deployment contributor are not really necessary, you just need to define the location in your publish.xml file and parameterise the file-path yourself.

Sup Dawg, I Heard You Like Reading my Blogs

So, earlier this year I started working at Sabin.IO, the consultancy ran by Simon Sabin, he of SQLBits, amongst other distinctions. You may have noticed that around the turn of the year I started posting less and less DBA-type posts and focused more on Database DevOps, which is exactly what I used to do not so long ago. But invariably I’ve come back to Database DevOps for a few reasons:

  1. You can actually genuinely make a difference helping people improve their database deployment story
  2. I was burnt by (and burnt out) too many chaotic databases, where the definition of “gold code” was whatever was in live, but there was no control as to who or what got deployed there.

But to quote a friend of mine, “there’s nothing negative in pointing out problems. It’s only negative in pointing out problems and doing nothing about it.” And very true that is. So I’m helping my former DBA colleagues by advocating the CI/CD database deployment process.

Anyway, there’s a few blog posts over on Sabin IO written by yours meely about some exciting new features in SQLPackage, and how to leverage those same features through the DacFx API. So why not head over there and give them a read why don’t you?!

And so as One Door Closes, Another Opens….

Empty platitudes aside, this will be my last blog post here. 4 years, 325 posts, over 100,000 views, not bad going. But I haven’t stopped blogging: I will in fact be posting at Sabin IO. Because I’ve not only been working at Sabin IO, I’ve actually gone permanent as of the end of October. And seeing as:

  • I like to blog
  • I like wide audience to read it

It makes sense for me to post at Sabin.IO .

This site will hang around for a while, but what with the name being a company name it’ll go and revert to a “.wordpress.com” name, and some of the content will appear on the SabinIO blog.

See ya pals!

Remove the Limitations Of LocalDB With This One Simple Trick**

**(not a trick)

If you are developing your databases in SSDT, then chances are you are familiar with LocalDB. LocalDB is essentially as stripped-down version of the database engine that runs within Visual Studio, and whenever you hit F5 the database is deployed to LocalDB… sometimes.

You see, LocalDB is in fact pretty much SQL Express, the free version of SQL Server. Not that there’s anything wrong with that, except that SQL Server Express does not support all features, like say for example, full text search. But what else is missing? There’s no performance counters, but is that too much of an issue?

In fact, any Enterprise feature is not available in LocalDB, but you can fix this problem with one simple trick: download SQL Server Developer Edition for free! Yep, this is a new Microsoft, one which will give you the full fat features at zero cost. This announcement was some months ago, but it does surprise me how many people were not aware of this fact.

SQL Server Compression 101

The data compression feature in SQL Server can help reduce the size of the database as well as improve the performance of intensive I/O intensive workloads, especially on data warehouses. This performance boost in I/O is offset against the extra CPU resource that is required to compress/decompress the data whilst data is exchanged with the application. With data warehouses, the guideline is to compress all objects as there is typically CPU capacity whilst the data storage and memory capacity is at a premium, so Microsoft’s recommendation is to compress all objects to the highest level. However, if there are no compressed objects at all, it is better to take the cautious approach and to evaluate each database object individually and the effects on the workload, particularly if the CPU headroom is limited. Continue reading “SQL Server Compression 101”

SSDT And Pre-Model Scripts Part 2: Execute Pre-Model Scripts Using Octopus Deploy

Hello!

Part one of this post concerned the notion of SSDT and Pre-Model Scripts. This post relates to an actual implementation of executing a SQL script using Octopus Deploy. My pull request on Github was approved, so my step template can be exported from the Octopus Library.

The only part that I feel requires some explanation is the SQL Scripts and the option to select a previous step. If you’ve read part one of this post you’ll know that you keep the pre-model scripts in a folder in the database solution. These are not compiled (ie not in build) but are included as part of the output and packaged up in the NuGet package. When the NuGet package is unpacked, it’s then a case of locating the scripts and executing them.

First, the order. Extract the NuGet package that includes the dacpac and the post-deploy scripts and then add the execute SQL step:

steporder

In the step template for execute SQL, there are Octopus parameters. These we define the step that the scripts were unpackaged, and the final path to the scripts themselves:

stepoptionsAnd these are the values of the parameters:

 

variablesSo now when we execute the step, we can see which scripts were executed:

execute

When To Use Octopus Deploy Script Modules

Hello!

Lately I’ve been thinking a lot about script modules in Octopus Deploy. Script Modules, for those of you not ITK, are a collection of PowerShell cmdlets or functions that exist outside of a script step that can be used across multiple steps across different projects/environments. They don’t have to be used to contain modules exclusively; they can just contain that will get executed upon the initial import of the module. The issue here is that these script modules are imported or every step in a process. So unless the code you write is idempotent, you’ll probably cause an error in the process. This is by design.

So at first I thought “put nothing in script modules” because they’re pretty pointless, right? Anything you write will pertain to a step exclusively, so why do you need to share code across steps. Turns out quite a bit of code is shared across steps, particularly checking and validating parameters.

Then, as step templates got more sophisticated they got harder to read, so then I thought “put everything in script modules!” It aided in legibility of the step templates and allowed them to focus mainly on setting up a process and using script modules to execute the function.

There is however an issue to this: script modules are the “magic” of a step template. If you’re not aware that a step template requires a script module, how can you communicate this implicitly? You could mention that a script module is required in a step template, but this is still a manual process that everyone remembers, or even buys into.

There is also another issue: I had to make a change to a step template that required a change to the script module. The step template in question was in heavy use, and so careful planning was required to make sure that the change in the step template and the script module was not a breaking change: basically what I mean is that whereas a step template has to be explicitly updated for a given step in an deployment pipeline, the same cannot be said for a script module. So whatever is the latest version of the script template is what a user gets. And if you make a few changes to the step template in an iterative manner, you have to be sure that if anyone updates the step template then all version have to match one another. Continue reading “When To Use Octopus Deploy Script Modules”

The Alternative to Pre-Model Scripts

Hello!

Two of my most recent posts have concerned pre-model scripts; those scripts that need to be run before the dacpac is compared against the production database. These are sometimes necessary. And this necessity is usually because SSDT may not produce a script that is optimal. One such example is an index creation: no index is created “ONLINE”, and this can be a problem if the database being deployed is online during this operation. And can be even worse if the table is particularly large.

In respect to the SSDT team, I can see why this is the case: some versions of SQL have the online index feature, some don’t. So one solution may be to write a pre-model script that will create an index with the online operation included. And while there’s nothing wrong with this, there is an alternative: deployment contributors. Continue reading “The Alternative to Pre-Model Scripts”

SSDT and Pre-Model Scripts

Hello!

Earlier this week I posted about the need to create the database as a pre-model tasks: that is creating a database before we run sqlpackage/dacfx against a database/server so that the script itself does not create the database.

Going a step further than this, pre-deploy scripts in SSDT scripts may be executed before the main diff script is executed, but not before the diff script is generated. This is an important fact to understand. If you want scripts to be executed before the diff is generated, then you need to execute pre-model scripts.

How you go about this is up to you, but there needs to be a few rules that must be followed:

Idempotent: big word, and is used in maths. But don’t let that put you off. What it means in this context is that a script can be re-run and the same result happens. So in the context of altering a primary key, the pseudo-code would look something like this:

if database exists then
if table exists then
if primary key name eq “oldname” then
drop primary key then
add primary key with “newname”

and then you add primary key “newname” to the solution. That way the drop and add part of the script will only ever be run once, or again against an older version of the database that has the old primary key.The “newname” of the primary key guarantees that this will be the case.

Stored in the Solution: within database solutions you can add scripts that are part of the solution but are not executed:

checksIf you copy the file to the output, what you’ll be able to do is add the files required to your nuspec file and get them deployed as part of the NuGet package. Neat!

premodelnuspec

So if you have the scripts in the solution and re-runnable then you can be confident that they will run against older versions of the database. It keeps a level of audit and history of the database.

Now, how to run the scripts. As with the create database script, SMO is probably the easiest way to do this. You could use SQLCMD, or Invoke-SqlCmd, but it’s entirely up to you.

Now if you want to use SMO, and you are deploying using Octopus Deploy, then you’re in (some) luck: there are scripts available in the Octopus Library to do something like this, but you may will have to alter them to execute a folder full of sql scripts. Fortunately, I already have completed this, and I will add my script to the GitHub solution to be available on the Library. When this is available, assuming it meets the quality control, I’ll blog about how to use it.

edit: it is now available on the Octopus Library. There is also a part two on how to use the step template.

Dude, Where’s My Gold Code?

Let me tell you about a conversation I had yesterday with some developers in a team. They needed help with their database CI/CD process, and so I wanted to see the database solutions they were using.

Dev1: We don’t have solutions at the moment.

Me: OK, how come?

Dev1: Because we are making so many changes right now, there seemed to be little point in creating solutions, so we’re just using scripts.

[At this point, I bite my tongue and choose not to reply with ‘If you’re continuously making changes, wouldn’t it make sense to continuously integrate and  continuously deploy the changes. I mean, you’d hardly be taxing the definition of the word “continuously…” now are you…’]

Me: So how are you making changes?

Dev1: Oh well, we’re just using scripts to deploy the changes now.

Me: And where are the scripts?

Dev1 They’re in the wiki.

Dev2: No they’re not, because we put anything in the wiki that is changing regularly.

Dev1. Oh. So where are they?

[Dev2 stares blankly].

Dev1: So why do you want solutions?

Me: To help you build a release pipeline.

Dev2: But we’re making lots of changes regularly.

[Again, me, keeping cool in spite of the blindingly obvious contradiction presented to me.]

Me: But let’s get something built to help you make those changes regularly.

Dev1: OK, I’ll get something together by the end of the week. In the meantime we’re too busy making changes to get this together…

Sound familiar? I’m sure many of you have been in a situation like this before. It’s not the first time I’ve had a conversation like this either.

Now I can go on about how the business doesn’t buy into maintenance, or Continuous Integration or Continuous Delivery. Or I could talk about how CI/CD needs to be pushed by the devs and that associated tasks require items in the sprint to make sure that they are completed and audited. And I can talk about how the “build once, deploy many” process reduces the number of bugs and speeds up the deployment process. And you know, it is great that when wearing my DBA hat I have had to write wacky scripts  because dev teams haven’t been deploying to environments in a controlled manner, but I’d really rather be focusing on infrastructure. All these points are well documented.

But these points are consequences of implementing (or not) Continuous Delivery and Continuous Deployment. The most fundamental point of adopting CI/CD is getting your code into some sort of source control. It all starts with getting the code into source control, writing pre and post deployment scripts that are re-runnable, developers checking in changes, building the code, and taking the resulting artefact (in SSDT terms, a dacpac) and deploying that dacpac to your database. And taking the same dacpac and deploying again and again up to production. This process can be achieved in an hour. Actually, in 55 minutes! And AdventureWorks can be tough to automate! OK, it may take longer than that, but getting a repository of your code that can be audited, changes rolled back, tagged, branched, deployed, merged etc etc is the first step towards achieving Database DevOps.