November 2008 Entries
A cautionary tale of the MSBuild Task

A colleague ran into a very interesting issue whilst working on his teams release build script. He was trying to get the custom built version number setup before a clean of the old source had taken place (so the version number could become the build number). To achieve this he was calling a sub script that (for use by a different target) setup a list of all the AssemblyInfo.cs files that needed to be updated with the version number. Later on in the build (after the clean / get) he reran this same script (for a different target) to write the version numbers into the AssembyInfo file. At this point the script would fail as one of the files wasn't there (a remnant of a previous packaging process). Much head scratching ensued as we tried to figure out how the heck an item list in a different script, invoked with MSBuild was populated from the first time it ran...

A theory was constructed that something was being cached, or data was moving in a way we didn't expect. Either the MSBuild task held onto scripts, or the item list was passed back to the calling script. The later theory was quickly discounted, which just left us with the MSBuild task caching script instances.

After a bit of digging around in the Microsoft.Build.Tasks assembly I managed to reach down into where it was actually executing MSBuild... not (as I'd thought) by spinning up a seperate process, but by calling back into it's own build engine. Fair enough, saves having to start a new process (although it explained some unusual file locking we'd seen as well). Further digging into the build engine found the following:

 

   1:  Project project = null;
   2:  Project existingProject = null;
   3:  Project project3 = (Project) this.projectsLoadedByHost[info.FullName];
   4:  if (project3 != null)
   5:  {
   6:     existingProject = project3;
   7:     if (project3.GlobalProperties.IsEquivalent(globalProperties))
   8:     {
   9:        project = project3;
  10:     }
  11:  }

Scripts are held firstly by the name of the script, and secondly BY THE PROPERTIES THEY ARE PASSED...

So recalling the same script with a different target but the same properties doesn't invoke a new instance of the script. No, the old one is reused and since Items are populated only when the script is first invoked the list of AssemblyInfo.cs files was as it was BEFORE the clean / get, not (as we wanted) afterwards.

In this instance the simple fix was to add an extra property to each of the two calls to ensure that the property lists were different ... and voila! Problem solved.

The dangers of unprotected PEX - catching an STD

And that STD is a Secret Testing Defect.

If you've not seen PEX yet it's a tool from Microsoft Research that automates white box testing. That is to say you write your code, point PEX at it and away it goes, generating test values for your method that will exercise your code to the fullest. These test values are then written out as full unit tests and can be included in your regular testing regime. But how does PEX find these values? Extispicy? Dousing rods? No, it runs your code... fair enough really, if you've not used a formal specification language then the only definition of your code available to PEX is the code itself. Reread that, the only definition of your code available to PEX is the code itself. So what if your code is wrong (and trust me it will be)? PEX is writing tests based on what your code does, or appears to do. So if there's a mistake in your code, there'll be a mistake in what PEX expects. Even worse, PEX might make assumptions about your code on the basis of how it understands it.

Take the following single line method:

   1:          public static decimal ConvertCelsiusToFahrenheit(int celsius)
   2:          {
   3:              return Round(((9 / 5) * celsius) + 32);
   4:          }

If run PEX explorations against this it generates us the following set of values (I say set, I mean value):

PEX_1

Fair enough really, it's one line of code, and a linear function, and it's right. Zero degrees Celsius does convert to 32 degrees Fahrenheit. That set of inputs doesn't break our code. But there is an input that does, well more than one input really, in fact any value of celsius other than 0 gives an incorrect calculation. PEX has picked the first value it can use, made an assumption about how our code works and in doing so missed the one error that's crept into the code.

Were I writing the tests I'd have written 3 tests with the following values of Celsius -1, 0 and 1. That would've spotted the mistake (have you spotted it yet? I'll post it at the end) and given me much more certainty about the correctness of my method.

I can see some of the value in using tools like PEX, but I'd much rather that people wrote their own damned tests. You know how it's supposed to work (you wrote it after all), so you should be able to test it at a unit level, and if not you're doing something wrong.

The mistake in the method was in the (9 / 5), replacing it with (9m / 5m) solves our problem.