In a previous post I discussed Halstead Metrics as a good way to gain insight into the complexity of a program. Vocabulary and Length in particular get to the heart of what’s important in a program, but there are other calculations based on the Operands and Operators counts.

Here are other Halstead metrics with their commonly used descriptions:

  • Computed length (N^): A prediction of program length, computed n1log2n1+n2log2n2.
  • Volume: A measure of the size of a piece of code (computed Nlog2n) compared to potential volume (n*log2n*), where n* is the size of potential vocabulary.
  • Level: The ratio of potential volume to actual volume, a measure of how abstractly the program is written.
  • Intelligence content: The total content of your program, computed by multiplying program level and volume.
  • Difficulty: Computed D = n1/2 * N2/n2. The difficulty measure is related to the difficulty of the program to write or understand, such as when doing a code review.
  • Time required to program: Computed T= e/18. This is an estimate of how long it would take to code the program, in seconds.
  • Number of delivered bugs: Volume/3000 is most commonly used formula.
  • Maintenance effort: Computed E = D(Difficulty) * V(Volume). A measure of the effort required to maintain the program, based on program clarity, computed as the ratio of Volume to Level. The lower the number, the easier the program will be to maintain. This is a relative number that can be compared to other programs to help you determine which would require more effort to maintain.

The first five of these I feel are for studying programs in the abstract. I’ve struggled to find a real use for them. They might be useful, but they are so abstract that most don’t get there.

The “Difficulty” metric is appealing. So is “Time required to program” and especially “Number of delivered bugs.” But I’ve never found anything in my own experience or research to show that these are reliable. The concept is that in code there will be bugs and it assumes a certain number of bugs for a given Volume. So, as the size of the program grows so does the number of potential bugs. But that’s just it; it’s an assumption of potential bugs in new code. Number of delivered bugs makes no allowances for the skill of the developer or of the type of logic involved. Even assuming that the number is correct, how would you use that?  Would you compare that count against the number of bugs you found and keep testing until they matched, i.e. until you found them all?  But if you did that, fixed the bugs and were able to keep the Volume similar, the number of delivered bugs would remain.

I feel these metrics are best used as interesting metrics for new code, which may help guide you in test plans. But, since they are all related to size, other metrics would also serve that purpose too without promising something like a definitive number of bugs, which could cause confusion. This leads me to a point I’d like to make: before you pursue any metric, study it and determine how you would really use it. Ask yourself, does the metric really provide what I need? Will everyone understand what it really shows so it isn’t misinterpreted?

As I stated in my earlier post, I have found that while there is a strong correlation between the Halstead metrics and Software Lines of Code, there are some nuances captured in the Vocabulary and Length that can help you understand programs better. The same can be said for the “Maintenance Effort.”

The Maintenance Effort proves to be the most useful among all of these other Halstead metrics because it provides a level of granularity that helps you spread out the programs in your portfolio based on how difficult or easy they will be to maintain. I’ve found that a value of over 4.5 million is the cutoff for very large programs, but the range of 1.5 to 4.5 is where most programs fall. Smaller programs fall below 1.5. I’ve worked with QA teams that used Maintenance Effort to guide testing efforts. Those programs that are in the highest range are targeted for more testing than those in the lowest.

Besides using Vocabulary to know “the number of things you need to know,” you can supplement that with the Maintenance Effort to guide you in ranking the programs in your portfolio for change estimation and testing efforts. They are great for comparing two programs. But remember that these metrics are based on size with the assumption that with an increase in size comes an increase in complexity. They do not account for how the logic is actually coded. For example, they have no insight into the number of code paths. You could find two programs with very close Maintenance Effort numbers, yet inside the structures are very different. To gain that insight you will need different metrics which I will begin to explore in my next post.

For more information on the Halstead Metrics: Halstead, Maurice H. (1977). Elements of Software Science. Amsterdam: Elsevier North-Holland, Inc. ISBN 0-444-00205-7.Citations