A great yeat 2013 to all!
When we present our PACBASE-migration solution, people readily understand how restructuring the generated code is essentially independant from how this code is generated in the first place. Our technology can be (and actually is) applied to other COBOL code generators, such as MetaCOBOL, DeltaCOBOL or COOL:Gen or even to manually written and maintained COBOL code. At the end of the day, we recognize COBOL constructs and replace them by equivalent but better structured COBOL constructs. How these constructs are generated does not come into the picture.
This is true in general, but there are cases where more tailored approaches must be taken. This note describes one of these PACBASE-specific transformations in detail, explaining how it can make a serious difference in maintainability and structure of the resulting code.
When dealing with breaks (I assume this is how the french term ruptures is translated in the english PACBASE documentation, but I don’t know this for a fact. Ruptures qualify cases when, in a report (or état), grouping value change. If any of the english-speaking readership of this blog could enlighten me on this minor terminology issue, I would be infinitely grateful), PACBASE generates code such as:
F22HA. MOVE ZERO TO HA-DE. IF HA-FI = '1' GO TO F22HA-1. IF HA00-CPROD NOT = 1-HA00-CPROD GO TO F22HA-1. IF HA00-NARTI NOT = 1-HA00-NARTI GO TO F22HA-2. IF HA00-CTCOD NOT = 1-HA00-CTCOD GO TO F22HA-3. GO TO F22HA-FN. F22HA-1. MOVE 1 TO HA-DE1. F22HA-2. MOVE 1 TO HA-DE2. F22HA-3. MOVE 1 TO HA-DE3. F22HA-FN. EXIT.
where a scaffolding of conditions sets one or more flags to 1, depending on whether a given field in the current record (
HA00-*) differs from the same field in the previous one (
From the code depicted above, there is very little one can improve directly as it violates virtually all rules of structured programming (this should not be understood as criticism, it is generated code and whether it obeys the rules for maintainability is pointless), the worst being the fall-thrus from
F22HA-2, then to
F22HA-3, that resists virtually all attempts at restructuring.
An additional brick to the transformation engine
It is tempting to write a transformation pattern that addresses this situation globally, producing equivalent but more structured code in a single step. Experience has shown over and over again that it is simpler, easier, safer and in general more useful to add a minimal transformation that will turn this construct into something another set of transformations can further improve. It is a classical case of divide and conquer approach. Such minimal transformations are easier to write, easier to debug, and have fewer bugs than larger ones.
In this case, an apparently mundane transformation does the trick. It is merely a matter of moving a
MOVE statement in the two branches of a test when a given combination of statements has been recognized:
IF cond 1 (1) IF cond2 THEN goto next-label END-IF ELSE (2) END-IF. MOVE 1 TO VAR1. next-label. (3)
IF cond1 (1) IF cond2 THEN goto next-label END-IF MOVE 1 TO VAR1 ELSE (2) MOVE 1 TO VAR1 END-IF next-label. (3)
It is a very counter-intuitive transformation.
First, it is not even clear how it actually addresses the restructuring of the break constructs as generated by PACBASE. I’m afraid you are going to have to trust me on that one, as the complete story is involved and does not make for an exciting read. Let’s just say that it does address this issue by being applied after a number of transformations that have already altered the code’s structure significantly.
It is also counter-intuitive in the sense that instead of improving quality, it actually makes things worse, by duplicating the
MOVE A TO VAR1 statement! It is one of the (few) cases in our restructuring facility where we allow ourselves to do something we know is sub-optimal, hoping it will be compensated and more by further transformations.
Stated more formally, we allow ourselves to move away from a local optimum because we know that it allow us to get gloser to a more global one. Metaphorically, it is just as climbing off a modest hill to climb on a higher one to get a better view on the valley.
In any case, the worsening of the quality is minor. This pattern is a very peculiar and will hardly match outside of PACBASE-generated code, thereby further minimizing the negative impact. And just on the safe side, we deactive it altogether when processing non-PACBASE code, as we can’t then be sure that this gamble (temporary worsening to allow for later improvement) will pay off.
Applying this pattern a number of times ends up yielding something such as:
IF HA-FI = '1' OR HA00-CPROD NOT = 1-HA00-CPROD THEN MOVE 1 TO HA-DE1 ELSE IF HA00-NARTI NOT = 1-HA00-NARTI THEN MOVE 1 TO HA-DE1 HA-DE2 ELSE IF HA00-CTCOD NOT = 1-HA00-CTCOD THEN MOVE 1 TO HA-DE1 HA-DE2 HA-DE3 END-IF END-IF END-IF
which, can then turn to an EVALUATE statement, flattening the nested conditions and making the true nature of these breaks very clear:
EVALUATE TRUE WHEN HA-FI = '1' OR HA00-CPROD NOT = 1-HA00-CPROD MOVE 1 TO HA-DE1 WHEN HA00-NARTI NOT = 1-HA00-NARTI MOVE 1 TO HA-DE1 HA-DE2 WHEN HA00-CTCOD NOT = 1-HA00-CTCOD MOVE 1 TO HA-DE1 HA-DE2 HA-DE3 END-EVALUATE
This example shows how a less than optimal transformation ends up enabling far more powerful ones, resulting in code that is arguably better structured and understandable than the original one.
What’s in a name?
Structure is not everything though. Admittedly, the resulting code shown above is much better than the original one, the intention has gotten much clearer, it is shorter and technically sound, but this does not mean it cannot be improved any further. For instance, the
HA-DE* variables are generated by PACBASE, and hold little if any meaning for the COBOL developer responsible for maintaining the restructured code after migration.
Another transformation, even more PACBASE-specific than the one presented above, aims at renaming these variables into something more useful. It recognizes a pattern such as
IF HA00-NARTI NOT = 1-HA00-NARTI THEN MOVE 1 TO HA-DE1 HA-DE2
and recognizes it as comparing a field named
HA00 in the specific context of a report break. If
HA-DE2 has been declared as such control variables commonly are in PACBASE, it is renamed to
HA-NARTI. This emphasizes the fact this control variable is attached to the
NARTI field in the file at hand.
This renaming operation yields code such as:
EVALUATE TRUE WHEN HA-FI = '1' OR HA00-CPROD NOT = 1-HA00-CPROD MOVE 1 TO HA-CPROD WHEN HA00-NARTI NOT = 1-HA00-NARTI MOVE 1 TO HA-CPROD HA-NARTI WHEN HA00-CTCOD NOT = 1-HA00-CTCOD MOVE 1 TO HA-CPROD HA-NARTIT HA-CTCOD END-EVALUATE
but the true added value of this facility lies in the way the
HA-DE* variables are replaced everywhere in the program, not just in the fragment shown above. The intention gets clearer when one tests for the level of a given break.
Such a renaming transformation is technically much simpler than the restructuring described earlier in this post, but it is the extra mile that makes the resulting code as close as possible to what a human programmer would have written in the first place, allowing for efficient maintenance in the future.
Well, I guess that’s it for now.
Have a great day!