Chapter 31 Generating Code

Introduction
Choosing the Input File
Parsing Records
Writing the C Header Files
Writing the Encoder Source File
Writing the Decoder
Putting It Together
Summary

This chapter introduces you to using Perl to solve real-world problems. The ideas you'll gain from this chapter pertain to applying Perl to solve coding problems. The sample problem chosen is complicated enough to be encountered during your programming endeavors. Remember to concentrate on how the problem is solved, not on for what the final solution is being used.

Introduction

The main concepts introduced in this chapter have nothing to do with navigation and seismic fields. However, the problem that I address in this chapter is related to writing a seismic navigation record parser. This parser was crucial in getting a delayed project up and running in the field. Rather than spend days writing a parser in C, we were up in a few hours of coding effort. Plus the concepts gained during this experience helped me write parsers for other data formats for the same project with just as little effort.

Basically, I'll be covering ways to use Perl to generate C code for a parser. Instead of writing another phone book manager or a database for music or home inventory records, it's probably better that we deal with a real-world example. Perhaps after reading this chapter and seeing how this problem was tackled, you can draw parallel solutions for your current problems.

The sample project involves writing a parser to read FORTRAN-based records for a seismic survey. Most seismic navigation data, believe it or not, is based on an archaic standard based on the use of FORTRAN programs to read it. The standard is known as the UKOOA P2/86 standard. Sometimes this format is simply referred to as P286. This format was developed for old FORTRAN programs, and a huge amount of data still exists in this format. Now with the wide acceptance of C for most of the code being developed for graphical interfaces and numeric processing, it's only natural to look for ways to read these files without requiring the FORTRAN executable.

Of course, the decode and encode functions for P286 had to be done yesterday. Given the options of getting another job or writing 100+ functions and structure declarations, I tried tackling this problem with the lazy programmer's approach: Let Perl do the grunt work.

Here is what we were dealing with. Each record of data in a P286 data file is exactly 80 characters long with a trailing newline or null character. Fields within a record are based on character column positions and lengths in the record. Columns are numbered from 1 up. Because white spaces and commas can be part of the data, there are no "field separators" as such.

The first few characters of a record identify the type of data in the record. For example, records begin as H0001, H0002, and so on, with the rest of the characters as fields within the record. There are 52 such records in all. Records in a file are not in sequential order. The first identifying characters are between three and five characters long. The only guarantees are that there will always be only one record per line and that there will be no blank lines.

By reading the specification I discovered these things that would make the coding process programmable:

The only variables that could be defined in all records were either integers, doubles, or character strings. This implied a closed set of variables to check for.
The specification of all the headers was explicit about where each data item began in a line and where it ended. The lack of delimiters between fields was no longer a problem since a line could be chopped into substrings quite easily.
These were text files instead of packed binary files. This made debugging a little easier because no special routines were required to display the data.

Voilà! The two important criteria (consistency and working with a closed set) made it possible to do the encoding and decoding functions manually. By using a Perl program to do the coding for me, I reduced the possibility of errors. Any one error would be propagated to all functions and would be easy to catch and fix.

The most obvious question was, Why not use Perl to do all the decoding and encoding? The encoding and decoding routines were to be incorporated into a C program running on different platforms running DOS or another lower operating system. Embedding Perl within the C program would involve installing and maintaining Perl on these platforms. Most of the platforms the final code would run on did not support Perl.

Choosing the Input File

After reading the specification, I extracted all the header declarations into one file called P286hdrs. The specification listed the contents of headers in plain text in between two keywords, RECORD and END. The specifications were not consistent enough to be extracted using a program. Actually, I extracted most of the lines defining the format using an awk script:

/RECORD/,/^ENDREC$/ { print ; }

This script removes all the lines between the lines containing RECORD and END. I still had to do some editing after extracting all these records to get the correct input format.

The case presented in this example had a text file with the following format:

RECORD TypeOfRecord StringToUse variableName varType startCol endCol [format] variableName varType startCol endCol [format] variableName varType startCol endCol [format] [REPEAT count st1 st2 ... ] variableName varType startCol endCol [format] variableName varType startCol endCol [format] variableName varType startCol endCol [format] ENDREC

The variableName would be the name of a variable in a structure; the VarType would be int, double, or char. The startCol and endCol values defined the locations in the string where the data could be picked up. The first column was still numbered 1 instead of 0. It's easier to increment by 1 in a program than to change so many declarations.

Some blocks of variables in some records were repeated. These were defined after the option REPEAT keyword. The syntax for the REPEAT keyword was this:

REPEAT count st1 st2 ... stN

The st1 to stN are the starting offsets for all the fields that follow the REPEAT word. The count specified the number of times to repeat these blocks.

For example, the following record is interpreted as "Record H0001, with one variable starting at column 29 up to column 80."

RECORD H0001 SurveyType char 29 80 ENDREC

Another example of a record using more than one field is shown here:

RECORD H011 datumId int 5 5 spheroidName char 6 23 datumName char 24 41 semimajorAxis double 42 53 12.3 conversionFactor double 66 77 12.8 inverseFlattening double 66 77 12.7 ENDREC

Note how only the first four characters are relevant in identifying the record. Also, the integers and characters don't have a floating point string, whereas the numbers defining the double type do have a floating point specification of the form: length.decimals. The length is the total number of columns in the number including the decimal point, the decimals portion is the number of digits to the right of the decimal point. For example, 12.8 will occupy 12 character spaces in one column, 3 digits to the left of the decimal point and 8 digits to the right of the decimal point. The format for the floating point number is the same as that for a printf() statement in C.

Another example is a record using REPEAT fields, as shown here:

RECORD E3100 velprop double 6 12 7.2 REPEAT 5 13 26 39 52 65 srcNdx int 13 15 dstNdx int 16 18 slant double 19 25 7.2 ENDREC

In this record type, the block {srcNdx,dstNdx,slant} is repeated at columns 13, 26, 39, 52, and 65. This implies that each of these variables can be interpreted as arrays of five elements each.

The entire file for parsing these records is about 520 lines long. A shorter sample file is shown in Listing 31.1. Note how comment lines in that listing are inserted in this input file with the use of the # character. Actually, any lines could be used for comments as long as the comments are outside the confines of RECORD and ENDREC statements. The reason to use the hash is to maintain some consistency with Perl.

Listing 31.1. The input file.

1 # 2 # Comment lines are permitted in the usual Perl style. 3 # 4 RECORD H0001 5 SurveyType char 29 80 6 ENDREC 7 8 RECORD H0010 9 numPatterns int 6 7 10 sblInUse int 8 8 11 sattInUse int 9 9 12 numVessels int 10 10 13 numDatum int 11 11 14 offsetMode int 12 12 15 ENDREC 16 17 RECORD H011 18 datumId int 5 5 19 spheroidName char 6 23 20 datumName char 24 41 21 semimajorAxis double 42 53 12.3 22 conversionFactor double 66 77 12.8 23 inverseFlattening double 66 77 12.7 24 ENDREC 25 26 RECORD E3100 27 velprop double 6 12 7.2 28 REPEAT 5 13 26 39 52 65 29 srcNdx int 13 15 30 dstNdx int 16 18 31 slant double 19 25 7.2 32 ENDREC

Parsing Records

Now that we have an input file, let's tackle parsing the records within this file. The most likely way to tackle this problem is to generate three files from the input file: one header, one file with all source code for the decoder, and one source file for the encoder.

The pseudocode looks something like this:

open file for input open files for output while (more records) if recognized start of record start structure definitions start encoder function preamble start decoder function preamble if recognized end of record terminate structure definitions terminate encoder function preamble terminate decoder function preamble if within record generate structure variable definitions generate encoder function parsing for variable generate decoder function parsing for variable close all files

Perl has the capability to have more than one file open at once. By running the input file through one parser, you can generate three files simultaneously. All you do is send the output to its respective file. In this case, three files are opened for output: HDRS for the header declarations, EncD for the encoding output, and DECD for the decoding output. The SAFE handle is used to read in the input records from the P286hdrs file. The lines to open the file are

open (SAFE, "P286hdrs") || die "Cannot open Input file $!\n"; open (HDRS, ">P286.h") || die "Cannot open Header $!\n"; open (EncD, ">P286enc.c") || die "Cannot open Encoder $!\n"; open (DECD, ">P286dec.c") || die "Cannot open Decoder $!\n";

After the files are opened, some preamble stuff is required for each source and header file. The calls to the these functions provide the initialization. The contents of each file are destroyed when the files are opened, so you have to initialize each file:

&startHeaderFile(); &startEncoderFile(); &startDecoderFile();

Then, a while loop simply reads in all the input, one record at a time.

After chopping off the terminating newline, the incoming line is examined to see whether there are any comments or if it's a blank line. If either case is true, the line is discarded. Look at the following code:

while (<SAFE>) { chop($_); if (/^#/) { next; } if (/^\s*$/) { next; }

If the line appears to be non-empty, it's split into the @names array and examined for the tokens RECORD, ENDREC, and REPEAT. The default case is to process variable types and generate either structure variable declarations or code for encoding and decoding their values.

If it's a RECORD token, a new declaration is started for a structure. The name of the record is in $rname, with the $rtype as the type of record. Note how the $_ is used twice when splitting the record. The value of $_ is not modified with a call to the split() function nor is any function called that will modify the value of $_. The three functions, &startHeaderRecord($rtype), &startEncoderFunction($rtype), and &startDecoderFunction($rtype) take the P286 record type and generate a header declaration, an encoder function preamble, and a decoder function preamble. We also mark the fact the we are starting a new record by setting a flag $inRecord to 1. Further processing of the incoming line is halted with a call to the next() function. The $repeat flag is set to 0 to start a new record and to stop any previous declarations for any previous records. The fragment of code to start each type of data is shown here:

if (/^RECORD/) { ($rname,$rtype,@rest) = split(' ',$_); &startHeaderRecord($rtype); &startEncoderFunction($rtype); &startDecoderFunction($rtype); $inRecord = 1; $repeat = 0; next; }

The code in while also has to check when the record has ended with the receipt of an ENDREC token. When ENDREC is seen, three functions are called to close up the structure and function declarations started in the RECORD structure. Because we are no longer within a record, the value of $inRecord is set to 0 and the next function is called to skip further processing of this record. The fragment of code to do this cleanup is shown as this:

if (/^ENDREC/) { &stopHeaderRecord($rtype); &closeEncoderFunction(); &closeDecoderFunction(); $inRecord = 0; next; }

The REPEAT block is hit if the word REPEAT is the first word on a new line. Note that in RECORD and ENDREC token recognition, we looked at the start of a new line, whereas with the REPEAT keyword, we look for the REPEAT token after some white spaces from the start of a new line. The offsets are derived in two stages. The first stage gets the number of offsets to work within $count with the split call. The first stage puts the REPEAT line's variables into an array called @allOffsets. The next stage calls the splice() function to extract the subset of items starting from item number 2 in @allOffsets. The @offsets array then has the offsets in a record where the rest of the variables will be repeated in blocks. The next function is called to proceed to the next line of the input file.

The fragment of code to do this is shown here:

if (/^[\s]*REPEAT/) { $repeat = 1; @allOffsets = split(' ',$_); $index = $allOffsets[0]; $count = $allOffsets[1]; print "INDEX = $index, COUNT= $count"; @offsets = splice(@allOffsets,2); next; }

Finally, the default processing begins for a line. The first thing to do before attempting to parse a line is to see whether we are in the middle of a record. Because the input file may contain free-form text, too, in the future, this is a bit of insurance to help prevent any variables from being accidentally declared.

The incoming line is parsed to extract five values into an array. The input string $_ is split on white spaces. The call for this is

($vname,$vtype,$from,$to,$fmt) = split(' ',$_);

There are two ways to process a variable in our case. One is when a variable is by itself and another is when the variable is in a block being repeated. If a block is being repeated, it's easier to simply declare an array and parse into it. Here's the code to handle this part:

if ($inRecord) { ($vname,$vtype,$from,$to,$fmt) = split(' ',$_); if ($repeat == 0) { &makeHeaderItem($vname,$vtype,$from,$to); &encodeVariable($vname,$vtype,$from,$to,$fmt); &decodeVariable($vname,$vtype,$from,$to,$fmt); } else { &makeArrayedItem($vname,$vtype,$count,$from,$to); $offsetLen = $to - $from + 1; $offsetCount = 0; foreach $x (@offsets) { $offsetFrom = $x; $offsetTo = $offsetLen + $x; $offsetName = sprintf "%s[%d]", $vname,$offsetCount; print "Name = $offsetName, COUNT= $count\n"; &encodeVariable($offsetName,$vtype, $offsetFrom,$offsetTo,$fmt); &decodeVariable($offsetName,$vtype, $offsetFrom,$offsetTo,$fmt); $offsetCount++; } } ## end of else clause. } } # of while loop.

The while loop continues to process each line in the input file until all the record definitions have been completed. After the while loop ends, any terminal processing that be must done is completed and all open files are closed:

close (SAFE); &closeHeaderFile(); close (HDRS); close (EncD); close (DECD);

When the program terminates, you should have three files in the directory: P286.h, P286enc.c, and P286dec.c. The headers are declared in the P286.h file, encoding functions are declared in the P286enc.c file, and the decoding functions are declared in P286dec.c. The acid test really is to see if the code compiles. Try these commands-you should see no errors:

gcc -c P286enc.c gcc -c P286dec.c

Alas, we still have to write the code to use these functions. But that's really beyond the scope of this book. What's important to see is that in a few hours or so, we have created the boring part of the application and are now ready to proceed with using these tools.

During this discussion I have glazed over the details of how the header and source files are created in the subroutine calls we made. Let's take a look at the details of how these functions work.

Writing the C Header Files

The first task for generating the header file is to create the preamble to the include file being created in the file that is pointed to by the HDRS file handle. The code to do this is as follows:

sub startHeaderFile() { print ( HDRS "#ifndef P286_HDRS\n", "#define P286_HDRS 1\n", "#define STRPTR char *\n"); }

When the file is closed, you'll want to put in an #endif statement to allow multiple inclusions of the header file. This is done with a call to the subroutine sHeaderFile():

sub closeHeaderFile() { printf HDRS "\n#endif\n"; }

The empty pair of parentheses in the function declaration is used in Perl 5.002 or later to define a function prototype that allows for no input parameters. When a RECORD header is received, it generates two items: a #define token for a header number and the preamble for the structure to use. The token is helpful if you want to create a parser that does a switch() statement on a type of structure. To make sure that each token has a unique value, a counter is kept in $recordCounter for use in assigning a record type a unique value. The code to do this is shown here:

sub startHeaderRecord { my ($name) = @_; printf HDRS "\n#define P286_%s_RECTYPE %d ", $name, $recordCounter++; printf HDRS "\n\ntypedef struct P286_%s_type {", $name; }

By assigning the @_ array to my($name) we are actually permitting more than one argument into the function even though only the first argument is used. The contents of the @_ array are not altered in this case. Using a command like my $name = shift @_ would achieve the same purpose but would also alter the contents of the @_ array.

Each structure definition being created has to be stopped. Two variable types are constructed: P286_HEADERNAME_TYPE and a pointer type to the structure *P286_HEADERNAME_PTR. The code to do this is shown here:

sub stopHeaderRecord($name) { my $name = shift; my $ntype = "P286_" . $name . "_TYPE"; my $nptr = "*P286_" . $name . "_PTR;"; print HDRS "\n}" . uc($ntype) . "," . uc($nptr) ." \n"; }

Note the syntax used for the function prototype shown previously. The $name variable declaration in the argument list is only valid in Perl 5.002 or later. Formal parameter lists to subroutines in Perl are not completely supported in Perl as we go to print. It does not hurt to be prepared for the future by including formal parameter lists if they do not affect the underlying code in the subroutine itself. If you want to force Perl to take only one parameter into this subroutine, you can also declare this as

sub stopHeaderRecord($) { … }

The dollar sign in the parentheses will be used by the Perl interpreter as an indicator that only one parameter is allowed into the subroutine.

For a non-arrayed item, three variable types are created: a char string of fixed length, an int, and a double. The type of variable to generate a declaration for is passed in as a parameter to the makeHeaderItem function. Comments are also generated in the header file pointed to by the HDRS file handle to indicate what columns the data points to. These comments serve as a cross- reference for when you are debugging the generated code:

sub makeHeaderItem { my($vname,$vtype,$from,$to) = @_; my $len; if ($vtype eq 'char') { $len = $to - $from + 2 ; printf HDRS "\n %s %s[%d]\; \/* %d %d *\/ ", $vtype, $vname, $len, $from, $to; } else { printf HDRS "\n %s %s\; \/* %d %d *\/", $vtype, $vname, $from, $to; } }

If the variable being generated is in a REPEAT block, it's declared as an array with a call to the MakeArrayedItem() function. The call to generate this arrayed item is

sub makeArrayedItem { my ($vname,$vtype,$count,$from,$to) = @_; printf HDRS "\n %s %s[%d]; \/* from %d %d *\/ ", $vtype, $vname, $count, $from, $to; }

This code generates the P286.h headers file. Now let's look at the files that will contain the encode and decode functions.

Writing the Encoder Source File

To create the C source file to create encoder for the data file, use the EncD file handle. The subroutine startEncoderFile() starts the preamble for the file, which includes two items. The first is the call to include the header file, which is also being generated. The second is to write out the code to a function that pads spaces to the right of an incoming string to make the length equal to 80 characters plus a null character. The code to perform this preamble is shown here:

sub startEncoderFile { print (EncD "\/*** C source file to encode records.", "\nDon't edit this file\n ", "*/\n", '#include "p286.h" ', "\n", "\n/* The incoming buffer must be 81 chars! */", "\nvoid padTo80(STRPTR buffer)\n{\n", "int i,ln;\n", "ln = strlen(buffer);\n", "for(i=ln;i<80;i++) buffer[i] = ' '; \n ", "buffer[81] = 0; /* NULL terminate the string*/", "\n} /* end of padding function */\n", "\n"); }

As each new RECORD type is encountered in the input file, its corresponding encoding function header is created in the output file. A function header is created that takes a string to put an encoded record in and a pointer to a structure to unpack. Because the name of the record being parsed is passed into the code generation function, it's easy to derive the pointer name for it: P286_HEADERNAME_PTR. The code to accomplish this is shown here:

sub startEncoderFunction { my($vname) = @_; # Pick up name of record. print (EncD "\n/*: ", "\n** Generated by Perl script -- Avoid editing\n*/\n"); printf EncD "void encode_%s_type(STRPTR buffer,P286_%s_PTR sp)\n{", $vname, $vname; print (EncD "\nSTRPTR ncp; \n", "\nSTRPTR cp; \n", "register int i; \n", "char tempbuffer[80];\n"); }

When an ENDREC line is read, the function has to be closed. Unmatched RECORD and ENDREC lines in the input file will cause bad, uncompilable code to be generated. The cleanup at the end of the encode function is done by adding a call the padding function, padTo80, and printing out the ending curly brace:

sub closeEncoderFunction { print (EncD "\npadTo80(buffer);", "\n} /* End of encoding function */ \n"); }

As with structure declarations, two types of variables have to be parsed. One is a non-arrayed element and the other is an arrayed element. However, within the original loop the $vname being passed into this function is already set up as a variable or a member of an array so that no further processing is necessary.

The three types of variables used by the parsing encoder are used to generate the code. If the variable is a string, it's simply cut and pasted into its columns in the outgoing buffer. If it's an integer, the value of the integer is printed in the columns in the output buffer for the integer. For a double, the $fmt string contains the format string to explicitly place the decimal point at the right location in the columns for the output buffer. The code to perform this parsing is shown here:

sub encodeVariable { my($vname,$vtype,$from,$to,$fmt) = @_; my $len = $to - $from + 1; printf EncD "\n\n"; printf EncD "/* Encode:$vtype,$vname,$from,$to,$len,$fmt */"; if ($vtype =~ /char/) { printf EncD "\ncp = (STRPTR )&(buffer[%d]); ",$from-1; printf EncD "\nfor (i=0; i< %d;i++)",$len; printf EncD "\n buffer[%d + i] = sp->%s[i]; ",$from-1,$vname; } if ($vtype =~ /double/) { if (length($fmt) > 0) { printf EncD "\nsprintf(tempbuffer,\"%%%sf\",(sp->%s));", $fmt,$vname; } else { printf EncD "\nsprintf(tempbuffer,\"%%%sf\",(sp->%s));", $len,$vname; } printf EncD "\ntempbuffer[%d]= 0; ",$len; printf EncD "\nstrncat((STRPTR )(&buffer[%d]),tempbuffer,%d);", $from-1,$len; } if ($vtype =~ /int/) { { if ($len == 1) { printf EncD "\nsp->%s %%= 10;", $vname; printf EncD "\ntempbuffer[%d] = (char)('0' + sp->%s);", $from-1,$vname; printf EncD "\ntempbuffer[%d] = 0; ",$from; } else { printf EncD "\nsprintf(tempbuffer,\"%%d%d\",(sp->%s));",$len,$vname; printf EncD "\ntempbuffer[%d]= 0; ",$len; } printf EncD "\nstrncat((STRPTR )(&buffer[%d]),tempbuffer,%d);", $from-1,$len; } } } # end of subroutine.

Writing the Decoder

The decoder source file is created in almost exactly the same way that the encoder file is created. The preamble for the decoder file includes the #include statement for the P286.h header file, which is also being created. It also makes a reference to the padTo80 function in the encoder file (should this extra function be required).

The decoder file also generates code to extract a substring from another string given the offset and length to extract. The function may be used on systems that have a broken strncpy (just like my old, in-house version of the C compiler for a Motorola 6809E microprocessor).

Here's the code to generate the decoder source file preamble:

sub startDecoderFile { print (DECD "\/**\n* C source file to decode records.", " \nDon't edit this file\n ", "*/\n", '#include "p286.h" ', "\n", "\n/* The outgoing buffer must be 81 chars! */", "\nextern void padTo80(STRPTR buffer); \n", "\n", "\n/* The outgoing buffer must also be 81 chars! */", "\nvoid substr(STRPTR buffer,STRPTR cut,int offset, int len)\n{\n", "int i,j; \nj = offset;\n", "for(i=len;i<80;i++) cut[i] = buffer[j]; \n ", "cut[len] = 0; /* NULL terminate the string*/", "\n} /* end of padding function */\n", "\n"); }

Each RECORD and ENDREC pair in the input file causes a call to the Perl subroutines startDecoderFunction() and closeDecoderFunction(), respectively. The first subroutine creates the preamble to decode a string into a structure. The second subroutine prints the closing brace for a function. The code for the two functions is shown here:

sub startDecoderFunction { my($vname) = @_; # Pick up name of record. print DECD "\n/*: "; print DECD "\n** Generated by Perl script -- Avoid editing "; print DECD "\n** The outgoing buffer must also be 81 chars!"; print DECD "\n*/\n"; printf DECD "void decode_%s_type(P286_%s_PTR sp,STRPTR buffer)\n{", $vname, $vname; print DECD "\nSTRPTR cp; \n"; print DECD "register int i; \n"; print DECD "char tempbuffer[80];\n"; } sub closeDecoderFunction { print DECD "\n} /* End of decoding function */ \n"; }

Finally, the decodeVariable function is called when a variable is encountered. The type of variable dictates how to read the value from a substring in the incoming buffer. The name of the variable is set to be a scalar or the member of an array before the call is made. Therefore, the Perl subroutine simply uses the value of $vname verbatim:

sub decodeVariable { my ($vname,$vtype,$from,$to,$fmt) = @_; my $len = $to - $from + 1; printf DECD "\n"; if ($vtype =~ /char/) { printf DECD "\nstrncpy(sp->%s,(STRPTR )(&buffer[%d]),%d);", $vname,$from-1,$len; printf DECD "\nsp->%s[%d]= 0; ",$vname,$len; } if ($vtype =~ /double/) { printf DECD "\nstrncpy(tempbuffer,(STRPTR )(&buffer[%d]),%d);", $from-1,$len; printf DECD "\ntempbuffer[%d]= 0; ",$len; printf DECD "\nsscanf(tempbuffer,\"%%%sf\",&(sp->%s));", $fmt,$vname; } if ($vtype =~ /int/) { printf DECD "\nstrncpy(tempbuffer,(STRPTR )(&buffer[%d]),%d);", $from-1,$len; printf DECD "\ntempbuffer[%d]= 0; ",$len; printf DECD "\nsscanf(tempbuffer,\"%%%dd\",&(sp->%s));",$len,$vname; } }

Putting It Together

Here are the input and output files generated from the code fragments shown previously. The input file shown here has only five records in it. The actual file had 52 record types, and I applied the same methodology to three types of similar formats. Here is a breakdown of the listings:

Listing 31.2. The header file generated from Perl.
Listing 31.3. The encoder file P286enc.c.
Listing 31.4. The encoder file P286dec.c.
Listing 31.5. The full listing for the code generator.

Listing 31.2. The P286.h header file.

1 #ifndef P286_HDRS 2 #define P286_HDRS 1 3 #define STRPTR char * 4 5 #define P286_H0001_RECTYPE 0 6 7 typedef struct p286_H0001_type { 8 char SurveyType[53]; /* 29 80 */ 9 }P286_H0001_TYPE,*P286_H0001_PTR; 10 11 #define P286_H0010_RECTYPE 1 12 13 typedef struct p286_H0010_type { 14 int numPatterns; /* 6 7 */ 15 int sblInUse; /* 8 8 */ 16 int sattInUse; /* 9 9 */ 17 int numVessels; /* 10 10 */ 18 int numDatum; /* 11 11 */ 19 int offsetMode; /* 12 12 */ 20 }P286_H0010_TYPE,*P286_H0010_PTR; 21 22 #define P286_H011_RECTYPE 2 23 24 typedef struct p286_H011_type { 25 int datumId; /* 5 5 */ 26 char spheroidName[19]; /* 6 23 */ 27 char datumName[19]; /* 24 41 */ 28 double semimajorAxis; /* 42 53 */ 29 double conversionFactor; /* 66 77 */ 30 double inverseFlattening; /* 66 77 */ 31 }P286_H011_TYPE,*P286_H011_PTR; 32 33 #define P286_E3100_RECTYPE 3 34 35 typedef struct p286_E3100_type { 36 double velprop; /* 6 12 */ 37 int srcNdx[5]; /* from 13 15 */ 38 int dstNdx[5]; /* from 16 18 */ 39 double slant[5]; /* from 19 25 */ 40 }P286_E3100_TYPE,*P286_E3100_PTR; 41 42 #endif

Listing 31.3. The encoder file P286enc.c.

1 /*** C source file to encode records. 2 Don't edit this file 3 */ 4 #include "p286.h" 5 6 /* The incoming buffer must be 81 chars! */ 7 void padTo80(STRPTR buffer) 8 { 9 int i,ln; 10 ln = strlen(buffer); 11 for(i=ln;i<80;i++) buffer[i] = ' '; 12 buffer[81] = 0; /* NULL terminate the string*/ 13 } /* end of padding function */ 14 15 16 /*: 17 ** Generated by Perl script -- Avoid editing 18 */ 19 void encode_H0001_type(STRPTR buffer,P286_H0001_PTR sp) 20 { 21 STRPTR ncp; 22 23 STRPTR cp; 24 register int i; 25 char tempbuffer[80]; 26 27 28 /* Encode:char,SurveyType,29,80,52, */ 29 cp = (STRPTR )&(buffer[28]); 30 for (i=0; i< 52;i++) 31 buffer[28 + i] = sp->SurveyType[i]; 32 padTo80(buffer); 33 } /* End of encoding function */ 34 35 /*: 36 ** Generated by Perl script -- Avoid editing 37 */ 38 void encode_H0010_type(STRPTR buffer,P286_H0010_PTR sp) 39 { 40 STRPTR ncp; 41 42 STRPTR cp; 43 register int i; 44 char tempbuffer[80]; 45 46 47 /* Encode:int,numPatterns,6,7,2, */ 48 sprintf(tempbuffer,"%d2",(sp->numPatterns)); 49 tempbuffer[2]= 0; 50 strncat((STRPTR )(&buffer[5]),tempbuffer,2); 51 52 /* Encode:int,sblInUse,8,8,1, */ 53 sp->sblInUse %= 10; 54 tempbuffer[7] = (char)('0' + sp->sblInUse); 55 tempbuffer[8] = 0; 56 strncat((STRPTR )(&buffer[7]),tempbuffer,1); 57 58 /* Encode:int,sattInUse,9,9,1, */ 59 sp->sattInUse %= 10; 60 tempbuffer[8] = (char)('0' + sp->sattInUse); 61 tempbuffer[9] = 0; 62 strncat((STRPTR )(&buffer[8]),tempbuffer,1); 63 64 /* Encode:int,numVessels,10,10,1, */ 65 sp->numVessels %= 10; 66 tempbuffer[9] = (char)('0' + sp->numVessels); 67 tempbuffer[10] = 0; 68 strncat((STRPTR )(&buffer[9]),tempbuffer,1); 69 70 /* Encode:int,numDatum,11,11,1, */ 71 sp->numDatum %= 10; 72 tempbuffer[10] = (char)('0' + sp->numDatum); 73 tempbuffer[11] = 0; 74 strncat((STRPTR )(&buffer[10]),tempbuffer,1); 75 76 /* Encode:int,offsetMode,12,12,1, */ 77 sp->offsetMode %= 10; 78 tempbuffer[11] = (char)('0' + sp->offsetMode); 79 tempbuffer[12] = 0; 80 strncat((STRPTR )(&buffer[11]),tempbuffer,1); 81 padTo80(buffer); 82 } /* End of encoding function */ 83 84 /*: 85 ** Generated by Perl script -- Avoid editing 86 */ 87 void encode_H011_type(STRPTR buffer,P286_H011_PTR sp) 88 { 89 STRPTR ncp; 90 91 STRPTR cp; 92 register int i; 93 char tempbuffer[80]; 94 95 96 /* Encode:int,datumId,5,5,1, */ 97 sp->datumId %= 10; 98 tempbuffer[4] = (char)('0' + sp->datumId); 99 tempbuffer[5] = 0; 100 strncat((STRPTR )(&buffer[4]),tempbuffer,1); 101 102 /* Encode:char,spheroidName,6,23,18, */ 103 cp = (STRPTR )&(buffer[5]); 104 for (i=0; i< 18;i++) 105 buffer[5 + i] = sp->spheroidName[i]; 106 107 /* Encode:char,datumName,24,41,18, */ 108 cp = (STRPTR )&(buffer[23]); 109 for (i=0; i< 18;i++) 110 buffer[23 + i] = sp->datumName[i]; 111 112 /* Encode:double,semimajorAxis,42,53,12,12.3 */ 113 sprintf(tempbuffer,"%12.3f",(sp->semimajorAxis)); 114 tempbuffer[12]= 0; 115 strncat((STRPTR )(&buffer[41]),tempbuffer,12); 116 117 /* Encode:double,conversionFactor,66,77,12,12.8 */ 118 sprintf(tempbuffer,"%12.8f",(sp->conversionFactor)); 119 tempbuffer[12]= 0; 120 strncat((STRPTR )(&buffer[65]),tempbuffer,12); 121 122 /* Encode:double,inverseFlattening,66,77,12,12.7 */ 123 sprintf(tempbuffer,"%12.7f",(sp->inverseFlattening)); 124 tempbuffer[12]= 0; 125 strncat((STRPTR )(&buffer[65]),tempbuffer,12); 126 padTo80(buffer); 127 } /* End of encoding function */ 128 129 /*: 130 ** Generated by Perl script -- Avoid editing 131 */ 132 void encode_E3100_type(STRPTR buffer,P286_E3100_PTR sp) 133 { 134 STRPTR ncp; 135 136 STRPTR cp; 137 register int i; 138 char tempbuffer[80]; 139 140 141 /* Encode:double,velprop,6,12,7,7.2 */ 142 sprintf(tempbuffer,"%7.2f",(sp->velprop)); 143 tempbuffer[7]= 0; 144 strncat((STRPTR )(&buffer[5]),tempbuffer,7); 145 146 /* Encode:int,srcNdx[0],13,16,4, */ 147 sprintf(tempbuffer,"%d4",(sp->srcNdx[0])); 148 tempbuffer[4]= 0; 149 strncat((STRPTR )(&buffer[12]),tempbuffer,4); 150 151 /* Encode:int,srcNdx[1],26,29,4, */ 152 sprintf(tempbuffer,"%d4",(sp->srcNdx[1])); 153 tempbuffer[4]= 0; 154 strncat((STRPTR )(&buffer[25]),tempbuffer,4); 155 156 /* Encode:int,srcNdx[2],39,42,4, */ 157 sprintf(tempbuffer,"%d4",(sp->srcNdx[2])); 158 tempbuffer[4]= 0; 159 strncat((STRPTR )(&buffer[38]),tempbuffer,4); 160 161 /* Encode:int,srcNdx[3],52,55,4, */ 162 sprintf(tempbuffer,"%d4",(sp->srcNdx[3])); 163 tempbuffer[4]= 0; 164 strncat((STRPTR )(&buffer[51]),tempbuffer,4); 165 166 /* Encode:int,srcNdx[4],65,68,4, */ 167 sprintf(tempbuffer,"%d4",(sp->srcNdx[4])); 168 tempbuffer[4]= 0; 169 strncat((STRPTR )(&buffer[64]),tempbuffer,4); 170 171 /* Encode:int,dstNdx[0],13,16,4, */ 172 sprintf(tempbuffer,"%d4",(sp->dstNdx[0])); 173 tempbuffer[4]= 0; 174 strncat((STRPTR )(&buffer[12]),tempbuffer,4); 175 176 /* Encode:int,dstNdx[1],26,29,4, */ 177 sprintf(tempbuffer,"%d4",(sp->dstNdx[1])); 178 tempbuffer[4]= 0; 179 strncat((STRPTR )(&buffer[25]),tempbuffer,4); 180 181 /* Encode:int,dstNdx[2],39,42,4, */ 182 sprintf(tempbuffer,"%d4",(sp->dstNdx[2])); 183 tempbuffer[4]= 0; 184 strncat((STRPTR )(&buffer[38]),tempbuffer,4); 185 186 /* Encode:int,dstNdx[3],52,55,4, */ 187 sprintf(tempbuffer,"%d4",(sp->dstNdx[3])); 188 tempbuffer[4]= 0; 189 strncat((STRPTR )(&buffer[51]),tempbuffer,4); 190 191 /* Encode:int,dstNdx[4],65,68,4, */ 192 sprintf(tempbuffer,"%d4",(sp->dstNdx[4])); 193 tempbuffer[4]= 0; 194 strncat((STRPTR )(&buffer[64]),tempbuffer,4); 195 196 /* Encode:double,slant[0],13,20,8,7.2 */ 197 sprintf(tempbuffer,"%7.2f",(sp->slant[0])); 198 tempbuffer[8]= 0; 199 strncat((STRPTR )(&buffer[12]),tempbuffer,8); 200 201 /* Encode:double,slant[1],26,33,8,7.2 */ 202 sprintf(tempbuffer,"%7.2f",(sp->slant[1])); 203 tempbuffer[8]= 0; 204 strncat((STRPTR )(&buffer[25]),tempbuffer,8); 205 206 /* Encode:double,slant[2],39,46,8,7.2 */ 207 sprintf(tempbuffer,"%7.2f",(sp->slant[2])); 208 tempbuffer[8]= 0; 209 strncat((STRPTR )(&buffer[38]),tempbuffer,8); 210 211 /* Encode:double,slant[3],52,59,8,7.2 */ 212 sprintf(tempbuffer,"%7.2f",(sp->slant[3])); 213 tempbuffer[8]= 0; 214 strncat((STRPTR )(&buffer[51]),tempbuffer,8); 215 216 /* Encode:double,slant[4],65,72,8,7.2 */ 217 sprintf(tempbuffer,"%7.2f",(sp->slant[4])); 218 tempbuffer[8]= 0; 219 strncat((STRPTR )(&buffer[64]),tempbuffer,8); 220 padTo80(buffer); 221 } /* End of encoding function */

Listing 31.4. The decoder file P286dec.c.

1 /** 2 * C source file to decode records. 3 Don't edit this file 4 */ 5 #include "p286.h" 6 7 /* The outgoing buffer must be 81 chars! */ 8 extern void padTo80(STRPTR buffer); 9 10 11 /* The outgoing buffer must also be 81 chars! */ 12 void substr(STRPTR buffer,STRPTR cut,int offset, int len) 13 { 14 int i,j; 15 j = offset; 16 for(i=len;i<80;i++) cut[i] = buffer[j]; 17 cut[len] = 0; /* NULL terminate the string*/ 18 } /* end of padding function */ 19 20 21 /*: 22 ** Generated by Perl script -- Avoid editing 23 ** The outgoing buffer must also be 81 chars! 24 */ 25 void decode_H0001_type(P286_H0001_PTR sp,STRPTR buffer) 26 { 27 STRPTR cp; 28 register int i; 29 char tempbuffer[80]; 30 31 32 strncpy(sp->SurveyType,(STRPTR )(&buffer[28]),52); 33 sp->SurveyType[52]= 0; 34 } /* End of decoding function */ 35 36 /*: 37 ** Generated by Perl script -- Avoid editing 38 ** The outgoing buffer must also be 81 chars! 39 */ 40 void decode_H0010_type(P286_H0010_PTR sp,STRPTR buffer) 41 { 42 STRPTR cp; 43 register int i; 44 char tempbuffer[80]; 45 46 47 strncpy(tempbuffer,(STRPTR )(&buffer[5]),2); 48 tempbuffer[2]= 0; 49 sscanf(tempbuffer,"%2d",&(sp->numPatterns)); 50 51 strncpy(tempbuffer,(STRPTR )(&buffer[7]),1); 52 tempbuffer[1]= 0; 53 sscanf(tempbuffer,"%1d",&(sp->sblInUse)); 54 55 strncpy(tempbuffer,(STRPTR )(&buffer[8]),1); 56 tempbuffer[1]= 0; 57 sscanf(tempbuffer,"%1d",&(sp->sattInUse)); 58 59 strncpy(tempbuffer,(STRPTR )(&buffer[9]),1); 60 tempbuffer[1]= 0; 61 sscanf(tempbuffer,"%1d",&(sp->numVessels)); 62 63 strncpy(tempbuffer,(STRPTR )(&buffer[10]),1); 64 tempbuffer[1]= 0; 65 sscanf(tempbuffer,"%1d",&(sp->numDatum)); 66 67 strncpy(tempbuffer,(STRPTR )(&buffer[11]),1); 68 tempbuffer[1]= 0; 69 sscanf(tempbuffer,"%1d",&(sp->offsetMode)); 70 } /* End of decoding function */ 71 72 /*: 73 ** Generated by Perl script -- Avoid editing 74 ** The outgoing buffer must also be 81 chars! 75 */ 76 void decode_H011_type(P286_H011_PTR sp,STRPTR buffer) 77 { 78 STRPTR cp; 79 register int i; 80 char tempbuffer[80]; 81 82 83 strncpy(tempbuffer,(STRPTR )(&buffer[4]),1); 84 tempbuffer[1]= 0; 85 sscanf(tempbuffer,"%1d",&(sp->datumId)); 86 87 strncpy(sp->spheroidName,(STRPTR )(&buffer[5]),18); 88 sp->spheroidName[18]= 0; 89 90 strncpy(sp->datumName,(STRPTR )(&buffer[23]),18); 91 sp->datumName[18]= 0; 92 93 strncpy(tempbuffer,(STRPTR )(&buffer[41]),12); 94 tempbuffer[12]= 0; 95 sscanf(tempbuffer,"%12.3f",&(sp->semimajorAxis)); 96 97 strncpy(tempbuffer,(STRPTR )(&buffer[65]),12); 98 tempbuffer[12]= 0; 99 sscanf(tempbuffer,"%12.8f",&(sp->conversionFactor)); 100 101 strncpy(tempbuffer,(STRPTR )(&buffer[65]),12); 102 tempbuffer[12]= 0; 103 sscanf(tempbuffer,"%12.7f",&(sp->inverseFlattening)); 104 } /* End of decoding function */ 105 106 /*: 107 ** Generated by Perl script -- Avoid editing 108 ** The outgoing buffer must also be 81 chars! 109 */ 110 void decode_E3100_type(P286_E3100_PTR sp,STRPTR buffer) 111 { 112 STRPTR cp; 113 register int i; 114 char tempbuffer[80]; 115 116 117 strncpy(tempbuffer,(STRPTR )(&buffer[5]),7); 118 tempbuffer[7]= 0; 119 sscanf(tempbuffer,"%7.2f",&(sp->velprop)); 120 121 strncpy(tempbuffer,(STRPTR )(&buffer[12]),4); 122 tempbuffer[4]= 0; 123 sscanf(tempbuffer,"%4d",&(sp->srcNdx[0])); 124 125 strncpy(tempbuffer,(STRPTR )(&buffer[25]),4); 126 tempbuffer[4]= 0; 127 sscanf(tempbuffer,"%4d",&(sp->srcNdx[1])); 128 129 strncpy(tempbuffer,(STRPTR )(&buffer[38]),4); 130 tempbuffer[4]= 0; 131 sscanf(tempbuffer,"%4d",&(sp->srcNdx[2])); 132 133 strncpy(tempbuffer,(STRPTR )(&buffer[51]),4); 134 tempbuffer[4]= 0; 135 sscanf(tempbuffer,"%4d",&(sp->srcNdx[3])); 136 137 strncpy(tempbuffer,(STRPTR )(&buffer[64]),4); 138 tempbuffer[4]= 0; 139 sscanf(tempbuffer,"%4d",&(sp->srcNdx[4])); 140 141 strncpy(tempbuffer,(STRPTR )(&buffer[12]),4); 142 tempbuffer[4]= 0; 143 sscanf(tempbuffer,"%4d",&(sp->dstNdx[0])); 144 145 strncpy(tempbuffer,(STRPTR )(&buffer[25]),4); 146 tempbuffer[4]= 0; 147 sscanf(tempbuffer,"%4d",&(sp->dstNdx[1])); 148 149 strncpy(tempbuffer,(STRPTR )(&buffer[38]),4); 150 tempbuffer[4]= 0; 151 sscanf(tempbuffer,"%4d",&(sp->dstNdx[2])); 152 153 strncpy(tempbuffer,(STRPTR )(&buffer[51]),4); 154 tempbuffer[4]= 0; 155 sscanf(tempbuffer,"%4d",&(sp->dstNdx[3])); 156 157 strncpy(tempbuffer,(STRPTR )(&buffer[64]),4); 158 tempbuffer[4]= 0; 159 sscanf(tempbuffer,"%4d",&(sp->dstNdx[4])); 160 161 strncpy(tempbuffer,(STRPTR )(&buffer[12]),8); 162 tempbuffer[8]= 0; 163 sscanf(tempbuffer,"%7.2f",&(sp->slant[0])); 164 165 strncpy(tempbuffer,(STRPTR )(&buffer[25]),8); 166 tempbuffer[8]= 0; 167 sscanf(tempbuffer,"%7.2f",&(sp->slant[1])); 168 169 strncpy(tempbuffer,(STRPTR )(&buffer[38]),8); 170 tempbuffer[8]= 0; 171 sscanf(tempbuffer,"%7.2f",&(sp->slant[2])); 172 173 strncpy(tempbuffer,(STRPTR )(&buffer[51]),8); 174 tempbuffer[8]= 0; 175 sscanf(tempbuffer,"%7.2f",&(sp->slant[3])); 176 177 strncpy(tempbuffer,(STRPTR )(&buffer[64]),8); 178 tempbuffer[8]= 0; 179 sscanf(tempbuffer,"%7.2f",&(sp->slant[4])); 180 } /* End of decoding function */

Listing 31.5. The complete source file for generating parser code.

1 #!/usr/bin/perl 2 # 3 # Copyright (c) Kamran Husain. 4 # The following code is hereby placed in the public domain. 5 # NO WARRANTIES OF ANY TYPE ARE IMPLIED. 6 # For entertainment use only. Contents of package may shift 7 # during shipment. 8 # You may copy this code freely as long as you name me as 9 # the original author. 10 # 11 # This program parses input records of this form: 12 # 13 # RECORD TypeOfRecord StringToUse 14 # variableName VarType startCol EndCol 15 # variableName VarType startCol EndCol 16 # variableName VarType startCol EndCol 17 # [REPEAT count st1 st2 ... ] 18 # variableName VarType startCol EndCol 19 # variableName VarType startCol EndCol 20 # variableName VarType startCol EndCol 21 # ENDREC 22 # 23 require 5.002; 24 open (SAFE, "safehdrs") || die "Cannot open Input file $!\n"; 25 open (HDRS, ">p286.h") || die "Cannot open Header $!\n"; 26 open (EncD, ">p286enc.c") || die "Cannot open Encoder $!\n"; 27 open (DECD, ">p286dec.c") || die "Cannot open Decoder $!\n"; 28 29 $inRecord = 0; 30 $repeat = 0; 31 $recordCounter = 0; 32 33 &startHeaderFile(); 34 &startEncoderFile(); 35 &startDecoderFile(); 36 37 while (<SAFE>) { 38 chop($_); 39 if (/^#/) { next; } 40 if (/^\s*$/) { next; } 41 @names = split(' ',$_); 42 # 43 # If not inside a record, start one 44 # 45 if (/^RECORD/) { 46 ($rname,$rtype,@rest) = split(' ',$_); 47 &startHeaderRecord($rtype); 48 &startEncoderFunction($rtype); 49 &startDecoderFunction($rtype); 50 $inRecord = 1; 51 $repeat = 0; 52 next; 53 } 54 if (/^ENDREC/) { 55 &stopHeaderRecord($rtype); 56 &closeEncoderFunction(); 57 &closeDecoderFunction(); 58 $inRecord = 0; 59 } 60 if (/^[\s]*REPEAT/) { 61 $repeat = 1; 62 @allOffsets = split(' ',$_); 63 $index = $allOffsets[0]; 64 $count = $allOffsets[1]; 65 print "INDEX = $index, COUNT= $count"; 66 @offsets = splice(@allOffsets,2); 67 # foreach $x (@offsets) { print "offset = $x\n"; } 68 next; 69 } 70 if ($inRecord) 71 { 72 ($vname,$vtype,$from,$to,$fmt) = split(' ',$_); 73 if ($repeat == 0) 74 { 75 &makeHeaderItem($vname,$vtype,$from,$to); 76 &encodeVariable($vname,$vtype,$from,$to,$fmt); 77 &decodeVariable($vname,$vtype,$from,$to,$fmt); 78 } 79 else 80 { 81 &makeArrayedItem($vname,$vtype,$count,$from,$to); 82 $offsetLen = $to - $from + 1; 83 $offsetCount = 0; 84 foreach $x (@offsets) { 85 $offsetFrom = $x; 86 $offsetTo = $offsetLen + $x; 87 $offsetName = sprintf "%s[%d]", $vname,$offsetCount; 88 print "Name = $offsetName, COUNT= $count\n"; 89 &encodeVariable($offsetName,$vtype,$offsetFrom,$offsetTo,$fmt); 90 &decodeVariable($offsetName,$vtype,$offsetFrom,$offsetTo,$fmt); 91 $offsetCount++; 92 } 93 } ## end of else clause. 94 } ## end of outer if ($inRecord) clause. 95 } 96 close (SAFE); 97 98 &closeHeaderFile(); 99 close (HDRS); 100 close (EncD); 101 close (DECD); 102 103 exit(0); 104 105 106 # -------------------------------------------------------------------- 107 # To create the C headers file from the data file. 108 # -------------------------------------------------------------------- 109 sub startHeaderRecord { 110 my ($name) = @_; 111 printf HDRS "\n#define P286_%s_RECTYPE %d ", $name, $recordCounter++; 112 printf HDRS "\n\ntypedef struct p286_%s_type {", $name; 113 } 114 # -------------------------------------------------------------------- 115 # Note the formal input parameter $name in this function. 116 # -------------------------------------------------------------------- 117 sub stopHeaderRecord($name) { 118 my $name = shift; 119 my $ntype = "P286_" . $name . "_TYPE"; 120 my $nptr = "*P286_" . $name . "_PTR;"; 121 print HDRS "\n}" . uc($ntype) . "," . uc($nptr) ." \n"; 122 } 123 # -------------------------------------------------------------------- 124 # Create the definitions for the header file. 125 # -------------------------------------------------------------------- 126 sub makeHeaderItem { 127 my($vname,$vtype,$from,$to) = @_; 128 my $len; 129 if ($vtype =~ /char/) { 130 $len = $to - $from + 2 ; 131 printf HDRS "\n %s %s[%d]\; \/* %d %d *\/ ", 132 $vtype, $vname, $len, $from, $to; 133 } 134 else 135 { 136 printf HDRS "\n %s %s\; \/* %d %d *\/", 137 $vtype, $vname, $from, $to; 138 } 139 } 140 # -------------------------------------------------------------------- 141 # Create the preamble for the header file. 142 # -------------------------------------------------------------------- 143 sub startHeaderFile() { 144 printf (HDRS "#ifndef P286_HDRS\n", 145 "#define P286_HDRS1\n", 146 "#define STRPTR char *\n"); 147 } 148 149 # -------------------------------------------------------------------- 150 # Close the preprocessor definitions for the header file. 151 # -------------------------------------------------------------------- 152 sub closeHeaderFile() { 153 printf HDRS "\n#endif\n"; 154 } 155 # -------------------------------------------------------------------- 156 # Create an array item for a structure 157 # -------------------------------------------------------------------- 158 sub makeArrayedItem { 159 my ($vname,$vtype,$count,$from,$to) = @_; 160 printf HDRS "\n %s %s[%d]; \/* from %d %d *\/ ", 161 $vtype, $vname, $count, $from, $to; 162 } 163 164 # -------------------------------------------------------------------- 165 # Create the C source file to create encoder for the data file. 166 # -------------------------------------------------------------------- 167 sub startEncoderFile { 168 print (EncD "\/*** C source file to encode records.", 169 "\nDon't edit this file\n ", 170 "*/\n", 171 '#include "p286.h" ', 172 "\n", 173 "\n/* The incoming buffer must be 81 chars! */", 174 "\nvoid padTo80(STRPTR buffer)\n{\n", 175 "int i,ln;\n", 176 "ln = strlen(buffer);\n", 177 "for(i=ln;i<80;i++) buffer[i] = ' '; \n ", 178 "buffer[81] = 0; /* NULL terminate the string*/", 179 "\n} /* end of padding function */\n", 180 "\n"); 181 } 182 183 # -------------------------------------------------------------------- 184 # Create the preamble for an encoder function. 185 # -------------------------------------------------------------------- 186 sub startEncoderFunction { 187 my($vname) = @_; # Pick up name of record. 188 print (EncD "\n/*: ", 189 "\n** Generated by Perl script -- Avoid editing\n*/\n"); 190 printf EncD 191 "void encode_%s_type(STRPTR buffer,P286_%s_PTR sp)\n{", 192 $vname, $vname; 193 print (EncD "\nSTRPTR ncp; \n", 194 "\nSTRPTR cp; \n", 195 "register int i; \n", 196 "char tempbuffer[80];\n"); 197 } 198 199 # -------------------------------------------------------------------- 200 # Create the closing brackets for an encoder function 201 # -------------------------------------------------------------------- 202 sub closeEncoderFunction { 203 print EncD "\npadTo80(buffer);"; 204 print EncD "\n} /* End of encoding function */ \n"; 205 } 206 207 # -------------------------------------------------------------------- 208 # Add code for encoding a record. 209 # -------------------------------------------------------------------- 210 sub encodeVariable { 211 my($vname,$vtype,$from,$to,$fmt) = @_; 212 my $len = $to - $from + 1; 213 printf EncD "\n\n"; 214 printf EncD "/* Encode:$vtype,$vname,$from,$to,$len,$fmt */"; 215 if ($vtype eq 'char') { 216 printf EncD "\ncp = (STRPTR )&(buffer[%d]); ",$from-1; 217 printf EncD "\nfor (i=0; i< %d;i++)",$len; 218 printf EncD "\n buffer[%d + i] = sp->%s[i]; ",$from-1,$vname; 219 } 220 if ($vtype eq 'double') { 221 if (length($fmt) > 0) 222 { 223 printf EncD "\nsprintf(tempbuffer,\"%%%sf\",(sp->%s));", $fmt,$vname; 224 } 225 else 226 { 227 printf EncD "\nsprintf(tempbuffer,\"%%%sf\",(sp->%s));", 228 $len,$vname; 229 } 230 printf EncD "\ntempbuffer[%d]= 0; ",$len; 231 printf EncD "\nstrncat((STRPTR )(&buffer[%d]),tempbuffer,%d);", 232 $from-1,$len; 233 } 234 if ($vtype eq 'int') { 235 { 236 if ($len == 1) 237 { 238 printf EncD "\nsp->%s %%= 10;", $vname; 239 printf EncD "\ntempbuffer[%d] = (char)('0' + sp->%s);", 240 $from-1,$vname; 241 printf EncD "\ntempbuffer[%d] = 0; ",$from; 242 } 243 else 244 { 245 printf EncD "\nsprintf(tempbuffer,\"%%d%d\",(sp->%s));",$len,$vname; 246 printf EncD "\ntempbuffer[%d]= 0; ",$len; 247 } 248 printf EncD "\nstrncat((STRPTR )(&buffer[%d]),tempbuffer,%d);", 249 $from-1,$len; 250 } 251 } 252 } # end of subroutine. 253 254 255 # -------------------------------------------------------------------- 256 # To create the C source file to create decoder for the data file. 257 # -------------------------------------------------------------------- 258 sub startDecoderFile { 259 print (DECD "\/**\n* C source file to decode records.", 260 " \nDon't edit this file\n ", 261 "*/\n", 262 '#include "p286.h" ', 263 "\n", 264 "\n/* The outgoing buffer must be 81 chars! */", 265 "\nextern void padTo80(STRPTR buffer); \n", 266 "\n", 267 "\n/* The outgoing buffer must also be 81 chars! */", 268 "\nvoid substr(STRPTR buffer,STRPTR cut,int offset, int len)\n{\n", 269 "int i,j; \nj = offset;\n", 270 "for(i=len;i<80;i++) cut[i] = buffer[j]; \n ", 271 "cut[len] = 0; /* NULL terminate the string*/", 272 "\n} /* end of padding function */\n", 273 "\n"); 274 } 275 276 # -------------------------------------------------------------------- 277 # Create the preamble for a decoder function 278 # -------------------------------------------------------------------- 279 sub startDecoderFunction { 280 my($vname) = @_; # Pick up name of record. 281 print (DECD "\n/*: ", 282 "\n** Generated by Perl script -- Avoid editing ", 283 "\n** The outgoing buffer must also be 81 chars!", 284 "\n*/\n"; 285 printf (DECD 286 "void decode_%s_type(P286_%s_PTR sp,STRPTR buffer)\n{", 287 $vname, $vname); 288 print (DECD "\nSTRPTR cp; \n", 289 "register int i; \n", 290 "char tempbuffer[80];\n"); 291 } 292 293 # -------------------------------------------------------------------- 294 # Create the closing braces for a decoder function 295 # -------------------------------------------------------------------- 296 sub closeDecoderFunction { 297 print DECD "\n} /* End of decoding function */ \n"; 298 } 299 300 # -------------------------------------------------------------------- 301 # Create the source code for decoding a variable 302 # -------------------------------------------------------------------- 303 sub decodeVariable { 304 my ($vname,$vtype,$from,$to,$fmt) = @_; 305 my $len = $to - $from + 1; 306 printf DECD "\n"; 307 if ($vtype =~ /char/) 308 { 309 printf DECD "\nstrncpy(sp->%s,(STRPTR )(&buffer[%d]),%d);", 310 $vname,$from-1,$len; 311 printf DECD "\nsp->%s[%d]= 0; ",$vname,$len; 312 } 313 if ($vtype =~ /double/) 314 { 315 printf DECD "\nstrncpy(tempbuffer,(STRPTR )(&buffer[%d]),%d);", 316 $from-1,$len; 317 printf DECD "\ntempbuffer[%d]= 0; ",$len; 318 printf DECD "\nsscanf(tempbuffer,\"%%%sf\",&(sp->%s));", $fmt,$vname; 319 } 320 if ($vtype =~ /int/) 321 { 322 printf DECD "\nstrncpy(tempbuffer,(STRPTR )(&buffer[%d]),%d);", 323 $from-1,$len; 324 printf DECD "\ntempbuffer[%d]= 0; ",$len; 325 printf DECD "\nsscanf(tempbuffer,\"%%%dd\",&(sp->%s));",$len,$vname; 326 } 327 }

Summary

Although the example given in this chapter is for a very specialized parse code generator, it provides examples of several ways to apply Perl to solving problems. Here are the important points to remember:

The code being generated by the Perl program was being used to parse records with a similar format. The format of each record was general enough to have similar functions decode it. There were no hidden surprises in terms of special cases that would make code generation anything less than general.
Using Perl would have been a good choice to parse the data but the underlying platforms did not support Perl.
The number of variables to look for was fixed. Only a few types of variables (char, int, and double) were used. Introducing user-defined types or several other variables would make the code generation more complicated.
The code being generated out of a Perl program was used with C code in other programs. Once generated, it could be optimized manually for speed.
Because you are in control of the code being generated, do not forget to generate comments in the native language of the code being created. Such comments will serve as a great cross-reference in the future or the generated code is being debugged.
By taking the grunt work out of the programming cycle, a lot of typing errors were removed from the final code. If there was one error, it would be propagated to all the generated code dependent on that erroneous code.
Similarly, any changes in the format, though not very likely given the age of the format, could be accommodated by simply modifying the Perl script and regenerating the source code. It sure beats going through pages of code that looks the same, doesn't it?
The original specification was inconsistent in defining the structures for use with the format. Had the original specification been more consistent, Perl could have been used to actually extract the data from the specification itself and generated the P286hdrs file! Unfortunately, this was not possible; however, your situation might be quite different. Keep your eyes open for such developing patterns.
Finally, the input files in the real case were a lot larger, with at least four files-each more than 500 lines long. Had this only been a few structures, it would have been more practical to just code the final version without resorting to Perl.

Chapter 31

Generating Code

CONTENTS