## 3. Example of a SURVO 84C module

The idea and practice of making SURVO 84C modules is first illustrated by an example. To save space and to highlight the main principles, we shall describe coding of a simple module for calculating weighted means from statistical data.
Usually it is good to start by making a synopsis from the user's point of view and imagine how the things should look if we already had the new operation. In this case we could type following text in the edit field:

``````13  1 SURVO 84C EDITOR Wed Feb 15 11:46:19 1989         D:\C\PROG\ 100 100 0
1 *SAVE TEST1
2 *
3 *Here is our data set:
4 *DATA TEST
5 *Name     Sex   Test1   Test2   Test3
6 *Karen     F     1.45    3.46     5
7 *Charles   M     3.22    2.43     3
8 *Anthony   M     5.00    3.27     2
9 *Lisa      F    -0.76    4.03     3
10 *Mike      M     1.37    1.88     3
11 *William   M     4.65    -        2
12 *Ann       F     2.16    4.98     2
13 *
14 *MASK=--AAW   / to indicate selection of variables (columns)
15 *CASES=Sex:M  / to indicate selection of observations (lines)
16 *
17 *MEAN TEST,19_
18 *
19 * Means of variables in TEST N=4 Weight=Test3
20 * Variable     Mean     N(missing)
21 * Test1     3.307000       0
22 * Test2     2.433750       1
23 *
``````

Here we have a small application where the data set is on edit lines 4-12, the MEAN operation on line 17 and results (which we hope to receive after activation of the MEAN line) on lines 19-22.

We assume that the MEAN operation has the following syntax:

``````MEAN <SURVO_84C_data>,<first_line_for_the_results>
``````

To select variables and observations, we have used two extra specifications (on lines 14-15). There `MASK=--AAW` selects only columns #3 and #4 `(Test1,Test2)` for the analysis and column #5 `(Test3)` is used as a weight variable. `CASES=Sex:M` indicates that only observations with `Sex=M` are selected.

We shall see that there will be still more options available if the MEAN module is written according to the standards of SURVO 84C, and all this is achieved with a minimal effort by using ready-made tools of the SURVO 84C libraries.

It should also be noted that the structure of more complicated modules does not differ from that of this example.

The !MEAN module has only one compiland and its main function is listed below in several parts. The line numbers have been added for easier reference.

``````  1 /* !mean.c 21.2.1986/SM (19.3.1989)
2 */
3
4 #include <stdio.h>
5 #include <stdlib.h>
6 #include <conio.h>
7 #include <malloc.h>
8 #include "survo.h"
9 #include "survoext.h"
10 #include "survodat.h"
11
12 SURVO_DATA d;
13 double *sum;       /* sums of active variables */
14 long   *f;         /* frequencies */
15 double *w;         /* sums of weigths */
16
17 long n;
18 int weight_variable;
19 int results_line;
20
21 main(argc,argv)
22 int argc; char *argv[];
23         {
24         int i;
25
26         if (argc==1)
27             {
28             printf("This program can be used as a SURVO 84C module only.");
29             return;
30             }
31         s_init(argv[1]);
32         if (g<2)
33             {
34             init_remarks();
35             rem_pr("MEAN <data>,<output_line>        / S.Mustonen 4.3.1989");
36             rem_pr("computes means of active variables. Cases can be limited");
37             rem_pr("by IND and CASES specifications. The observations can be");
38             rem_pr("weighted by a variable activated by 'W'.");
39             wait_remarks(2);
40             return;
41             }
42         results_line=0;
43         if (g>2)
44             {
45             results_line=edline2(word[2],1,1);
46             if (results_line==0) return;
47             }
48         i=data_open(word[1],&d); if (i<0) return;
49         i=sp_init(r1+r-1); if (i<0) return;
50         i=mask(&d); if (i<0) return;
51         weight_variable=activated(&d,'W');
52         i=test_scaletypes(); if (i<0) return;
53         i=conditions(&d); if (i<0) return;  /* permitted only once */
54         i=space_allocation(); if (i<0) return;
55         compute_sums();
56         printout();
57         free(sum); free(f); free(w);
58         data_close(&d);
59         }
``````

Among the include lines, 8-10 refer to special SURVO 84C include files. Lines 8-9 should always be present in modules. Line 10 (`survodat.h`) is needed especially in those modules where SURVO 84C data sets and data files are employed.

Line 12 declares the `SURVO_DATA` structure `d` which may represent either a data set in the edit field (as DATA TEST in our example) or a SURVO 84C data file or part of it or even a matrix file. The writer of the module has no need to know the actual form of the data set. By using the tools provided by the SURVO 84C library (like `data_open` on line 48), all these alternatives can be handled similarly. In rare cases where a distinction has to be made, the `d.type` member of the `SURVO_DATA` structure `d` gives the type of the data set at hand.
On lines 13-15, pointers to various arrays used in MEAN are declared. In order to make the modules general and flexible, we avoid fixed limits in arrays. Therefore all arrays whose sizes depend on application (like number of variables in the analysis) should be defined dynamically. This is done by using the standard space allocation function `malloc`. It has been employed here for all space reservations through the `space_allocation` call on line 54.
Finally, before the main function starts, certain global variables are declared on lines 17-19. To shorten the function calls, we usually prefer using static variables.

When calling the !MEAN module as a child process, the main program of SURVO 84C passes only one parameter (address of the pointer to the array of system pointers as a string). In the main function of !MEAN this parameter (`argv[1]`) is needed in the `s_init` call (line 31). It declares all important SURVO 84C system parameters and variables for !MEAN. Thereafter writing of code in !MEAN is like making more functions for the main program.

However, before the `s_init` call, lines 26-30 are given in order to prevent misuse of !MEAN (direct call of !MEAN from the MS-DOS level).
After the `s_init` call we have, for example, `r`=current line on the screen and `r1`=first visible edit line on the screen. Hence `r1+r-1` is the current (activated) edit line. See the library reference of s_init for the the complete list of system variables which are initialized by `s_init`.
The `s_init` function also analyzes the edit line `(MEAN TEST,19`) which was activated by the user and splits it into parts `word[0]="MEAN"`, `word[1]="TEST"` and `word[2]="19"` giving the total number of `words' found as `g`. (In this case `g`=3).

Lines 32-41 are for testing the completeness of the user's call. Observe that `MEAN TEST` without an edit line for the results is allowed and thus only the case (`g`<2) (mere `MEAN` activated) leads to an error message.
In such a case, the standard modules typically give a short notice of their usage like "Usage: MEAN <data>, L" and the user can get more information by consulting the inquiry system of SURVO 84C.
On a new module written by the user, the inquiry system cannot provide any information. Therefore it is important to give longer explanations telling all essential features. This should be done with functions `init_remarks`, `rem_pr`, and `wait_remarks` as shown on lines 32-41. These functions emulate the behaviour of the inquiry system. For example, the user can load the explanations appearing on the screen to the edit field.

The next section in the main function (lines 42-47) deals with output in the edit field. As pointed out earlier, the line label (or number) for the results in the edit field may be omitted (case `results_line=0`). If the line for the results is given (i.e. `g`>2), it is found by the SURVO 84C library function `edline2` (line 45). If no edit line corresponding to the user's command is found, `edline2` gives an error message and returns 0 instead of the line number.

Line 48 ```i=data_open(word[1],&d); if (i<0) return; ```opens the data set and initializes several variables (members of structure `SURVO_DATA d`) describing the size and the structure of the data set. For example, we have the following information readily available for the subsequent processing:
 `d.m` # of variables in data (type int) `d.m_act` # of active variables (int) `d.n` # of observations in data (long) `d.l1` first active observation (long) `d.l2` last active observation (long) `d.varname[0], ..., d.varname[d.m-1]` names of variables (char **) `d.vartype[0], ..., d.vartype[d.m-1]` types of variables (char **) byte 0: type 1,2,4,8 or S byte 1: activation byte 2: protection byte 3: scale type byte 4-: other mask bytes `d.v[0], ..., d.v[d.m_act-1]` indices of the active variables (int *)

If the data is not available, `data_open` displays an error message and returns -1. In that case there is an immediate return to the main program of SURVO 84C.

In SURVO 84C, the operations are not only controlled by parameters written on the activated line (like `TEST` and `19` in our example), but the modules can also be guided by using various specifications written around the activated line anywhere in the edit field. In our example, such specifications are `MASK=--AAW` and `CASES=Sex:M` .
To take their effects into consideration, we must first read all the specifications written in the current edit field. This happens by calling the `sp_init` function once (line 49: `sp_init(r1+r-1);`) where the argument refers to the line currently activated. It implies `sp_init` to look for specifications primarily around that line. Later the `spfind` function is called repeatedly to find specifications from a list generated by `sp_init`.
The `mask` function (on line 50) has the task of analysing the VARS specification (or if it does not appear, the MASK specification) through the `spfind` function. If VARS or MASK exists, `mask` corrects the activation status of each variable accordingly. If VARS (MASK) is not given, the status of the data set itself determines which are active variables.

Line 51 checks whether any of the variables in the data set have been activated by ``W`' (using the `activated` function). If such a variable is found (as `Test3` in our example) the index of that variable is returned and it serves as a weight variable in the computations. Otherwise `activated` returns -1.

One of the unique features of SURVO 84C is the possibility to assess the validity of various statistical methods by checking the scale types of variables. Scale types can be declared for variables in data files only. The user has the freedom to use or not to use this facility. The `test_scaletypes` call on line 52 does the job in a positive case.
The observations may be restricted by the CASES and IND specifications. The `conditions` function (called on line 53) tests that those specifications, if used at all, are written correctly and initializes system variables which are used for scanning data during the computation (through a function called `unsuitable`).
After these preliminary checks, we are ready to allocate space for frequencies, sums of weights and weighted sums of observations. The dimension of these arrays must be `d.m_act`. This happens by calling `space_allocation` (line 54).
If the space is succesfully allocated (there is no negative response), the actual computations can start (`compute_sums`) and the results are printed (`printout`).
Finally (on lines 57-58), the allocated space is freed and the data set closed before returning to the main program of SURVO 84C and to the normal editing mode.

Most of the functions called by the main function of !MEAN are either in the Microsoft C run-time library or in the SURVO 84C libraries. The descriptions of the SURVO 84C library functions will be given later in this paper.
There are only 4 functions called in the main function being specific for the !MEAN module, namely `test_scaletypes`, `space_allocation`, `compute_sums`, and `printout`. Since !MEAN is a very small module, all of them are in the same compiland together with the main function.

The `test_scaletypes` function has the following form:

`````` 61 test_scaletypes()
62         {
63         int i,scale_error;
64
65         scales(&d);
66         if (weight_variable>=0)
67             {
68             if (!scale_ok(&d,weight_variable,RATIO_SCALE))
69                 {
70                 sprintf(sbuf,"\nWeight variable %.8s must have ratio scale!",
71                           d.varname[weight_variable]); sur_print(sbuf);
72                 WAIT; if (scale_check==SCALE_INTERRUPT) return(-1);
73                 }
74             }
75         scale_error=0;
76         for (i=0; i<d.m_act; ++i)
77             {
78             if (!scale_ok(&d,d.v[i],SCORE_SCALE))
79                 {
80                 if (!scale_error)
81                     sur_print("\nInvalid scale in variables: ");
82                 scale_error=1;
83                 sprintf(sbuf,"%.8s ",d.varname[d.v[i]]); sur_print(sbuf);
84                 }
85             }
86         if (scale_error)
87             {
88             sur_print("\nIn MEAN score scale at least is expected!");
89             WAIT; if (scale_check==SCALE_INTERRUPT) return(-1);
90             }
91         return(1);
92         }
``````

The task of this function is to check the scale types of variables selected for the analysis. In small data sets written in the edit field, the scale types of the variables (columns) cannot be given and then no checks are performed; `test_scaletypes` will simply return 1 which means that everything is OK. However, in data sets saved in SURVO 84C data files, each variable can be labelled with a one character label (mask column #3) which tells the scale type. For example, variables with a ratio scale are labelled with ``R`' (discrete) or with ``r`' (continuous) or with ``F`' (variable is a frequency). If the user omits these labels (each scale label is then `  '), SURVO 84C will skip all scale checks.
In any case, at first the `scales` function is called to remove variables which have the scale type label ``-`', which means that the variable in question has no scale at all. For example, `names' and `addresses' are typically variables (fields) without a scale. Of course, a careful user does not select such variables for computations, but it is safer to have an extra check by the `scales` function in order to avoid harmful consequences.
On lines 66-74 the program tests the scale of the weight variable (if it is used). It is done by using the `scale_ok` function which is set to require `RATIO_SCALE` for the weight variable. `RATIO_SCALE` is a predefined (in `survodat.h`) string constant `"`  `RrF"` telling the permitted scale type alternatives.
If the scale is not OK, an error message is displayed (on lines 70-71). The continuation depends on the value of the SURVO 84C system parameter `scale_check`. This parameter can be set to 0, 1 or 2 by the user where 0 means that `scale_ok` always returns 1 and no warning error messages are given, i.e. everything is accepted. The value `scale_check`=1 implies that messages are given as warnings, but the analysis can be continued. At the strictest level (value `SCALE_INTERRUPT`=2) the process is actually interrupted as we can see on line 72.
The remaining lines of `test_scaletypes` are devoted to corresponding checks for active variables which now should have a `SCORE_SCALE` at least. See how the `d.v[]` array selects the `d.m_act` variables from all `d.m` variables. (In our example `d.m`=5, `d.m_act`=3 and `d.v[0]`=2, `d.v[1]`=3, `d.v[2]`=4.)

The error messages and warnings are given by producing an output string by the standard `sprintf` function (usually to a global buffer `sbuf` of max. 256 characters) and then yielding the output by `sur_print(sbuf)`.

The next function to be introduced is `space_allocation`:

`````` 94 space_allocation()
95         {
96         sum=(double *)malloc(d.m_act*sizeof(double));
97         if (sum==NULL) { not_enough_memory(); return(-1); }
98         f=(long *)malloc(d.m_act*sizeof(long));
99         if (f==NULL) { not_enough_memory(); return(-1); }
100         w=(double *)malloc(d.m_act*sizeof(double));
101         if (w==NULL) { not_enough_memory(); return(-1); }
102         return(1);
103         }
104
105 not_enough_memory()
106         {
107         sur_print("\nNot enough memory! (MEAN)");
108         WAIT;
109         }
``````

This function allocates memory for arrays `sum`, `f` and `w`, which all should have `d.m_act` elements.
It is strongly recommended to use dynamic memory allocation for all working space which is dependent on the size of the data set. Then no theoretical limits appear for the number of variables, etc. In practice there are always some limits. On the 16 bit micros we typically have still the 64KB limit for a single array unless the huge memory model is used.
Since errors in memory allocation may have very surprising consequences, it is, of course, possible to start with fixed dimensions and later when all the space requirements are clear, dynamic arrays are established.
For example, the lines 13-16 in the main function could read:

`````` 13 #define MAX 100
14 double sum[MAX];       /* sums of active variables */
15 long f[MAX];           /* frequencies */
16 double w[MAX];         /* sums of weights */
``````

and `space_allocation` is not needed at all, but this should be a temporary arrangement only.

The data set will be scanned by the `compute_sums` function:

``````111 compute_sums()
112         {
113         int i;
114         long l;
115
116         n=0L;
117         for (i=0; i<d.m_act; ++i)
118             { f[i]=0L; w[i]=0.0; sum[i]=0.0; }
119
120         sur_print("\n");
121         for (l=d.l1; l<=d.l2; ++l)
122             {
123             double weight;
124
125             if (unsuitable(&d,l)) continue;
126             if (weight_variable==-1) weight=1.0;
127             else
128                 {
130                 if (weight==MISSING8) continue;
131                 }
132             ++n;
133             sprintf(sbuf,"%ld ",l); sur_print(sbuf);
134             for (i=0; i<d.m_act; ++i)
135                 {
136                 double x;
137
138                 if (d.v[i]==weight_variable) continue;
140                 if (x==MISSING8) continue;
141                 ++f[i]; w[i]+=weight; sum[i]+=weight*x;
142                 }
143             }
144         }
``````

At first, the work space is cleared (lines 116-118) and then the rest of the function consists of a loop for active observations (from `d.l1` to `d.l2`). In this loop the function `unsuitable` checks (line 125) whether the conditions (set by `conditions` in the main module) are met in the current observation `j`. If not, the rest of the loop is skipped.
If the observation is accepted, first the value of the possible weight variable is read by the `data_load` function (line 129). If `weight` is missing (line 130), the rest of the loop is skipped. If there is no weight variable, `weight=1.0` is selected (line 126).
Thereafter the number of cases `n` is increased by one and the order of the current observation is displayed on the screen to indicate that the run is going on (lines 132-133).
In the inner loop (lines 134-142) all the active variables are scanned and the cumulative sums updated. However, the weight variable is skipped (on line 138). Similarly, possible missing values of active variables are omitted. By comparing `n` to `f[i]` we can see the number of missing observations in each variable separately.

The final task of the !MEAN module is to give the results by calling the `printout` function:

``````146 printout()
147         {
148         int i;
149         char line[LLENGTH];
150         char mean[32];
151
152         output_open(eout);
153         sprintf(line," Means of variables in %s N=%ld%c",
154                           word[1],n,EOS);
155         if (weight_variable>=0)
156             {
157             strcat(line," Weight=");
158             strncat(line,d.varname[weight_variable],8);
159             }
160         print_line(line);
161         strcpy(line," Variable     Mean     N(missing)");
162         print_line(line);
163         for (i=0; i<d.m_act; ++i)
164             {
165             if (d.v[i]==weight_variable) continue;
166             if (w[i]==0.0)
167                 sprintf(line," %-8.8s            -  %6ld",d.varname[d.v[i]],
168                          n-f[i]);
169             else
170                 {
171                 fnconv(sum[i]/w[i],accuracy+2,mean);
172                 sprintf(line," %-8.8s %s  %6ld",d.varname[d.v[i]],
173                              mean,n-f[i]);
174                 }
175             print_line(line);
176             }
177         output_close(eout);
178         }
179
180 print_line(line)
181 char *line;
182         {
183         output_line(line,eout,results_line);
184         if (results_line) ++results_line;
185         }
``````

At first the output file/device `eout` is opened by the `output_open` function. Thereafter lines can be written to `eout` by the `output_line` function (called in the function `print_line` on line 183). The lines are appended to the file. So no previous results are overwritten.
The SURVO 84C library function `output_line` writes also lines in the current edit field provided that the third argument (here `results_line`) gives a valid line number. Remember that the first line for the results was optional in the MEAN operation and we set `results_line=0` (on line 42) if that line label was missing.
`print_line` (lines 180-185) is only an auxiliary function to keep an eye on the current output line in the edit field.
It is a practice in SURVO 84C that the numerical accuracy of the printed numbers can be controlled by the user. This happens by using the system parameter `accuracy` (typically set to the value 7 in SURVO.APU) which gives the desired number of significant digits and such. The writers of the modules must take the current value of `accuracy` into account when selecting the printout parameters. The library function `fnconv` is often useful in this task. Here (on line 171) it formats the means. `accuracy+2` gives the total length of the resulting string `mean`; we must have one extra place for sign and one for the decimal point.

These 185 lines constitute the whole !MEAN module in its source form. Since several library functions were employed and there are many `hidden' or optional properties included, the total amount of code after compiling and linking is about 60KB. However, if the module grows, the actual code size is not growing proportionally. For example, !MEAN can be considered a tiny special case of the !CORR module which computes standard deviations and correlations in addition to means, but the size of !CORR is only 6KB more than the size of !MEAN. Thus it is profitable to create modules with several tasks and options.

All SURVO 84C compilands of SURVO 84C modules have to be compiled in the large memory model because the SURVO 84C libraries (`SURVO.LIB`, `SURVOMAT.LIB`, etc.) are available in this model only. Thus, the `!MEAN.C` file is compiled by the command

``````   CL /c /AL !MEAN.C
``````
and it is linked by
``````   LINK !MEAN,,NUL.MAP,SURVO /STACK:4000 /NOE .
``````

!MEAN was made and presented only for illustration. Source codes for selected true SURVO 84C modules are available separately.

Each module (as an `.EXE` file) is normally saved in the SURVO 84C system directory (typically `C:\E`) and activated by the user as `MEAN`. During the testing stage, it can be activated from any disk or path. For example, if `!MEAN.EXE` is on the disk `A:`,

``````   A:!MEAN DATA1,11
``````
is a valid command in SURVO 84C.

Previous: SURVO 84C processes
Next: Edit field

Front page of Programming Survo in C