This helpfile describes the various identifiers within                  
        P-STAT's TURF command.                                                  
        This helpfile was last updated on April 8, 2013.                        
              *  required identifiers;        *                                 
              *  there must be an input file  *                                 
        TURF xxx,        this supplies the input filename.                      
                         Except for an optional weight variable,                
                         all variables are treated as analysis items.           
                         The values on the analysis items should                
                         be zeros or positive numbers.                          
                         A positive value signifies a "hit".                    
                         Cases with any missing or negative values on           
                         the analysis items are ignored.                        
                         The SET.MISS.TO.ZERO identifier, described             
                         below, sets missing analysis items to zeros.           
                         When case weighting is being used, any cases           
                         with a missing, negative or zero value on              
                         the weight variable are also ignored.                  
              *  required identifiers, SIZE or STEP:      *                     
              *  what size combinations should be tried,  *                     
              *  and should the processing be stepwise ?  *                     
        SIZE 6,                                                                 
        SIZE 4 to 7  9,                                                         
        SIZE 6 to 3,                                                            
        SIZE 4 6 8,                                                             
        STEP 8 12,                                                              
        STEP 1 TO 15,                                                           
                         Either SIZE or STEP must be used.                      
                         They provide the size (or sizes) of                    
                         the item combinations to be evaluated,                 
                         and indicate if the run should be stepwise.            
                         SIZE, followed by one or more sizes,                   
                         causes a separate, independent run                     
                         for each given size. This evaluates all                
                         possible combinations for each size.                   
                         One or more sizes can be done in one run.              
                         They are done in the order given.                      
                         STEP, followed by two or more sizes,                   
                         processes the sizes in a stepwise manner.              
                         The sizes must ascend.                                 
                         The first step size is done normally, i.e.,            
                         all possible combinations are evaluated,               
                         exactly like a SIZE run.                               
                         Then, the items in its best combination for            
                         reach (or optionally for frequency)                    
                         are fed into the second size, and so on.               
                         A STEP run is usually much faster than a SIZE          
                         run because it is trying fewer combinations.           
                         However, it will probably not provide the              
                         absolutely best combination.                           
              *  STEP details  *                                                
                         STEP 8 12,  for example, does a size 8 run,            
                         and then forces the items from the best                
                         size 8 combination into a size 12 run.                 
                         STEP 1 TO 9,   creates a cascading effect              
                         in the specified results file.                         
                         This type of STEP run takes very little time.          
                         Either a reach.results file or a freq.results          
                         file (but not both) must be requested                  
                         in a STEP run.                                         
                         Providing a reach.results output file                  
                         indicates that the stepping should maximize            
                         the reach scores. Therefore, the items in              
                         the best combination on reach are forced               
                         into the next step.                                    
                         Similarly, providing a freq.results output             
                         file indicates that the stepping should                
                         maximize the frequency scores.                         
                         Trade-offs. Suppose there are 32 items.                
                         A SIZE 12 run examines all 225,792,840                 
                         combinations and by so doing finds the                 
                         truely best combination of size 12.                    
                         In a STEP 8 12 run, a full size 8 step is done         
                         first; it examines all 10,518,300 combinations.        
                         The next step forces the best 8 items, so it           
                         is doing 4 things out of 24, which is 10,626           
                         Thus, the STEP 8 12 run takes only 4.7% of             
                         the time of the full SIZE 12 run.                      
                         It should be a good result, close to the best,         
                         but cannot be guaranteed to the THE best.              
                         In general, the larger the initial step,               
                         the better, given that the running time                
                         is tolerable. STEP 9 12 could well yield               
                         a better result than STEP 8 12.                        
                         The final report shows the result for each             
                         size separately. The output files show the             
                         best results from the first size, then the             
                         second, and so forth.                                  
                         Many (up to 40) sizes can be done in a run;            
                         each size must be from 1 to 60, and there              
                         should not be any repeated sizes in a run.             
                         Note...some sizes cannot be run in a                   
                         reasonable amount of time. Consider 40 items.          
                         Depending on number of cases and on options:           
                         Size  4 takes 91,390      iterations. Seconds.         
                         Size  6 takes 3.8 million iterations. Minutes.         
                         Size 10 takes 847 million iterations. An hour.         
                         Size 15 takes  40 billion iterations. A day.           
                         Size 20 takes 137 billion iterations. A week.          
                         This command produced the above numbers.               
                         DO  #j = 1, 20;                                        
                         PUT @commas #j (combinations( 40,#j));                 
                         ENDDO $                                                
                         On PC/Windows, the F2 key can be used to               
                         cause a TURF command to abandon the current            
                         size being processed.                                  
                         It will produce the report and the output              
                         files for the sizes already completed.                 
              *  general identifiers  *                                         
        REACH.THRESHOLD 2,    optional.  can be fractional.                     
                         This permits the user to control                       
                         what constitutes a successful "reach".                 
                         The default is one; if a case has a positive           
                         response on any of the items in a given                
                         combination, that case is added to the reach           
                         total for that set of items.                           
                         Using REACH.THRESHOLD 3, for example, means            
                         a case needs a reach score of 3 or more                
                         to have been reached on a given group.                 
                         Having several responses increases a case's            
                         reach score; weighting of either items or              
                         responses can also affect the reach score.             
        PROGRESS 5,      optional.   controls how often the progress            
                         window or report line is updated.                      
                         The default is 1, which means every million            
                         combinations. PROGRESS 0  turns it off.                
        SET.MISS.TO.ZERO,  optional.  If used, missing analysis                 
                           values in the input file are set to zeros.           
                           If needed, this saves having to write                
                           some PPL as the file is read.                        
        FULL.REPORT,     The final report includes a summary about              
                         each size that was run. It uses 4 lines when           
                         there are 3 or less sizes, and uses a 2-line           
                         form when there are more than 3 sizes.                 
                         Using FULL.REPORT causes the 4-line form               
                         (which contains more information) to be used           
                         in all situations.                                     
              *  identifiers that control the makeup  *                         
              *  of the combinations to be used       *                         
        FILTER list-of-vars min max,                                            
                         optional. This provides a limitation on the            
                         makeup of the combinations to be tried.                
                         This option used the identifier CONSTRAIN in           
                         Version 2 (before 2012). CONSTRAIN still works.        
                         Of the variables whose names (or ranges)               
                         follow FILTER, at least MIN of them                    
                         and at most MAX of them should be in                   
                         every combination that will be tried.                  
                         The MIN value can be zero.                             
                         Up to 100 such FILTER phrases can be given.            
                         Combinations are used only if they pass the            
                         constraints in every one of the FILTER                 
                         Each use of FILTER is followed by:                     
                        (1) The names of the variables in the group.            
                            Ranges, like TOPPING.1 TO TOPPING.8,                
                            can be used.                                        
                        (2) The smallest number of those variables              
                            that are required. Can be zero.                     
                            A combination must have AT LEAST that               
                            many of the variables in the group.                 
                        (3) The largest number of those variables               
                            that may be used.                                   
                            A combination may have AT MOST that many            
                            of the variables in the group.                      
                            All of the group could be used if the               
                            supplied number is equal to or larger               
                            then the size of the group. Therefore,              
                            using 999 is a vivid way of saying there            
                            is no upper limit for the group.                    
                         For example:                                           
                         TURF xxx, size 8,                                      
                             filter  aaa         bbb to ddd  1 999,             
                             filter  eee to ggg  jjj to mmm  2 4,               
                             filter  yyy         zzz         0 1 $              
                         In the above command, the only combinations            
                         that will be evaluated are those that have             
                         at least one variable from the first group, and        
                         at least two but no more than four variables           
                         from the second group, and                             
                         no more than one variable from the third group.        
        FORCE vars,      optional.   names or ranges of items that              
                         should be part of every combination.                   
                         Suppose there are 30 items and size is 6;              
                         without force, 593,775 combinations are done,          
                         because we take 30 items 6 at a time.                  
                         If 2 items are forced, only 20,475                     
                         combinations will be done because the run              
                         reduces to 28 items taken 4 at a time.                 
                         If size is 6 and all 6 items are forced,               
                         just that one pass will be done.                       
        FORCE.FIRST,     optional.  This causes the force items to be           
                         shown first in the display of each of the              
                         combinations in the reach.results and                  
                         freq.results files.                                    
                         The FORCE items are themselves in the order            
                         used in the FORCE statement itself.                    
              *  identifiers for various  *                                     
              *  kinds of weighting       *                                     
        CASE.WEIGHTS varname,  optional.                                        
                         The named variable will be used as a                   
                         caseweight, and not as an analysis item.               
        ITEM.WEIGHTS filename,                                                  
                         the default is treat all of the items                  
                         the same, i.e., with weights of 1.                     
                         When ITEM.WEIGHTS is used, it should                   
                         be followed by the name of a p-stat system             
                         file which itself has exactly 2 variables.             
                         In each record, the first variable has the             
                         name of a item being used for the TURF                 
                         analysis, the second is the weight to be used          
                         for that item. The first variable is therefore         
                         character, and the second is numeric.                  
                         The file is not required to have a record              
                         for every item. In other words, some items             
                         can be given changed weights; others can               
                         be left as is ( i.e., still set to 1).                 
                         The file can have names and weights for                
                         items not used in the current run; if so,              
                         they are ignored.                                      
        RESPONSE.WEIGHTS, the default is to store the input data                
                         as zeros or ones, with one meaning a yes.              
                         This option leaves the input values intact;            
                         they should be in zero (no) or a positive              
                         value (not necessarily an integer) to show             
                         the INTENSITY of a yes.                                
              *  REACH.RESULTS   *                                              
              *   FREQ.RESULTS   *                                              
        The following eight identifiers are fully described                     
        in the TURF.RESULTS helpfile.                                           
        REACH.RESULTS rrr,                                                      
        REACH.RESULTS rrr 5,                                                    
                         optional output p-stat system file.                    
                         This file holds the combinations with the              
                         best REACH values. If an integer follows the           
                         file name (like the 5 above), that many                
                         combinations will be written.                          
        REACH.STATS  cumulative.pct  unique,                                    
                         This controls which lines are printed in the           
                         REACH.RESULTS file to show the importance of           
                         the items in a combination.                            
                         REACH.STATS ALL,  would cause 7 extra stats            
                         lines to follow each combination shown.                
        FREQ.RESULTS fff,                                                       
        FREQ.RESULTS fff 10,                                                    
                         optional output p-stat system file.                    
                         This file holds the combinations with the              
                         best FREQ values.                                      
        FREQ.STATS  cumulative.pct,                                             
                         This controls which lines are printed in the           
                         FREQ.RESULTS file to show the importance of            
                         the items in a combination.                            
                         This option applies to STEP runs only.                 
                         This causes all of the items in the                    
                         REACH.RESULTS or FREQ.RESULTS files to be              
                         reordered (from best to worst) at the end              
                         of each step.                                          
                         The default is to reorder only the newly               
                         added items, keeping the original ordering             
                         of the items that were added in previous steps.        
        SHOW  pct.reached stats,                                                
                         This provides a way to specify which summary           
                         variables should appear in the REACH.RESULTS           
                         and FREQ.RESULTS files.                                
                         The default is to provide the following:               
                         size.and.rank, reach, pct.reached, freq, and           
                         stats (which identifies the additional lines).         
        OMIT  freq stats,                                                       
                         This provides a way to specify which of the            
                         default summary variables should NOT appear            
                         in the REACH.RESULTS and FREQ.RESULTS files.           
                         SHOW can accomplish the same thing, but                
                         OMIT will sometimes be easier to use.                  
                         The default is to provide the following:               
                         size.and.rank, reach, pct.reached, freq, and           
                         stats (which identifies the additional lines).         
                         When writing a reach.results or freq.results           
                         file in a STEP run, the default is to write            
                         only the best combination from the initial             
                         steps, and then write the requested amount             
                         (default 100) from the final step.                     
                         This makes the cascading effect of the items           
                         in a STEP run easier to see.                           
                         Using SHOW.ALL.COMBOS causes the full amount           
                         of combinations to be written from EVERY step.         
                         The default in the two RESULTS files is to             
                         use 16-character short names to identify               
                         the items making up a given combination.               
                         Using FULL causes full names to be used                
                         instead. These can be as much as 64 characters.        
                         The LIST command, given such a file, will fold         
                         values of more than 32 characters, using extra         
                         lines but saving width.                                
              *  other optional output file identifiers  *                      
        The following four identifiers are fully described in the               
        TURF.FILES helpfile.                                                    
        REACH.SUMMARY qqq,                                                      
                         optional output p-stat system file.                    
                         This file shows how many combinations                  
                         had each of the reach values that were found.          
        FREQ.SUMMARY qqq,                                                       
                         optional output p-stat system file.                    
                         This file shows how many combinations                  
                         had each of the freq values that were found.           
        FULL.OUTPUT fff,                                                        
                         optional output p-stat system file.                    
                         This has the results of ALL combinations               
                         in the order that they were processed.                 
                         This should only be used in very small runs.           
        TEMPLATE ttt,                                                           
                         optional output p-stat system file.                    
                         This contains the names of the items                   
                         that comprised the best combination.                   
                         It is intended for the TURF.SCORES command.