INTERNET COBOL
Data Validation Lab

Bad data is a problem for businesses as it gives incorrect results and errors. For this lab, we are going to perform data validation routines on the input data file to "clean" the data. Generally data validation is done when the data is first captured rather than in batch processing.


 

We will be using a new version of our gender / marital / salary data file that has intentional errors in the data. (Link to the data file). This data file has the same specifications as previously:

columns 1 - 4 - Employee ID number (pic X(4))
columns 5 - 16 - Employee First name  (pic X(12))
columns 17 - 31 - Employee Last name  (pic X(15))
columns 32 - 36 - Employee gross salary (pic 9(5))
column 37 - Marital status (S - single; M - married)
column 38 - Gender (F - female; M - male)

 

Each of these fields can have errors in the data.

The Employee ID number field can be spaces or can have non-numeric values.

The Employee First name and the Employee Last name fields can be filled with spaces. (There could be other errors such as misspelled names - but we can't handle that!)

The salary field can have non-numeric values or could be filled with spaces.

The only valid entries in the marital status field are "S" and "M" - any other value (including spaces) will be in error.

The only valid entries in the gender field are "F" and "M" - and other data will be in error.


 

You will have two output files:

(1) the standard report file (from lab 8 - with the date/time and page routines)

(2) an error report


 

Processing:

You will check the data for errors prior to doing any processing (otherwise you will get an illegal character in numeric field or similar error).

Here is some of my processing:

        3000-Process.
           PERFORM 3100-ERROR-ROUTINE.
           IF ERROR-FLAG = "NO"
               PERFORM 3200-MARITAL-CALCULATIONS
               PERFORM 3400-GENDER-CALCULATIONS
               PERFORM 3600-WRITE-DETAIL-LINE
           END-IF.

       3100-ERROR-ROUTINE.
           MOVE "NO"                   TO ERROR-FLAG.
           IF EMPLOYEE-ID-IN = SPACES OR EMPLOYEE-ID-IN IS NOT NUMERIC
               MOVE "YES"              TO ERROR-FLAG
               MOVE GENDER-RECORD      TO ERROR-RECORD-OUT
               MOVE "ID-ERROR"         TO ERROR-MSG
               WRITE ERROR-RECORD      FROM ERROR-LINE
           END-IF.
           IF SALARY-IN IS NOT NUMERIC
               MOVE "YES"              TO ERROR-FLAG
               MOVE GENDER-RECORD      TO ERROR-RECORD-OUT
               MOVE "SALARY"         TO ERROR-MSG
               WRITE ERROR-RECORD      FROM ERROR-LINE
           END-IF.
           IF NOT VALID-MARITAL
               MOVE "YES"              TO ERROR-FLAG
               MOVE GENDER-RECORD      TO ERROR-RECORD-OUT
               MOVE "MARITAL ERR"         TO ERROR-MSG
               WRITE ERROR-RECORD      FROM ERROR-LINE
           END-IF.

In each case, we will check the data for errors - if an error occurs, we will move "yes" to the error-flag, move and set up the error line, and write the error line.


 

Back to Internet COBOL main menu