Power to Build

Home » CodeProject » File Checking – Perl to the rescue

File Checking – Perl to the rescue

One of my coworkers posted about an issue with one of the data files we receive from outside world. We have a batch program that processes these files and posts transactions into our system. The issue was that a file had one record that an extra byte and any amount of checks didn’t reveal the “bad” row. After different checks, he decided to download to PC, take it into TextPad and turned on visual spaces to find which row had that extra space.  Very tedious, but it works.

As you may know, Perl is a very good scripting tool for such purposes. I’ve created a sample perl script that does basic checks from file size to record size. If it finds any discrepancy, it prints the line #s and records. (See sample output below). For those interested, I’m attaching a copy (text version of the script) here for reference.

(If you are uploading to unix, you need to chmod +x to make it executable).

Please let me know, if you want more information. Feel free to change the script as needed (but please send me a copy, so I can keep mine updated).

Sample Usage:

/tmp/chkfile.pl FIN_09022011_131353.txt 259

1st parameter is the file name and the second parm is expected record size.

Sample Output:

$/home/svaradar/dev/perl
$ /tmp/chkfile.pl FIN_999905_BILLPAY_09022011_131353.txt 259
Name of the file              : FIN_09022011_131353.txt
File Size                     : 1554 bytes
Record Size expected          : 259
Total # of lines in file      : 6
File appears to be a DOS file. (contains carriage returns)
All rows match!

After creating a bad record (I just “fixed” one of the record to change it to 260 chars):

$ /tmp/chkfile.pl FIN_09022011_131353.txt 259
Name of the file              : FIN_09022011_131353.txt
File Size                     : 1555 bytes
Record Size expected          : 259
Total # of lines in file      : 6
File appears to be a DOS file. (contains carriage returns)
Following rows were unmatched:
+4 –  Size: 260 – << P0001CHK0000000000000374.4600099999990001400001111              011000015                                                                                     Sample Record                                                                                     <CR>>>
<CR>– The script translates CTRL-M to printable <CR>; otherwise it would have inserted just a blank line in the output!

Below file contains the perl script in PDF file format:

chkfile.pl

 

Also, copied to pastebin.com here.


Comments, please?

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: