I recently found myself in quite a predicament. I had a very large flat file in a specific format and I needed to delete certain record types. Now, obviously this is very easy with regular expressions if the records are only on one line. However, that was not the case. The record types I needed to remove took up multiple lines. Below is an example of the type of record I needed to delete (however since the file spec is proprietary I have changed the actual values):
START PAYMENT_RECORD
123456789 80000 JOHN DOE BANK OF ANYWHERE USA 123 ANY STREET ANYTOWN USA
END PAYMENT_RECORD
However with sed, you can still do a multiple line search and delete from a file with the following command:
sed '/pattern/{N;N;N;d}' filename
So using the sed command in my particular record example, I accomplished this task with
sed '/^START PAYMENT_RECORD/{N;N;N;d}' payment_data_file.txt
This command tells sed to find the text
START PAYMENT_RECORD at the beginning of the line, then delete that line and the next two lines. The
N portion is the number of lines to be removed, and the
d portion tells sed to delete the data matching the pattern.
0 comments:
Post a Comment