Remove only certain duplicate lines with awk

Basic solution

http://www.unix.com/shell-programming-and-scripting/153131-remove-duplicate-lines-using-awk.html demonstrates and explains how to use awk to remove duplicate lines in a stream without having to sort them. This statement is really useful.
awk '!x[$0]++'

The fancy solution

But if you need certain duplicated lines preserved, such as the COMMIT statements in the output of iptables-save, you can use this one-liner:
iptables-save | awk '!asdf[$0]++; /COMMIT|Completed on|Generated by/;' | uniq
The second awk rule prints again any line that matches “COMMIT” or “Completed on” or “Generated by,” which appear multiple times in the iptables-save output. I was programmatically adding rules and one host in particular was just adding new ones despite the identical rule already existing. So I had to remove the duplicates and save the output, but keep all the duplicate “COMMIT” statements. I also wanted to keep all the comments as well.

Advertisements

One thought on “Remove only certain duplicate lines with awk

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s