In a 3 day week, I only managed to learn how to get distinct IP addresses from log file.

How to get distinct IP addresses from log file

For a customer of ours, I had to screen two years of log files and find distinct IP addresses for certain criteria. You could check those log files by hand. Sure, it would take a month or two, but it can be done. However, if you are not keen of spending your day looking at log files line by line, here is what you can do:

  1. You can grep log files for specified criteria:
    grep -iRn "<my criteria>" --include=*.log
  2. Then you can parse results to get all IP addresses:
    grep -o '[0-9]\{0,3\}\.[0-9]\{0,3\}\.[0-9]\{0,3\}\.[0-9]\{0,3\}'
  3. You can then use awk, to print them on separate lines:
    awk 'NR%2{printf $0"\n";next;}1'
  4. And again use awk, to print only distinct ones:
    awk -F: '{ if (!a[$1]++ ) print ;}'
  5. Optionally, you can store output to file:
    > _ip_addresses.log

Ideally, you want to run this in one command:

grep -iRn "<my criteria>" --include=*.log | grep -o '[0-9]\{0,3\}\.[0-9]\{0,3\}\.[0-9]\{0,3\}\.[0-9]\{0,3\}' | awk 'NR%2{printf $0"\n";next;}1' | awk -F: '{ if (!a[$1]++ ) print ;}' > _ip_addresses.log

There you have it! File _ip_addresses.log now contains only distinct IP addresses.

I am pretty sure, it can be done differently. You can leave your solution in comments below.