How to analyze 404 HTTP code from weblogs

The dreaded 404 HTTP code means page not found. However multiple 403 and 404 on weblogs also can also mean there are attempts to crack the website.

The awk script down here can be useful tool to analyze weblogs and identify multiple attempts at cracking the web application.

 

awk '($9 ~ /404/)' access.log | awk '{print $7}' | sort | uniq -c | sort -n

The script can also be tweaked for other HTTP status code too.