Written by Steve Perry
Published on
Gathering data from log files
I learn more about the command line every day, especially at the moment in terms of working with large text files and cleaning data. This, along with learning some Python, is such a help when sifting through server logs etc.
For example, I have a very large Magento payment gateway log that I’m monitoring for errors. To grab the entries after a certain known time instead of sifting through the whole log, I’m doing:
$ awk '/2018-06-28T14:56/,0' gene_braintree.log > gene_from_2018_06_28T14:56.txt
I’m then left with just the data I want to look through. Such a time saver stuff like this is when it all adds up to daily use. Also doing:
$ grep 'failed' gene_braintree.log | grep 'DEBUG' > gene_all_DEBUG_failed.txt
Gets me all lines – which happen to include timestamps – which have both failed
and DEBUG
in the lines.
As I’ll be running this a few times over the next few days, I’ve created a bash script that I can run. This runs both the awk
and grep
commands for me and creates the output files. So a quick $ sh script.sh
now gives me the data I need from an input file:
#! /bin/bash awk '/2018-06-28T14:56/,0' gene_braintree.log > gene_from_2018_06_28T14:56.txt grep 'failed' gene_braintree.log | grep 'DEBUG' > gene_all_DEBUG_failed.txt