Written by Steve Perry
Published on

Gathering data from log files

I learn more about the command line every day, especially at the moment in terms of working with large text files and cleaning data. This, along with learning some Python, is such a help when sifting through server logs etc.

For example, I have a very large Magento payment gateway log that I’m monitoring for errors. To grab the entries after a certain known time instead of sifting through the whole log, I’m doing:

$ awk '/2018-06-28T14:56/,0' gene_braintree.log > gene_from_2018_06_28T14:56.txt

I’m then left with just the data I want to look through. Such a time saver stuff like this is when it all adds up to daily use. Also doing:

$ grep 'failed' gene_braintree.log | grep 'DEBUG' > gene_all_DEBUG_failed.txt

Gets me all lines – which happen to include timestamps – which have both failed and DEBUG in the lines.

As I’ll be running this a few times over the next few days, I’ve created a bash script that I can run. This runs both the awk and grep commands for me and creates the output files. So a quick $ sh script.sh now gives me the data I need from an input file:

#! /bin/bash

awk '/2018-06-28T14:56/,0' gene_braintree.log > gene_from_2018_06_28T14:56.txt
grep 'failed' gene_braintree.log | grep 'DEBUG' > gene_all_DEBUG_failed.txt
Steve Perry Creative Ltd

Studio and registered office: 4 Back Lane, Brown Edge, Staffordshire ST6 8QS.

Copyright © 2012 – 2023 Steve Perry Creative Ltd., unless otherwise noted.

Registered in England & Wales, number 08354632.


Typeset in Söhne Kräftig and Söhne Buch, by Klim Type Co.

Set as 32/64, 24/32, 20/32, and 12/16 on an 8px/96px grid.

Colour palette selected for AAA contrast.