Reducing large data file with awk

I have a data file with 20000 lines collected with an Arduino datalogger that I have built. When I tried to plot all these lines with gnuplot, I got a 400K graph EPS, which worked great, but loaded slowly.

In order to get a lighter graph, here’s what I did:

Each line of the datafile looks like this:

1340622508 26.12 27.65 129.70 78.90 22.97

Fist column is the Unix timestamp of the acquired data, followed by the measurements.

I simply wrote an AWK script that averages each column in groups of 100. You can change this number, by setting N=100 to another value.

Heres the script:

awk ‘{N=100; sum1+=$1; sum2+=$2; sum3+=$3; sum4+=$4; sum5+=$5; sum6+=$6} (NR%N)==0 {printf(“%d %.2f %.2f %.2f %.2f %.2f\r\n”, sum1/N, sum2/N, sum3/N, sum4/N, sum5/N, sum6/N); sum1=0; sum2=0; sum3=0; sum4=0; sum5=0; sum6=0;}’ input.csv > output.dat

Thanks to the powers of awk 🙂

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s