[CentOS] Text Proccessing script - advice?

Tue Dec 21 19:40:42 UTC 2010
Roland RoLaNd <r_o_l_a_n_d at hotmail.com>

Thanks to your help i've reached this step:

original data:

01,01368,2010-12-02,09:07:00,Pass
01,01368,2010-12-02,10:54:00,Pass
01,01368,2010-12-02,13:07:04,Pass
01,01368,2010-12-02,18:54:01,Pass
01,01368,2010-12-03,09:02:00,Pass
01,01368,2010-12-03,13:53:00,Pass
01,01368,2010-12-03,16:07:00,Pass




awk -F , '{if ($4 > "09:10:00") print $2 " was late on", $3 " by coming at ",$4}' test | tee  DaysLate ; wc -l DaysLate

01368 was late on 2010-12-02 by coming at  10:54:00

01368 was late on 2010-12-02 by coming at  13:07:04

01368 was late on 2010-12-02 by coming at  18:54:01

01368 was late on 2010-12-03 by coming at  13:53:00

01368 was late on 2010-12-03 by coming at  16:07:00

       5 DaysLate


the only thing missing is to find a way to just take the earliest time of each day.

in other words the above output should be:


      0 DaysLate # as on 12-02 he came in at 09:07 which is before 09:10 and on 12-03 he came in at 09:02 which is also before the set time




----------------------------------------
> Date: Tue, 21 Dec 2010 14:35:13 -0500
> From: m.roth at 5-cent.us
> To: centos at centos.org
> Subject: Re: [CentOS] Text Proccessing script - advice?
>
> John Lundin wrote:
> > On Tue, Dec 21, 2010 at 08:30:43PM +0200, Roland RoLaNd wrote:
> >
> > (chuckle) That's a bit more verbose than necessary. As a one-liner:
> >
> > awk -F, '($4>"09:00:00"){c[$2 "," $3]++};END{for (i in c){print i ","
> > c[i]}}' $filename
> >
>
> Well, yes, but he also wanted a count....
>
> mark
>
> > 01368,2010-12-02,4
> > 01368,2010-12-03,3
> >
> > (You might check if you want >="09:00:00", and include the edge case.)
> >
> > -F, # set separator to comma
> >
> > # (automatic loop over all data lines)
> > ($4>"09:00:00"){ # do if fourth field greater than 09:...
> > c[$2 "," $3]++ # increment hash element pointed to by
> > # second and third fields separated by comma
> > # (that is, hash on id,date)
> >
> > END{ # after finishing the data
> > for (i in c){ # for each observed hash value in array c
> > print i "," c[i] # print the hash value, comma, count
> >
> > --
> > lundin at fini.net
> > _______________________________________________
> > CentOS mailing list
> > CentOS at centos.org
> > http://lists.centos.org/mailman/listinfo/centos
> >
>
>
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> http://lists.centos.org/mailman/listinfo/centos