Collecting bulk information from files

1

I want to collect ctime, atime, mtime, and crtime from a considerable mass of files.

As a partial solution, I put the following script:

sudo debugfs -R 'stat <1055890>' /dev/sda1|awk -F': ' -v c='' -v a="" -v m="" 'BENGIN {} $1==" ctime" {c=$2} $1==" atime" {a=$2} $1==" mtime" {m=$2} $1=="crtime" {print c, a, m, $2}'
  

debugfs 1.42.13 (17-May-2015)   0x5ade9510: c7eb0e9c - Mon Apr 23 23:23:12 2018 0x5b05601f: 111ab67c - Wed May 23 09:35:43 2018 0x5ade9510: c7eb0e9c -   Mon Apr 23 23:23:12 2018 0x5ade9510: c7eb0e9c - Mon Apr 23 23:23:12   2018

I want to capture the information as follows:

Mon Apr 23 23:23:12 2018, Wed May 23 09:35:43 2018, Mon Apr 23 23:23:12 2018, Mon Apr 23 23:23:12 2018

Where will be respectively ctime, atime, mtime and crtime in a future csv.

How can I handle the variables to pick up only after "-"?

    
asked by anonymous 13.06.2018 / 16:31

3 answers

1

You can use a awk script to solve your problem ( script.awk ):

BEGIN {
    OFS = ",";
    FS = " -- ";

    print "ctime,atime,mtime,crtime"
}
{
    for( i = 0; i < 4; i++ ){
        split( $(i+2), a, " 0x" );
        b[i] = a[1];
     }

     print b[0], b[1], b[2], b[3];
} 
END {
}

Usage:

debugfs -R 'stat <1055890>' /dev/sda1 | awk -f script.awk > saida.csv

All in one line:

debugfs -R 'stat <1055890>' /dev/sda1 | awk 'BEGIN{OFS=",";FS=" -- ";print "ctime,atime,mtime,crtime"}{for(i=0;i<4;i++){split($(i+2),a," 0x");b[i]=a[1]};print b[0],b[1],b[2],b[3];}' > saida.csv

Output (% with%):

ctime, atime, mtime, crtime
Mon Apr 23 23:23:12 2018,Wed May 23 09:35:43 2018,Mon Apr 23 23:23:12 2018,Mon Apr 23 23:23:12 2018
    
14.06.2018 / 18:57
1

Summarizing my answer, the complete command would be:

sudo debugfs -R 'stat <1055890>' /dev/sda1 | 
awk -F'--' '{gsub(/0x[a-z0-9:]+/, ""); print $2, $3, $4, $5}' | 
sed -r 's/^ //;s/  +/, /g'

Complete answer:

With awk it is not strictly necessary to assign arbitrary variables, since positional values already serve your purpose of enabling only what you want to return.

So by setting the field separator to the string "-", through the -F flag, just give print in the desired fields. In the example below I executed a gsub to also remove the hexadecimal values:

[stat...] | awk -F'--' '{gsub(/0x[a-z0-9:]+/, ""); print $2, $3, $4, $5}'

> Mon Apr 23 23:23:12 2018    Wed May 23 09:35:43 2018    Mon Apr 23 23:23:12 2018    Mon Apr 23 23:23:12 2018

Finally, to eliminate unwanted spaces and add the commas between fields, a% quick%:

sed -r 's/^ //;s/  +/, /g'

The complete command, according to the question, would be:

sudo debugfs -R 'stat <1055890>' /dev/sda1 | 
awk -F'--' '{gsub(/0x[a-z0-9:]+/, ""); print $2, $3, $4, $5}' | 
sed -r 's/^ //;s/  +/, /g'
  

Mon Apr 23 23:23:12 2018, Wed May 23 09:35:43 2018, Mon Apr 23 23:23:12 2018, Mon Apr 23 23:23:12 2018

For performance purposes, if you want to run sed of many files, I suggest saving the entire result of all stat statements to a single file and then executing this stat once, passing the stats file as input to awk | sed :

awk -F'--' '{gsub(/0x[a-z0-9:]+/, ""); print $2, $3, $4, $5}' arquivo-de-stats | 
sed -r 's/^ //;s/  +/, /g'
    
14.06.2018 / 19:38
0

Using awk , you can define the "-" set as the delimiter.

A simple script using your example command would look like this:

#!/bin/bash

var=$(sudo debugfs -R 'stat <1055890>' /dev/sda1|awk -F': ' -v c='' -v a="" -v m="" 'BENGIN {} $1==" ctime" {c=$2} $1==" atime" {a=$2} $1==" mtime" {m=$2} $1=="crtime" {print c, a, m, $2}')

ctime=$(echo $var | awk -F"-- " '{print $2}'| rev | cut -d" " -f3-7 | rev)
atime=$(echo $var | awk -F"-- " '{print $3}'| rev | cut -d" " -f3-7 | rev)
mtime=$(echo $var | awk -F"-- " '{print $4}'| rev | cut -d" " -f3-7 | rev)
crtime=($echo $var | awk -F"-- " '{print $5}')
echo "$ctime", "$atime", "$mtime", "$crtime"
  

Mon Apr 23 23:23:12 2018, Mon Apr 23 23:23:12 2018, Mon Apr 23   23:23:12 2018, Mon Apr 23 23:23:12 2018

Explanation:

In awk, the -F flag indicates the delimiter that will be used, and the print indicates the column that will be displayed. As the result of this parse brings a set of unwanted characters, it was necessary to use the cut in conjunction with rev to remove the last column and keep only the date. Note: For crtime , since there is no last column, we can dispense the cut into the filter.

If you run the command many times, I recommend encapsulating the solution below into a function:

function parse_date {
  ctime=$(echo $1 | awk -F"-- " '{print $2}'| rev | cut -d" " -f3-7 | rev)
  atime=$(echo $1 | awk -F"-- " '{print $3}'| rev | cut -d" " -f3-7 | rev)
  mtime=$(echo $1 | awk -F"-- " '{print $4}'| rev | cut -d" " -f3-7 | rev)
  crtime=$(echo $var | awk -F"-- " '{print $5}')

  echo "$ctime", "$atime", "$mtime", "$crtime"
}

Then you would call the function passing the result of your command as an argument:

parse_date "$var"
  

Mon Apr 23 23:23:12 2018, Mon Apr 23 23:23:12 2018, Mon Apr 23   23:23:12 2018, Mon Apr 23 23:23:12 2018

    
14.06.2018 / 16:47