Sunday, 16 February 2014

BASH text file processing

In this Blog, we will show some common bash commands about text file processing.

cut: cut is a powerful tool to extract the dedicated colume, fields tool from a text-based file. 
The common options are:
  • -c <list>:      the specified columns for output.
  • -d <delimiter>  the delimiter used to separate the file, default is space and tab
  • -f <fields>     the fields for output.


For example,
The command to print the 1st and 7th field of the /etc/passwd file using : as delimiter

The command to print the 1st to 10th characters of /etc/passwd


sort: display the file by sorting the field
Some important paramters
  • -b: ignore the blank
  • -d: sort by dictionary
  • -g: sort by float
  • -f: ignore the case
  • -k: define the key
  • -n: sort by integer
  • -o: send the output to output file
  • -t: delimiter
  • -u: unique


Example: sort the file by the second field as float


Sort the /etc/password file using UID by descend.
sort -t : -k3 -n -r /etc/passwd


uniq: delete the duplicated records
-c: show the line number
-i: ignore the case
-u: only show the unduplicated records
-d: only show the duplicated records



wc command:
show the files, line counts, word counts and character counts
the file has to use space or tab as delimiters


head and tail commands to show the first and list lines (by default is 10)
head –n number <file>:  the first number lines
head –n -number <file> all the lines to the last number-st line
tail –n number <file>: the last number lines

tail –n +number <file>: the bottom number lines

No comments:

Post a Comment