Working with text files

Given the simplicity and readability, text files are widely used in computing system for various purposes. In this practice, we will use text files to store numerical data. A benefit of storing data in text file is that many tools coming along with the Linux system can be used directly to process the data.

In the examples below, we will create two text files to store the final-exame scores of four students in the mathematics and language courses. We will then introduce few usueful Linux commands to browse and analysis the data.

Before we start, make sure the directory $HOME/tutorial/labs is already available; otherwise create it with

$ mkdir -p $HOME/tutorial/labs

and change the present working directory to it:

$ cd $HOME/tutorial/labs

Creating and editing text file

There are many text editors in Linux. Here we use the editor called nano which is relatively easy to adopt. Let’s firstly create a text file called score_math.dat using the following command:

Note

In Linux, the suffix of the filename is irrelevant to the file type. Use the file command to examine the file type.

$ nano score_math.dat

You will be entering an empty editing area provided by nano. Copy or type the following texts into the area:

Thomas 81
Percy 65
Emily 75
James 55

Press Control+o followed by the Enter key to save the file. Press Control+x to quit the editing environment and return to the prompt.

Now repeat the steps above to create another file called score_lang.dat, and paste the data below into it.

Thomas 53
Percy 85
Emily 70
James 65

When you list of the content of the present working directory, you should see the two data files.

$ ls -l
total 0
-rw-r--r-- 1 honlee tg 40 Sep 30 15:06 score_lang.dat
-rw-r--r-- 1 honlee tg 37 Sep 30 15:06 score_math.dat

Browsing text file

Several commands can be used to brows the text file. First of all, the command cat can be used to print the entire content on the terminal. For example:

$ cat score_math.dat

When the content is too large to fit into the terminal, one uses either more or less command to print contents in pages. For example,

$ more score_math.dat
$ less score_math.dat

Tip

The command less provides more functionalities than the more command such as up/down scrolling and text search.

When the top and bottom of the content are the only concern, one can use the commands tail and head. To print the first 2 lines, one does

$ head -n 2 score_math.dat

To print the last 2 lines, one does

$ tail -n 2 score_math.dat

Searching in text file

For search a string in text file, one use the command grep. For example, if we would like to search for the name Thomas in the file score_math.dat, we do

$ grep 'Thomas' score_math.dat

Tip

grep supports advanced pattern searching using the regular expression.