3.5.Working with text streams

Regular Expressions

Regular Expressions

[Caution]Caution

As with the commands mv and cp from Section3.3, “Manipulating the file system”, if you direct the I/O redirection operator > to a pre-existing file, that file will be silently replaced (overwritten). Thus you must be careful when selecting the name of the file to which > will direct its output.

The Linux CLI depends on programs communicating with each other (and with the user) through text streams,[31] including through the commands that you type at the keyboard (called the standard input stream, or stdin) and the results or errors displayed on the terminal screen (called the standard output and standard error streams, or stdout and stderr). As such, you may benefit from having some ability to work with text streams.

The command grep and its variants (such as egrep) are particularly powerful and complex commands, since they're designed to search for textual patterns called regular expressions.[32] Thus, consulting man grep and info grep, possibly in addition to checking sources such as those mentioned in SectionA.1, “Suggested resources for finding information” for information on grep, is highly recommended.

More information on text processing with the shell can be found in SectionA.3.3, “Text processing with the shell”.

More information on I/O redirection can be found in Barr, CLI for Noobies, Chapter 5 ("Everything's a File") and Garrels, Introduction to Linux, Chapter 5 ("I/O redirection").

Example3.8.Using cat with pipes

cat *.c | grep printf | less will search all files in the current directory whose name ends with .c for lines containing the string printf and then display the search results on the screen in the less[a] read-only text viewer.[33]


Example3.9.Using /dev/null

./some_verbose_program > /dev/null will run some_verbose_program but discard its output by redirecting it to /dev/null (the "bit bucket") instead of stdout (the terminal screen). If some_verbose_program has output that is directed to stderr, that output will still be displayed on the terminal screen.


Example3.10.Using diff

diff file1 file2 > differences.txt will find all textual differences between file1 and file2 and store the results in a file (that will be created if it does not exist; if it does exist, it will be silently replaced) titled differences.txt without sending any of the results to stdout (the screen).


Example3.11.Using echo

echo $SHELL will display the value of the SHELL environment variable (that is, the full path of the default shell) on stdout (the terminal screen).


Table3.5.Standard streams

NameDescriptionDefault Location
stdinStandard inputKeyboard
stdoutStandard outputTerminal screen
stderrStandard errorTerminal screen

Table3.6.I/O redirection operators

Operator in ContextAction
command1 | command2Direct the output from command1 to the input of command2
command1 < file1Direct the contents of file1 to the input of command1
command2 > file2Direct the output from command2 to be stored as file2
command3 >> file3Direct the output from command3 to be appended to the contents of file3

Table3.7.Commands for working with text streams

CommandAction

cat file_list

Concatenates the contents of the given list of files and sends the results to stdout

diff file1 file2

Performs line-by-line comparison of file1 and file2, reporting any differences

echo some_text

Takes some_text from stdin and displays it on stdout

grep pattern [file_list]

Searches stdin by default [or the files in file_list instead] for lines matching pattern, then prints the results to stdout

less file

Allows read-only viewing of file[a]

[a] less has a diverse set of keyboard shortcuts available that seems to be designed to cover a wide range of common conventions for keyboard shortcuts, including those for the man pages and the vi and Emacs text editors. The commands mentioned in Section2.1, “Manual ("man") pages” ought to be enough to get by, but you can get the full listing of commands by typing h while you are using less (and typing q still quits).




[31] Eric Raymond calls this the "Rule of Composition: Design programs to be connected with other programs" in The Art of Unix Programming (Boston: Addison-Wesley, 2003). Found online at http://www.catb.org/~esr/writings/taoup/html/ch01s06.html#id2877684.

[32] Wikipedia's article on regular expressions may be helpful.

[33] This type of search will not indicate which files the lines containing printf came from, though. One way to obtain that information is to use find (discussed in Section3.2, “Navigating the file system”), by typing find . -name "*.c" | xargs grep printf at the prompt. Be aware that while cat searches for files in the current directory only, find searches for files in the current directory and all of its subdirectories (and their subdirectories and so on).


Back to Guide main page