3.5. Working with text streams

[Caution]Caution

As with the commands mv and cp from Section 3.3, “Manipulating the file system”, if you direct the I/O redirection operator > to a pre-existing file, that file will be silently replaced (overwritten). Thus you must be careful when selecting the name of the file to which > will direct its output.

The Linux CLI depends on programs communicating with each other (and with the user) through text streams,[29] including through the commands that you type at the keyboard (called the standard input stream, or stdin) and the results or errors displayed on the terminal screen (called the standard output and standard error streams, or stdout and stderr). As such, you may benefit from having some ability to work with text streams.

The command grep and its variants (such as egrep) are particularly powerful and complex commands, since they're designed to search for textual patterns called regular expressions.[30] Thus, consulting man grep and info grep, possibly in addition to checking sources such as those mentioned in Section A.1, “Suggested resources for finding information” for information on grep, is highly recommended.

More information on text processing with the shell can be found in Section A.3.3, “Text processing with the shell”.

More information on I/O redirection can be found in Barr, CLI for Noobies, Chapter 5 (“Everything's a File”) and Garrels, Introduction to Linux, Chapter 5 (“I/O redirection”).

Example 3.8. Using cat with pipes

cat *.c | grep printf | less will search all files in the current directory whose name ends with .c for lines containing the string printf and then display the search results on the screen in the less[a] read-only text viewer.[31]


Example 3.9. Using /dev/null

./some_verbose_program > /dev/null will run some_verbose_program but discard its output by redirecting it to /dev/null (the “bit bucket”) instead of stdout (the terminal screen). If some_verbose_program has output that is directed to stderr, that output will still be displayed on the terminal screen.


Example 3.10. Using diff

diff file1 file2 > differences.txt will find all textual differences between file1 and file2 and store the results in a file (that will be created if it does not exist; if it does exist, it will be silently replaced) titled differences.txt without sending any of the results to stdout (the screen).


Example 3.11. Using echo

echo $SHELL will display the value of the SHELL environment variable (that is, the full path of the default shell) on stdout (the terminal screen).


Example 3.12. Using xargs

find dir -name "*.cpp" | xargs grep boost will search all files ending in .cpp in the directory dir and all of its subdirectories recursively, listing every line where the string boost is found in those files.

By contrast, find dir -name "*.cpp" | grep boost will search the full filenames (absolute path from /) of the files ending in .cpp in the directory dir and all of its subdirectories for the string boost, listing every matching line.[32] It will not search the contents of the files, just their names (full paths).


Table 3.5. Standard streams

NameDescriptionDefault Location
stdinStandard inputKeyboard
stdoutStandard outputTerminal screen
stderrStandard errorTerminal screen

Table 3.6. I/O redirection operators

Operator in ContextAction
command1 | command2Direct the output from command1 to the input of command2
command1 < file1Direct the contents of file1 to the input of command1
command2 > file2Direct the output from command2 to be stored as file2
command3 >> file3Direct the output from command3 to be appended to the contents of file3

Table 3.7. Commands for working with text streams

CommandAction

cat file_list

Concatenates the contents of the given list of files and sends the results to stdout

diff file1 file2

Performs line-by-line comparison of file1 and file2, reporting any differences

echo some_text

Takes some_text from stdin and displays it on stdout

grep pattern [file_list]

Searches stdin by default [or the files in file_list instead] for lines matching pattern, then prints the results to stdout

less file

Allows read-only viewing of file[a]

xargs cmd

Passes stdin as arguments to some command cmd, which is itself an argument to xargs

[a] less has a diverse set of keyboard shortcuts available that seems to be designed to cover a wide range of common conventions for keyboard shortcuts, including those for the man pages and the vi and Emacs text editors. The commands mentioned in Section 2.1, “Manual (man) pages” ought to be enough to get by, but you can get the full listing of commands by typing h while you are using less (and typing q still quits).




[29] Eric Raymond calls this the “Rule of Composition: Design programs to be connected with other programs” in The Art of Unix Programming (Boston: Addison-Wesley, 2003). Found online at http://www.catb.org/~esr/writings/taoup/html/ch01s06.html#id2877684.

[30] Wikipedia has an article on regular expressions; the references in Section A.3.3, “Text processing with the shell” may also be helpful.

[31] This type of search will not indicate which files the lines containing printf came from, though. One way to obtain that information is to use find (discussed in Section 3.2, “Navigating the file system”), by typing find . -name "*.c" | xargs grep printf at the prompt. Be aware that while cat searches for files in the current directory only, find searches for files in the current directory and all of its subdirectories (and their subdirectories and so on).

[32] If you're not sure what path means in this context, see the Wikipedia article on paths (in the computing sense).


Back to Guide main page