Linux Lexicon — Input And Output With Pipes And Redirection In Linux


linuxlexiconfeaturedimage-piping-redirectionShort Bytes: Command output, and input, can be controlled in extremely flexible ways using pipes and redirection. This is much more than smoke and mirrors, it’s the magic of Linux and UNIX-like shells.

When working with the command-line interface it’s common to want to use the results of one command in another, but it’s extremely tedious to type out the output of the previous command into the next, sometimes impossible if the output exceeds the size of the terminal. The output could be saved to a file using a feature of the program you’re using, but then you’re stuck dealing with files for every other command you run. This is where pipes and redirection magically move your command output from one place to another. Pipes and redirection were invented by Douglas McIlroy at Bell Labs during the development of the original UNIX operating system.

First, we’ll cover pipes as they’re the most often used of the redirection toolset. Pipes work just like they do in Super Mario. To demonstrate this we’ll use the standard commands echo and wc. The echo command simply displays the strings that are fed to it, the wc command counts lines, words, and characters.

Recommended — Complete Linux Lexicon Series

[[email protected] ~]$ echo Tux

We can see that the name Tux is printed to the terminal just as expected. Now, let’s pipe Tux around and see what information we can get from him.

[[email protected] ~]$ echo Tux | wc
1        1        4

Here, wc tells us that it received one line of input, one word, and four characters. Tux only has five characters, but wc is counting the line-feed character ( \n ) that doesn’t print a visible character, it just forces a new line. So, we’ll move away from a more contrived example of what pipes are and onto something equally contrived, but much more realistic and commonplace in IT. Below here we have an excerpt of an SSH log:

Sept 3 05:31:12 server sshd[19400]: Connection from port 32889
Sept 5 20:43:14 server sshd[19878]: Failed password for Tux from port 35245 ssh2
Sept 5 21:31:14 server sshd[20785]: Received disconnect from 11: Bye Bye
Sept 6 01:04:56 server sshd[22658]: Connection from port 45196
Sept 7 22:04:58 server sshd[25928]: Failed password for Tux from port 59843 ssh2
Sept 8 01:04:58 server sshd[30894]: Received disconnect from 11: Bye Bye
Sept 8 22:04:59 server sshd[32896]: Connection from port 45528
Sept 9 22:05:01 server sshd[32890]: Failed password for Tux from port 32498 ssh2
Sept 10 22:05:01 server sshd[34798]: Received disconnect from 11: Bye Bye

As you can imagine, a log that’s been appended for weeks, or months, can take a lot of time to read through, but we don’t have the time for that. So, using a pipe and another command called grep we can search through a large log in very little time.

[[email protected] ~]$ cat sshd.log | grep “Failed”
Sept 5 20:43:14 server sshd[15798]: Failed password for Tux from port 35245 ssh2
Sept 7 22:04:58 server sshd[25438]: Failed password for Tux from port 59843 ssh2
Sept 9 22:05:01 server sshd[21358]: Failed password for Tux from port 32498 ssh2

With a single command line utilizing a pipe, we can see that someone is unsuccessfully trying to get into Tux’ account over SSH.

Next to pipes in the redirection toolset is what’s simply known as ‘redirection’, no special name for this one, but very useful nonetheless. The redirection tools come in two variants, input and output, and additionally, output itself can be further divided into two more tools.

Pipes are a form of output redirection, but pipes are a very particular case, they redirect the output to the input of another program. The more generic output direction allows the output of a program to be fed into a file. This is very useful when running scheduled scripts as it lets you capture the output that you weren’t there to witness. Let’s say you’d like to capture the last time the files in a given directory were accessed because you think someone might be trying to steal your notes. You could schedule the following (in whichever method you prefer) to snapshot the access time (atime) of the directory and write the output to a file.

[[email protected] ~]$ ls -lu ~/secret_notes/ > script.log

And a simple as that, the output of your command will be captured in the script.log file. There is one thing that’s very important to note, here, and that’s that this command will overwrite the existing file every time it’s executed. If the desired effect is to append the log file, then you should use the following:

[[email protected] ~]$ ls -lu ~/secret_notes/ >> script.log

The difference being >> instead of >.

Now it’s time to address the three tools of the output redirection, we’ve seen >, which, itself, is a combination of two of the tools. There are two components of the output of any program in the shell, there are the stdout and stderr streams, both of which are displayed by the shell, often with no way to distinguish between them. The stdout stream is the standard output, or normal output, of the program, this is where our echo command was printing to. The stderr stream is the standard error output, this is typically where errors are printed. The benefit in splitting output into two streams is that they can be redirected to different places. We can see how this works below.

Imagine a directory filled with files, some of which belong to you, some don’t. When attempting to delete them, you will successfully delete those that belong to you, but the ones that don’t will produce an error.

[[email protected] ~]$ rm * 1>succeed.log 2>fail.log

You can see that we use 1> and 2>, they are stdout and stderr respectively, and the files following them are where the output is redirected. Again, we can user >> instead to ensure that we don’t overwrite the file if it already exists, which is very useful in the case of repeated logging.

Lastly, we have input redirection. Input redirection is providing a file to use as the source of input instead of the user manually providing input or redirecting the output of a previous command to use as input. We’ll use the same excerpt from our SSHD log from above.

[[email protected] ~]$ wc < sshd.log
9        108        727
[[email protected] ~]$ wc sshd.log
9        108        727 sshd.log
[[email protected] ~]$ cat sshd.log | wc
9        108        727

All three commands produce the same output, which while redundant, is exactly what we want to see. But, why use input redirection when you can just use a pipe? When using a pipe to push output to another program, you’re involving another program to open the file and read it, and then finally pass it along the pipe. This means that there’s an additional executable being run, creating overhead of its own, and then beyond that, the reading of the file is buffered, and so is the writing of the file onto the pipe. This is trivial when dealing with small files, but it can slow down processes that are passed larger files, so it’s beneficial to simply let BASH (or whichever shell you use) to open the file for your program to read from.

How do you make use of input and output redirection in your daily tasks? Let us know in the comments below.

Recommended — Complete Linux Lexicon Series

Devin McElheran

Devin McElheran

IT professional by day and various hobbies by night.
More From Fossbytes

Latest On Fossbytes

Find your dream job