Shuf: A Linux Command To Shuffle Text; Try It On 78 Billion Line Text File

Nowadays, a meme about Shuf, a Linux and Unix command, is getting good hype on Reddit, especially in a subreddit group /r/ProgrammerHumor; credit goes to Redditor Nexuist who first noticed and shared a picture mentioning ’78 billion line text file.’

So, before I tell you about Shuf, let’s demystify why this command-line utility is getting such limelight. At StackOverflow, amid the conversation about ‘selecting random lines from file’, one of the guys named Ash commented that Shuf is fantastic and tried it on 78 billion lines of a text file.

He also claimed that Shuf took less than a minute to shuffle the text inside the file. However, he didn’t share any further information. Surprisingly, there are some comments as well as supporting the answer with the fact that Shuf is very quick.

78 billion line text file

No one knows what’s the truth, but Redditors found humor in the comment and started circulating loads of memes. One user, Stein van Broekhoven, posted an answer with his own experiment on file with 78 billion newlines.

And whether you believe it or not, he quoted that Shuf finished shuffling 78 000 000 000 newlines within one minute. Can you believe it? If not, let’s know about the Shuf Linux command and try the challenge on your own system.

What Is Shuf?

Shuf is a Linux and Unix command-line utility that puts its input text in random order to generate output consisting of random permutations of the input. In simple terms, it just shuffles the input of either text file or standard input passed through the command line.

Shuf has three modes of operation that defines the method of accepting input to produce random output. First, you can pass input using a standard method that is also a default method of reading input. Second, you can use -e or --echo option to treat each command line operand as an input line.

And lastly, use -i lo-hi or --input-range=lo-hi to define the range of output.

shuf [option]… [file]
shuf -e [option]… [arg]…
shuf -i lo-hi [option]…

There are other options as well to run Shuf in different modes.

Shuf options
Shuf options

How To Use Shuf Linux/Unix Command?

Now let’s begin to use Shuf and learn by practicing it in a terminal. To demonstrate Shuf, I’ve created a text file fossbytes.txt with ten lines of inputs.

cat fossbytes.txt
Shuf text file with inputs
text file with inputs

Shuffle Content Of Files

If you just want to print the output of the files in random order, you can directly use Shuf with a filename to shuffle the content of files.

shuf fossbytes.txt
Get text file content in random order
Get text file content in random order

Shuffle With Limited No Of Random Content

Now if you want to output your random content from a text file in a limited number, you can use -n option that defines the no of output you want.

shuf -n 5 fossbytes.txt
Shuffle Content With Limited Number
Shuffle Content With Limited Number

Pass Shuffled Output To Another File

Sometimes you may want to transfer some random content to another file. For the same, you can use the previous command with > output redirection to another file.

shuf -n 5 fossbytes.txt > fossbytes_1.txt
Pass shuffled output to another file
Pass shuffled output to another file

If you want to move all the content to another file, remove -n option and just run:

shuf fossbytes.txt > fossbytes_1.txt
Pass all shuffled output to another file
Pass all shuffled output to another file

Shuffle Input Passed As Command-Line Argument

You can also get random content by passing input as a command-line argument. You can use -e option that treats arguments as an input line.

shuf -e text_1 text_2 text_4 text_3 text_5
Shuffle Input Passed From Command Line Argument
Shuffle Input Passed From Command Line Argument

Shuffle Random Content Within Range

To print a range of outputs, you can use -i option with lower to higher range values:

shut -i 1-10
Shuffle range of input
Shuffle range of input

Furthermore, you can also specify the number of random outputs using -n flag. I’ve also used -r option to repeat the output unless it produces less than ten outputs.

shuf -r -n 10 -i 0-5
Shuffle range of input with limited no of output
Shuffle range of input with limited no of output

Conclusion

I hope you learn about ‘Fantastic Shuf’ that will speed up the process of generating random content from input lines. If you’ve tried on your text file with your 70 billion lines of text, share in the comment section below.