In this recitation, we will look at how to apply redirection and pipes to various tasks when working in the shell. Though you are not expected to memorize each of these commands, what they do, and the flags they support, this exercise will hopefully inspire you to use some of these commands and their features in your daily workflows, in this course and beyond.
Before getting started, make sure you have a copy of our Wordle solutions.
You can obtain those solutions following our Lab Workflow guide, or just by
cloning a fresh copy of the skeleton repo (which now contains the solutions)
somewhere outside of your ~/cs3157/
directory:
git clone ~j-hui/cs3157-pub/lab1
cd lab1/solutions/part2
For the first part of this recitation, we will look at several commands you should have seen and used before by this point. To answer these questions, read each command’s man pages, and try running them yourself.
ls
(“LiSt”): list the files in the current directory
(6.1.1.1) What does the -l
flag do?
(6.1.1.2) What does the -a
flag do?
(6.1.1.3) The default behavior of ls
is to list files horizontally,
but if it detects that stdout isn’t a terminal (e.g., if you run it in
a pipeline like ls | other-cmd
), then it will list each file in
a separate line. What flag can you pass to force ls
to print each
file on its own line?
echo
: print arguments to stdout
echo "\n"
print? Is it what you expect?cat
(“conCATenate”): concatenate file contents and print to stdout
(6.1.3.1) What does cat
normally take as its (non-flag) arguments?
(6.1.3.2) What is the significance of the -
argument?
(6.1.3.3) What is the default behavior of cat
?
cat
new tricksWe normally just use cat
to display the contents of text files, but can do
far more that that, despite being an extremely simple shell utility.
(6.2.1) How can we use cat
to copy a file? For instance, how do we do
something like cp game.c game2.c
, without using cp
?
(6.2.2) How can we use cat
to write to a text file? For instance, how
can we write the following multi-line text into file named cat.txt
,
without using a text editor (e.g., vim
)?
They say, you can't teach an old dog new tricks.
But maybe an old cat can do what an old dog cannot.
- John
wc
is a handy utility used to count the number of lines, words, or bytes in
files. Like cat
, wc
normally takes a list of file names as arguments, but
treats the -
file name specially.
wc
normally reports all of those counts, but we can ask it to only report the
number of lines by passing it the -l
flag. For example, we can verify that
there are indeed 1000 words in the common1000
words list:
$ wc -l < words/common1000
1000
(6.3.1) How long is the longest word in the words/common1000
list?
(6.3.2) What’s the difference between the following invocations of wc
?
wc -l game.c
wc -l < game.c
cat game.c | wc -l
Are any of them equivalent?
(6.3.3) By default, echo
adds a newline after whatever it prints; e.g.,
if you run echo | wc -l
, wc
will report that it counted 1 newline.
What flag can you pass to echo
to suppress that behavior?
(6.3.4) What arguments can you pass to echo
to make it so that it
prints two newlines? Verify using echo <args..> | wc -l
.
(6.3.5) How many processes are running on CLAC right now? (Hint: running
the ps -A
command displays all running processes.)
(6.3.6) How many students have (or had) an account on CLAC? (Hint: every
student’s home directory is in /students/
.)
grep
?grep
is another staple of UNIX shell utilities, and is used to search through
and filter lines of text. Like cat
and wc
, it can be used to search
through a list of files, or through input coming from stdin.
For example, to search for any words that contain the letter m
in our
short list of words:
grep m words/short-list
Or to search for any words that contain the substring gre
in common1000
:
grep gre common1000
grep
has a lot of features, which you can read about using man grep
.
One of the more useful flags is the -r
“recursive” flag, which
asks grep
to search every file in every subdirectory:
grep -r hello
Another useful flag is -v
, which inverts the grep
query and only prints
out lines that do not contain your search term.
(6.4.1) How words contain the letter m
in the common1000
list?
(6.4.2) How many student UNIs on CLAC contain the number 2
?
(6.4.3) How many instances of Vim are running on CLAC right now?
(6.4.4) In the Wordle solutions, how many times is print_words()
used,
and where? What about valid_word()
? And what about the WORD_SIZE
macro?
If you haven’t already, compile the Wordle solutions by running make
.
In shell syntax, you can expand the output of one command into another, using
$()
. For example, if you run:
wc -l $(echo game.c words.c)
echo game.c words.c
outputs game.c words.c
, so the above command will
effectively evaluate:
wc -l game.c words.c
The run-tests.sh
script uses this to automatically run each test case. For
example, for the test case named hello1
, it runs:
./wordle $(cat tests/hello1.test) < tests/hello1.in > test-output/hello1.out
The $(cat tests/hello1.test)
expands to the arguments needed for the hello1
test case, while < tests/hello1.in
redirects test input to ./wordle
’s
stdin.
(6.5.1) Write your own new test case for Wordle. You should not have to
write your own output .out
file. (Tip: you may find the tee
command
useful, but it’s not necessary.)
(6.5.2) What are the line counts of all files in the Wordle solutions
tracked by Git? (Hint: git ls-files
lists all files tracked by Git.)
There’s also a system-wide word list installed in /usr/share/dict/words
, that
(supposedly) contains every word in the English dictionary, with each word on
its own line.
(6.5.3) How many lines are there in /usr/share/dict/words
?
(6.5.4) /usr/share/dict/words
contains duplicate entries to account for
the fact that some words may be followed by 's
. For example, it contains
both meme
and meme's
. It does this so that spellcheckers using this
file don’t need to consider these edge cases, even though humans would not
really consider these distinct words. How many “actual” words are there in
/usr/share/dict/words
, i.e., those that don’t contain an apostrophe?
(6.5.5) What’s the longest “actual” word in /usr/share/dict/words
?
(6.5.6) You can actually use /usr/share/dict/words
as the words file for
a game of Wordle. Give this a try.
(5.6.7) You can also use /usr/share/dict/words
to “brute force” your way
through a game of Wordle, provided you have unlimited guesses. Run
./wordle
with the -g -
flag to allow unlimited guesses, and use
/usr/share/dict/words
to feed guesses (whether valid or invalid) into
your game.
Note that the answers to some of these questions don’t actually matter (some of don’t even have a fixed answer). What’s more important are the commands you need to run to obtain those answers, and the thought process that goes into figuring out what command to run. What’s most important is understanding that thought process, and how you incorporate into your command-line workflow.
(6.1.1) See man ls
/try running it yourself.
(6.1.1.1) List files in long format, which also shows other metadata like the owner, permissions, and last accessed timestamp of each file.
(6.1.1.2) List all files, including hidden ones (whose file names
begin with a .
).
(6.1.1.3) The -1
flag will force each file to be printed on its own
line. Use of this flag isn’t usually necessary, but generally
considered good practice when writing shell scripts with ls
.
(6.1.2) See man echo
/try running it yourself.
(6.1.2.1) echo "\n"
just prints \n
, not a newline. This might be
surprising at first, but it’s important that echo
sees "\n"
as
the characters \
and n
, not as newline character. Use the -e
flag to tell echo
to interpret \n
as a newline.
By the way, we had to write "\n"
because \
means something special
in shell syntax, outside of the quotation marks.
(6.1.3) See man cat
/try running it yourself.
(6.1.3.1) cat
takes file names as its non-flag argument, whose
contents it concatenates.
(6.1.3.2) The -
argument tells cat
to read from stdin instead of
a file. It can be mixed with other arguments, e.g., cat myfile -
will first output the contents of myfile
, then whatever it reads from
stdin.
(6.1.3.3) If you run cat
with no arguments, it defaults to reading
from stdin, i.e., as if you had run cat -
.
(6.2.1) cat game.c > game2.c
reads from game.c
, and prints it to
stdout, which is redirected to game2.c
effectively copying the contents
of game.c
into game2.c
. cat < game.c > game2.c
also works.
(6.2.2) If you run cat
or cat -
, cat
will just read from stdin (i.e.,
what you type in with your keyboard), and print it to stdout. If you
redirect stdout to a file, you essentially get a crude text editor. So
you can do something like cat > cat.txt
, and then type in the text.
(6.3.1) According to man wc
, the -L
flag counts the maximum line
length, so either wc -L words/common1000
or wc -L < words/common1000
will work here. You should find that the longest word/line is 16.
(6.3.2) By running these commands, you’ll find that wc -l game.c
prints
out the name of the file (game.c
), while the other two don’t; since
they’re reading from stdin, there’s no actual “file” whose name it will
show. wc -l < game.c
and cat game.c | wc -l
are equivalent, because
both are using the contents of game.c
as the stdin of wc -l
. In fact,
any cmd < file
is always equivalent to cat file | cmd
.
(6.3.3) According to man echo
, you can use the -n
flag to suppress the
newline. You should find that echo -n | wc -l
reports 0.
(6.3.4) According to man echo
, the -e
flag tells echo
to interpret
escape sequences like \n
, so echo -e "\n" | wc -l
will report 2 (one
line for the interpreted \n
, one line that echo
adds by default).
(6.3.5) You can find out by running ps -A | wc -l
, and subtracting 1 from
the line count. You should run ps -A
on its own first, to get an idea of
its output format; it outputs a header row showing the meaning of each
column, and then outputs a line per running process. If you want more
information about those processes, you can run ps -A -f
, which will tell
you things like exactly what command was run to start that process, and which
user ran it.
(6.3.6) Each student account has a home directory associated with it in
/students/
. You can use ls /students/
to list all the students home
directories in /students/
, so running ls -1 /students/ | wc -l
will
count the number of home directories (the -1
is not strictly necessary
because ls
will figure out that you are piping its output elsewhere, and
automatically output each file name on a separate line). At this time of
writing, the number was 466 (though not all of those accounts are active).
(6.4.1) grep m
outputs all lines containing the letter m
from stdin;
grep m < words/common1000 | wc -l
indicates that there are 134.
(6.4.2) You can use grep 2
as a filter for any lines containing the
character 2
, so ls -1 /students/ | grep 2 | wc -l
will tell you the
number of student UNIs that contain a 2
. At this time of writing,
there are 358 such UNIs. You can run ls -1 /students/ | grep 2
to see
what they are.
(6.4.3) You can use ps -A
to list all the processes, and grep vim
to
filter for only vim
processes, so running ps -A | grep vim | wc -l
counts the number of processes. You can even use ps -A -f | grep vim
to
peek at what files others are editing.
(6.4.4) You can look for these using the -r
flag:
grep -r print_words
grep -r valid_word
grep -r WORD_SIZE
If grep
tells you that you are getting matches in binary files, you can
skip them using the -I
flag.
(6.5.1) The tee
command replicates input from stdin, to stdout and to
each file you specify as an argument. I use it to capture the input I type
and the output generated by Wordle.
Here’s what I did to create a new test. First, I wrote my mytest.test
file, which contains the arguments I run wordle
with (and I made sure to
specify the -n
argument to make sure that the test behaves
deterministically, rather than choosing a random word). Then, I ran:
tee tests/mytest.in | ./wordle $(cat tests/mytest.test) | tee tests/mytest.out
tee tests/mytest.in
captures the input that I type into stdin, and saves
it in tests/mytest.in
; tee tests/mytest.out
captures the output emitted
by Wordle, and saves it to tests/mytest.out
, while also showing it on the
screen.
Of course, this way of creating tests only works because I have working implementation of Wordle. You should always closely inspect the output of Wordle and to sure that it is what you expect. If you are writing your own test cases before you have a working solution, you will need to write these files manually.
(6.5.2) Running wc -l $(git ls-files)
counts all the lines in the current
repo; git ls-files
expands to all the tracked files, which then all given
to wc -l
as parameters. Note that running this is not the same as
running git ls-files | wc -l
, which will just count the number of tracked
files (i.e., the number of lines of git ls-files
).
(6.5.3) There are lots of words. wc -l < /usr/share/dict/words
tells us
there are 104334 to be exact (for the current version of the dictionary).
(6.5.4) You can filter out lines containing '
using grep -v "'"
; note
that the quotation marks around '
are necessary because otherwise your
shell will interpret '
as starting a string, in shell syntax. So the
command to run here is grep -v "'" < /usr/share/dict/words | wc -l
, which
should give 74744 (for the current version of the dictionary).
(6.5.5) Building on 6.5.4, you can use wc -L
to count the longest line,
so grep -v "'" < /usr/share/dict/words | wc -L
tells us the longest
actual word is 22 characters long.
(6.5.6) Just run ./wordle -f /usr/share/dict/words
(:
(6.5.7) You can use the words
file to drive a “dictionary attack” on
Wordle with ./wordle -g - -f words/common100 < /usr/share/dict/words
.
This should also work when you use -f /usr/share/dict/words
.