COMS 3157 Advanced Programming

Recitation 8: TCP/IP Networking in the Shell

Solutions/supplemental code for this recitation can be found on CLAC, here:

git clone ~j-hui/cs3157-pub/examples/sh-networking

8.0 `hello.sh`

For this recitation, you will implement a shell script. A shell script is just a text file containing shell commands which you execute. It allows you to automate commands you would otherwise type in yourself in the shell.

Let’s start by writing a basic shell script that prints out a simple message. Name your shell script hello.sh; it should behave like this:

$ ./hello.sh
hello

$ ./hello.sh John
hello John

$ ./hello.sh Johnathan Huiman
hello Johnathan

In particular, when you run it with no arguments, it should print hello; when you run it with a single argument, it should print hello followed by that argument.

Here are some tips to get you started:

You will need to make your script executable. You can do so using chmod:
```
chmod +x hello.sh
```
The +x means “add executable permissions,” and will allow you to execute it like ./hello.sh.
The first line of a script should start with a “shebang” that indicates what interpreter should be used to execute the script:
```
#!/bin/sh
```
This shebang tells your operating system that you would like this script to be executed using the shell interpreter, /bin/sh. Note that the shebang MUST be the very first line of the script.
You can read the first argument by writing $1, which will “expand” to the first argument after ./hello.sh.
You can use echo to print inside of your shell script, exactly how you use echo in the terminal.

8.1 `mdb-lookup-server.sh`

In lecture, we saw how netcat sends bytes from stdin to its peer, and prints bytes received from its peer to stdout. In this exercise, we will use netcat to make mdb-lookup-cs3157 available over the internet. In other words, we will convert it into a server running on CLAC.

At first, this does not seem possible. On one hand, piping nc to mdb-lookup-cs3157 will allow a TCP client to send queries, but the results of the lookup will be printed to your terminal on CLAC, not sent back to the client:

nc -l 8888 | ~j-hui/cs3157-pub/bin/mdb-lookup-cs3157

On the other hand piping mdb-lookup-cs3157 to nc will send the output of mdb-lookup-cs3157 to the client, but it will still read input from the terminal where you ran this command:

~j-hui/cs3157-pub/bin/mdb-lookup-cs3157 | nc -l 8888

To overcome this, we can use a “named pipe” (also known as a “FIFO”). A named pipe is a special kind of file that behaves like a pipe instead of a regular file: what you write into it, another program will read out of it at the same time.

To create a named pipe, we use mkfifo, whose the argument is the name of the named pipe:

mkfifo mypipe

Next, we use it to “close the loop” in our shell pipeline:

nc -l 8888 < mypipe | ~j-hui/cs3157-pub/bin/mdb-lookup-cs3157 > mypipe

~j-hui/cs3157-pub/bin/mdb-lookup-cs3157 < mypipe | nc -l 8888 > mypipe   # equivalent

Try running these commands, and connect to your server from a netcat client.

Next, you will implement this as a shell script, named mdb-lookup-server.sh.

Here are a few tips and guidelines to get you going:

Remember that you will still need to make your script executable, and include a shebang.
Rather than hard-coding the port number into your script, you should make your script use its first argument as the port number.
Your script should create a named pipe each time it is run, and clean up the named pipe before exiting.
Use echo to debug your script; think of it like printf(), but for shell scripts!

8.2 Exploring HTTP with netcat

Let’s try to learn HTTP without reading documentation! The lecture notes are full of typos anyway.

By default, cURL and Wget don’t show you the contents of the HTTP session itself; they only show the resource attached to the server’s HTTP response. For example, here’s cURL:

$ curl http://clac.cs.columbia.edu/~j-hui/cs3157/index.html
<html>
<body>
<h1>
    Hi, I'm John!
</h1>

... (( truncated for brevity )) ...

You could use the -v flag to ask cURL to show the HTTP session content:

$ curl -v http://clac.cs.columbia.edu/~j-hui/cs3157/index.html

(( output not shown ))

cURL actually includes additional information here beyond the bytes exchanged as part of the HTTP session. It also limits the flexibility of what we are able to do over that TCP connection; we can’t craft our own HTTP contents.

Instead, we can set up netcat as a fake HTTP peer, and try to reverse engineer the protocol! (This technique will also be useful for debugging your own HTTP programs.)

For instance, let’s peek at what an HTTP request looks like. First, run netcat on CLAC as a TCP server on your port of choice:

clac$ nc -l 8888

Then, send it a request using a real HTTP client like cURL, Wget, or your web browser. The request should appear on CLAC, where you ran netcat. What do you see? Compare it with the output of curl -v.

After receiving the request, nothing else will appear to happen. The client is waiting for a response from the server, but netcat doesn’t know what to respond with—it’s just a simple TCP server and doesn’t know how to speak HTTP. We can forcibly quit netcat (or cURL) by pressing Ctrl-C.

So now let’s see what an HTTP response looks like! We’ll run netcat as a TCP client, but we need to craft a valid HTTP request to first send the server. We can do this by forwarding a request captured from cURL (or Wget or any browser), using a pipe! Here’s how to set that up:

$ nc -l 8888 | nc clac.cs.columbia.edu 80

Then, on the same machine, make an HTTP request to http://localhost:8888/~j-hui/cs3157/index.html (using cURL, Wget, or a browser). The response should appear as the output of your netcat pipeline. What do you see now? Compare it with the output of curl -v.

(Note that some browsers will insist on using HTTPS instead of HTTP, which will appear as garbage bytes being printed to your terminal. If your browser does this, try another browser, or stick to cURL or Wget.)

Finally, let’s craft some of our own HTTP requests and responses, by doctoring what we got from the client and the browser. First, save a request and a response to text files, using shell redirection:

nc -l 8888 > request.txt                                # capture request

nc -l 8888 | nc clac.cs.columbia.edu 80 > response.txt  # capture response

You will need to quit commands these forcibly using Ctrl-C.

Try the following:

Hex dump request.txt (e.g., using xxd or hexyl). What do you notice about line endings? Do the same for response.txt.
Using only netcat (i.e., without using an HTTP client like cURL or Wget) and your request file, make an HTTP request to CLAC for http://clac.cs.columbia.edu/~j-hui/cs3157/index.html.
Modify the request file to be a request for http://clac.cs.columbia.edu/index.html.
See if you can create a request that will make an HTTP server respond with a 404 status code.
See if you can create a request that will make an HTTP server respond with a 400 status code.

Note that when editing the request and response files, you’ll want to make sure you preserve newline encoding. In vim, you can do this by editing the file like this:

$ vim ++ff=dos response.txt

I also recommend keeping around multiple copies of the request and response files, for each experiment you attempt. You could keep track of your progress using Git.

COMS 3157 Advanced Programming

Recitation 8: TCP/IP Networking in the Shell

8.0 hello.sh

8.1 mdb-lookup-server.sh

8.2 Exploring HTTP with netcat

8.0 `hello.sh`

8.1 `mdb-lookup-server.sh`