Wireshark Lab HTTP

This blog series I will be solving a number of Labs to understand a bit more how different internet packages send or receive information, and how can we use Wireshark to analyze these packages.

So what is Wireshark? In simple words Wireshark is a packet analyzer. Most commonly used for network troubleshooting, analysis, software and communications protocol development.

In this blog I will focus mainly in solving each Lab and won’t be explaining basic concepts with much details. If you are new to Wireshark I’ll recommend to review the Introduction to these labs, which explain some basic concepts and the initial steps to use WireShark.

So to start with this lab let’s open Wireshark. You might be seeing a windows “similar” to the one showed in the figure below.

ws_00

To start capturing packages you should first select the interface you will be working with e.g. Ethernet card, wireless card, etc. (in my case I am using a wireless connection therefore I’ll select the “Wireless Network Controller”). Then select Capture in the top menu and click on Start, or click on the Shark’s Fin icon at the top-left of the window.
ws_01
To stop capturing data, just click on the stop icon (red square).
ws_02

To work with Wireshark it is important that you understand how it shows the captured information. The Wireshark window is divided in five major compnonets (See figure below):
1. The command menus, located at the top of the windows it containd buttons and/or pulldown menu for actions such as, save, capture, exit, etc..
2. The Packet-Listing window, shows a one-line summary for each packet captured.
3. The Packet-header details window, provide details about the packed selected in the “Packet-listing windows”.
4. The Packet-content window, display the content of the captured frame in both ASCII and hexadecima format.
5. The Packet-display filter filed, in this filed you can place information to filted the packages showed in the Packet-listing window.
ws_03

Now that we got a very short overview of Wireshark, let us start with the Wireshark HTTP lab.

The Basic HTTP GET/response interaction

For the first part of this lab do the following:

  • Start up your web browser.
  • Start up the Wireshark packet Sniffer. Enter http in the display-filter-specification window, so that only captured HTTP messages will be displayed later in the packet-listing window.
  • Wait a bit more than one minute, and then begin Wireshark packet capture.
  • Enter the following to your browser http://gaia.cs.umass.edu/wireshark-labs/HTTP-wireshark-file1.html Your browser should display the very simple, one-line HTML file.
  • Stop Wireshark packet capture.

ws_04

The above Figure shows in the packet-listing window that two HTTP messages were captured:

  • The GET request message (from your browser to the gaia.cs.umass.edu web server) and,
  • The response message from the server to your browser.

The HTTP response message consists of a status line, followed by header lines, followed by a blank line, followed by the entity body.
Note that since HTTP messages are carried inside a TCP segment, which are carried inside an IP datagram, which are carried within an Ethernet frame, Wireshark displays the Frame, Ethernet, IP, and TCP packet information as well.
However because this lab is for HTTP let us ignore all non-HTTP data displayed, and let us extend the data displayed for “Hypertext Transfer Protocol“.

By looking at the information in the HTTP GET and response messages, answer the following questions.

  1. Is your browser running HTTP version 1.0 or 1.1? What version of HTTP is the server running?
    You can see the my Browser and Server are running HTTP 1.1
    ws_05
  2. What languages (if any) does your browser indicate that it can accept to the server?
    My browser can accept Japanese and English.ws_06
  3. What is the IP address of your computer? Of the gaia.cs.umass.edu server?
    Client IP: 10.104.52.95
    Server IP: 128.119.245.12
    As shown in the Figure below you can get the IP information from the HTTP GET and response messages on the Packet Listing window of from the Internet Protocol line on the Packet-Header details window.ws_07
  4. What is the status code returned from the server to your browser?
    Status code: 200
    The HTTP status code will give you information about the response to your HTTP request. For example code 2xx (such as 200) means that the request was successfully completed.ws_08
  5. When was the HTML file that you are retrieving last modified at the server?
    Last modified: Mon, 05 Dec 2016
    Notice that the document you just retrieved was last modified within a minute before you downloaded the document. That is because for this particular file, the gaia.cs.umass.edu server is setting the file’s last-modified time to be the current time, and is doing so once per minute.ws_09
  6. How many bytes of content are being returned to your browser?
    Content length: 128
    The Content-Length will return the length in bytes of the body of the message.ws_10
  7. By inspecting the raw data in the packet content window, do you see any headers within the data that are not displayed in the packet-listing window? If so, name one.
    No.

The HTTP CONDITIONAL GET/response interaction

For the next series of question do the following:

  • Start up your web browser.
  • Empty the browser’s cache (clear recent history on your browser; no information of webpages opened should be stored on your pc).
  • Start up the Wireshark packet Sniffer, and start the packet capture.
  • Enter the following URL into your browser
    http://gaia.cs.umass.edu/wireshark-labs/HTTP-wireshark-file2.html
    (Your browser should display a very simple five-line HTML file.)
  • Quickly enter the same URL into your browser again (or simply refresh the page on your browser)
  • Stop Wireshark packet capture, and enter “http” in the display-filter, so that only captured HTTP messages will be displayed later in the packet-listing window.

ws_11-0
As shown in the figure we will get two HTTP GET request, one from the first time we enter the URL with the respective 200 OK response message and the second HTTP GET from refreshing the page (or reentering the URL) with a 304 Not Modified response message.

Answer the following questions:

  1. Inspect the contents of the first HTTP GET request from your browser to the server. Do you see an “IF-MODIFIED-SINCE” line in the HTTP GET?
    ws_11-1
    As you can see in the figure there is no line with “IF-MODIFIED-SINCE” in the Packet-headers details of the HTTP GET request. This is because we are looking at the first HTTP GET request and our browser didn’t not have any previous data about the URL we opened.
  2. Inspect the contents of the server response. Did the server explicitly return the contents of the file? How can you tell?
    ws_12-3
    We can tell the server returned the content of the file becasue as shown in the figure we can see the content of the message in the Packet-header details window, Line-based text data.
  3. Now inspect the contents of the second HTTP GET request from your browser to the server. Do you see an “IF-MODIFIED-SINCE:” line in the HTTP GET? If so, what information follows the “IF-MODIFIED-SINCE:” header?
    ws_13
    As shown in the figure on the second HTTP GET request we could find the “IF-MODIFIED-SINCE” line. This line is followed by “Tue, 10 Jan 2017 06:59:01 GMT\r\n” Wich is the date shown in the “Last-Modified” field of the first HTTP response message (See question 2’s figure).
  4. What is the HTTP status code and phrase returned from the server in response to this second HTTP GET? Did the server explicitly return the contents of the file? Explain.
    ws_14
    We can see in the figure that the Status Code is 304 Not Modified. In this case the server didn’t return the content because the browser already had it from its cache.

Retrieving Long Documents

In prevouos questions, the documents retrieved have been simple and short HTML files. In the next series of question you will see what happens when we download a long HTML file. Do the following:

  • Start up your web browser, and make sure your browser’s cache is cleared.
  • Start up the Wireshark packet sniffer
  • Enter the following URL into your browser
    http://gaia.cs.umass.edu/wireshark-labs/HTTP-wireshark-file3.html
    Your browser should display the rather lengthy US Bill of Rights.
  • Stop Wireshark packet capture, and enter “http” in the display-filter-specification window.
    ws_15-3

In the packet-listing window, you should see your HTTP GET message, followed by a multiple-packet TCP response to your HTTP GET request.
With the newer version of Wireshark by entering http on the display-filter I just got the HTTP GET request and the response. To see the TCP responses right-click one of the packets select Conversation Filter and then select TCP.
ws_15-4
There are multiple TCP packets because the HTML file is very long, 4500 bytes which is too large to fit in one TCP packet. Therefore, the HTTP response message is broken into several parts, with each part being contained within a separate TCP segment.
As you can see in this new versions of Wireshark, Wireshark indicates each TCP segment as a separate packet, and the fact that the single HTTP response was fragmented across multiple TCP packets is indicated by the “TCP segment of a reassembled PDU” in the Info column of the Wireshark display.

In earlier versions of Wireshark (see figure below) the “Continuation” phrase was used to indicated that the entire content of an HTTP message was broken across multiple TCP segments. (Sorry about the Bing desktop app in the figure)
ws_15-2

Now that we clear the differences between the Wireshark versions let us answer the following questions:

  1. How many HTTP GET request messages did your browser send? Which packet number in the trace contains the GET message for the Bill or Rights?
    There is only one HTTP GET request. In the new version the HTTP GET request was capture on frame 3017. The older version captured the HTTP GET request on frame 63
    ws_16-1
    ws_16-2
    A better way to see the whole data, is using a feature called “Follow TCP Stream“. Just right-click on any of the HTTP/TCP packets associated with a given TCP stream, select “Follow” and “TCP Stream“. The “Follow TCP Stream” window will open. This window contains the data exchanged in the selected stream.
    ws_17
    The figure above shows the “Follow TCP Stream” window on the older version of Wireshark (the newer version will be very similar) for the HTTP GET request (highlighted in red) and its complete associated response (highlighted in blue). Also, you can see the total number of packets the client and server sent for that particular TCP stream.
  2. Which packet number in the trace contains the status code and phrase associated with the response to the HTTP GET request?
    Packet number (frame) 67.
    ws_18
  3. What is the status code and phrase in the response?
    The Status Code of the response is 200 OK. (See figure above).
  4. How many data-containing TCP segments were needed to carry the single HTTP response and the text of the Bill of Rights?
    On the older version of Wireshark there are 3 packets with the “Continuation” phrase.
    ws_15-2
    On the newer version of Wireshark there are 3 packets with the “TCP segment of a reassembled PDU” indication.
    ws_15-4

HTML Documents with Embedded Objects

For the next step let’s look at what happens when your browser downloads a file with embedded objects, that are stored on another server(s).

Do the following:

  •  Start up your web browser, and clear the browser’s cache.
  • Start up the Wireshark packet sniffer
  • Enter the following URL into your browser
    http://gaia.cs.umass.edu/wireshark-labs/HTTP-wireshark-file4.html
    Your browser should display a short HTML file with two images. These two images are referenced in the base HTML file. That is, the images themselves are not contained in the HTML; instead the URLs for the images are contained in the downloaded HTML file. The browser will have to retrieve these logos from the indicated websites. The publisher’s logo is retrieved from the gaia.cs.umass.edu website. The image of the cover is stored at the caite.cs.umass.edu server.
  • Stop Wireshark packet capture, and enter “http” in the display-filter-specification window.
    ws_19

Now let us answer the following questions:

  1. How many HTTP GET request messages did your browser send? To which Internet addresses were these GET requests sent?
    In this case I got 4 HTTP GET request messages. Note that I got 4 and not 3 because the first GET request for the jpg file was redirected to a different server. You can confirm this on the response message with code 302 Found.
    ws_20
  2. Can you tell whether your browser downloaded the two images serially, or whether they were downloaded from the two web sites in parallel? Explain.
    Images were downloaded serially. The time stamp shows that the png image was downloaded after the 3rd GET and the jpg image was downloaded after the 4th GET request.
    ws_21

HTTP Authentication

Finally, let’s on the last part of this lab we will access a website that is password-protected and examine the sequence of HTTP message exchanged for such a site.
The username is “wireshark-students” (without the quotes), and the password is “network” (without the quotes).

Do the following:

Now let’s check the Wireshark output. You might want to first read the the easy-to-read material on HTTP authentication recommended on this Lab “HTTP Access Authentication Framework”.

Answer the following questions:

  1. What is the server’s response (status code and phrase) in response to the initial HTTP GET message from your browser?
    The status code of the server response to the initial HTTP GET request was 401 Unauthorized.ws_23
  2. When your browser’s sends the HTTP GET message for the second time, what new field is included in the HTTP GET message?
    The second HTTP GET request include the field “Authorization: Basic” with the username and password that was entered.ws_24

The username (wireshark-students) and password (network) that you entered are encoded in the string of characters (d2lyZXNoYXJrLXN0dWRlbnRzOm5ldHdvcms=) following the “Authorization: Basic” header in the client’s HTTP GET message. While it may appear that your username and password are encrypted, they are simply encoded in a format known as Base64 format. The username and password are not encrypted! To see this, go to this link and enter the base64-encoded string “d2lyZXNoYXJrLXN0dWRlbnRzOm5ldHdvcms=” select “decode the data from a Base64 string (base64 decoding)” and decode. That is all you need, you have translated from Base64 encoding to ASCII encoding, and thus should see the “wireshark-students:network“. I color coded the encoded message, the red part corresponds to the username and the blue part to the password.

Since anyone can download a tool like Wireshark and sniff packets (not just their own) passing by their network adaptor, and anyone can translate from Base64 to ASCII, it should be clear that simple passwords on WWW sites are not secure unless additional measures are taken.

This ends the HTTP lab of the Wireshark Labs series. On my next blog I’ll cointuinue with the DNS Lab.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s