Lab 5: Caffeine

In this lab, we'll be using Excel1). Specifically, we'll crunch some numbers regarding caffeine and caffeine intake and then we'll make some graphs.


After you're done, call me over and I'll give you credit (assuming you didn't make a mistake, in which case, I'll help you fix it). If you run out of time, and have to go somewhere, you can email me your lab and I'll give you credit.


Getting the data

Go to http://www.energyfiend.com/the-caffeine-database and notice that they've got a nice big table with lots of different name brand drinks, the serving size of each drink, and the caffeine content in a serving.


For this step and the purposes of this lab, you'll want to use Internet Explorer to visit that site just to ensure that everyone has the same experience and does the same amount of work (Firefox behaves a bit differently). So if you're not using IE, load it up now and visit the above site. Also, start Excel.


Going to the caffeine listing site, starting just before the word "Drink" in the gray table, click and drag all the way to the bottom right of the table. This will take an annoying long amount of time, so be sure to not let go of the mouse button until the whole table is selected. Copy (ctrl-c) the content and then paste (ctrl-v) it into the top left corner of an Excel spread sheet.


You should notice that some of the data that we pasted into Excel is colored. We'd like it to be plain. Use Excel's Format Painter to make the entire table have a plain / standard Excel look to it.


One thing you should notice is that Excel is treating the energy drink called "180" as a number (and is therefore aligning it to the right). We'd like it to appear just like all the other product names. So, on the cell containing the "180" format it as text.


Auto adjust the columns to fit their data.


Even though all the cells look normal, if you click the name of a drink it you'll quickly see that it is still behaving as a link to a website. I personally find this quite annoying because if I accidentally click on a cell, my web browser starts up. You can get rid of the link in one cell by right clicking on the link and choosing "Remove Hyperlink." We have 200+ drinks, so we do not want to do that manually for each cell.


There's a bit of a hack we can do to automate getting rid of the links. If you know a better way, I'd be interested to here it. In a blank cell somewhere, just type in the number 1. Copy (ctrl-c) that cell. Now, select all of the drink cells (remember to use ctrl-shift-down as a shortcut). On the down arrow on the paste button, click on the "Paste Special" entry. This window gives us a few non-typical options that we might want to do with the copied cells. We'll ignore most of the options.


The settings that we are interested in are: For the paste section make sure "all" is clicked and for the operation section, click "multiply." Using these settings, Excel will multiply our copied cell (which contains 1) with all the selected drink cells. I realize this doesn't make any mathematical sense, as all of our drinks are pieces of text which can't really be multiplied with a number, but in Excel, this works. Click Ok. You should notice that the appearance of your spreadsheet has not changed, but that the drink names no longer have hyperlinks associated with them. Get rid of the "1" cell we made.


Go ahead and replace all the "mg/oz" cells, which are hard coded with data, to instead use a formula to calculate the milligrams of caffeine per ounce of drink. Then, apply formatting on all of the mg/oz data cells to always show 2 decimal places. Apply the formatting necessary to the ounces and caffeine data cells to show always 1 decimal place.


Rename this sheet to be called "Data".

Analyzing the Data

Select all the data (ctrl-a) then on the Home tab, click "Format as Table." Choose a style/color that you like. Yes, your table has headers (Excel will ask you this). Notice now that you can sort the data by any one of the columns. For example, you can find the drinks with the most/least caffeine per ounce, or the drinks with the most/least caffeine in general.


Compute the average amount of caffeine in each drink. Compute the average amount of caffeine per ounce in each drink.


Off to the side of the data, write down at least 3 facts that you find by sorting various columns. This can just be in sentence form and should say something like "The drink with the most [blank] is [blank]."


Find the drinks (at least 3) that you typically consume in an average week and which also have some caffeine (e.g., "Bawls, Coffee (Brewed), and Snapple Tea"). If you never ever drink any drink on the list, pretend that you drink at least 3. Off to the side, or on another data sheet, make a list of those drinks and their various attributes (size, caffeine, mg caffeine/ounce) (see paragraph below).


Normally, I would say that the data should reference the original table data, so that, if you change the size of a drink in the table your list will automatically update. However, because we are using a table and because we may sort things, the positions of items may change. For example, the drink 180 was at =A2 but if we sort in some ways, its position may change. To keep your table of the top 3 drinks that you consume in a week, instead copy and paste the values off to the side.


According to Wikipedia, a typical adult would need to drink 80 to 100 cups of coffee within a limited time frame in order to experience a lethal dosage of caffeine. How much caffeine (mg) is in 80 cups of coffee? How much caffeine (mg) is in 100 cups of coffee? Using these lower and upper bounds on the amount of caffeine a person would have to have in order to die, then pick your favorite drink and determine an upper and lower range on how many servings of that drink you would need in order to overdose and die. Please do not in any way attempt to find out how accurate these bounds are.


Using the names of your top 3 favorite drinks and the amount of caffeine/ounce in them, make a pie chart. Give it custom colors to match the drink. For example, Pepsi could be blue, Coca-Cola red, Root Beer could be brown etc).


Use Excel's conditional formatting capabilities to let the viewer quickly get an understanding of the data. Use your own judgment in what you want to show and why someone might want to see/understand that.


Now, we'll make a much more complicated chart: a histogram. We'd like to see which concentrations of caffeine (milligrams/ounce) are more common than others. This might give some indication about society's needs and preferences regarding caffeine intake.


Make a chart in Excel that looks like the table below. The 1st column will be the boundaries that Excel will use to decide which row a product belongs in. The 2nd column will be the labels that we'll use when we make the chart. The 3rd column we'll have Excel fill in. So for example, if the number 3 appeared in the product count cell for the row corresponding to products having caffine "up to 10" then there are 3 products in our list that have up to 10 mg/oz of caffeine, but no more.

Up to: Caffeine (mg/oz) Product Count
10 Up to 10
20 Up to 20
30 Up to 30
40 Up to 40
50 Up to 50
60 Up to 60
70 Up to 70
80 Up to 80
90 Up to 90
100 Up to 100
110 Up to 110
120 Up to 120
Over 120


Now for the toughest part (which is still pretty easy). Select all the cells in the product count column (they should be empty). Then, start typing =FREQUENCY( and go ahead and select all of the cells in the mg/oz section of the product table (don't forget the "(" after FREQUENCY). Type "," (comma) and then select all the cells in the "Up To:" column. Don't forget to select the empty one at the end. Hit ctrl-shift-enter. All of the product count cells should be filled in with some numbers. Verify that the numbers look approximately right.


Select the last 2 columns in your histogram table, including their headers and then insert a bar chart or column chart (your choice). Adjust the colors / look of the chart. Also, be sure to adjust the vertical axis so that it better fits the data. Change at least 3 things in the chart's look or behavior that we did not cover in class.


Go ahead and raise your hand so I can come check your work. I apologize in advance if you end up waiting for a long time. I'll try my best to quickly give everyone credit, while at the same time making sure everyone was able to complete the lab and learn something.

1) This should not surprise you.