Statistical analysis
Introduction
In science, we often repeat an experiment many times. In doing so, we are able to compute an average value, which would be the best value to report. We make the assumption, however, that both accuracy and precision are maximized. In performing experiments, we utilize properly calibrated instruments and we use them to their fullest capability when reporting values. An experiment performed with an N-number of trials would generally have a significant population clustered about the mean value of all trials when there are random outcomes. When examining the results of a large number of N-trials, there are typically outliers that appear in many cases for experiments such as these. Rolling dice for instance can yield such a case. If a pair of dice were rolled, say, 100 times, in many instances the number seven would be obtained when adding the two dice. However, there would also be outliers such as two and twelve present in the data set. Why would we expect seven frequently? In simple terms, there are several ways seven could be the sum of the two rolled dice. In terms of this lab, we say the probability for achieving a seven is larger than the probability of achieving the outliers two and twelve.
Lab objectives
To study the statistical nature in science by investigating many rolls of four pieces of dice. These “rolls” will be simulated using Excel.
Introduction to basic probability for rolling dice
Roll a single die. A die has six faces. It is equally likely that when the die is rolled, it will be either 1, 2, 3, 4, 5 or 6. In a word, the probability that it would be any of those numbers is the same (1/6). Thus, since all outcomes are equally likely, then the probability equation for this scenario is simply ?(?)=?????? ?? ????? ?????????????? ????????.
Note that the above equation is assuming that each event scenario is equally likely as the other.
It gets much more interesting when there is a pair of dice. For example, the number of outcomes increases to 36 (each dice has a total of six faces and each dice is unique to the other). Let us call the first die (?) and the second (?). Each die is unique. We will call the “roll” scenario as (?,?). [If it is easier to imagine, call one die the “red” die and the other the “white” die.] For instance, if you roll the dice and the sum total ?+?=3. There are two scenarios which can give us this sum; (2,1) or (1,2), since both die are unique. We call our sample space {(2,1), (1,2)}. Therefore, the probability of obtaining a 3 from the roll based on our sample space is simply ?(?=????????? ? ??? ?? 3)=136+136=236=118.
The entire sample space is seen below. Note again that there are 36 different (unique) outcomes. Importantly, if all of the probabilities were added from this entire sample space, the total would add up to one.
Basic probability for rolling dice
The table below shows all of the possible scenarios listed in the sample space (36 total).
Sum of two dice
# of possible ways to get sum
2
1
3
2
4
3
5
4
6
5
7
6
8
5
9 4
10 3
11 2
12 1
Notice that there are 6 ways to obtain the sum of 7 from a pair of dice. Thus we expect the
probability to achieve a sum of 7 to be the greatest. What would be the probability of rolling a
sum of 2?
Let us assume there is a large sample of data collected by (many) rolls. The mean value is
simply the center of the distribution of values within the sample. Let there be N trials of a
measurement of a quantity x. The mean value of the measured quantity x over N trials can be
written as
.
1
1
N
i
i x
N
x (1)
This is an equation that simply states: For the “best” value of a measurement of some quantity x,
it is simply the arithmetic average x of an N number of trials. Suppose you were rolling a single
piece of die. There are 6 possible outcomes. When rolling two or more dice many times, the
mean value from all N-rolls, and all possible outcomes (?) that have a frequency of occurrence
can be written as
( ) .
1
1
N
i
i i f x x
N
x (2)
Two other important quantities are variance and standard deviation. Variance is defined as the
arithmetic average of the square of the difference between each trial and the mean. That is, it is a
measure as to how far a random number is from the mean. The standard deviation (?? ) or the
square root of the variance (?2) is expressed as:
N
i
i
N
x i x x
N
d
N 1
2
1
2 ( )
1
1
( )
1
1
. (3)
For the above equation, the quantity xi is the value for the ith trial. The deviation or the residual is
defined as the difference between the ith trial and x , di = xi – x .
Let us assume that there is some fluctuation within an N number of trials. If N is
sufficiently large, then the fluctuations (the spread) would follow a Gaussian (Normal)
distribution. The figure below shows what this
would look like with a large N-number of trials.
The width of this curve is simply the standard
deviation. Notice the percentages of a large data
set that fall within the inflection points. For
example, 68.2 % of the data fall within one
standard deviation from the mean value.
The equation describing the normal distribution is written as
?(?) =
1
?√2?
?
[−
(?−?̅)2
2?2 ].
(4)
For a large collection of data points, the values typically cluster around the mean value, if the
precision of the test is well within acceptable means. In a word, since the standard deviation is a
measure of the width of the above distribution curve, a large standard deviation correlates to
measurements of low precision.
Rolling dice is a measurement, which is random in nature. If there are two dice, one
expects the average to float around the sum of seven. The values will of course fluctuate above
and below seven. Due to these fluctuations, random uncertainty is present. This is an important
point: Many measurements should be undertaken to result in a mean value close to what one
might expect. Note that random uncertainties are assumed to follow the above distribution if
there are a (large) number of rolls and several dice.
Experimental set-up (Excel)
1. Open up Excel and pair with one other student.
2. Create a tab on the spreadsheet and call it “dice-rolling”.
3. Refer to the sample spreadsheet at the end of the experiment for labeling purposes. Please label your spreadsheet accordingly.
4. In cell A3, type “=RANDBETWEEN(1,6)” and then press enter. This random number generator is simulating a rolled die. Hit F9 to make another roll. Copy this formula in cells B3 through D3. This represents a simulation of four rolled dice.
5. Highlight the cells A3 through D3. Right click EDIT/COPY. With the cells highlighted, click on A3 and with the mouse button pressed, scroll down to A2002. Now, right-click EDIT/PASTE.
6. In cell E3, type”=SUM(A3:D3)”. Click on E3, and then right-click to EDIT/COPY. Scroll down to E2002 and with the E-column highlighted, right-click to EDIT/PASTE the formula.
7. This next step is crucial for visual appearance. Go down and click on any cell (say, D30). With the cell chosen, and with a left-click mouse, scroll down (in the same column) to row 1998. With these cells highlighted, go to the top tool bar and chose format/Hide & Unhide/Hide rows. Note: This DOES NOT affect your calculations whatsoever. It merely allows a single-page snapshot of the data and analysis. If you were going to print, it would not print many pages of data, but just this singular snap-shot.
8. In column I, starting in I3, enter “4”. This is the minimum sum with four dice. Add one down until you reach 24 in I23.
9. Left-click on cell J3 and scroll down to J23. With the cells highlighted, type “=FREQUENCY(E3:E2002,I3:I23), BUT DO NOT PRESS ENTER! This is an array. After you type, simply press both the Control and the Shift key simultaneously and then press enter. Now, to verify, in cell J24 type “=SUM(J3:J23). Press enter. Did you get what you were expected? Stop and think.
10. Now let us set-up a visual of our “collected data”. Highlight I3 through J23 (columns I and J with the appropriate cells should be highlighted). Go to Insert/Recommended Charts/All Charts. You should see the vertical bar chart that resembles a “Histogram”,
with the “x-series” being 4 through 24 (sum of dice). The y-series is frequency. Please label these series on your chart. Title the chart as you see fit.
11. It is now time to calculate other relevant statistical data.
12. In cell G3, type “=SUM(E3:E2002)/2000”. This is equation 1.
13. To verify that we should also get this mean from equation 2, in cell K3 type “=I3*J3” and copy down to K23. In K24, type”=SUM(K3:K24)”. Finally, we need the sum of all f(x) (2000). Thus, in K25 type “=K24/J24”. Compare this mean with the mean you previously calculated. [Again, do not worry if every operation you do generates another roll and numbers slightly change. Statistically, this is meaningless to what we are trying to do here.].
14. To calculate the variance (see above), we need the deviations squared. Namely, (?−?̅)2.
15. In cell F3, type in the formula “=(E3-$G$3)^2” and press enter. Copy this formula down to F2002. Notice the dollar signs around “G” in the formula. Why are those needed?
16. To calculate the standard deviation, we take the square-root of the variance. In H3, type the formula “=SQRT(SUM(F3:F2002)/1999)”. Hit enter. Why is it 1999?
17. You are now ready to calculate the normal distribution.
18. In cell L3, type “=2000*(1/((SQRT(2*PI())*$H$3)))*EXP(-((I3-$G$3)^2/(2*$H$3^2)))”. Please be careful, as this is a very long formula. You may also use the Excel function: “=2000*NORM.DIST(x, mean, stdev, FALSE)” as a back-up to compare.
19. Copy the formula down to L23.
20. Next, sum cells L3 through L23. To do that, in L24, type “=SUM(L3:L23)”. Is the answer what you expected? Mention that in your summary.
21. Lastly, in M3 we will calculate the standard deviation of the mean (SDOM). To do that, in M3 type “=H3/SQRT(J24)”. Hit enter. This calculation is simply the standard deviation divided by the square-root of the number of measurements. Notice that if the number of measurements gets even larger, the SDOM gets smaller. We will mention SDOM later in the course.
22. Save the spreadsheet
23. Right click on the graph. Go to Select Data and then click Add. Go to Series Values. Clear the information and then go and highlight cells L3 through L23. Click OK and then OK again.
24. Finally, right click on one of the bars in the graph. Go to Change Series Chart type. For series 2, which is the theoretical Gaussian, select line graph.
25. Your graph should look like the model.
26. Right click on the graph and go up to Add Chart Element on the top toolbar. Label the horizontal and vertical series.
Summary
1. Create a fourth tab called “summary”.
2. In the tab, address the following:
a. Can the 2000 rolls of four dice be properly described by a normal distribution? What do your calculations tell you about this?
b. Did you get roughly 68 % of the data within one standard deviation?
c. In your own words, how would you define “Probability” based on your experimental work in this lab?
d. Identify an experiment in YOUR area that this analysis might be useful.
3. Save the file.
4. Upload it in the link provided before the due date.
Order any type of Service We Provide
We are the best assignment service that can satisfy student’s demands in different scientific fields. We perform tasks of any kind. Our specialists provide diverse custom assignment writing services to students from all over the world every day. You can contact us for assistance with:
The list is far from complete!
Representatives of our student assignment service are connoisseurs of the peculiarities of presentation in regards to academics. Your college assignment will never turn out an untoward surprise! Whether you need research for high school, an undergraduate program, or a Master’s or Doctoral degree, you will get exactly what you are looking for to sound smart and well-informed.
If you are pressed for time, request our experts!
We are your academic saver. It is a perfect solution for people who need academic help but has no professionals nearby to provide it. Just redirect your home tasks to us and forget about any educational issues. With us, you will get 100% plagiarism-free content delivered on time by an experienced specialist in a particular field.
When you order professional assignments here, you will get:
Original custom papers. We value your academic reputation. Just as well, we value the years of thorough work on our reputation for reliability and never compromise the originality of delivered papers. We will never endanger both. Every custom assignment is written from zero – the only possible first stage of work on the order is research. We apply the latest plagiary checking tools on the final stage, so plagiarism has no chance to emerge in your college assignment writing.
Control over the order completion. To make the process of collaboration comfortable and efficient, we offer our clients to choose the writer themselves taking into account all the requirements and the budget. For you to save nerves, having entrusted your fateful task to some qualified “stranger”, we enable communication with the assigned writer in the process of order completion.
Affordable assistance. We are eager to contribute to the academic achievements of students in need of help with college assignments. We are eager to help you start investing in your career growth today. That is why our service is so affordable. We don’t charge to pay for the latest marketing tricks or advertisements – we choose the best experts in the labor market who write great custom papers in any area, level of complexity, and time frame. Our client’s grateful feedback is a top advertising trick.
On-time delivery. The quality of a paper is only valued when it is delivered within the determined time frame – you will get your custom assignment writing service in time and have a couple of days to revise it and ask for changes in case they are needed. And you will still have time to learn the material.
Protection. We value your trust and take all measures needed to keep your private and banking details safe. We offer only well-tested payment methods. Due to the system of encryption and protected servers, no third party can have access to your data.
Custom support 24/7. We do our best to make the process of collaboration comfortable for the customer. Our “write my assignment” help is always here for you to provide a solution to your problems, give you a prompt answer to any question, and offer clarification as to any issue related to the services. Feel free to contact us anytime!
If you feel that your dissatisfaction with student life is growing every day and you just feel exhausted, don’t hesitate to change the situation for the better today. Order your first assignment from reliable custom assignment services today to evaluate the advantages.
Having seen the difference once, you won’t want to go back to your previous lifestyle. You deserve to find fulfilment in other spheres of your life, have hobbies, spend time with close people but continue making progress. It is possible when you have a professional helper. Don’t put off your life for someday after graduation!
Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.
You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.
Read moreEach paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.
Read moreThanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.
Read moreYour email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.
Read moreBy sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.
Read more