A. Genetics
Ribosomal 5S RNA can be presented as a sequence of 120 nucleotides. Each nucleotide can be represented by one of four characters: A (Adenine), G (Guanine), C (Cytosine), or U (Uracil). The characters occur with different probabilities for each person. We wish to test if a new sequence is the same as ribosomal 5S RNA. For this purpose, we replicate the new sequence 100 times and find that there are 60 A’s in the20th position. Use a 0.05 level of significance.

1. If the probability of an A in the 20th position is 0.79 in ribosomal 5S RNA, then test the hypothesis that the new sequence is the same as the ribosomal 5S RNA using critical method. 2. Report a p- value corresponding to your result in problem num 1 1A.

* Ho= there is no difference in the new sequence as the ribosomal 5S RNA using critical method Ha= there is difference in the new sequence of ribosomal 5S RNA using critical method

* X= 60, level of significance= 0.05
* Reject null hypothesis
GENETICS| |
| |
Data|
Null Hypothesis =| 0.79|
Level of Significance| 0.05|
Population Standard Deviation| 20|
Sample Size| 100|
Sample Mean| 60|
| |
Intermediate Calculations|
Standard Error of the Mean| 2|
Z Test Statistic| 29.605|
| |
Two-Tail Test| |
Lower Critical Value| -1.959963985|
Upper Critical Value| 1.959963985|
p-Value| 0|
Reject the null hypothesis| |
* Conclusion: The new sequence is different from ribosomal 5S RNA using critical method.

2A.
* P-value: 0
Pharmacology:
One method for assessing the effectiveness of a drug is to note its concentration in blood and/or urine sample at certain periods of time after giving the drug. Suppose we wish to compare the concentrations of two types of aspirin (type A and B)in urine specimens taken from the same person, 1hour after he or she has taken the drug. Hence, a specific dosage of either type A or type B aspirin is...

...
INTRODUCTION TO NORMAL DISTRIBUTIONS
The normal distribution is the most important and most widely used distribution in statistics. It is sometimes called the "bell curve," although the tonal qualities of such a bell would be less than pleasing. It is also called the "Gaussian curve" after the mathematician Karl Friedrich Gauss. As you will see in the section on the history of the normal distribution, although Gauss played an important role in its history, Abraham de Moivre first discovered the normal distribution.
Strictly speaking, it is not correct to talk about "the normal distribution" since there are many normal distributions. Normal distributions can differ in their means and in their standard deviations. Figure 1 shows three normal distributions. The green (left-most) distribution has a mean of -3 and a standard deviation of 0.5, the distribution in red (the middle distribution) has a mean of 0 and a standard deviation of 1, and the distribution in black (right-most) has a mean of 2 and a standard deviation of 3. These as well as all other normal distributions are symmetric with relatively more values at the center of the distribution and relatively few in the tails.
Figure 1. Normal distributions differing in mean and standard deviation.
The density of the normal distribution (the height for a given value on the x axis) is shown below. The parameters μ and σ are the mean and standard deviation, respectively, and define the normal distribution. The...

...BIOSTATISTICS
* Effect modification: effect of the main exposure on the outcome is affected by another variable. It is NOT a bias!
* Case-control study: also known as retrospective study. Divided into “cases” and “controls.” If disease is rare, the odds ratio will approximate the relative risk (following subjects over time.)
* Cohort study (Retrospective or prospective): divided into “exposed” and “non-exposed.” Study subjects are free of the outcome at the time the study begins.
* Cross-sectional study: exposure and outcome are studied at one point in time. (Think: snapshot study)
* Randomized control trial: gold standard for studying the efficacy of treatment or procedure. Subjects are randomized into experimental or control. Less bias and strong causal relationship.
* Cross-over study: group of participants are randomized to one treatment for a certain period of time and the other with the alternative for that same certain period of time. Then after period ends, the groups switched treatments for the duration of the trial.
* Parallel study: think “drug group vs. placebo group.” No other variables are measured
* Hazard ratio: the higher the ratio, that higher risk for hazardous events. If ratio is 1 (or value if closer to 1), then there’s little difference between the two entities
* Factorial design studies: randomization of different interventions with additional study of 2 or more variables
* Cluster...

...I wish to apply for admission to the PhD program in Biostatistics at your esteemed university, commencing from Fall 2002 . I am currently studying in the second year of the two year Masters program in Statistics at the Indian Statistical Institute, Calcutta, known all over the world for the seminal contributions that some of its alumni have made and are continuing to make i n the field of Statistics - both theoretical and applied, after having successfully completed the three year Bachelor s program in Statistics with distinction, full stipend and scholarship from the Government of India.
My academic endeavours at the institute have been quite encouraging. I have been consistently ranking fifth in my class for the past few semesters at the institute which has culminated in my receipt of an award in the form of books and cash. Ranking among the top few in class was not something new to me as I had been doing so right from my days at school. At school, mathematics and the physical sciences used to be my favorites and I mastered the subjects with unmatched proficiency.
Upon making it to the Indian Statistical Institute after high school and after clearing a tough countrywide admission test, through which only thirty students are selected among ten thousand candidates, I was introduced to the world of Statistics and Probability. The most appealing facet of Statistics that actually motivated me to think of a career in the subject is its usefulness in...

...Tutorial 3
Question 1
a) A company wishes to review its distribution operation and from its time sheet records it found that 144 vehicles were loaded in a 24 hour period. A frequency distribution table was prepared from the data as follows:
Time to load (minutes) Number of vehicles
40 up to 50 17
50 up to 60 61
60 up to 70 59
70 up to 80 7
From the above data construct :
i) a frequency histogram.
ii) a frequency polygon.
iii) a cumulative frequency curve (ogive).
iv) From your ogive estimate:
- the median loading time
- the interquatile range and quartile deviation.
- The number of vehicles loaded in under 52 minutes.
Question 2
The frequency distribution shows the times taken by 30 pupils to do their Mathematics homework. Times have been measured to the nearest minute.
Time (min) Frequency
14. 4
24. 8
34. 10
35-44 8
i) Construct a histogram of the data.
ii) Construct a polygon.
iii) Construct a less than cumulative frequency table and draw the corresponding curve. Use your curve to estimate:
- the median,
- the lower and...

...Exam 1
Chapter 1
* In statistics the group we wish to study is called the population
* A sample is a subset of the population which is used to gain insight about the population. Samples are used to represent a larger group, the population.
* • Descriptive statistics – the collection, organization, analysis, and presentation of data.
* Inferential statistics – uses descriptive statistics to estimate population parameters; an educated guess about the population based on sample data.
Chapter 2
* • During the experiment a treatment is applied to the experimental group.
* • The exact treatment will depend on the particular experiment.
* • The treatment changes the level of the explanatory variable in the experiment. The effect of the treatment can be measured by comparing the response variable in the control and experimental groups.
* Qualitative: Descriptions and Labels
* Quantitative: counts and measurements
* Discrete: Usually counts of things, restricted set of values
* Continuous: Usually measurements, data can take on any value in an interval
* Nominal measures offer names or labels for certain characteristics
* Ordinal data represents data in an associated order.
* If the data can be ordered and the arithmetic difference is meaningful, the data is interval.
* Ratio data has a meaningful zero point and the ratio of two data points is meaningful.
* Quantitative data is measured on the interval...

...1)Permutation----nPr = n!
----
(n-r)!
2)Combination----nCr = nPr = n!
----- -------
n r! r! (n-r)!
3)Summation-----∑ X i
i =1
n
4)Product--------Л Xi
i=1
5)Age specific fertility rate(Asfr)=No of live birth at specific age
-------------------------------------
No of women to specific age group
Asfr=∑
6)Total fertility rate(TFR)= ∑ Asfr *group of year
7)Gross Reproductive Rate( GRR) =TFR*(100)sex ratio
205
f
8)Net Reproduction Rate (NRR)= {Σ( Asfr * nSx)} * group of year
9)Life Table-
nLx= Persons year live in the Interval
lx = No.of person live at particular age
nqx = Proportion of person live at particular age
nSx = nLx
l0 * group of year , where l0 =5,00,000
10)Crude Death Rate (CDR)= Σ (Pi)*(asdri) →Direct Standardisation...

...Test
1-
Monday February 3,2Ot4
Following the instructions for each problem and when necessary show os much work os necessary to
justify your answers. Point values are listed next to each problem.
1. The Centers
for Disease Control and Prevention (CDC) lists causes of death in the United States during
2003:
v
a. ls
it reasonable to conclude that heart or respiratory disease were the cause of approximately 33% of
Og &^'jrr-,Tf
\)
ffi{,rr',';-1n;"
ilii*rl hblf
pts)
L{f
: , b{ fn f ,n M f;et"Ctu"ri4,,,
l\rlutrr,r ri.rr{r'. ..,i:, "';f i)[til**.,t
U.s. deaths in 2003? Yes or No and briefly explain. (5
t""F
q'"i \'t\ l"' 'ftltt
j', ,l Ot lt-' I
b. What percent of deaths were from causes not listed here? | lRts)
t"\;,-i
[\
.L
'."
y-.
u
j
,
t
" 1'$
{'r"'
\ " \ot- .f r 'r ---] -]r]", ,
,
n appropriate graphical display for these data. (10 pts)
')!,
"
y;{-'{
'it '; 'i ''
Qtl't'l
:
br*L, lr, >*eJ^t--d
ffiD+e)
J'
\1
a
\}
.(s ptsl
distribution of average length of stay can best be described as
volue 5
5
6
7
8
9
10
11
I
_./ t
'
23
o2466
0000
7
7888899
45
46
0
3. Using the Stem and Leaf display above, find the Median and Mode of the data. (5 pts)
rY\
oc\a
yr&q$.
:+''
'-J'
o
*{,n-6.F 1, t
4. How hot does it get in Death Valley? The following data are taken from a study conducted by the...