PGEG371: Data Analysis & Geostatistics
Laboratory Exercise # 3
1st and 5th February, 2015
Read through this instruction sheet then answer the ‘pre-Lab’ quiz BEFORE starting the exercises!
The purpose of this laboratory exercise is to use a Normal Distribution to find information about a data population.
On successful completion of this exercise, you should be able to Describe what a Normal Distribution is;
How the histogram for a whole population looks like;
How to find information about a population using probability distribution function (PDF), and cumulative distribution function (CDF); How to calculate the best estimate of the mean and standard deviation, and the confidence interval.
In this section lab, you will study PDF and CDF of a Normal distribution. PDF is the Probability Distribution Function, and CDF is the Cumulative Distribution Function. A Normal Distribution is the simplest model that exists. Could you speculate why?
Normal (Gaussian) distribution has very important characteristics. Some of them are: (1) it has always the same shape, (2) the mean values is in the centre of the distribution (any thoughts about the median and mode?), (3) the standard deviation scale the horizontal axis (what does this means?), (4) the skewness is zero (why?), and
(5) if it is a perfect Normal distribution the kurtosis is 3.
We understand that samples are a small part or a much larger and “well behaved” population. Therefore, statistics calculated on the sample data can be used to estimate statistic values for the population, when the population follows a “Normal Distribution”. You can also quantify these estimations by using confidence intervals. These confidence values depend on how many samples compared with the total population as well as how variable the original distribution is.
- Open Matlab
- Set your current directory in your ‘home directory’; - Create a new M file with the name ‘Lab3-your-last-name’ in MATLAB (if unsure, refer to Lab_1 for instructions on how to do this)!! - Save your variables on the workspace at the end of your section or every time that you consider needed, from the file menu → Save Workspace As → ----------------
Some simple yet important statistical measures might include these in the following list. The equivalent MATLAB internal functions are shown in the right hand column. Visual PDF and CDF
ksdensity (x, 'function','pdf') computes a
probability density estimate of the sample in the vector x.
ksdensity (x, 'function','cdf') computes a
cumulative probability estimate of the sample in the vector x. Labelling X and Y axis
xlabel(‘text’) adds text beside the X-axis on the current axis (same for ylabel(‘text’) ).
knowing your data:
In this lab section, each of you will be assigned an exercise from the text book. Before starting the lab, answer the following questions:
1. What is the source of your data? Explain
GASA which is data set named for the Geostatistical Association of South Africa and was used in an illustration of geostatistical technique. The sample data were taking form deep boreholes drilled into a typical Witwatersand type gold reef. The measurements of interest are the grade of the gold in grams per tonne of rock (part per million) and the thickness of the reef intersection in the borehole (centimetres). The boreholes lie approximately 1 kilometre apart and continue a typical data set for planning and design of a new wits gold mine. Coordinates are in metes. 2. What variable was assigned to you? What are its units?
(3pts) GASA data set, variables of column 7, Width of reef measured in centimetres.
3. What is the relevance of this data in Geoscience?
(3pts) GASA is used as an illustration of geostatistical technique and it is important to find the potential gold in the reef mines.
Please join StudyMode to read the full document