Simple Random Sampling (SRS), Distribution of the Data and Statistics

Simple Random Sampling (SRS)

Data are collected following a simple random sampling method, called an SRS frame or grid.

The method gives every possible collection of n observations (or units of sample) the same chance of being chosen.

It is done in a fair and unbiased manner and the data that are collected are fair and unbiased.

You are watching: What does srs stand for in statistics

It is NOT a convenience sample. A convenience sample is typically biased.

A Parameter versus A Statistic

A parameter is a numerical characteristic of a population

A statistic is a numerical characteristic of the sample.

Do not know the "true" value of a population parameter, but the numeric value of a "statistic" is known from the sample and changes with each sample.

Variability

Is the difference in the value of the statistic between samples of the same size (n).

Representativeness

The sample statistic for an SRS should be representative of the broader population.

Sampling Distribution

A predictable pattern of values in repeated sampling is called the sampling distribution.

Types of Error (Bias and Precision)

The sampling distribution of a statistic tells about the bias and precision of the sampling.

The objective of a good sampling program is to have both low bias and high precision; meaning the sampling results are repeatable and representative of the true population parameter “p”.

The precision of the sample statistic to estimate the “true” population parameter increases as the size of the sample increases, but does not depend on the size of the population, as long as the population is much larger than the sample.

A large sample size increases precision regardless of the size of the population and can be increased to as high as desired by taking a large enough sample.

Assessing the Distribution of the Data

In large population studies consisting of many repeat sampling the distribution of the data should be assessed to determine the appropriateness of different statistical tests.

Normal Distribution - the curve is symmetrical in shape on both sides of the peak, also called a "bell curve" with both tails extending to infinity. The peak represents the mean, median, and the mode, all are the same. The area under the normal curve determines the spread of the data defined by the standard deviation and confidence interval.

Skewed Distribution - have more extreme values at the ends of the distribution and are asymmetrical, skewed or non-normally distributed; the mean, median, and mode are not the same.

Central Tendency of the Data - most commonly defined by the mean, median and the mode; the central tendency is also called the measure of central location in epidemiology studies.

Spread of the Data - commonly described bythe interquartile range, variance and the standard deviation. A box and whiskers plot provides visual representation of the data.

See more: How Many Therms In A Gallon Of Propane, How Many Therms Are In A Gallon Of Propane

Median and Mean - The mean is affected by extreme values, whereas the median and the mode are not.

Geometric Mean - is used when log transformation of the data are used to give a more symmetrical curve (normal curve) rather than the unadjusted observations.

Midrange - is the halfway point or mid-point in the dataset of observations

*
Bell Shaped Curve
*
Three Identical Curves with Different Central Locations
*
Three Distributions with the Same Central Location but Different Spreads
*
*
Three Distributions with Different Skewness
*
Six Figures are Sourced from CDC"s Principles of Epidemiology on Public Health Practice. a) Bell curve, b) identical curves with different central location, c) same central location but different spreads, d) normal curve with 1, 2, and 3 std dev, e)three distributions with different skewness, f) box and whisker plot

Recommended Measures of Central Location and Spread by Type of Data (CDC)

Type of Distribution

Measure of Central Location

Measure of Spread

Normal

Arithmetic mean

Standard deviation

Asymmetrical or skewed

Median

Range or interquartile range

Exponential or logarithmic

Geometric mean

Geometric standard deviation

Reference: U.S. Center for Disease Control http://www.cdc.gov (Internet Access Required)