| 
                     
                  
                    - 
                      
Populations and samples 
                      
                        - 
                          
 Depends on data or survey 
                         - 
                          
 Example 
                          
                            - 
                              
 Population – survey CEOs of the world’s top 500 corporations 
                             - 
                              
 Parameters 
                              
                                - 
                                  
 Mean, m
                                   
                                 - 
                                  
 Standard deviation, s
                                   
                                
                             
                            - 
                              
 Sample – population has too many individuals 
                             - 
                              
 Choose sample of population 
                            
                         
                        - 
                          
 Conditions 
                          
                            - 
                              
 Every individual in a population has a known non-zero chance of being sampled 
                             - 
                              
 Equal chance for everyone 
                             - 
                              
 Has to be independent ; choosing one does not influence the choice for choosing another 
                              
                             
                           
                         
                        - 
                          
 Have to be careful when defining a population 
                          
                            - 
                              
 Book – each member of population has a number 
                             - 
                              
 Use a random number table to randomly select individuals 
                             - 
                              
 Excel – the function is =rand( ) 
                              
                                - 
                                  
 Distributed uniform (0, 1) 
                                 - 
                                  
 X ~UNIF(0, 1) 
                                
                             
                            - 
                              
 Select numbers between 0 and 1,000 
                              
                                - 
                                  
 =round(1000*rand(), 0) 
                                 - 
                                  
 The round function rounds a number to the integer 
                                
                             
                            - 
                              
 Each time you change something in Excel, Excel recalculates the random numbers 
                              
                                - 
                                  
 Use Copy and Past Special to freeze the random numbers and stop them from changing 
                                
                             
                           
                        
                     
                    - 
                      
Trick – Generate random numbers with any distribution 
                      
                        - 
                          
 Example – generate normally distributed random numbers 
                          
                            - 
                              
 Probability Density Function (PDF) – a function that associates each value of a discrete random variable with the probability that this value will occur. 
                              
                                - 
                                  
 Denoted as p(x) or f(x) 
                                
                             
                            - 
                              
 Cumulative Density Function (CDF) – integral of a probability function 
                              
                                - 
                                  
 Denoted by a capital letter, such as P(x) or F(x). 
                                
                             
                           
                        
                     
                   
                  
                     
                   
                            If you sum over all probabilities, then it has to equal one 
                  
                     
                   
                  
                    
                      
                        - 
                          
 A PDF and CDF is shown below 
                        
                      
                     
                   
                  
                     
                   
                  
                    
                      - 
                        
 Use UNIF to get probability between 0 and 1 
                        
                          - 
                            
 Find the inverse for P(X) using that random number 
                           - 
                            
 To randomly create a normally distributed variable with mean and standard deviation, then the Excel function is 
                           - 
                            
 =norminv(rand(), mean, standard deviation) 
                          
                       
                      - 
                        
 Example 
                        
                          - 
                            
 Find the random numbers for the distribution, X i~N(10, 25)  
                            
                              - 
                                
 The notation is X i~N( m, s
                                  2)
                                 
                               - 
                                
 The Excel function is = norminv(rand(), 10, 5) 
                              
                           
                          - 
                            
 Can use this method to find random numbers from any distribution 
                          
                       
                     
                   
                  
                    - 
                      
Stratified Random Sampling 
                      
                        - 
                          
 You take a sample and then you divide a sample by gender (male or female) 
                         - 
                          
 Then you divide by age, creating the four categories 
                          
                            - 
                              
 0 – 30 years 
                             - 
                              
 31 – 40 years 
                             - 
                              
 40 – 60 years 
                             - 
                              
 > 60 years 
                            
                         
                        - 
                          
 You have a total of eight compartments 
                          
                            - 
                              
 You randomly select individuals and fill the compartments equally 
                             - 
                              
 Each compartment has 10 individuals 
                            
                         
                        - 
                          
 Unfortunately, males/females and age categories may not be distributed evenly 
                          
                         
                       
                     
                    - 
                      
Unbiasedness – on average, the mean of a sample will equal its true parameter value 
                      
                        - 
                          
 The notation is E( ) = m
                           
                         - 
                          
 E stands for expected value 
                         - 
                          
 Precise – the study is repeatable, if we took another sample, we get similar results 
                         - 
                          
 Nonrandom samples – makes our parameter estimates biased 
                          
                            - 
                              
 Some people in the population will never be selected; they may be transient 
                             - 
                              
 Some people may not fill out the surveys 
                             - 
                              
 Some people may lie on surveys 
                            
                         
                       
                     
                    - 
                      
Block Randomization  
                      
                        - 
                          
 Use Table F and choose block size 2, 4, 6, 8, and 10 
                         - 
                          
 Example – testing effectiveness of a new drug 
                          
                            - 
                              
 We have 8 patients, and choose block size 8 
                             - 
                              
 Four patients get the new drug, while four patients get the placebo 
                             - 
                              
 Our study has 8 patients who have a unique number between 1 and 8 
                             - 
                              
 Patients could be a biased sample; however, we are testing drug’s effectiveness 
                             - 
                              
 Then we have 8 patients who get the following treatments 
                            
                         
                       
                    
                  
                    
                      | Treatment | 
                      2 | 
                      3 | 
                      8 | 
                      5 | 
                     
                    
                      | Placebo | 
                      1 | 
                      4 | 
                      6 | 
                      7 | 
                     
                   
                  
                    - 
                      
Standard Error 
                      
                        - 
                          
 Each time we take a sample, we get a different mean 
                         - 
                          
 Example 
                          
                            - 
                              
Sample 1:  =29.3  
                             - 
                              
Sample 2:  =33.3  
                            
                         
                       
                     
                   
                  
                     
                   
                  
                    
                      
                        - 
                          
Sample 100:  =27.7  
                        
                      
                     
                    
                      - 
                        
 We do not want to keep taking samples to find the variability in the mean 
                       - 
                        
 The standard error (SE) gives the variability in the mean for repeated sampling 
                       - 
                        
 The formula 
                      
                    
                   
                  
                     
                   
                  
                    
                      - 
                        
 As the sample size increases, the standard error decreases 
                      
                    
                   
                  
                     
                   
                  
                    
                      - 
                        
 With an infinite sample size, we know the true parameter for the mean 
                      
                    
                   
                  
                    - 
                      
Binominal Distribution 
                      
                        - 
                          
 We have two states, 
                          
                            - 
                              
 P is probability that Event A happens 
                             - 
                              
 1 – P is probability that Event A does not happen 
                            
                         
                        - 
                          
 The states or events are mutually exclusive 
                          
                            - 
                              
 We sampled 80 people and 43 went to college 
                             - 
                              
 The mean for people going to college (the event) 
                              
                                - 
                                  
 P = 43 / 80 = 0.5375 
                                
                             
                            - 
                              
 The probability for people who did not go to college 
                              
                                - 
                                  
 1 – P = (80 – 43) / 80 =1 – 0.5375 = 0.4625 
                                
                             
                           
                         
                        - 
                          
 The variance 
                          
                         
                        - 
                          
 The standard error is 
                          
                         
                        - 
                          
 It is possible to keep probability of events in percents. 
                          
                         
                       
                     
                   
                  
                     
                   
                 |