Suppose you are an intern in a consulting firm that provides businesses and government agencies with advices on various social and economic issues. You are involved in the project of advising a pharmaceutical company that is interested in penetrating the market of Luckland (a country endowed with rich natural resources). Your supervisor, the project manager, needs some vital facts about the adult population of Luckland and asks you to answer the following questions.
(i) What are the main characteristics of the adult population: weight, height, age, gender, smoker or not, married or not, income, education?
(ii) What is the proportion of individuals who have kids and currently smoke?
(iii) What are the main factors that influence an individual’s weight?
(iv) How does the income influence an individual’s weight, ceteris paribus?
(v) How does the education level influence an individual’s weight, ceteris paribus?
(vi) Is there any evidence that smokers weigh less than non-smokers, ceteris paribus? Is there any evidence that smoke-quitters (those having quitted smoking) weigh more than non-smokers, ceteris paribus?
(vii) What is the expected (or mean) weight of a single male, non-smoker, 35 years of age, childless, 170 centimeters tall, with a trade certificate and an annual income of 30 thousand Luckland dollars?
Your supervisor instructs that you use the linear regression model log(ℎ)=0+1log(ℎℎ)+2log()+3+4+52+6+7+8+9+10+11+12+ to find the answers to questions (iii)-(vii), where s are parameters to be estimated. In addition to answering the questions (i)-(vii), you are encouraged to comment on the adequacy of this model for analyzing the questions. You have access to a data set from a recent national health survey of Luckland, which can be regarded as a random sample. The data description is in the file “NHS.des” and data are in the file “NHS.raw”. Read “NHS.des” carefully and make sure that you understand the meaning of each variable in