Assignment Task
Question
1. As part of a study on salaries in different industries, a random sample of people was selected and for each person, their salary in thousands of dollars (X1) and the industry in which they work (X2) were recorded. You have been tasked to perform some analyses of the data. The data are stored in the file Assignment Data. RData in the data frame salaries.
(a) Based on what was covered in the course, create a single plot that would be most appropriate for describing the overall distribution of salaries for all people not in the secondary education industry. Make sure to give your plot a proper descriptive title and appropriate axis labels (do not just use the default title or label). Provide a clear description of the overall distribution of salaries for all people not in the secondary education industry. Be specific in your description, making sure to mention any interesting and/or important aspects of the distribution.
(b) Calculate an 87% confidence interval for the population mean salary of people in the education industry (i.e., secondary or tertiary education). Give a clear interpretation of the confidence interval. Do not use any R functions that are designed to calculate confidence intervals or perform hypothesis tests.
(c) Test whether the population proportion of salaries in the finance industry that are less than 85.15 thousand dollars is equal to the population proportion of salaries in the technology industry that are greater than 102.65 thousand dollars. Clearly state your hypotheses, making sure to define any parameters, and use a significance level of a = 4%. Do not use any R functions that are designed to perform hypothesis tests.