You have gained a short-term internship working as a junior data analyst in your city council’s employment division. Your supervisor has commissioned you a task to assist her in examining the relationship between the number of hours worked per week and income earned per year amongst the profession of casual bookkeepers. A random sample of 200 bookkeepers from different age groups who work in small to medium-sized accounting firms across different suburbs in the city were selected for collecting the data on the two associated variables. These data are stored in the .xls file attached.
Your supervisor has categorised the data into six age group categories and four suburb categories. She further calculated the frequencies for each category and presented the information in the tables below.
Your supervisor has categorized the data into six age group categories and four suburb categories. She further calculated the frequencies for each category and presented the information in the tables below.
Age
Age category
fraimnty
Age group 1
37
Age group 2
60
Age group 3
47
Age group 4
IS
Age group 5
21
Age group h
15
Use the data pnn idl’d in the tables above io answer the following questions.
(a.) Your supervisor is interested in comparing the number of bookkeepers in each ayt category. Which chart would you recommend her to use? Explain the reason in selecting the graphical chart. L nurfc
Use Excel to construci the chart you selected for part (a). Display ihe chart. Then briefly describe what you have observed about the number of bookkeepers in each age category. ] runt
(e) Your supervisor is interested in comparing the |>ru|M>rtion of the number of bookkeepers in each suburb category. Which ehart would you recommend her to use? Explain the reason in selecting the graphical chan. L mart
(dl Use Excel to construct the chan you selected for part fc). Display the chart. Then briefly describe what you have observed about the proportion of the number of bookkeepers in each suburb category. ] mart
Second, your supervisor would like to use an appropriate graphical descriptive technique in presenting summaries of the data on each of the two variables: hours narked /ter meek and income earned /ter year. Refer to the .xls tile attached.Your supervisor believes that using 9 class intervals would be besi in construciing
a hisiogjam for each of the two variables. Explain how your supervisor concluded 9 as the appropriate number of class intervals. :i mad;
Your supervisor suggests using class intervals listed below:10 <X < 15, 15 < X <20,…, 50 < X < 55 for the hours per week variable, and20 < X < 25,25 < X < .30…… 60 < X < 65 for ihe yearly income variable
Explain how your supervisor could have decided on the widih of the above class imervals. 2 masks
{cl Draw a hisiogram for each of the iwo variables. In drawing the histograms, you are to use ihe appropriate BIN values from part (b). Moreover, provide commends) on the shape of the two distributions. 2 masks
Moreover, your supervisor is interested in attaining numerical descriptive measures to further summarize ihe data on each of the iwo variables: hours marked /ter meek and inatnie earned /ter year.Present and display two numerieal summary reports – one repon for the hours worked variable, and another one for the income earned Ensure that ihe caleulated measures of mean, median, mode, range, variance, standard deviation, smallest value, largest value and the three quartiles are included in the report.
2 masks
Give a brief interpretation on the reported mean value for the income earned
variable. 1 maik
Give a brief interpretation on ihe reported third quanile value for the ftottrs worked
variable. L maik
Calculate and present the correlation coefficient value of the linear relationship between the two variables. Give an interpretation of the calculated correlation value.
1 marks
“In the digital economy the wot id has .Teen an exponential increa.se in the amount of data generated per second. When strategically managed and analysed, data transform into useful information for frjisj’nei.r decisiott making. “
Herewith, you are tasked to select and utilize real available data from any relevant source. These could be data from a variety of areas of student interest (e.g. economics, finance, accounting, marketing, management, sports, tourism, etc). You could also select and utilise data regarding the Covid-19 -pandemic situation if you like. Copy and paste the dam con select pntir It) presenting viDar report in titltlresstnq items Hi I IQ till heltnv.
The emphasis here would be for you to demonstrate your skills in data visualization and descriptive data analysis using graphical and numerical techniques taught in modules six to eight of our course. Moreover, drawing from the graphical and numerical outputs obtained, students are to present a report {fVOtt words maximum in word length). In your report, please ensure that you cover the following items:
(a) Identification of the type of daia you have. 2 marks
{b) Discussion on why you chose certain graphical technique/s for the data you have.
1 marks
Presentation and analysis of the relevant graphical outputs* and the numerical measures summary report. 5 marks
(d) Discussion on the important information you exuaei from ihe graphical outputs and the numerical measures summary report. Specifically, in the discussion you also need to address some recommendations that propose innovative business solutions for decision makers who may benefit from the use of ihe daia. * riiariu
*/pj producing the graphical outputs for this task, you can either use Microsoft Excel or Tableau Public.