Top Special Offer! Check discount
Get 13% off your first order - useTopStart13discount code now!
Task 1: Project definition and Data Collection
In this project, the relationship between adult literacy rate and life expectancy will be evaluated. Adult literacy rate is the measure of the proportion of adults in the country with basic literacy. Life expectancy is a measure of the expected length of life for the people living in a country. Both literacy levels and life expectancy are measures of development in the country. High literacy rates are associated with higher development. Similarly, higher life expectancy is associated with higher levels of development.
In this study, life expectancy is the dependent variable while literacy rate is the independent variable. We shall evaluate the relationship between the two variables by using both descriptive and inferential statistics. The measures of central tendency, dispersion as well as correlation and regression analysis will be used in the study.
The data for the study is collected from www.gapminder.org. A sample of African and Middle East countries is then selected for the study. The study sample consists of 61 countries from the two regions. The data collected contains the two variables. Both variables are quantitative data measured on the ratio scale. Most of the selected countries from both regions represent data for developing countries. However, all the countries do not have uniform levels of development.
Task 2.1: Data Analysis
We shall carry out a regression analysis on the dataset to determine the impact of literacy on life expectancy. First, we need to determine whether there is a linear relationship between the dependent variable and the independent variable. The scatter plot below shows the relationship between the two variables.
There is a linear relationship between literacy rate and life expectancy as shown above. The line of best fit is upward sloping. This shows that there is a positive linear relationship between independent variable and the dependent variable. In addition, there is likelihood of extreme low values which are likely outliers because they lie further away below the line of best fit. The outliers are caused by the existence of some countries with high literacy rate but low life expectancy or some countries with low literacy rate but high life expectancy.
The equation for the line of best fit obtained is shown below.
Y = 0.2056X + 51.879
Where Y is life expectancy and X is literacy rate.
The above regression equation is determined using the following fomrulas.
a = {(Σy)(Σx2) – (Σx)(Σxy)} / {n(Σx2) – (Σx)2}
b = {n(Σxy) – (Σx)(Σy)} / (Σx)2
Where the linear regression equation is of the form,
Y = a + bx where a is the y-intercept and b is the slope of the equation.
a = {(4017.76*305902.7259) – (4149.288*278158.1616)} / {(61*305902.7259) – 4149.2882}
a = 74,885,413.98 / 1,443,475.373 = 51.8786
b = {(61*278158.1616) – (4149.288*4017.76)} / {61*305902.7259 – 4149.2882
b = 296,804.5027 / 1,443,475.373 = 0.2057
From the above calculations, the regression equation is Y = 51.8786 + 0.2057 X
We need to determine the Pearson’s Product moment correlation coefficient, r for the relationship between the dependent variable and the independent variable. The value of r obtained from the excel formula is 0.5164. The formula below will be used to calculate the value of r.
r = {nΣxy – ΣxΣy} / √{[nΣx2 – (Σx)2]*[nΣy2
– (Σy)2} = (61*278158.1616) – (4149.288*4017.76) / √[(61*305902.7259) – 4149.2882][(61*268381.6826) – 4017.762]
r = 296,804.5027 / 574,798.2835 = 0.5164
Task 2.2: Data Grouping
The maximum life expectancy for the sampled countries is 82.91 years while the minimum life expectancy is 48.86 years. The range for life expectancy is 34.05. Using the value of range above, we create the following frequency table of 12 groups with a class interval of 3 years. The table below represents the frequency for the 12 groups.
Group
Frequency
48 - 51
1
51 - 54
3
54 - 57
1
57 - 60
7
60 - 63
15
63 - 66
10
66 - 69
5
69 - 72
4
72 - 75
5
75 - 78
4
78 - 81
5
81 - 84
1
Using the above table, we can create the following histogram.
The figure below shows the ogive for the above distribution.
The table below shows the calculation of the angle for pie chart.
Group
Frequency
Cumulative frequency
Angle
Angle size
48 - 51
1
1
(1/61)*360
5.901639
51 - 54
3
4
(3/61)*360
17.70492
54 - 57
1
5
(1/61)*360
5.901639
57 - 60
7
12
(7/61)*360
41.31148
60 - 63
15
27
(15/61)*360
88.52459
63 - 66
10
37
(10/61)*360
59.01639
66 - 69
5
42
(5/61)*360
29.5082
69 - 72
4
46
(4/61)*360
23.60656
72 - 75
5
51
(5/61)*360
29.5082
75 - 78
4
55
(4/61)*360
23.60656
78 - 81
5
60
(5/61)*360
29.5082
81 - 84
1
61
(1/61)*360
5.901639
The pie chart below shows the distribution of life expectancy among the groups.
We can use the table below to calculate the mean median, mode, standard deviation and interquartile range.
Group
x
Frequency(f)
Cumulative frequency
fx
fx^2
48 - 51
49.5
1
1
49.5
2450.25
51 - 54
52.5
3
4
157.5
8268.75
54 - 57
55.5
1
5
55.5
3080.25
57 - 60
58.5
7
12
409.5
23955.75
60 - 63
61.5
15
27
922.5
56733.75
63 - 66
64.5
10
37
645
41602.5
66 - 69
67.5
5
42
337.5
22781.25
69 - 72
70.5
4
46
282
19881
72 - 75
73.5
5
51
367.5
27011.25
75 - 78
76.5
4
55
306
23409
78 - 81
79.5
5
60
397.5
31601.25
81 - 84
83.5
1
61
83.5
6972.25
Sum
61
4013.5
267747.25
Mean = Σfx / n = 4013.5 / 61 = 65.80 years.
Mode = the modal class is 60 – 63 years.
Median = 63 + [(30.5 – 27)/10]*3 = 63 + 1.05 = 64.05
Standard deviation = √{Σfx2 – (Σfx)2/n}/n-1 = √[267747.25 – (4013.52/61)] / 60 = √61.3115
Standard deviation = 7.83
Q1 = 60 + [(15.25 – 12)/15]*3 = 60 + 0.65 = 60.65
Q3 = 69 + [(45.75 – 42)/4]*3 = 69 + 2.81 = 71.81
Interquartile range = 71.81 – 6065 = 11.16
Task 3: Analysis and Write-up
Project Definition
The objective of this project is to determine the relationship between life expectancy and literacy rate. One of the roles of the government is to ensure high standards of living for the citizen. Although it is difficult to measure and quantify the standards of living, several parameters may be used as proxies. For example, the life expectancy of a country is an indication of the standards of living and the general social and economic welfare of the citizens. Countries with higher life expectancy rank higher on economic welfare and development than the countries with lower life expectancy (Cervellati & Sunde, 2009).
Therefore, it is important for the government and other stakeholders to review and evaluate ways of increasing the life expectancy of the citizens. However, the challenge is that the government does not have direct control on life expectancy. The only way this can be increased is through manipulation of other social and economic aspects of the citizens. One of the factors that may influence life expectancy is education. When more citizens of a country are educated, the general welfare of the community improves. Therefore, it is important for the government to determine the impact that education has on the life expectancy of the citizens. The following hypotheses will be tested in the project.
Null hypothesis: Literacy rate is not a significant predictor of life expectancy.
Alternative hypothesis: Life expectancy is a significant predictor of life expectancy.
The research question for this project is; Does literacy rate have a significant impact on life expectancy?
Data Collection
Secondary data is obtained from online sources. The data for the study is obtained from www.gapminder.org. The raw data is then recorded. A non-probability sample is then obtained from the data comprising of global report of all countries. A convenience sample of African and Middle East countries is selected. The sample is made up of 61 countries. Two variables are recorded which include the literacy rate and life expectancy.
Literacy rate is a continuous quantitative variable measured on the ratio scale. It represents the percentage of adult citizens aged above 15 years and with at least primary level of education that is crucial for using written word. Therefore, the literacy rate represents the proportion of adult population hat is able to read and communicate in written text. Literacy level provides an indication of the education system of schools and adult literacy centers. The measure for literacy level ranges between zero and 100%.
On the other hand, life expectancy is a continuous quantitative variable that is measured on the ratio scale. It represents the number of years that a new born child in the country is expected to live under the current conditions. Therefore, citizens of countries with higher life expectancy are expected to live longer on average than the citizens of countries with shorter life expectancy. Life expectancy changes over time depending on some social, human, political and environmental factors. The following hypotheses will be tested to answer the research question.
Null hypothesis 1: There is no significant relationship between literacy level and life expectancy.
Null hypothesis 2: Literacy rate does not significantly influence life expectancy.
Null hypothesis 1 will be tested using correlation analysis. The correlation coeffect will indicate the direction of the relationship as well as the size of the relationship. Null hypothesis 2 will be tested using regression analysis. The coefficients of the regression will provide the quantitative impact that literacy rate has on life expectancy. Excel will be used to analyze the data.
Analysis
Both descriptive and inferential statistics will be used to analyze the data. The table below shows the descriptive statistics of the two variables.
X
Y
Mean
68.02111
Mean
65.86492
Standard Error
2.542727
Standard Error
1.012524
Median
71.29051
Median
64.91
Mode
#N/A
Mode
61.4
Standard Deviation
19.85933
Standard Deviation
7.908065
Sample Variance
394.3932
Sample Variance
62.53749
Kurtosis
-0.69131
Kurtosis
-0.51785
Skewness
-0.4954
Skewness
0.298781
Range
70.97599
Range
34.05
Minimum
25.30775
Minimum
48.86
Maximum
96.28374
Maximum
82.91
Sum
4149.288
Sum
4017.76
Count
61
Count
61
The average literacy rate for the sampled countries is 68.02% with a standard deviation of 19.86%. The maximum literacy rate is 96.28% while the minimum literacy rate is 25.31%. The average life expectancy is 65.86 years with a standard deviation of 7.91 years. The maximum life expectancy is 82.91 years while the minimum life expectancy is 48.86. We need to determine whether the sample data complies with the assumptions of linear regression. There should be a linear relationship between the independent and dependent variables. In addition, the variables should be quantitative without outliers. The dependent variable should also be normally distributed (Witte & Witte, 2015). The histogram below shows the distribution of the dependent variable.
The histogram shows that the dependent variable is normally distributed with positive skewness. The graph below shows the relationship between the variables.
The scatter plot above shows that there is a linear relationship between the dependent variable and the independent variable.
Next, we carry out a correlation analysis to determine the correlation between the dependent and independent variables. The table below shows the results of the correlation analysis.
X
Y
X
1
Y
0.516365
1
The correlation coefficient is 0.5164. This shows that 51.64% of the changes in life expectancy ca be explained by the changes in literacy rate. In addition, both variables are positively correlated.
The table below shows the results of the regression analysis.
Coefficients
Standard Error
t Stat
P-value
Lower 95%
Upper 95%
Intercept
51.87851
3.143887
16.50139
8.9E-24
45.5876
58.16941
X
0.205619
0.044396
4.631508
2.05E-05
0.116783
0.294454
The regression equation obtained from the analysis is Y = 51.8785 + 0.2056x
Data Interpretation
The regression equation implies that when literacy rate is at 0%, the life expectancy is 51.87 years. When literacy rate increases by 1%, life expectancy increased by 0.2056 of a year. The coefficient for the independent variable is significant because the p-value is lower than significance level, t = 4.63, p-value = 2.05E-05. Therefore, literacy rate is a significant predictor of life expectancy. Therefore, an increase in literacy rates can increase the life expectancy of the citizens. The government can increase life expectancy of the citizens by promoting the ease of access to schools and education institutions.
The correlation coefficient is relatively low. This implies that there are other variables that may influence the life expectancy of the citizens like access to health care, climate, etc. therefore, the government should seek t evaluate more factors that have significant effects on life expectancy.
Conclusion
The research question for the project was analyzed and evaluated. The research findings reveal that there is a positive relationship between literacy rate and life expectancy. Thus, the government can increase life expectancy by promoting policies that improve literacy rates. However, the findings of this study assume that the correlation between the two variables is an indication of causality. This may not be the case. The positive correlation for the two variables may be because of the presence of confounding variables that cause the two variables to change in the same direction (Tokunaga, 2016). In this case, the government would not have control of life expectancy by influencing literacy rate. In addition, the results of the project may not be reliable because the sampling method used does not follow probability. Therefore, the findings may be influenced by sampling bias. The findings of the study are limited to the geographical areas in which the sample was collected. The findings may not be generalized into the world population because the sample is based on African and Middle East countries. It is therefore important for further study to be contacted to include a global sample frame.
Bibliography
Cervellati, M. & Sunde, U., 2009. Life expectancy and economic growth: the role of the demographic transition. London: Centre for Economic Policy Research.
Tokunaga, H., 2016. Fundamental statistics for the social and behavioral sciences. Los Angeles: Sage.
Witte, R. S. & Witte, J. S., 2015. Statistics. Hoboken: John Wiley and Sons, Inc..
Appendix
Table 1: Multiple correlation for regression analysis
Regression Statistics
Multiple R
0.516365
R Square
0.266633
Adjusted R Square
0.254203
Standard Error
6.829368
Observations
61
Table 2: Analysis of variance for regression model.
ANOVA
df
SS
MS
F
Significance F
Regression
1
1000.474
1000.474
21.45087
2.05E-05
Residual
59
2751.775
46.64026
Total
60
3752.25
Hire one of our experts to create a completely original paper even in 3 hours!