Do poor countries grow faster than rich countries?¶

Dataset¶

The dataset contains growth data of Barro-Lee. The Barro Lee data consists of a panel of 138 countries for the period 1960 to 1985. The dependent variable is national growth rates in GDP per capita for the periods 1965-1975 and 1975-1985. The growth rate in GDP over a period from t1 to t2 is commonly defined as log(GDPt1 /GDPt2 ). The number of variables are p=62. The number of complete observations is n=90.

The full data set and further details can be found at http://www.nber.org/pub/barro.lee, http: //www.barrolee.com, and, http://www.bristol.ac.uk//Depts//Economics//Growth//barlee.htm.

  1. Outcome : national growth rates in GDP per capita for the periods 1965-1975.
  2. intercept: Constant.
  3. gdpsh465 : Real GDP per capita (1980 international prices) in 1965
  4. bmp1l : Black market premium. Log (1+BMP)
  5. freeop : Measure of "Free trade openness
  6. freetar : Measure of tariff restriction
  7. h65 : Total gross enrollment ratio for higher education in 1965.
  8. hm65 : Male gross enrollment ratio for higher education in 1965.
  9. hf65 : Female gross enrollment ratio for higher education in 1965.
  10. p65 : Total gross enrollment ratio for primary education in 1965.
  11. pm65 : Male gross enrollment ratio for primary education in 1965.
  12. pf65 : Female gross enrollment ratio for primary education in 1965.
  13. s65 : Total gross enrollment ratio for secondary education in 1965.
  14. sm65 : Male gross enrollment ratio for secondary education in 1965.
  15. sf65 : Female gross enrollment ratio for secondary education in 1965.
  16. fert65 : Total fertility rate (children per woman) in 1965.
  17. mort65 : Infant Mortality Rate in 1965.
  18. lifee065 : Life expectancy at age 0 in 1965.
  19. gpop1 : Growth rate of population.
  20. fert1 : Total fertility rate (children per woman).
  21. mort1 : Infant Mortality Rate (ages 0-1).
  22. invsh41 : Ratio of real domestic investment (private plus public) to real GDP.
  23. geetot1 : Ratio of total nominal government expenditure on education to nominal GDP.
  24. geerec1 : Ratio of recurring nominal government expenditure on education to nominal GDP.
  25. gde1 : Ratio of nominal government expenditure on defense to nominal GDP.
  26. govwb1 : Ratio of nominal government "consumption" expenditure to nominal GDP (using current local currency).
  27. govsh41 : Ratio of real government "consumption" expenditure to real GDP. (Period average).
  28. gvxdxe41 : Ratio of real government "consumption" expenditure net of spending on defense and on education to real GDP.
  29. high65 : Percentage of "higher school attained" in the total pop in 1965.
  30. highm65 : Percentage of "higher school attained" in the male pop in 1965.
  31. highf65 : Percentage of "higher school attained" in the female pop in 1965.
  32. highc65 : Percentage of "higher school complete" in the total pop.
  33. highcm65 : Percentage of "higher school complete" in the male pop.
  34. highcf65 : Percentage of "higher school complete" in the female pop.
  35. human65 : Average schooling years in the total population over age 25 in 1965.
  36. humanm65 : Average schooling years in the male population over age 25 in 1965.
  37. humanf65 : Average schooling years in the female population over age 25 in 1965.
  38. hyr65 : Average years of higher schooling in the total population over age 25.
  39. hyrm65 : Average years of higher schooling in the male population over age 25.
  40. hyrf65 : Average years of higher schooling in the female population over age 25.
  41. no65 : Percentage of "no schooling" in the total population.
  42. nom65 : Percentage of "no schooling" in the male population.
  43. nof65 : Percentage of "no schooling" in the female population.
  44. pinstab1 : Measure of political instability.
  45. pop65 : Total Population in 1965.
  46. worker65 : Ratio of total Workers to population.
  47. pop1565 : Population Proportion under 15 in 1965.
  48. pop6565 : Population Proportion over 65 in 1965.
  49. sec65 : Percentage of "secondary school attained" in the total pop in 1965.
  50. secm65 : Percentage of "secondary school attained" in male total pop in 1965.
  51. secf65 : Percentage of "secondary school attained" in female total pop in 1965.
  52. secc65 : Percentage of "secondary school complete" in the total pop in 1965.
  53. seccm65 : Percentage of "secondary school complete" in the total pop in 1965.
  54. seccf65 : Percentage of "secondary school complete" in female pop in 1965.
  55. syr65 : Average years of secondary schooling in the total population over age 25 in 1965.
  56. syrm65 : Average years of secondary schooling in the male population over age 25 in 1965.
  57. syrf65 : Average years of secondary schooling in the female population over age 25 in 1965.
  58. teapri65 : Pupil/Teacher Ratio in primary school.
  59. teasex65 : Pupil/Teacher Ratio in secondary school
  60. ex1 : Ratio of export to GDP (in current international prices)
  61. im1 : Ratio of import to GDP (in current international prices)
  62. xr65 : Exchange rate (domestic currency per U.S. dollar) in 1965.
  63. tot1 : Terms of trade shock (growth rate of export prices minus growth rate of import prices).

Importing the necessary libraries and overview of the dataset¶

In [ ]:
#Import required libraries
import numpy as np
import pandas as pd
import warnings
warnings.filterwarnings("ignore")
In [1]:
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
In [ ]:
# Load data
df = pd.read_csv('growth.csv')
In [ ]:
# See variables in the dataset
df.head()
Out[ ]:
Outcome intercept gdpsh465 bmp1l freeop freetar h65 hm65 hf65 p65 ... seccf65 syr65 syrm65 syrf65 teapri65 teasec65 ex1 im1 xr65 tot1
0 -0.024336 1 6.591674 0.2837 0.153491 0.043888 0.007 0.013 0.001 0.29 ... 0.04 0.033 0.057 0.010 47.6 17.3 0.0729 0.0667 0.348 -0.014727
1 0.100473 1 6.829794 0.6141 0.313509 0.061827 0.019 0.032 0.007 0.91 ... 0.64 0.173 0.274 0.067 57.1 18.0 0.0940 0.1438 0.525 0.005750
2 0.067051 1 8.895082 0.0000 0.204244 0.009186 0.260 0.325 0.201 1.00 ... 18.14 2.573 2.478 2.667 26.5 20.7 0.1741 0.1750 1.082 -0.010040
3 0.064089 1 7.565275 0.1997 0.248714 0.036270 0.061 0.070 0.051 1.00 ... 2.63 0.438 0.453 0.424 27.8 22.7 0.1265 0.1496 6.625 -0.002195
4 0.027930 1 7.162397 0.1740 0.299252 0.037367 0.017 0.027 0.007 0.82 ... 2.11 0.257 0.287 0.229 34.5 17.6 0.1211 0.1308 2.500 0.003283

5 rows × 63 columns

In [ ]:
# Dimensions of the dataset
df.shape
Out[ ]:
(90, 63)
  • In this segment, we provide an empirical example of using partialling-out with Lasso to estimate the regression coefficient $β_1$ in the high-dimensional linear regression model. For any inference question, we can write Y as

Y = $β_1D$ + $β^r_2W$ + ε.

  • Specifically we are interested in, how the rates at which economies of different countries grow denoted by Y , are related to the initial wealth levels in each country, denoted by D, controlling for country’s institutional, educational, and other similar characteristics, denoted by W.

  • The relationship is captured by the regression coefficient $β_1$.

  • In this example, this coefficient is called the “speed of convergence/divergence”, as it measures the speed at which poor countries catch up ($β_1$<0) or fall ($β_1$>0) behind wealthy countries, controlling for W.

  • Our inference question here is: Do poor countries grow faster than rich countries, controlling for educational and other characteristics? In other words, is the speed of convergence negative: $β_1$<0?

  • This is the Convergence Hypothesis predicted by the Solow Growth Model.Robert M. Solow is a world-renowned MIT economist who won the Nobel Prize in Economics.

  • The dataset contains 90 countries and about 60 controls. Thus p is approximately 60 and n = 90 and p over n is not small. This means that we operate in the high-dimensional setting.

  • Therefore, we expect the least squares method to provide a poor, very noisy estimate of $β_1$.

  • In contrast, we expect the method based on partialling-out with Lasso to provide a high quality estimate of $β_1$.

In [ ]:
# Extract the names of control and treatment variables from varnames
xnames = df.columns[3:] # names of X variables
dandxnames = df.columns[2:] # names of D and X variables
print(xnames)
print(dandxnames)
Index(['bmp1l', 'freeop', 'freetar', 'h65', 'hm65', 'hf65', 'p65', 'pm65',
       'pf65', 's65', 'sm65', 'sf65', 'fert65', 'mort65', 'lifee065', 'gpop1',
       'fert1', 'mort1', 'invsh41', 'geetot1', 'geerec1', 'gde1', 'govwb1',
       'govsh41', 'gvxdxe41', 'high65', 'highm65', 'highf65', 'highc65',
       'highcm65', 'highcf65', 'human65', 'humanm65', 'humanf65', 'hyr65',
       'hyrm65', 'hyrf65', 'no65', 'nom65', 'nof65', 'pinstab1', 'pop65',
       'worker65', 'pop1565', 'pop6565', 'sec65', 'secm65', 'secf65', 'secc65',
       'seccm65', 'seccf65', 'syr65', 'syrm65', 'syrf65', 'teapri65',
       'teasec65', 'ex1', 'im1', 'xr65', 'tot1'],
      dtype='object')
Index(['gdpsh465', 'bmp1l', 'freeop', 'freetar', 'h65', 'hm65', 'hf65', 'p65',
       'pm65', 'pf65', 's65', 'sm65', 'sf65', 'fert65', 'mort65', 'lifee065',
       'gpop1', 'fert1', 'mort1', 'invsh41', 'geetot1', 'geerec1', 'gde1',
       'govwb1', 'govsh41', 'gvxdxe41', 'high65', 'highm65', 'highf65',
       'highc65', 'highcm65', 'highcf65', 'human65', 'humanm65', 'humanf65',
       'hyr65', 'hyrm65', 'hyrf65', 'no65', 'nom65', 'nof65', 'pinstab1',
       'pop65', 'worker65', 'pop1565', 'pop6565', 'sec65', 'secm65', 'secf65',
       'secc65', 'seccm65', 'seccf65', 'syr65', 'syrm65', 'syrf65', 'teapri65',
       'teasec65', 'ex1', 'im1', 'xr65', 'tot1'],
      dtype='object')

Barro-Lee GrowthData¶

  • The outcome (Y) is the realized annual growth rate of a country's wealth (Gross Domestic Product per capita).

  • The target regressor (D) is the initial level of the country's wealth.

  • The target parameter $β_1$ is the speed of convergence, which measures the speed at which poor countries catch up with rich countries.

  • The controls (W) include measures of education levels, quality of institutions, trade openness, and political stability in the country.

In [ ]:
from Scikit-learn.linear_model import LinearRegression #import linear regression from Scikit-learn
from Scikit-learn import metrics
from statsmodels.regression.linear_model import OLS #import Ordinary Least Squares from statsmodels
import statsmodels.api as sm

Y = df['Outcome'] # target variable
D = df['gdpsh465'] # target regressors
D_X = df[dandxnames] # control variables with target regressor
X = df[xnames] # control variables
D_X = sm.add_constant(D_X) # adding constants for intercept to control variables with target regressor
X = sm.add_constant(X) # adding constants for intercept to control variables
model = sm.OLS(Y, D_X) # OLS model object
results = model.fit() # training the model
results.summary() # summary of the trained model
Out[ ]:
OLS Regression Results
Dep. Variable: Outcome R-squared: 0.887
Model: OLS Adj. R-squared: 0.641
Method: Least Squares F-statistic: 3.607
Date: Sat, 07 May 2022 Prob (F-statistic): 0.000200
Time: 08:45:19 Log-Likelihood: 238.24
No. Observations: 90 AIC: -352.5
Df Residuals: 28 BIC: -197.5
Df Model: 61
Covariance Type: nonrobust
coef std err t P>|t| [0.025 0.975]
const 0.2472 0.785 0.315 0.755 -1.360 1.854
gdpsh465 -0.0094 0.030 -0.314 0.756 -0.071 0.052
bmp1l -0.0689 0.033 -2.117 0.043 -0.135 -0.002
freeop 0.0801 0.208 0.385 0.703 -0.346 0.506
freetar -0.4890 0.418 -1.169 0.252 -1.346 0.368
h65 -2.3621 0.857 -2.755 0.010 -4.118 -0.606
hm65 0.7071 0.523 1.352 0.187 -0.364 1.779
hf65 1.6934 0.503 3.365 0.002 0.663 2.724
p65 0.2655 0.164 1.616 0.117 -0.071 0.602
pm65 0.1370 0.151 0.906 0.373 -0.173 0.447
pf65 -0.3313 0.165 -2.006 0.055 -0.670 0.007
s65 0.0391 0.186 0.211 0.835 -0.341 0.419
sm65 -0.0307 0.117 -0.263 0.795 -0.270 0.209
sf65 -0.1799 0.118 -1.523 0.139 -0.422 0.062
fert65 0.0069 0.027 0.254 0.801 -0.049 0.062
mort65 -0.2335 0.817 -0.286 0.777 -1.908 1.441
lifee065 -0.0149 0.193 -0.077 0.939 -0.411 0.381
gpop1 0.9702 1.812 0.535 0.597 -2.742 4.682
fert1 0.0088 0.035 0.252 0.803 -0.063 0.081
mort1 0.0666 0.685 0.097 0.923 -1.336 1.469
invsh41 0.0745 0.108 0.687 0.498 -0.148 0.297
geetot1 -0.7151 1.680 -0.426 0.674 -4.157 2.726
geerec1 0.6300 2.447 0.257 0.799 -4.383 5.643
gde1 -0.4436 1.671 -0.265 0.793 -3.867 2.980
govwb1 0.3375 0.438 0.770 0.447 -0.560 1.235
govsh41 0.4632 1.925 0.241 0.812 -3.481 4.407
gvxdxe41 -0.7934 2.059 -0.385 0.703 -5.012 3.425
high65 -0.7525 0.906 -0.831 0.413 -2.608 1.103
highm65 -0.3903 0.681 -0.573 0.571 -1.786 1.005
highf65 -0.4177 0.561 -0.744 0.463 -1.568 0.732
highc65 -2.2158 1.481 -1.496 0.146 -5.249 0.818
highcm65 0.2797 0.658 0.425 0.674 -1.069 1.628
highcf65 0.3921 0.766 0.512 0.613 -1.177 1.961
human65 2.3373 3.307 0.707 0.486 -4.437 9.112
humanm65 -1.2092 1.619 -0.747 0.461 -4.525 2.106
humanf65 -1.1039 1.685 -0.655 0.518 -4.555 2.347
hyr65 54.9139 23.887 2.299 0.029 5.983 103.845
hyrm65 12.9350 23.171 0.558 0.581 -34.529 60.400
hyrf65 9.0926 17.670 0.515 0.611 -27.102 45.287
no65 0.0372 0.132 0.282 0.780 -0.233 0.308
nom65 -0.0212 0.065 -0.326 0.747 -0.154 0.112
nof65 -0.0169 0.067 -0.252 0.803 -0.154 0.120
pinstab1 -0.0500 0.031 -1.616 0.117 -0.113 0.013
pop65 1.032e-07 1.32e-07 0.783 0.440 -1.67e-07 3.73e-07
worker65 0.0341 0.156 0.218 0.829 -0.286 0.354
pop1565 -0.4655 0.471 -0.988 0.332 -1.431 0.500
pop6565 -1.3575 0.635 -2.138 0.041 -2.658 -0.057
sec65 -0.0109 0.308 -0.035 0.972 -0.641 0.619
secm65 0.0033 0.151 0.022 0.983 -0.306 0.313
secf65 -0.0023 0.158 -0.015 0.988 -0.326 0.321
secc65 -0.4915 0.729 -0.674 0.506 -1.985 1.002
seccm65 0.2596 0.356 0.730 0.471 -0.469 0.988
seccf65 0.2207 0.373 0.591 0.559 -0.544 0.985
syr65 -0.7556 7.977 -0.095 0.925 -17.095 15.584
syrm65 0.3109 3.897 0.080 0.937 -7.671 8.293
syrf65 0.7593 4.111 0.185 0.855 -7.661 9.180
teapri65 3.955e-05 0.001 0.051 0.959 -0.002 0.002
teasec65 0.0002 0.001 0.213 0.833 -0.002 0.003
ex1 -0.5804 0.242 -2.400 0.023 -1.076 -0.085
im1 0.5914 0.250 2.363 0.025 0.079 1.104
xr65 -0.0001 5.42e-05 -1.916 0.066 -0.000 7.18e-06
tot1 -0.1279 0.113 -1.136 0.266 -0.359 0.103
Omnibus: 0.439 Durbin-Watson: 1.982
Prob(Omnibus): 0.803 Jarque-Bera (JB): 0.417
Skew: 0.158 Prob(JB): 0.812
Kurtosis: 2.896 Cond. No. 7.52e+08


Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 7.52e+08. This might indicate that there are
strong multicollinearity or other numerical problems.
  • Model performance is good as the r2 is coming to be 88.7 percentage.
  • Low value to adjusted r2 makes sense as we have a lot of features and they may be capturing the noise in the dataset.
In [ ]:
# Lasso model with target and control variables
model = sm.OLS(Y, X) # OLS model object
results = model.fit() # training the model
results_Y = model.fit_regularized(alpha = 0.002, L1_wt=1.0,start_params=results.params)
# alpha = learning parameter, L1_wt = 1 - represents lasso


final = sm.regression.linear_model.OLSResults(model,
                                              results_Y.params,
                                              model.normalized_cov_params)
r_Y = final.resid # find the residuals
In [ ]:
# Lasso model with target regressor and control variables
model = sm.OLS(D, X)
results = model.fit()
results_D = model.fit_regularized(alpha = 0.002, L1_wt=1.0,start_params=results.params)
final = sm.regression.linear_model.OLSResults(model,
                                              results_D.params,
                                              model.normalized_cov_params)
r_D = final.resid # find the residuals
In [ ]:
# Linear model between the residuals of the lasso models created above
Y = r_Y # target variable
X = r_D
X = sm.add_constant(X)
model = sm.OLS(Y, X)
results = model.fit() # train the model
print(results.summary())
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.045
Model:                            OLS   Adj. R-squared:                  0.034
Method:                 Least Squares   F-statistic:                     4.143
Date:                Sat, 07 May 2022   Prob (F-statistic):             0.0448
Time:                        08:45:20   Log-Likelihood:                 128.89
No. Observations:                  90   AIC:                            -253.8
Df Residuals:                      88   BIC:                            -248.8
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.0011      0.006      0.177      0.860      -0.011       0.013
x1            -0.0431      0.021     -2.035      0.045      -0.085      -0.001
==============================================================================
Omnibus:                        2.328   Durbin-Watson:                   1.681
Prob(Omnibus):                  0.312   Jarque-Bera (JB):                2.020
Skew:                           0.250   Prob(JB):                        0.364
Kurtosis:                       2.463   Cond. No.                         3.44
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Estimate Standard Error 95% Confidence Interval
Least squares -0.0094 0.030 [-0.071 0.052]
Partialling-out via lasso -0.0431 0.021 [-0.085 -0.001]
  • As expected, least squares provides a rather noisy estimate of the speed of convergence, and does not allow us to answer the question about the convergence hypothesis.

  • In sharp contrast, partialling-out via Lasso provides a more precise estimate.

  • The lasso based point estimate is -0.043 and the 95% confidence interval for the (annual) rate of convergence is -0.085 to -0.001

  • This empirical evidence does support the convergence hypothesis.

Conclusions¶

  • In this segment, we have examined an empirical example in the high-dimensional setting.

  • Least squares yields a very noisy estimate of the target regression coefficient and does not allow us to answer an important empirical question.

  • Lasso does yield a precise estimate of the regression coefficient and does allow us to answer that question.

  • We have found significant empirical evidence supporting the convergence hypothesis of Solow.

In [ ]:
# Convert notebook to html
!jupyter nbconvert --to html "/content/drive/My Drive/Colab Notebooks/Copy of FDS_Project_LearnerNotebook_FullCode.ipynb"