Load the Boston Housing data set (Harrison & Rubinfeld, 1978). It contains 506 examples of housing values in suburbs of Boston, each with 13 continuous attributes and 1 binary attribute.
The data contains the following columns:
Feature | Description |
CRIM | per capita crime rate by town |
ZN | proportion of residential land zoned for lots over 25,000 sq.ft. |
INDUS | proportion of non-retail business acres per town. |
CHAS | Charles River dummy variable (1 if tract bounds river; 0 otherwise) |
NOX | nitric oxides concentration (parts per 10 million) |
RM | average number of rooms per dwelling |
AGE | proportion of owner-occupied units built prior to 1940 |
DIS | weighted distances to five Boston employment centres |
RAD | index of accessibility to radial highways |
TAX | full-value property-tax rate per $10,000 |
PTRATIO | pupil-teacher ratio by town |
B | 1000(Bk |
LSTAT | % lower status of the population |
MEDV | Median value of owner-occupied homes in $1000’s |
: str. Path to directory which either stores file or otherwise file will be downloaded and extracted there. Filename ishousing.data
Tuple of np.darray x_train
and dictionary metadata
of column headers (feature names).
Harrison, D., & Rubinfeld, D. L. (1978). Hedonic housing prices and the demand for clean air. Journal of Environmental Economics and Management, 5(1), 81–102.