Alternative descriptions of the model
The normal linear model describes the distribution of Y for any value of X:
Y ~ normal (μy , σy)
where
μy = β0 + β1x
σy = σ
An equivalent way to write the same model is...
y = β0 + β1x + ε
where ε is called the model error and has a distribution
ε ~ normal (0 , σ)
The error, ε , for a data point is the vertical distance between the cross on a scatterplot and the regression line.
Band containing about 95% of values
Applying the 70-95-100 rule of thumb to the errors, about 95% of them will be within 2 standard deviations of zero — i.e. between ±2σ.
Since the errors are vertical distances of data points to the regression line, a band 2σ on each side of it should contain about 95% of the crosses on a scatterplot of the data.