Because it is a dozen... today in the metric world it would default to 10.
It isn't a magic number by any sense of the world since 12 doesn't come into play when using a fixed radius with either the distance band or minimum number of points.
Any rational in the original literature is largely absent. Nor is there any discussion in the "tools" in whatever software about important issues such as the spread of the points about the 'core' cell (the one being calculated). Often times using the default of 12 points in a sparsely populated point pattern will pull in extrema that aren't even close to the core cell and might belong to a different population group.
Consider 6 points on the top of a hill... then being forced to pull in 6 more points from the bottom of the hill to satisify the 12 requirement... How do you think the interpolation will go for new locations on top or on the bottom of the hill?
There are more important things than 12. So why 12? because