Fun with datasets and Shiny in R.
I am part of a Mayfield Lab group project that will use the US Social Security Baby Names dataset... which I did not know even existed. Therefore what do I do... Get very distracted by an awesome, large dataset. It contains the first names of all male and female infants at birth according to US Social Security from 1880 to 2015. They exclude names that have less than 5 occurrences in a year to protect those with unique names.
There have been 95,025 distinct names over the past 136 years and 9,026 of them have been for both males and females. Click each image for a larger pdf.
It appears that although there have been more births in the US over time, the number of births adjusted for the number of women between 15 and 44 years old decreases into the late 20th Century.
I was surprised to see such a drastic shift to nearly 75% of births being female in the late 19th and early 20th Century. I am still trying to figure out the cause (whether a real event or an artifact of poor data). In 1902 the US Congress established the Bureau of Census that included registration of births and in 1946 the function of collecting population vital statistics was moved to the US Public Health Service. This increase in regulation maybe the cause of the equalizing shown in the figure in the 1920's. Previously, parents may have chosen not to register the birth (although that still does not support a heavier proportion of female births).
According to Figs 3-5 it appears that uncommon names are becoming more prevalent. There are less infants sharing first names in the US than before the 1920's.
What about you name?
You have survived the luridly colored figures. Now test out my Shiny app that I made with ShinyR**. Type in a name and see the plot of the number of births over time (you can choose from Male or Female births or select Both to represent both sexes on the graph. Then you can hover on the graph to find a specific year and search for that year in the output table to find the exact number of names.
Find Your Name App*** #
** This is my first Shiny App. Let me know if you have any major issues...
*** The app will take a long time to initially load because it is a large dataset (over 90,000 names for 2 sexes over 136 years - 185,689 entries). It also may need to be reloaded from the server and may crash. See ** comment above. :)
# 07.07.2016 - The app has been crashing quite a bit and is probably not usable currently. I need to figure out how to make the large dataset easier to query.
Check out these sites that used the same database, R and Shiny but had more luck than I did.