Census Demographic Data Links
https://www.census.gov/quickfacts/fact/faq/grandviewcitymissouri,kansascitycitymissouri/PST045219#1 (Links to an external site.) – quickfacts in a nice visualizer
https://data.census.gov/cedsci/all?q=grandview%20mo (Links to an external site.) – age + gender in Grandview, MO
https://data.census.gov/cedsci/table?q=grandview%20mo&tid=ACSST5Y2018.S0601&hidePreview=false (Links to an external site.) -”Selected Characteristics of Total and Native Populations in the United States
Dr. Wang’s Feedback: Demographic data is useful to answer the “Who” part of our profile, but we need enough datasets to speak to each detail point in order to have enough data for future parts of the assignment. Basically, we have to have data giving information about details in the who, but also the what/where/when/why etc. (she suggested information aboutAllergies, Nutritional Facts, Home Addresses and where people live, and grocery store locations via business license addresses). She said we should be writing about 200 words per data set to describe them for assignment 2.
Allergy-Related Datasets:
- Gupta, R. S., Warren, C. M., Smith, B. M., Jiang, J., Blumenstock, J. A., Davis, M. M., Schleimer, R. P., & Nadeau, K. C. (2019). Prevalence and Severity of Food Allergies Among US Adults. JAMA network open, 2(1), e185630. https://doi.org/10.1001/jamanetworkopen.2018.5630 (Links to an external site.)
- “State-By-State Data for Food Allergy Chartbook.” p. 27. Food Allergy Research & Education, Food Allergy Research & Education, 2018, www.foodallergy.org/resources/state-state-data-food-allergy (Links to an external site.)
Note: I have included two different example sources of data on allergies here because most medical research data sets are not widely available due to either being privately funded or containing sensitive demographic information that creates HIPAA violation concerns. If we were to attain funding and develop this app, we would be able to pay for the data sets behind research such as the above examples.
Example 1 is a cross-sectional medical survey designed to collate self-reported allergy survey responses with positive clinical diagnosis of allergy supported by laboratory testing. The aim of the study was to estimate the prevalence (occurrence) of allergies in the United States based on a representative sample and small-area estimation. This data includes deidentified survey participants 18+ years of age who were able to complete surveys in English or Spanish, and a variety of methodologies were employed in generating the participant pool in order to create a representative sample. This data set records the instance of broadly categorized allergies in the population nationally, using an extended version of the national child food allergy survey which includes a section for severity of reaction. Surveys were administered at the University of Chicago from October 9, 2015 to September 18, 2016; however, the majority of the study authors are professionals associated with the Northwestern University Feinberg School of Medicine (Chicago). Per the abstract, the study was conducted because “Food allergy is a costly, potentially life-threatening condition. Although studies have examined the prevalence of childhood food allergy, little is known about prevalence, severity, or health care utilization related to food allergies among US adults.” Similar data sets could likely be sourced from local universities’ and teaching hospitals’ public health labs, depending on study availability.
Example 2 is a data overview coming from a data set of private insurance claims. Per the website, “The Chartbook relies on analysis of FAIR Health’s proprietary in-house FAIR Health National Private Insurance Claims (FH NPIC®) database. As the nation’s largest collection of private healthcare claims data, FH NPIC® contains over 27 billion billed medical and dental procedures in claim records contributed by payors and administrators who insure or process claims for private insurance plans covering more than 150 million individuals.” This data set would include claims data for every medical service billed for every customer of this insurance firm collected by the firm for reasons of record-keeping from 2007-2016. Although not statistically representative, similar data from large local providers, such as Aetna, Blue Cross/Blue Shield of Kansas City, etc, would provide local data on the prevalence of food allergies leading to medical intervention, which would give a better picture of which consumers need specialty food items or replacement ingredients and where. Since this is privately-financed data for a business entity, there are likely going to be other biases to the data aside from sampling issues. However, in terms of scale, insurance data would likely be a great source.
Business-Related Datasets:
- “Retail Trade: Summary Statistics for the U.S., States, and Selected Geographies: 2017 .” Data.census.gov, 2017, data.census.gov/cedsci/table?q=Grandview+city%2C+Missouri+Business+and+Economy.
- “Business License by Type Count .” Open Data KC, 2014, data.kcmo.org/Business/Business-License-by-Type-Count/4kja-guc3/data.
Our application is directed towards users in the Grandview, MO area. Because the app intends to help with grocery shopping, have a good idea about what sort of grocery shopping is available in the areas is very important.
Dataset 1:
In this dataset, we see the organization of information regarding all sorts of retail stores specifically in Grandview, MO in 2017. This dataset tells us how many grocery/food stores there are in the Grandview area. It also tells us about the establishment’s sales, the value of shipments, and revenue. In 2017, there were 8 food and beverage stores in Grandview. Unfortunately, there is a significant amount of data withheld in an attempt to not identify the specific business involved. However, the data would be applicable for our application as it could help users to be aware of the different shopping options in their area. Generally, business census data is collected to help a business know where to locate, how much to produce, and how they compare to their competitors. The data is available because of the Economic Census.
Dataset 2:
This dataset discusses the number of business licenses that have been granted in KCMO. Based on this dataset, we see that there are 59 Food (Health) Supplement Stores, 54 All Other Specialty Food Stores, 3 Baked Goods Stores, etc. This data was provided by the Revenue Division of KCMO Finance. Similarly to the information in the first database, this data would allow us to know the amount/type of business that would be available to our target audience in the KC area.
https://data.census.gov/cedsci/table?q=Grandview%20city,%20Missouri%20Business%20and%20Economy&tid=ECNBASIC2017.EC1744BASIC&hidePreview=false (Links to an external site.) https://data.kcmo.org/Business/Business-License-by-Type-Count/4kja-guc3 (Links to an external site.)
The Demographic Profile contains the demographic summary of Grandview, Missouri from the 2018 United States Census Bureau. The information provided includes 100-percent data collection as well as the error margin or over sampled areas such as units in structure.
The Demographic Profile contains 100-percent topics, such as sex, age, race, Hispanic or Latino, household relationship, household type, group quarters population, housing occupancy, and housing tenure.
The sample items include sample population topics, household, and family size, married-couple family household, male household no wife present family, female household no wife present family and no-family households.
The sample items also include sample housing topics, such as units in structure, family size, age of children, unmarried partner households, selected household types, and housing tenure.
Dataset #1 Families: Total number of families in the Grandview area in 2018 stating the average family size and the number of families with children under the age of 18, under the age of 6 years only, under 6 years to 17 years and 6 to 17 years only.
Dataset #2 Households: This dataset included data collected regarding households in the Grandview are from the 2018 Census. Selected households by type, households with one or more people under 18 years, households with one or more people 60 years and over, Householder living alone, 65 years and over, unmarried partner/partner households, same sex and opposite sex. This data set also includes the housing tenure, Owner-occupied housing units and Renter-occupied housing units.
World food facts — https://www.kaggle.com/openfoodfacts/world-food-facts (Links to an external site.)
This data set gives us information on different foods and all its nutrients. It provides different web links explaining what is in the foods, the surviving sizes and what people think. From this data we will be able to gain information on the foods that the user will be shopping for. We can also gain insight on other people’s opinions of the food based on the discussions section of the data source. This source is 24/7 accessible and is constantly being updated and relevant.
Recipe ingredients –– This organization of a variety of information regarding different recipe ingredients that is specific to Grandview, Missouri in 2016.Thisdata set gives us information about some of our strongest geographic and cultural associations tied to our local region foods. This data set’s objective is about the competition and how it was to predict the different categories of a dish’s cuisine and it had given a list of its ingredients. The content was that the training set contained recipe ID, type of cuisine and then list of the ingredients. This unique dataset was provided by Yummly and it was featured in Kaggle for competition for practice. This data set shows us a huge list of different ingredients and how often that certain ingredient was purchased. This gives us a better understanding on which ingredients are being purchased more often than others. After understanding this we can then create and then categorize the cuisine. This data set also tells us and illustrates who is buying what and how often, what is a more common preference, and how we determine what the next step is. This information is always accessible and constantly available for the consumers to have a look at to be able to help them have the most enjoyable and convenient shopping/cooking experience.
Recipe Ingredients Dataset (Links to an external site.)
Unstructured data like text data / image data
Provide a short walkthrough about the data description, fields description and a few insights / observations.
Reference: EduKC
Upload / share your folder with ocela.ai@gmail.com with images for all classes.
Add an image for each class as an example.
Reference: EduKC
Open the following Google Drive folder (which only your team has access to) and upload the data for your project.