PROPOSAL
In this section, we are going to showcase our primary ideas and plans
concerning on the rat inspection data analysis project, which clearly
conveys and defines the objectives and requirements of our project.
The Group Members
- Jia Ji (jj3205)
- Yijia Jiang (yj2687)
- Gonghao Liu (gl2716)
- Yifei Xu (yx2638)
- Ziyan Xu (zx2373)
The Tentative Project Title
Have You Ever Seen Rats in NYC?
Motivation
One of our team members and her friends were finishing an outdoor
dinner in Chelsea recently when, from the corner of her eye, she saw
something move near the edge of their table: A rat had been on the
table. Rats are among New York’s permanent features and across the city,
one hears the same thing: they are running amok like never before. Rats
in NYC are almost everywhere, in the park, on your block and even at
your table.
Our dataset contains some information on rat inspection, which is
provided by the Rat Information Portal (RIP) to do research on evidence
of rat activities. Gaining deeper sights into the pattern of rat
inspections in both the temporal and spatial dimensions will help those
in fear of rats minimize exposure as much as possible. Broadly, it is
beneficial to identify potential risks of disease-spread and then build
a clean and pleasant living environment, which is what we as public
health students should be working towards.
Anticipated Data Sources
We intend to mainly use three datasets:
Intended Final Product
- A website containing many pages, which will deliver rat sighting
analysis in NYC and give recommendations on locations to avoid
encountering rats
- A report summarizing the spatial analysis, temporal changes, and
potential relation to other factors of the distribution of rats in
NYC
- A Shiny App predicting rat sighting in specific locations and times
within NYC
Planned Analysis / Visualizations / Coding challenges
Planned Analysis
- Analyze climatic and geographical characteristics, and results of
rat inspection
- Compare the number of rat inspection cases across years
- Explore the trend in the number of rat inspections throughout the
years
- Evaluate the efficiency in handling rat related complaints among
different agencies
- Conduct hypothesis test to compare the rat density in different
boroughs and time
- Construct a linear model that predicts the rate of observing a rat
from various potential covariates, including climatic and geographical
factors
Visualizations
- Visualize rat activities
- at different times in a day (line chart)
- in different seasons (bar chart)
- in different locations (boxplot)
- in terms of temperatures (heatmap)
- Visualize the results of rat inspection (pie chart)
- Visualize the trend of rat inspection over years (line chart)
- Construct interactive maps for rat inspection cases
- Compare the performance among different agencies in dealing with rat
issues (boxplot)
- Make correlation matrix, PCA plot to check colinearity and to
simplify the complexity in high-dimensional data in modeling
section
Coding challenges
- Dealing with latitude and longitude variables
- Many missing values in certain variables
- Interactive maps for overall rat inspection cases
- Need to carefully quantify the value to evaluate the efficiency of
handling rat related complaints
- Multiple hypotheses or multiple comparisons when hypothesis
testing
- Need to preprocess the data for each year and location before
modeling
- Need to specify inspection type when constructing the linear
model
- Shiny App to predict rat sightings in specific locations and times
(according to the user’s choice)
Planned Timeline
Nov 13 - Nov 18 |
Project review meeting |
Nov 19 - Nov 22 |
Data preprocessing |
Nov 23 - Nov 27 |
Exploratory data analysis |
Nov 28 - Dec 2 |
Statistical analysis (hypothesis test and
modeling) |
Dec 3 - Dec 10 |
Work on report, webpage and screencast |
Dec 15 |
In-class discussion of projects |