PROPOSAL

 

In this section, we are going to showcase our primary ideas and plans concerning on the rat inspection data analysis project, which clearly conveys and defines the objectives and requirements of our project.

 

 

The Group Members

  • Jia Ji (jj3205)
  • Yijia Jiang (yj2687)
  • Gonghao Liu (gl2716)
  • Yifei Xu (yx2638)
  • Ziyan Xu (zx2373)

 

The Tentative Project Title

Have You Ever Seen Rats in NYC?

 

Motivation

One of our team members and her friends were finishing an outdoor dinner in Chelsea recently when, from the corner of her eye, she saw something move near the edge of their table: A rat had been on the table. Rats are among New York’s permanent features and across the city, one hears the same thing: they are running amok like never before. Rats in NYC are almost everywhere, in the park, on your block and even at your table.

Our dataset contains some information on rat inspection, which is provided by the Rat Information Portal (RIP) to do research on evidence of rat activities. Gaining deeper sights into the pattern of rat inspections in both the temporal and spatial dimensions will help those in fear of rats minimize exposure as much as possible. Broadly, it is beneficial to identify potential risks of disease-spread and then build a clean and pleasant living environment, which is what we as public health students should be working towards.

 

Anticipated Data Sources

We intend to mainly use three datasets:

 

Intended Final Product

  • A website containing many pages, which will deliver rat sighting analysis in NYC and give recommendations on locations to avoid encountering rats
  • A report summarizing the spatial analysis, temporal changes, and potential relation to other factors of the distribution of rats in NYC
  • A Shiny App predicting rat sighting in specific locations and times within NYC

 

Planned Analysis / Visualizations / Coding challenges

Planned Analysis

  • Analyze climatic and geographical characteristics, and results of rat inspection
  • Compare the number of rat inspection cases across years
  • Explore the trend in the number of rat inspections throughout the years
  • Evaluate the efficiency in handling rat related complaints among different agencies
  • Conduct hypothesis test to compare the rat density in different boroughs and time
  • Construct a linear model that predicts the rate of observing a rat from various potential covariates, including climatic and geographical factors

 

Visualizations

  • Visualize rat activities
    • at different times in a day (line chart)
    • in different seasons (bar chart)
    • in different locations (boxplot)
    • in terms of temperatures (heatmap)
  • Visualize the results of rat inspection (pie chart)
  • Visualize the trend of rat inspection over years (line chart)
  • Construct interactive maps for rat inspection cases
  • Compare the performance among different agencies in dealing with rat issues (boxplot)
  • Make correlation matrix, PCA plot to check colinearity and to simplify the complexity in high-dimensional data in modeling section

 

Coding challenges

  • Dealing with latitude and longitude variables
  • Many missing values in certain variables
  • Interactive maps for overall rat inspection cases
  • Need to carefully quantify the value to evaluate the efficiency of handling rat related complaints
  • Multiple hypotheses or multiple comparisons when hypothesis testing
  • Need to preprocess the data for each year and location before modeling
  • Need to specify inspection type when constructing the linear model
  • Shiny App to predict rat sightings in specific locations and times (according to the user’s choice)

 

Planned Timeline

Time Task
Nov 13 - Nov 18 Project review meeting
Nov 19 - Nov 22 Data preprocessing
Nov 23 - Nov 27 Exploratory data analysis
Nov 28 - Dec 2 Statistical analysis (hypothesis test and modeling)
Dec 3 - Dec 10 Work on report, webpage and screencast
Dec 15 In-class discussion of projects