Identifying Occupied Properties
Previously, I attempted to gain some insight into the apparent vacancies across regions of Saint Paul. This was attempted by utilizing public data on Vacant Properties in Saint Paul, my previous analytics are available at Saint Paul Property Vacancy, however this data set was insufficient for what I am searching for. Although it did provide some interesting insights in Vacancies and Wards it didn't account for buildings that were partially occupied which is a glaring flaw in the previous analytics attempt.
Another Data Set
I decided I wanted to explore some additional data sets, ideally some data sets that could provide insight into how many units there are in Saint Paul. To start this exploration I decided to utilize data on Certificate of Occupancy data. I found two potentially valuable data sets on this.
- Certificate of Occupancy - Residential (Last Refreshed May 31, 2024)
- Certificate of Occupancy - Commercial
For this article I decided to start exploring the Residential data set. To kick things off I downloaded the raw CSV file and imported it into Tableau Public Desktop. Then I started reviewing the available data features to try to see if I could find something useful. Unfortunately, there was no Ward data, however there was Latitude and Longitude data along with address, unit numbers, property type, and Grade.
Attempting to Glean Something Useful or Interesting
After exploring how to represent the Latitude and Longitude data I was able to utilize the map options in Tableau Desktop to produce a visualization of where the properties were. A map with a bunch of points on its own is not a useful visualization though, at least not for my goal of identifying area trends within Saint Paul.
To enhance this map I added dimensions to it. Two obvious dimensions were available, the color of the dot and the size of the dot. I decided to go with fire grade for the color of the dot and number of units in the property for the size of the dot. This yielded some interesting things. Notably it was immediately visible that some of the largest unit counts for properties didn't have a Grade.
This in itself was concerning since from my understand that is the Grade of fire safety. Could this mean that large unit buildings are not being enforced for fire safety in Saint Paul? Maybe, it could also just be a data entry issue. Me being the eternal optimist decided to proceed with the assumption that it was a data entry issue. This raised other questions though.
Data Integrity and Cleaning
After deciding to proceed with the assumption that the data set has some data entry issues I decided to look closer at some of the other values as well. Sure enough, I found property sub-types including Residential, Residential 1 Unit, Residential 2 Units, and Residential 3+ units, however the average unit count for the generic Residential was 100 which obviously should instead be classified as Residential 3+ units.
On exploring additional data I found other data entry errors as well including null property types and duplicate property types with spelling issues. I also found properties with much more than 3 units being classified as Residential 1 Unit or Residential 2 Unit. After cleaning up the data source I reproduced my visualization.
Further Down the Rabbit Hole
Although I didn't reach my goal yet with this I do have a lingering question from this data analytics. Why are some larger properties listed as not having a fire safety Grade. Through additional investigation I discovered another field I had been ignoring. The status field, initial research indicated that if an inspection was in progress, pending paperwork, or due for renewal it may not have an actively assigned Grade. So let's look at this further data point. Sure enough I found all of the properties that had a Null Grade had a In Progress (actively being inspected), Pending (Awaiting code violation fixes and paperwork), or Renewal Due (waiting for inspection) but how many actual units are in this limbo state? I put both of those together in another visualization for your reading pleasure.
Possible Takeaways
But what does all this mean? How does it contribute to my search for reason on why parts of Saint Paul may feel like a ghost town. I am no expert on Fire Safety or Inspections however I took this question to a general intelligence AI and it provided a few possibilities on how this data may impact the ghost town feel of Saint Paul. It may result in long standing scaffolding while fire safety violations are corrected, it may result in buildings waiting for inspection while we wait for capacity of inspectors and bureaucrats. It may result in partially empty buildings that need to have fire safety violations corrected before they can be rented out again. In the meantime we know from these analytics that as of May 31, 2024 there were more than 3000 residential units being inspected, pending repairs, or pending inspection.
It's all a little uncertain, the only thing certain from this dive into data is that I need to keep going, I need to keep digging through the data and try to find some correlating data points. At the very least, I learned how to utilize maps in Tableau, which I am sure will be valuable again in the future.
Thanks for reading!
Comments