Predicting rent prices using Craigslist data

This was a project with the Urban Analytics Lab to explore and predict rental prices in the Bay Area, using Craigslist rental listings.

We built predictive models using linear regression, random forest, gradient boosting, and geographically weighted regression algorithms. Details here.

I also made a web app. It should be up on the UAL website soon. In the meantime, it’s in the demo video here. And here’s the github repo.

The taxi market in NYC vs SF: inferring trip purpose using LDA

Was the taxi market really that much more developed in New York than in San Francisco? I used Latent Dirichlet Allocation with GPS data to infer taxi trip purposes in each city, to see just how different the two markets were. 

Back in 2014, when Uber and Lyft were taking off, people in San Francisco were much more excited about them than were people in New York. New Yorkers told me, “yeah, Uber might be better than a taxi, but it’s not really that big a deal.”

Why the difference? Maybe it’s just that San Franciscans are techno-optimists, especially compared to critical New Yorkers. But I also suspected it had something to do with differences between the two cities’ taxi markets.  Continue reading “The taxi market in NYC vs SF: inferring trip purpose using LDA”

Mapping NIMBY voting in San Francisco

Go to the full map

I wanted to know who votes for NIMBY ballot measures in SF, and whether that has changed over time. So I analyzed local election data since 1996 along with census data and made this map.

The fewer people vote, the greater chance NIMBYist ballot measures have of passing. In predicting whether anti-development measures will win at the ballot box, turnout is more important than income, race, or housing tenure, although those factors play a role as well.

Continue reading “Mapping NIMBY voting in San Francisco”