Assimilation & Gentrification

San Francisco Residents


The Silicon Valley world of top tech companies and startups often comes under fire for being too male, too white and too wealthy. For several years, the privileged group of tech workers in the Bay has continued to gentrify many San Francisco neighborhoods. Subsequently rising rents and new, high-end restaurants/shops have priced people out of the city and left many dispossessed. This is a pressing issue, and I want to use tech to help solve it. To begin the analysis, I focused on the interaction of tech and non-tech communities in various SF neighborhoods by the following method:

Data Collection & Classification: Using LotaData’s proprietary datasets, I gathered device-specific geolocation timestamps on ~100K devices. After clustering the data, I determined a likely work location and a likely home city for about 25% of the devices. From this sample, I compared two broad groups of San Francisco devices: tech-workers and non tech-workers. From the location trails of each device, I pinpointed a list of places visited (e.g. restaurants, airports, etc.) and events attended (e.g. concerts, basketball games, etc.). This list led to a classification of what types of places and what types of events a particular device tends to visit.

Network Creation: With these vectorized classifications, I created a similarity scoring method and calculated the similarity score between all devices. From the scores, I created a series of k-NN (k-nearest neighbors) networks. In the network, two nodes (or devices) are connected by an edge. For every node, there is an edge between itself and the k most similar other nodes. Further, the length of each edge is inversely proportional to the similarity score. This means similar devices are grouped close together and dissimilar devices are farther apart.

As seen in the above left plot, if we set k = 1, we don’t have too many connections and we have a simple, low-dimensional graph. If k = 50, we’ll have a dense, high-dimensional graph. Another feature that jumps out is the presence of a few clearly defined cyan clusters. These clusters represent cliques that rarely extend past their own grouping.

The middle plot of the Mission neighborhood in San Francisco highlights one of these main clusters. As one of the main spots of gentrification in the city, this cluster makes intuitive sense. There is a large influx of tech workers in that area, although there remains a strong Latino and minority presence in the neighborhood. In the fight to preserve the Latino character of the Mission, there seems to be a strong pocket that remains separate from the tech community.

With this idea of assimilation in mind, I took the connectivity of the graph and calculated a weighted assimilation score between tech and non-tech groups. This value represents how much one community has behaviorally assimilated with the other. A high score means the 2 groups behave similarly, while a low score means there is a wide behavioral gap between groups. I applied the metric to 7 San Francisco neighborhoods and 3 top tech companies (Google, Apple & Facebook) to obtain the scale below.
Amongst the 3 major companies, Google employees had the highest assimilation score, followed by Apple and Facebook. The differences between these companies could have something to do with the personalities of people hired at each of the companies. It could have something to do with age, gender, ethnicity or a whole host of other factors.

Amongst neighborhoods, the Mission and the Marina had the lowest assimilation scores, but for very different reasons. As mentioned before, there is a broad gap in the Mission between the minority groups (e.g. Latino, Black and immigrant communities) and the new wave of tech workers. In the predominantly white and wealthy Marina community, this gap is more likely a difference between finance workers (i.e. non-tech for this analysis) and tech workers.

This type of analysis absolutely requires more complete data and finer tuning. Moving forward, I believe a more rigorous and more comprehensive network analysis of workers in the Bay Area can help identify and address the needs of our communities. We can push specific companies and their employees to act in a more responsible and selfless manner. To help preserve and enrich the neighborhoods in San Francisco, we can implement actual solutions that mitigate the negative consequences of gentrification.