Uber has troves of data on how people navigate cities. Urban planners have begged, pleaded, and gone to court for access. Will they ever get it?
Joe Castiglione compares his job to playing SimCity.
As the deputy director for technology, data, and analysis at the San Francisco County Transportation Authority, Castiglione spends his days manipulating models of the Bay Area and its 7 million residents.
From wide-sweeping ridership and traffic data to deep dives into personal travel choices via surveys, his models are able to estimate the number of people who will disembark at a specific train platform at a certain time of day and predict how that might change if a new housing development is built nearby, or if train-frequency is increased.
The models are exceedingly complex, because people are so complex. “Think about the travel choices you’ve made in the last week, or the last year,” Castiglione says. “How do you time your trips? What tradeoffs do you make? What modes of transportation do you use? How do those choices change from day to day?” He has the deep voice of an NPR host and the demeanor of a patient professor. “The models are complex but highly rational,” he says.
The San Francisco County Transportation Authority participates in planning across the nine counties of the Bay Area, considering current issues like congestion pricing, while also creating plans decades into the future. To build models of the 25 million trips Bay residents take every day, Castiglione and his team process a lot of publicly available datasets for many modes of transportation: private cars, buses, trains, bicycles, going by foot. But one growing gap in the data is the footprint of ride-hailing services like Uber.
If regional agencies had that data, they could add public transit routes or adjust service times to offer more incentives to get people out of cars and onto trains and buses. What Castiglione craves is an anonymized dataset of all Uber rides with the origin and destination ZIP codes, dates, and times. San Francisco agencies right now are working on plans for $100 billion in transportation improvements like tunnels, bridges, and rail lines. “What happens in the absence of data,” Castiglione says, “is the risk that we make poor investments and poor choices.”
Data covering potentially billions of trips per year in the Bay Area might sound like a lot to ask for. In actuality, Uber is already providing this exact data to a California regulatory agency — on the assurance that they don’t release it to anyone else.
City planners have been asking for access to detailed ride-hailing data for years. Uber has generally pushed back, citing protection of its business advantage and of passenger privacy. The company has started to share limited, aggregated datasets on traffic speeds and travel times for certain cities, including San Francisco, over the past two years. But transportation researchers say Uber is cherry-picking data to reinforce the narrative that it isn’t to blame for increases in congestion or declines in transit ridership.
So the city of San Francisco has subpoenaed Uber to get the data it needs. In the past decade, the city’s population has grown by more than 10%, and the number of jobs has outpaced housing by a factor of eight. That means a lot more commuters and a lot more traffic, all jockeying for position in a city that is just seven by seven miles in size. In the Bay Area, rush hour commuters each lost nearly five days of time last year just sitting in traffic.
There are dire implications for cities all across the country. “If we let these companies decide who does or doesn’t get to do research or analysis, we’re biasing that work,” says Greg Erhardt, who runs a transportation lab at the University of Kentucky. “Cities have to have their own access to data to make intelligent decisions.”
The real value of Uber — $82.4 billion in its IPO earlier this year — isn’t derived from the ever-growing list of services it provides: ridesharing, food delivery, cargo shipping, all those damn scooters. What makes Uber valuable is the massive amount of data it’s amassed about how we move, and how much we’re willing to pay to do it.
Uber might have posted a loss of $5.2 billion last quarter, leading its present market cap to dwindle down to around $54 billion. But the data it reaps on its riders, and its proprietary grip on it, continues to be priceless.
Add a stop: regulation
In the Golden State, taxis are regulated at the city level, and limousines and black cars are overseen by the California Public Utilities Commission (CPUC). The CPUC’s purview spans from power companies to railroads, ensuring that services are equitably provided to all California residents. Anyone who’s followed the ride-sharing wars since they first emerged a decade ago witnessed this new breed of startup quickly recast itself not as transportation companies, but as platforms or marketplaces, in an attempt to sidestep local or state regulations.
But in 2012, Uber was facing challenges from upstarts Lyft and Sidecar, which were not only offering on-demand rides via an app, but also enlisting civilian drivers who could be paid less than chauffeurs or taxi drivers. (Initially, Uber offered only black car services, which were a bit more expensive than taxis.) Uber co-founder Travis Kalanick went to the CPUC that summer, demanding that it shut down the peer-to-peer services, according to Harvard researchers Onesimo Flores Dewey and Lisa Rayle.
When the commission declined to crack down on his competitors, Kalanick quietly introduced his own peer-to-peer service, UberX. That forced the commission’s hand, and at the end of 2012, the CPUC began its bureaucratic process to create new rules for what they now call Transportation Network Companies.
Uber, Lyft, and Sidecar all hired lobbyists to represent their companies’ interests with the CPUC. The commission’s rulemaking was unusually fast, Dewey and Rayle wrote, and, unlike the taxi industry, did not include any requirements for rate setting or limits on the number of vehicles ride-hailing companies could put on the road. Today there are fewer than 1,500 taxis operating in San Francisco. As of late 2016, there were already 45,000 Uber and Lyft drivers operating in the city, a number that has certainly grown since then.
The commission required ride-hailing services to deliver annual reports with anonymized data for every single ride request made in California — down to whether trips were accepted or declined, and whether accessible ride requests were fulfilled. It’s a massive amount of granular data that would reveal to officials in San Francisco, Los Angeles, and elsewhere in California where ride-hailing services are filling in public transit gaps, where pickups and dropoffs are snarling traffic, and where congestion pricing might need to be implemented.
But no one outside of the commission can access those reports because the CPUC, which Dewey and Rayle describe as “extremely business-friendly and supportive of innovation,” said that the data was proprietary and “commercially sensitive,” and would therefore be kept confidential. When transit authorities in San Francisco asked for the data, wanting to better evaluate congestion in the city, the commission refused, claiming the request was “not in the public interest.”
In 2017, the city attorney of San Francisco filed court orders against Lyft and Uber, arguing that the city and county had a right to see the California data the ride-hailing companies were providing to the commission. In addition to the detailed trip data reports, those companies also submit reports on the availability of disabled-accessible vehicles, traffic incidents, and hours and miles logged by drivers.
Jason Henderson, a professor who studies the politics of mobility at San Francisco State University, is frustrated by the stalemate. “Theoretically, if a governor wanted the CPUC to be more aggressive toward collecting data and regulating TNCs [Transportation Network Companies], he or she could do that,” he says. But instead, Henderson believes the political class of California has acted like their hands are tied.
“The CPUC has a huge amount of power and a lot of discretion as to what they can do,” Henderson says. “If the members of this commission woke up one day and said, ‘We’re going to cap how many Ubers and Lyfts there can be,’ they could do that. They choose not to. The reason why is a mystery.”
Cracking the congestion conundrum
In 2015, a stealth team scattered 43 computers posing as phones across San Francisco and Manhattan to better understand how Uber’s surge pricing system worked. The experiment by Northeastern University computer science researchers Le Chen, Christo Wilson, and Alan Mislove aimed to make the pricing algorithms slightly less opaque. Over the course of four weeks, the programs would ping the ride-hailing services every few minutes and record the prices offered without actually booking rides.
The Northeastern researchers’ work caught the attention of San Francisco transit authorities who were struggling to keep up with the rise of ride-hailing in the Bay Area. When Uber was founded in 2009 and Lyft in 2012, the promise of ride-hailing was to cut personal vehicle miles and make more efficient use of cars. The legacy taxi system was ripe for Silicon Valley disintermediation because the system limited the number of taxis in operation and wasn’t adaptive to real-time conditions. Before Uber, getting a ride meant calling one of a dozen taxi companies and hoping that a car would eventually show up. Ride-hailing companies would offer reliable pickups, potentially reduce personal car use and congestion, and enable a new class of workers to make money in a flexible way.
Ride-hailing was also touted as a solution to “the last-mile problem” — getting people to and from public transit stations. Without having access to trip data, transit planners can only guess how many people use ride-hailing for this purpose. Yet, while a few cities, such as Seattle and New York, have seen rising public transit use since 2010, the bulk of U.S. cities have seen a steady decline in ridership. For every year after Uber and others entered a market, bus ridership dropped by 1.7%, according to research from the University of Kentucky. After six years, those cities experienced on average a 10% drop. In Los Angeles, ridership has plummeted by more than 25%. So it didn’t come as much of a surprise when Uber stated in its IPO filing in April that it believes it can “replace personal vehicle ownership and usage and public transportation one use case at a time.”
The San Francisco County Transit Authority was terrified of this kind of scenario. The study conducted with the Northeastern researchers determined that Uber and Lyft vehicles, both with passengers and without passengers, accounted for about 20% of all vehicle miles traveled in the city. And it estimated that ride-hailing vehicles were responsible for about half of the increase in congestion in San Francisco between 2010 and 2016.
Uber and Lyft released their own study in August with consultancy Fehr & Peers that provided estimates of ride-hailing traffic as a share of overall traffic in certain U.S. metro areas. Uber admitted that the services are likely increasing congestion, but said private cars and commercial traffic remain much bigger problems — ride-hailing is a tiny percentage of traffic by comparison, accounting for less than 2% of all vehicle miles traveled in Seattle and up to 13.4% in San Francisco.
Those numbers are meaningless out of context. “If there’s one more car on the road at noon on a weekday in the more suburban western parts of San Francisco, it’s not going to affect travel speeds at all,” Castiglione says. “But if you add one more South of Market during the evening peak period, it adds a lot of extra delay.” In other words, it all depends on where and when trips are happening. And since most ride-hailing trips take place in the central core of the city, where congestion is worst, those delays can snowball.
Uber’s campaign to play nice
Uber needed a new look. In 2017, company co-founder Kalanick was ousted after a series of revelations ranging from institutional sexism to greylisting public officials using the service in battleground cities so that they would not be served with rides. Under Kalanick, human drivers were always just an inconvenient stage of operations to deal with until autonomous electric cars finally arrived. The arrival of Dara Khosrowshahi, the former head of Expedia, marked a new era that aspired to exhibit a more humane Uber. That meant rather than employing the shock-and-awe strategy it had once used to expand into new markets, it would attempt outreach with local governments.
Its charm offensive included giving a peek into Uber’s data troves. Uber Movement, which launched earlier in 2017, showcased select cities in color-coordinated interactive maps that tracked average travel times and speeds for specific dates. Uber Movement now covers 12 North American cities, and another two dozen cities globally.
The public loved it, but experts rolled their eyes at what was clearly a data visualization stunt. Says San Francisco County Transportation Authority’s Joe Castiglione: “That doesn’t really help us because there are plenty of other big data vendors that already provide that information that we already use.” Castiglione knows that congestion exists and that travel times vary by day. What he wants to know is where people are going and why.
Officials in Cincinnati would also like to know these kinds of details. Nestled in the southwest corner of Ohio, the city has seen an influx of people and capital transforming its core over the past decade. Though Cincinnati proper has about 300,000 residents, the greater metro area is home to more than 2 million. Uber chose the Queen City in January 2018 for a unique, multiyear project called the Mobility Lab, an initiative driven by an urban planner named Andrew Salzberg. Uber has promised to share its experts and data with a coalition of regional governments and transit authorities and is offering “enhanced” Uber services to Cincinnati.
Salzberg grew up in Montreal, raised with an appreciation for bike rides and public transit. He studied civil engineering and got a masters in urban planning from Harvard before going to work for the World Bank, focusing on transportation investments in Asia. He joined Uber in 2013 to lead its operations in New York City, but once the company realized he would be the ideal emissary to interact with transit agencies, he became part of its new transportation policy team. (Salzberg, who left Uber this summer to start a fellowship at Harvard’s Graduate School of Design focusing on using mobility technology for the public good, was unavailable to talk to Marker for this story.)
“We saw this [Mobility Lab] as an opportunity to collaborate with a city in a deeper way than we’ve done in the past and demonstrate our commitment to the cities in which we are operating,” says Alix Anfang, an Uber spokesperson. “This project aims to answer the question, how can new technology deliver innovation that leads to public benefit for all?”
Pete Metz, transportation policy and coalition manager at the Cincinnati Chamber, is a proponent of the Uber project. “Less than one-fourth of all jobs are reachable by a 90-minute public transit commute in our region,” he says. “We have to fix our bus system, and we’re working on that. But we want an integrated transit system that’s 21st-century-oriented.”
A new Amazon facility next to the Cincinnati/Northern Kentucky International Airport airport will provide another 2,000 jobs when it opens in 2021. But today that airport only gets two buses an hour from downtown Cincinnati, with even less frequency on weekends, so authorities know they need to think about the future. “The future of transit is multimodal — bus, Uber, Lyft, bikes, scooters,” says Brandy Jones, vice president of external affairs at the Southwest Ohio Regional Transit Authority (SORTA). “Whatever we can do to encourage people to get out of their own personal cars, that’s awesome.”
SORTA and the Transit Authority of Northern Kentucky shared ridership data with Uber, which in turn is providing some of its data to Fehr & Peers Transportation Consultants to create new studies. But supporters of public transit in Cincinnati are worried that the Uber Mobility Lab is a superficial fix for a very serious mobility problem: SORTA has been running at a deficit for years. “Our bus system is basically held together with duct tape and popsicle sticks,” says Derek Bauman, a retired police officer and transit activist in Cincinnati who lives car-free.
A preliminary curb study report, released one year into the partnership between Uber and Cincinnati, found that there should be more pick-up and drop-off zones for ridehailing services in entertainment areas of downtown. Cincinnati’s Public Services department installed new signs in those zones just last month.
University of Kentucky’s Erhardt, an assistant professor of civil engineering who focuses on transportation, is skeptical about the findings. “We should be asking holistically, how much of the public right-of-way should go to pick up and drop off, bus stops, bike lanes, cars?” he says. “What we choose to do reflects our values as a city. If we put in [rideshare] pick-up zones, are we not building bike lanes? We have to consider the tradeoffs as a whole.”
When it comes down to it, taking a $20 Uber trip instead of a $2.65 bus ride isn’t an option for people with limited means. Says Cam Hardy, president and co-founder of the Better Bus Coalition in Cincinnati: “I don’t think more Uber cars on the road is going to solve our transportation crisis or remedy congestion.”
Uber’s black box
Other cities are now following San Francisco’s lead of trying to get data from ride-hailing companies. Massachusetts requires aggregated tallies of trips based on origin community and destination community, and the governor has just proposed collecting California-level data. New York City, which has a long history of tussles with Uber, has capped the number of vehicles that ride-hailing companies can have in service, set minimum wages for drivers, and generally regulates them as black car services, requiring drivers to have livery licenses, undergo annual drug tests, and adhere to 164 pages of miscellaneous regulations.
Despite the olive branches of data Uber has given to select cities, Uber is still fighting San Francisco’s demand to keep secret the data it gives the California Public Utilities Commission. (Lyft eventually made a deal with the San Francisco city attorney to provide historical trip data as long as access is restricted to relevant city workers.)
The Superior Court of California in San Francisco and the Court of Appeal both sided with San Francisco, agreeing that opening up data from ride-hailing companies was very much in the public interest. In July, Uber appealed the case to the Supreme Court of California, where it remains on the docket. Uber declined comment to Marker on the ongoing litigation.
Even without Uber’s trove, Joe Castiglione will do his best to use the data he has to tweak his models of the Bay Area, trying to understand how planning decisions made today will affect travel patterns in 2020, 2030, or even 2050. Meanwhile, what Uber uses its petabytes of data for — improving services, complementing public transportation options, incentivizing drivers, or just crushing the competition (public or private) — will remain a mystery.
Days after this report was first published, the California Supreme Court declined to review Uber’s appeal of the data case with San Francisco. City Attorney Dennis Herrera had this to say: “The California Supreme Court has spoken. Uber must follow the law. Every court that has looked at this has acknowledged that we’re entitled to these records. We look forward to Uber complying with our subpoena so we can further this investigation into whether ride-hailing companies are following the law.”