Starting this July, the Transport for London (TfL) will start tracking the phones of users who have Wi-Fi enabled on the London Underground’s network in an effort to better understand the travel conditions at the time. This will be enabled at 260 underground stations in London and will track users using the MAC address of their devices to see the routes these devices and their owners are taking to look for hints of congestion or delays.
“The depersonalized data collection, which will begin from 8 July 2019, will look to harness existing Wi-Fi connection data from more than 260 Wi-Fi enabled London Underground stations to understand how people navigate the network. This will then be used by TfL to provide better, more targeted information to its customers as they move around London, helping them better plan their route to avoid congestion and delays,” says TfL in an official statement.
But how does this system work? TfL calls this depersonalized data collection and will track the movement of phones to understand which routes are being used more at any given point of time. While TfL has the ticket purchase data as well, but this real-time data collection will allow it to respond quicker in case of unexpected delays at some point. The system has been developed in-house by TfL.
When a device such as a smartphone, laptop or tablet has Wi-Fi enabled, it will continually search for a Wi-Fi network by sending out a unique identifier known as a Media Access Control address. This is collected by nearby routers, which in this case will be installed in the London Underground stations.
But what about the inevitable privacy concerns? TfL takes pains to assure users that it will just be the MAC data of a device that will be tracked, and absolutely no browsing or historical data is collected from any devices. In fact, the MAC data will also be tokenized and replaced with another identifier, so that there is no tracing link back to the source device. At no point will the London Underground network ever connect with any user’s smartphone, tablet or laptop, and will also have absolutely no access to any data of any device. It will simply use the regular requests to connect, which every smart device sends out, to understand the movement of a phone and detect delays or congestion.
“While I am excited about the potential of this new dataset, I am equally mindful of the responsibility that comes with it. We take our customers' privacy extremely seriously and will not identify individuals from the Wi-Fi data collected. Transparency, privacy and ethics need to be at the forefront of data work in society and we recognise the trust that our customers place in us, and safeguarding our customers' data is absolutely fundamental,” says Lauren Sager Weinstein, Chief Data Officer at Transport for London.
“The transparency shown by TfL around the data collection taking place and the steps taken to make customers aware of its purpose is welcome and should be seen as an example for others. If we are to realise the full potential and value of real-time data, it is vital that we bring the public on this journey and build a culture of data trust and confidence,” says Sue Daley, Associate Director Technology & Innovation techUK.
In the year 2016, TfL did a trial which logged these Wi-Fi connection requests to log depersonalized data which was then analyzed by TfL's in-house analytics team to help understand where customers were at particular points of their journeys. The agency says that more than 509 million depersonalized pieces of data, were collected from 5.6 million mobile devices making around 42 million journeys. TfL says they could not have detected many of these results from the ticketing data for instance. One instance that TfL puts forward as an example is that London Underground users travelling between King's Cross St Pancras and Waterloo take at least 18 different routes, with around 40 per cent of customers not taking one of the two most popular routes.