New shortcomings in the NHSX contact racing app could further limit effectiveness and scare away users. E&T investigated concerns raised by computer engineers about timestamp and Google Analytics tracking.
At the beginning of May, the open-source code of the NHS Covid-19 Android app beta version was uploaded to GitHub, a popular file-sharing platform. Since then developers have scrutinised every line of code and raised 27 issues on its Android version and 17 on the IOS version.
A number of concerns by computer engineers were raised about the app. Problems that have already received a lot of attention include the app failing when both devices are locked, click events for constant use, and secret keys. But E&T has investigated further issues: the app’s tracking of user timestamps and data points that record the exact moments when the app makes a connection with another phone by someone with a Covid-19 self-reported status.
The NHS argues that it needs timestamp tracking data to know if and when two people met, but it could also be used by law enforcement agencies and other groups. Joshua Berry, electrical engineer and security consultant at cyber-security firm Synopsys, reviewed the open-source code for E&T and checked on claims made on timestamps and several other issues. “Timestamp tracking is a very valid concern because it makes it easier to correlate data from other sources”, Berry says.
It’s a potential problem for the decentralised solutions adopted by several other EU countries as well as centralised systems, which the UK opted for. In the case of the centralised system, the app is unable to access the API that Google and Apple created for Bluetooth Low Energy (BLE). This makes it harder to limit data tracking and enhance privacy.
According to the open-source code the app is tracking timestamps at a millisecond level. But Berry doubts if that’s the granularity they need. “Milliseconds might be too granular. Maybe even seconds are too much”, he says. It potentially could be used for other purposes, he cautions. Millisecond exact timestamps could be correlated with other data that could allow someone to work out the personal details of a user.
Berry concedes the app needs to determine when contact with other devices is made. The data is then uploaded to the central server where the matching is done. Those matches help to determine if and at what point in time Covid-19 infected people were close to others. But with such granular data, Berry thinks the NHSX team may not exactly know what they are after. “They may not know what they want to use [the timestamps] for. They could end up not having enough contact event data points and use the additional timing precision. They may want to figure out when a person came in contact with another person, how much time elapsed before they were in contact with a third person. There are a lot of viable questions NHSX may ask of the collected data, some relevant for public health,” he says.
Tracking location and contact data will need to involve a discussion about ethics and how the data could be used for other purposes, he stresses. “The more useful [timestamps are] for the NHS, the more attractive it gets for malicious folks to figure out where people are going and who they’re interacting with”. Law enforcement and intelligence agencies could become interested in this data.
Other experts are concerned. According to an independent assessment by Dr Chris Culnane, a security and privacy consultant, “an adversary could establish the set of BroadcastValues, variables that allow a programmer to keep a read-only variable cached on each machine rather than shipping a copy of it with tasks, for known associates by observing their initial interaction and matching the timestamps. In effect it provides an easy tool for adversaries (of cyberattacks) to assert their control”, Culnane wrote.
Also, Berry thinks the NHS takes an unnecessary risk in promising the data is never going to be used for law enforcement purposes. In its privacy guidance NHSX promises “the app will not be able to track your location and it cannot be used for monitoring whether people are self-isolating or for any law enforcement purposes”. This could be misleading if groups who want to use the data for their purposes become creative, he says.
The police will already have access to the LTE (Long Term Evolution) data, namely metadata from mobile phone towers and they would be able to correlate exactly when the data is uploaded. They can also find out about peoples’ identities and anonymised tokens used by handsets.
Law enforcement and intelligence services may have access to all of those types of data anyway: “With access to cellular metadata, police could correlate timing and location metadata for a person with contact events in the NHSX system. They could know that they came in contact with another person. They could find it was sent by this application through this cellular provider and what their International mobile subscriber identity (IMSI) number is, which is registered to them and their postal address”, he says.
In his view the problem is not necessarily of a technical nature. Instead it’s a legal one to allow the authority to dig so deeply into the data.
Berry thinks NHSX should explain what the granular timestamps are for: “It’s easy for application developers to say, ‘we have access to a timer that’s accurate to two milliseconds, having this precision may come in handy later’”.
It is also feasible to track citizens without them knowing. There are connector points that are being stored on the server. Many home automation hubs and devices contain chipsets that support BLE. The current version of the software/firmware running on these devices may not support the tracking of BLE-based contact-tracing events. These features could be enabled without people being aware of it. In theory, says Berry, passive sniffing devices could be placed in various GPS locations to collect data on what people do and where they are going. Technically, there is nothing stopping the police from doing so. They could place them in strategic areas and record NHSX data. “For this, the NHS doesn’t even need to be privy to the police doing such a thing until after”, he says.
There are limits, though. The system is intended to only store four weeks-worth of keys that allow matching. “The NHS is trying to have a clean slate every four weeks”, Berry says. It’s a sliding window of data that is at most four weeks old. The client-side application code of the back end isn’t available online. The NHS’s assertions about how long data is retained cannot be reviewed.
Another issue is crowds. The app may fail with too many potential connections in one spot.
“If you have a sparse group of people that are in range, this approach could be perfectly effective. How effective it is when you really load-test it and you turn it on in a crowded subway or the Tube in London with a lot of people in range, is a different question”.
Your handset can only have concurrently up to five or eight connections at most over BLE. This is a hardware and an operating system limitation. “If you’re in a very busy area, you may not be able to track all of the people that are around you”, he says.
E&T spoke to an IET employee in the Isle of Wight where the contact tracing app was piloted and where it showed wide-ranging security flaws. Paul Deards from the IET’s books publishing team installed the app on his phone and says, so far, it has worked without complication. The little bar on the top of his phone screen has not picked up any connections. On the concern about crowds he says: “It’s really not going to be a problem on the Isle of Wight. I’ve lived here three years. I don’t think I ever walked through a crowd.”
Google Analytics tracking is another problem. More detail and context of the interaction could be available to Google, Berry thinks. URLs fetched by the Android and the iOS Covid-19 apps do include Urchin Tracking Module (UTM) tags. It represents a simple piece of code that can be attached to any URL to generate Google Analytics data for digital campaigns. Specific to Google Analytics, UTM help track the progress of a campaign on all online platforms, tags that get appended to HTTP requests. These in turn typically get sent to Google Analytics for typical web tracking that allows the embedded tags to be added to data visible from the website team’s Google Analytics dashboard, he says.
Allowing Google Analytics tracking could help groups to identify users. It’s possible to use this information to help correlate users’ Google Identity – including name, email address, and so on – to users’ new NHSX app registration, which the user may want to keep as anonymous as possible. With sufficient access to the data collected by Google, as well as the NHSX tracing back end, Berry is confident a correlation is possible between the NHSX stored sonarID for identifying a given user device and the Google identity.
At the beginning of May, the UK government’s contact-tracing app failed important tests, namely those needed to be included in the NHS Apps Library, including cyber security, clinical safety, and performance.
It was only possible to find so many security flaws thanks to the heightened scrutiny by the software engineering community, says Paul Farrington, EMEA CTO at Veracode, a US app security firm: “There has been a great deal of scrutiny of the software from ethical hackers and security researchers, and perhaps predictably, security bugs have been found”.
Also, the NHSX contact-tracing app is no outlier in its sector. 52 per cent of healthcare apps have severity four or five flaws, a Veracode report found. To fix security bugs in healthcare takes the longest time compared to other industries, experts found.
Developers helping to check the open-source code on GitHub has one big advantage, Mark Richards, software engineer and online privacy researcher, points out: “People can see experts debating about the technical quality of NHS apps. But whether it is pacemakers or mobile apps, why can’t all NHS software be open to the same review and scrutiny?”.
Sign up to the E&T News e-mail to get great stories like this delivered to your inbox every day.