Mar 16 2017
Researchers studying a deal in which Google's artificial intelligence subsidiary, DeepMind, acquired access to millions of sensitive NHS patient records have warned that more must be done to regulate data transfers from public bodies to private firms.
The academic study says that "inexcusable" mistakes were made when, in 2015, the Royal Free NHS Foundation Trust in London signed an agreement with Google DeepMind. This allowed the British AI firm to access sensitive information about 1.6 million patients who use the Trust's hospitals each year.
The access was used to create monitoring software for mobile devices, called Streams, which promises to improve clinicians' ability to support patients with Acute Kidney Injury (AKI). But according to the study's authors, the purposes stated in the agreement were far less specific, and made more open-ended references to using data to improve services.
More than seven months after the deal was put in place, an investigation by New Scientist then revealed that DeepMind had gained access to a huge number of identifiable patient records and that it was not possible to track how these were being used. They included information about people who were HIV-positive, details about drug overdoses and abortions, and records of routine hospital visits.
As of November 2016, DeepMind and the Trust have replaced the old agreement with a new one. The original deal is being investigated by the Information Commissioner's Office (ICO), which has yet to report any findings publicly. The National Data Guardian (NDG) is also continuing to look into the arrangement. DeepMind retained access to the data that it had been given even after the ICO and NDG became involved, and the app is being deployed.
The new study reviews the original agreement in depth, with a systematic synthesis of publicly available documentation, statements, and other details obtained by Freedom of Information requests. It was carried out by Dr Julia Powles, a Research Associate in law and computer science at St John's College, University of Cambridge, and Hal Hodson, who broke the New Scientist story and is now Technology Correspondent for The Economist.
Both authors say that it is unlikely that DeepMind's access ever represented a data security risk, but that the terms were nonetheless highly questionable, in particular because they lacked transparency and suffered from an inadequate legal and ethical basis for Trust-wide data access.
They say the case should be a "cautionary tale" for the NHS and other public institutions, which are increasingly seeking tech companies' help to improve services, but could, in the process, surrender substantial amounts of sensitive information, creating "significant power asymmetries between citizens and corporations".
"Data mining and machine learning offer huge promise in improving healthcare and clearly digital technology companies will have a major role to play," Powles said. "Nevertheless, we think that there were inadequacies in the case of this particular deal."
"The deal betrays a level of naivety regarding how public sector organisations set up data-sharing arrangements with private firms, and it demonstrates a major challenge for the public and public institutions. It is worth noting, for example, that in this case DeepMind, a machine learning company, had to make the bizarre promise that it would not yet use machine learning, in order to engender trust."
Powles and Hodson argue that the transfer of data to DeepMind did not proceed as it should have, questioning in particular its invocation of a principle known as "direct care". This assumes that an "identified individual" has given implied consent for their information to be shared for uses that involve the prevention, investigation, or treatment of illness.
No patient whose data was shared with DeepMind was ever asked for their consent. Although direct care would clearly apply to those monitored for AKI, the records that DeepMind received covered every other patient who used the Trust's hospitals. These extended to people who had never been tested or treated for kidney injuries, people who had left the catchment area, and even some who had died.
In fact, the authors note that, according to the Royal Free and DeepMind's own announcements, only one in six of the records DeepMind accessed would have involved AKI patients. For a substantial number of patients, therefore, the relationship was indirect. As a result, special permissions should have been sought from the Government, and agencies such as the ICO and NDG should have been consulted. This did not happen.
Such applications of "direct care" have been queried before. In December 2016, Dr Alan Hassey of the NDG, which provides national guidance on the use of confidential information wrote that: "an erroneous belief has taken hold in some parts of the health and care system that if you believe what you are doing is direct care, you can automatically share information on a basis of implied consent". Dr Hassey noted that direct care is not "of itself a catch-all... The crucial thing is that information sharing must be in line with the reasonable expectations of the individual concerned".
The researchers' survey also criticises the lack of transparency in the agreement, pointing out that neither party made clear the volume of data involved, nor that it involved so many identifiable records. How that data has been, and is being, used, has never been independently scrutinised. Last week, DeepMind announced plans to develop a new data-tracking system, to make such processes more transparent, at an unspecified future stage.
The authors liken the relationship overall to a one-way mirror. "Once our data makes its way onto Google-controlled servers, our ability to track it - to understand how and why decisions are made about us - is at an end," they write.
The paper says that an obvious lesson is that no such deal should be launched without full disclosure of the framework of documents and approvals which underpins it. In light of the 2013 Caldicott review of sharing of patient information, they write that: "The failure of both sides to engage in any conversation with patients and citizens is inexcusable."
They also suggest that private companies should have to account for their use of public data to properly-resourced and independent bodies. Without this, they argue that tech companies could gradually gain an unregulated monopoly over health analytics.
"The reality is that the exact nature and extent of Google's interests in NHS patient data remain ambiguous," the authors add. Powles notes that while Google has no stated plans to exploit the data for advertising and other commercial uses, its unparalleled access to such information, without any meaningful oversight, does not rule out the possibility in future.
"I personally think that because data like this can get out there, we are almost becoming resigned to the idea," Powles added. "This case stresses that we shouldn't be. Before public institutions give away longitudinal data sets of our most sensitive details, they should have to account to a comprehensive, forward-thinking and creative regulatory system."