The Limited Times

Now you can see non-English news...

Coronavirus: How swarm intelligence could automate corona diagnosis

2021-05-29T19:16:54.957Z


Artificial intelligence needs huge amounts of data, which often collides with data protection. A medical team is now reporting on a new type of software that networks decentralized databases and ensures confidentiality.


Enlarge image

X-ray (Pittsburgh, USA, 2009)

Photo: Keith Srakocic / AP

A new type of computer system is to help diagnose complex diseases by bundling the knowledge of many specialists who work in hospitals all over the world.

Blockchain technology also plays a role, known from crypto currencies such as Bitcoin or Dogecoin, but more on that later.

So-called artificial intelligence can already help to detect patterns in laboratory data.

The catch: it only works if the Artificial Neural Networks can evaluate huge amounts of patient data, which is therefore stored in gigantic databases, also known as “data lakes”.

For privacy advocates, however, this pooling of sensitive patient data is a nightmare, it is considered prone to data leaks and represents a valuable target for hackers. Now a team has tested how a machine learning system could square the circle: evaluate huge amounts of data, but at the same time save data and work decentrally.

"Swarm Learning" is what the team calls this approach, which they are presenting on Wednesday in the science journal "Nature": learning in swarms.

“Medical research data is a treasure. They can make a decisive contribution to developing personalized therapies that are more precisely tailored to each individual than conventional treatments, ”says Joachim Schultze, 56, main author of the article and director of systems medicine at the German Center for Neurodegenerative Diseases (DZNE). The DZNE focuses on researching diseases such as Parkinson's and Alzheimer's and cooperates with universities, university clinics, research institutions and companies around the world. The new system is intended to help with research work.

For his experiment, Schultze's team evaluated the diagnostic data of four different diseases and used it to train various computer models. It examined certain genetic data (so-called transcriptomes) from people who suffer from blood cancer, Covid-19, tuberculosis and other lung diseases. To do this, the group evaluated data from 127 clinical studies. She also analyzed more than 95,000 x-rays. The diagnostic AI trained with this data quickly learned to diagnose the respective diseases with a hit accuracy of 90 percent for transcriptome data. With X-ray images, the hit accuracy was around ten percent lower, which was mainly due to the low image quality.

That is a respectable result, as can be achieved with conventional methods. What is special in this case is that the more than 100,000 data records were not stored and processed centrally in a data lake, but remained on site on the computers of the participating research institutions at up to 32 different locations. All that was transmitted to the AI ​​model was which details differentiate the data records of the sick from those of the healthy. The calculation is decentralized, only the final result is communicated. This data economy is the key to the new type of swarm learning, writes Schultze's team: "Global cooperation with complete confidentiality".

A lot of health data is currently stored in various storage silos because there is no way to network them in a data-saving manner. The rarer an illness, the more difficult it is currently to collect the necessary number of data sets from all over the world for training learning algorithms. Swarm learning could help with this networking across national borders.

From a purely technical point of view, it is of course possible to cleanly anonymize the data in a central storage lake. The only question is: do all those involved in headquarters trust that it is also working properly - especially in a country whose laws and customs are unknown? If, during swarm learning, the sensitive raw data are not even transferred to the remote control center somewhere in the world, but rather remain with the respective hospital - as securely as possible - then this creates an additional hurdle against possible data leaks. Attackers would have to attack dozens of servers in order to collect all the sensitive raw data with which the AI ​​model was trained.

A similar approach is so-called "Federated Learning", in which the data for training an AI is also stored in a decentralized manner. The difference to swarm learning is that the intermediate results are transferred to a central server. For Schultze's experiment, on the other hand, all 32 participants agreed beforehand on rules for how the data will be collected, processed and distributed. These rules are recorded in a blockchain, a data protocol that works like a kind of digital contract, as is also used with crypto currencies such as Ethereum. With the medical blockchain, however, not everyone is allowed to feed in data, only institutions that have a special security token. This makes the medical blockchain fast and lean, unlike some extremely sluggish cryptocurrencies."All members of the swarm have equal rights," says Schultze: "There is no central power over what happens and the results, so in a sense there is no spider that controls the data network."

At random, a lottery will be drawn in the swarming blockchain to determine which of the participating research institutions or hospitals will collect the interim results of all participants on its servers for a while and automatically enter them into the joint AI model.

After a while, it would be the turn of the nearest hospital to keep records, so to speak.

"This makes our system very fault-tolerant," says Schultze: "If a network node fails, the next server simply takes over the same function."

Gray box instead of black box

Much of swarm learning seems promising, but there are still a few possible problems, as a team led by computer scientist Nicola Rieke pointed out in the magazine "Digital Medicine" last September. For example, an attacker could possibly draw conclusions about the original data of individual participants through clever back-calculation from the AI ​​model, speculates Rieke's team. If, for example, the AI ​​model of an illness changes sharply in one fell swoop shortly after a certain hospital has supplied its data, then the other hospitals involved could theoretically calculate backwards from the change in the AI ​​model using a so-called "model inversion" from the change in the AI ​​model Raw data must have been fed in last.And when it comes to extremely rare diseases with a low number of cases per country, the information that can be retrospectively could even be related in the worst case. In the case of a central data lake, on the other hand, it would be more difficult to see which data delivery caused the AI ​​model to change.

When asked, Joachim Schultze assures that it is mathematically impossible to calculate back from the AI ​​model to the original data of individual patients. But this data economy and decentralization has its price. If it is no longer possible to understand the raw data with which an AI model was trained, then it could become a black box, a kind of oracle that is 90 percent correct, but whose results can hardly be checked in case of doubt. Computer scientists who follow the »explainable AI« approach warn against this: explainable AI whose decision-making patterns are fully transparent. However, it should be difficult to have both at the same time, because there is a trade-off between perfect data protection and perfectly transparent AI.Nevertheless, swarm learning seems to be on the trail of a compromise. "We can understand the learning process of our system very well," says Schultze: "We are not dealing with a black box - at best you could call it a gray box."

Source: spiegel

All tech articles on 2021-05-29

You may like

Trends 24h

Latest

© Communities 2019 - Privacy

The information on this site is from external sources that are not under our control.
The inclusion of any links does not necessarily imply a recommendation or endorse the views expressed within them.