Big Data Analytics for Containing the Spread of SARS-COV-2

CC Chi-Mai Chen
HJ Hong-Wei Jyan
SC Shih-Chieh Chien
HJ Hsiao-Hsuan Jen
CH Chen-Yang Hsu
PL Po-Chang Lee
CL Chun-Fu Lee
YY Yi-Ting Yang
MC Meng-Yu Chen
LC Li-Sheng Chen
HC Hsiu-Hsi Chen
CC Chang-Chuan Chan
ask Ask a question
Favorite

After knowing about the outbreak of the Diamond Princess cruise ship on February 5, 2020, the CECC immediately formed a task force to involve the preliminary investigation on February 6, 2020. Contact tracing for those possibly contacted by already infected passengers was recommended. The design and process of contact investigation and management were elaborated as follows.

As the cruise ship passengers had a 1-day excursion on January 31, 2020, when the Diamond Princess cruise ship was docked at Keelung harbor, the team designed possible solutions for tracing their routes through their itinerary in Taiwan. As it was impossible to conduct retrospective individual interviews for each passenger, the methods used to overcome the barrier of determining the location and itinerary of the contact were classified into four main categories: GPS in the shuttle bus, credit card transaction log, closed-circuit television (CCTV), and mobile position data.

Among the four categories, the mobile geopositioning method was the mainstay for identifying passengers’ routes by mobile position data for COVID-19 contact investigations and was able to provide more accurate information on the location and time of exposure. This method can overcome the shortcoming of incomplete information obtained from the GPS in the shuttle bus, card transactions, and CCTV, as these three methods were only representative of some passengers. These three methods were used for cross-validation of the routes estimated by the mobile sensors of the contacted persons in the light of mobile position data from the passengers.

The mobile position data from more than 3000 passengers on January 31, 2020, were obtained from five local mobile phone companies. The mobile position data are collected at mobile positioning measurements up to 150 meters from the true mobile location as the accuracy of geolocation for identifying possible contact persons. The mobile position method might not be as exact as GPS but the latter may infringe on individual confidentiality. The contact locations were ascertained on the basis of the roaming signals with time of exposure over 30 minutes from multimobile base stations between 5 AM and 8 PM that were recognized as the major tracking routes. Based on the mobile signal registered to the base stations of five domestic telecom operators, the first challenge was to identify the 3000 passengers out of all tourists in Keelung area. According to the record, the cruise was moored at the harbor from 6 AM to 6 PM. We then checked the data between 1 hour before and 2 hours after the cruise docked at Keelung harbor. This confirmed the exact mobile phone numbers of people who traveled with the cruise.

After collecting those phone numbers, the team depicted rough locations of those phones. With the assistance of the local government, we found that about 34% of passengers took shuttle buses for local tours, 5.2% took taxies, the others biked or walked around at harbor or nearby area. More than 24 buses and 50 taxies had been interviewed and recorded. The estimated routes of passengers were further validated by the itineraries provided by the travel agency. The team then checked the detail tour information for each route, interviewed the taxi drivers in harbor area for destination, and integrated all information to confirm more precisely the location where passengers stayed.

The most important part of this stage was to identify the possible position where passengers were. This also showed how to utilize big data analysis with a mixture of different data sources.

At the second stage, we resorted to the mobile position information of passengers above to identify the sensors of mobiles from the possible contact persons. Citizens who carried their mobile phone and stayed within 500 meters of the marked locations over 5 minutes were classified as people who possibly contacted the passengers of the Diamond Princess cruise ship on January 31, 2020.

On February 7, 2020, the CECC sent an alert notice using SMS through the Public Warning System to remind the contact persons of starting the mitigation plan. The potential contact persons were advised to be quarantine at home, so that they did not engage in public gatherings, to avoid further contact. They were also notified to self-monitor COVID-19–compatible symptoms (fever, cough, and shortness of breath) and seek medical attention when symptoms developed.

On February 9, the CECC sent a notice to all health care providers mentioning this event and the guidance for management of potential contacts. Health care professionals were advised to perform SARS-CoV-2 testing for symptomatic contacts. After testing, symptomatic contacts may have been hospitalized as indicated or returned home for self-isolation. Health care professionals were also advised to proactively contact public health authorities to initiate active follow-up of the contacts.

In order to capture those in the contact population who sought medical attention but did not report to public health authorities, we used the National Health Insurance Claims data to track the health status of all subjects with potential contact. Those who were hospitalized due to pneumonia were identified. For those who remained hospitalized but had not been tested for SARS-CoV-2, the health care providers were informed of the potential exposure of the patient and screening for SARS-CoV-2 was suggested.

As few asymptomatic patients that may have a long duration of COVID-19 development and were very difficult to be identified by the reverse transcription–polymerase chain reaction (RT-PCR) test, it is also very interesting to compare the difference in the rate of respiratory syndrome and pneumonia between the contact population (n=627,386 residents) and the general population in Taiwan (n=23,877,447 residents). Among these subjects, information on respiratory syndrome or pneumonia cases was ascertained by linkage with the big National Health Insurance claim database from January 31, 2020, to March 10, 2020. During this period, subjects with at least one outpatient visit with ICD-10 (The International Statistical Classification of Diseases and Related Health Problems, 10th Revision) codes (“J00” to “J11”) were identified as having respiratory syndrome. The subjects who had pneumonia were identified by ICD-10 codes (“J12-” to “J18”).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A