
Synthetic digital twins - the future of healthcare
Using synthetic data to unlock virtual patients
Synthetic data case studies
Synthetic data is revolutionizing how organizations leverage their data assets. By preserving the insights of real data without containing sensitive information, synthetic datasets make it possible to securely and rapidly capitalize on data opportunities. Applications include extraction and visualization of business intelligence; advanced analytics; software testing; product demonstrations; and development of AI models for prediction, personalization, profiling, and more.
For these applications, synthetic data will soon overtake real data in processed volume 1. Organizations must anticipate this change. To help them do so, we have collected some of the benefits of Aindo’s synthetic data platform, along with its success stories.
Synthetic data allows organizations to extract the full value of their data assets. It enables secure and free exchange and analysis of data and removes data shortcomings through augmentation. As such, its key benefits include:
Challenge: A car insurance provider wants to use an internet-of-things (IoT) application to collect and manage customer data. The company collects data through IoT devices in the cars of their customers. It needs a platform in which this data is managed and leveraged to create business intelligence.
Four potential vendors are offering such platforms. The insurance provider wants product demonstrations from each of them to make an informed decision. Unfortunately, such a demonstration requires the insurer’s sensitive customer data.
Solution: The car insurance provider integrated Aindo’s Synthetic DataOps Platform on their infrastructure. They connected it to a relational database containing customer information. Our platform generated a database of artificial customer records with the same format and properties as the sensitive database. This synthetic data was securely generated on-site, without the insurer’s real data ever leaving its original IT environment.
The insurer provided the synthetic dataset to the four potential vendors. These vendors used it to demonstrate their products without needing access to the insurer’s confidential information.
Synthetic data was also applied to simulate special events. For example, an additional experiment was conducted in which data was rebalanced so that the number of long-distance commuters was relatively large. This showed how well the software responded to changes in customer behavior.

Benefits: Through Aindo’s synthetic data, the insurer could make an informed decision, substantially reducing risks. The process also showcased that synthetic data can dramatically shorten software development cycles. Risks were further reduced through data augmentation for simulating special events, showcasing the robustness of each of the products.
Challenge: A telemedicine company wants to leverage AI to improve its predictive model estimating fall risks of remote elderly patients. The company wants to combine its proprietary database with external socio-demographic data sources for a more complete understanding of its patients.
Solution: Synthetic data versions are created of the datasets the telemedicine company intends to acquire. The synthetic dataset are seamlessly compatibilized and integrated with the company’s proprietary data. Aindo’s platform also integrates other data sources, including automatically structured transcriptions of phone calls. All this data is combined to create a superb risk estimation model.

Benefits: The project leads to the development of a next-generation risk prediction model. Through the use of synthetic data, new synergies were explored and data could directly and safely be monetized.
Challenge: A large investment bank wants to offer personalized guidance to small and medium-sized corporate clients. The bank has a large relational database of corporate clients and their business trajectories. Through AI, the bank wants to leverage this database to predict which clients are likely to encounter financial difficulties. It will then tailor advice to these clients’ specific needs.
However, external consulting is required to build the involved AI methods. This consultant needs data access and client data is highly confidential and contains trade secrets. Sharing the data goes against the bank’s commitment to discretion.
Solution: A synthetic client database is created with the same format and properties of the real database. The generation process takes place on a dedicated server at the bank. Hence, the data never leaves its original institution and remains subject to the bank’s customary privacy protocols and standards.
The fidelity, privacy and utility of the synthetic dataset are assessed, guaranteeing that quality and safety standards are met. The synthetic data is then provided to the external consulting firm. This firm builds a predictive AI model using the synthetic dataset. The model can then be employed by the bank to better tailor advice to clients.

Benefits: synthetic data allows the bank to effortlessly and safely consult external experts and service providers. This allows the bank to benefit from innovative AI methods to personalize their product offering.
Challenge: A hospital wants to optimize the oncology patient journey. They have a large database of electronic health records (EHR) from previous patients. Through consulting, they know that by leveraging this database, they could improve the patient journey by detecting pathological signs early; improving and personalizing treatments; and offering guided support.
Unfortunately, the database is subject to substantial privacy restrictions. The EHR data is also unstructured, with information collected in text form. This makes analysis challenging at scale. Granting access to external data scientists for AI development involves time-consuming, costly processing steps.
Solution: The EHR data is automatically structured through Aindo’s generative AI technology. All involved attributes are recognized automatically and represented in tables. Subsequently, a synthetic database is created to mimic these tables, without containing sensitive information about real patients. This synthetic dataset can readily be transferred to an AI team.
The team uses the synthetic data to build three AI tools: a diagnostic model, helping physicians identify a collection of oncological pathologies; a prognostic model, able to predict the risk of patients developing oncological pathologies based on attributes in their EHRs; and a model that helped optimally administer treatments to oncology patients.

Benefits: through the synthetic data’s rapid availability, the project’s duration is only two months. This is a 78% decrease compared to the typical nine-month duration of AI projects in healthcare. This impressive pace is achieved as Aindo removed the need for cumbersome manual data preparation and anonymization processes and protocols. Similarly, the involved budget is significantly reduced compared to previous projects of similar scope.
Judah, S., White, A., Sicular, S., Jones, C.J., De Simoni, G., Friedman, T., Beyer, M., Heizenberg, J. and Parker, S., (2020). “Gartner Predicts 2021: Data and Analytics Strategies to Govern, Scale and Transform Digital Business.” ↩

Using synthetic data to unlock virtual patients

Reducing patient burden, accelerating innovation

Turning healthcare data into evidence, safely and at scale