Ready to accelerate your digital transformation? Get your FREE Digital Readiness Score today!

How a National Healthcare Authority Broke Data Silos with Google Cloud & AI

Share

Challenge: Overcoming Data Silos and Operational Bottlenecks

A major national healthcare authority (referred to hereafter as “The Client Organisation”) initiated the project to develop a comprehensive system for strategic insights and market segmentation. The primary objective was to consolidate and categorize structured and unstructured data from multiple sources and databases.

The complexity of the challenge centered on data acquisition and quality control:

  1. Data Disparity and Location: The system required gathering disparate data from internal sources (such as Informatics, Healthcare, and Communications departments) and numerous external web sources, which often included unstructured documents like PDFs, images, or general files. The majority of internal data resided in a Microsoft OneDrive environment, which had to be seamlessly integrated with the new platform.
  2. Integration Roadblocks: Automating the Extraction, Transformation, and Loading (ETL) pipelines proved challenging due to restrictions. For instance, attempts to connect directly via APIs to social media platforms like Meta (Facebook, Instagram), TikTok, and LinkedIn faced rejection due to platform policies, necessitating workarounds like extracting data through OneDrive. Furthermore, external data sources requiring web scraping, such as those detailing Elite Member Hospitals or international accredited facilities, presented technical difficulties dueibilities due to non-standardized web structures, Advanced JavaScript usage, or dynamic URL links.
  3. Data Quality and Standardization: Data cleaning was a critical prerequisite. The sources exhibited common issues like file name inconsistencies, missing values, facility code duplication, and non-standardized country identification. For example, the system needed to standardize country names across submissions, handling over 239 potential variations or entries often filed under ‘OTHERS’.
  4. Stringent Timeline: The scope included multiple complex deliverables (data consolidation, Big Data Analytics, AI-driven market insights, Information Portal, and Data Monetization) to be delivered within a defined project period.

 

Solution: A Cloud-Native, AI-Enabled Data Backbone

The Service Provider, MatrixC, deployed a cloud-native architecture leveraging the Google Cloud Platform (GCP), using separate development and production environments. This multi-phased approach ensured robust data governance and analytical capability.

Phase 1: Data Consolidation and ETL Automation

The foundational work focused on ingesting and structuring disparate data into a centralized repository.

  • Core Infrastructure: BigQuery (BQ) was established as the serverless, large-scale data warehouse for efficient querying and storage of large volumes of structured data.
  • Data Pipelines: The process of Data Ingestion utilized automated ETL pipelines orchestrated by Cloud Dataflow, Cloud Scheduler, and Cloud Functions/Cloud Run. These pipelines successfully extracted data from distinct cloud services: the Client Organisation’s existing OneDrive, as well as Google Cloud Storage (GCS), serving as the raw data lake.
  • AI for Data Cleansing: To achieve rapid data standardization, particularly the country name reconciliation required for Informatics data, the team utilized the Gemini Large Language Model (LLM). Since the required advanced version of Gemini was not yet available in the preferred Asia-Southeast region, the data was temporarily resided in the US data center (us-central1) to enable critical AI-powered cleaning.

 

Phase 2: Insights Generation and Knowledge Retrieval

This phase implemented applications for user interaction and intelligence delivery:

  • Information Portal: A centralized front-end web application was deployed, hosting key AI-driven modules.
    •  Document AI (OCR): This module employed Gemini LLM for Optical Character Recognition (OCR) screening, automating the extraction of data (such as medical information, travel details, and patient demographics) from supporting documents like e-Visa, Meet & Greet Services, and Visa Extension application PDFs and images. This automation drastically reduced manual data entry and improved processing speed, with an expected accuracy of ≥95% field accuracy (Human check is still required for validation).
    • Enterprise Search (STAX): An AI-powered search and summarization tool using Vertex AI was deployed. This capability allows users to search the organization’s document repository, including documents that were manually uploaded in bulk CSV/URL lists or PDF format, and retrieve highly relevant results quickly, loading results in under ≤3 seconds. The AI assistant was named AIRA (Artificial Intelligence for Research and Analytics).
  • Visualization and Analytics: Interactive dashboards for data analysis and visualization were developed using Looker Studio, enabling self-service analytics for internal users without relying on technical teams.

 

Phase 3: Advanced Analytics and Data Monetization

The final phase introduced strategic planning and external data sharing:

  • Predictive Models: Vertex AI was leveraged to develop and operationalize predictive analytical applications, including models for Market Segmentation, Country Profiling (demographic and treatment preferences), and Revenue Forecasting.
  • Data Monetization (Data-as-a-Service): The system enabled the strategic opportunity for the Client Organisation to securely share derived datasets to external partners using GCP’s Analytics Hub. Access was managed via role-based security and defined access tiers (T1 to T5), controlling the granularity of shared data (e.g., by month, country, region, state, or facility type) to unlock new revenue streams.

Impact: Data-Driven Strategy and Operational Efficiency

The project delivered a comprehensive, structured data intelligence system:

  • Unified Data Source: The system successfully consolidated complex, multi-source, multi-format healthcare and market data into a single, scalable BigQuery data warehouse, overcoming previous organizational silos.
  • Operational Automation: The deployment of AI-powered OCR significantly streamlined administrative processes within the Healthcare division, automating the extraction of information from patient application documents, which previously required extensive manual effort.
  • Enhanced Decision Making: The deployment of predictive models using Vertex AI enabled the Client Organisation to move beyond historical reporting to leverage AI for advanced analytical market segmentation and revenue forecasting, directly contributing to strategic activation plans.
  • Project Delivery: The defined scopes, including data management, data architecture, Big Data Analytics, AI-driven insights, Information Portal, and Data Monetization were successfully implemented.

 

The success of integrating such complex and diverse sources of information, from external public websites that resist scraping to internal PDF attachments, and unifying them into a central system accessible via AI search and predictive models, is analogous to building a command center for a massive logistics network. Previously, data updates arrived via scattered maps, handwritten notes, and separate fax messages, making real-time planning impossible. The new system integrates all those inputs automatically, translates them into a single digital format, and then feeds them into high-speed planning computers, enabling strategic, informed decisions based on the complete, coherent picture of the entire operation.

 

Contact Our Team

Are you interested to learn more about our products?

Do you wish to speak to us for professional advice on digitalizing your business? Click on the button below to book a complimentary 1-on-1 consultation with an expert from our team.

You may also like

Data Analytics

Unlocking the Power of Data Analytics

In our data-driven world, information is constantly generated and stored. But have you ever wondered how all that data can be transformed into meaningful insights? That’s where data analytics comes

Read More »