Developing and validating AI algorithms in prostate MRI – An evolving journey

Prostate cancer remains one of the most challenging diseases for radiologists to detect and diagnose accurately, particularly when utilizing AI prostate MRI solutions. The task becomes even more daunting with its multifaceted nature and the intricacies of interpreting multiparametric MRI (mpMRI) images.

This blog examines the complexities of developing and validating algorithms in AI prostate MRI to detect prostate cancer, exploring the challenges and best practices in this evolving journey with AI prostate MRI technology.

Understanding the complexity of prostate cancer detection

Prostate cancer detection poses unique challenges due to the variability in presentation and the diverse anatomical structures within the prostate gland. Radiologists must navigate through multi-sequence MRI images, distinguishing between benign and malignant lesions, particularly in the transition zone (TZ), where cancers can be elusive and easily misclassified with benign hyperplasic nodules, resulting in false positives, which would then undergo an unnecessary biopsy. AI tools can support radiologists in overcoming these challenges.

Such a challenge is exemplified by radiologists’ clinical performance when detecting csPCa, with up to 20% of csPCa+ patients often not being identified and false positive rates of up to 70%. Equally important is the high inter-reader variability, exemplified by a wide range in Positive Predictive Values (PPV) for PI-RADS scores >= 3 lesions when compared against biopsy outcomes.

A key factor contributing to this challenge is the heterogeneity of acquisition protocols in prostate MRI. Before the creation of the PI-RADS guidelines, there was no consensus on the recommended acquisition protocols for an efficient prostate MRI read, describing the needed sequences and parameters to be assessed. PI-RADS was born to bring a single ground truth describing best practices for prostate MRI acquisition T2-weighted sequences captured in various acquisition planes. However, assessing the peripheral zone (PZ) heavily relies on the findings derived from the diffusion-weighted imaging (DWI) sequence. DWI offers insights into the movement of water molecules, which can be indicative of various pathologies, including prostate cancer (PCa).

According to PI-RADS 2.1, while T2w and DWI sequences take precedence in the evaluation process, the dynamic contrast-enhanced sequence (DCE) also plays an important role in specific scenarios. Additionally, supplementary series are often acquired to ease the diagnostic process. These may include a T1-weighted sequence, utilized for detecting metastases or biopsy-induced bleeding, and a full pelvis DWI, employed to uncover any incidental findings in prostate adjacent tissues.

Deveolping AI-based CAD software in prostate MRI

The lack of qualified body radiologists who can confidently report prostate MRI, the high inter-reader variability, as well as the potential for improvement in both sensitivity and specificity for radiologists have created an opportunity space for CAD devices, particularly AI prostate MRI reader tools, to support radiologists in their clinical routine.

The development of these devices, especially when leveraging AI-based models, is equally challenging and requires careful consideration by their manufacturers.

Source of ground truth

There are 3 main sources of ground truth to be used when developing and validating AI-based models in the detection of csPCa, namely PI-RADS annotations, biopsy outcomes, and radical prostatectomy.

PI-RADS annotations

PI-RADS represent the most readily available ground truth to obtain, as these scores are manually given by a radiologist when reporting a prostate MRI study. However, there are 2 key shortcomings on using PI-RADS scores as ground truth for training and validation purposes:

PI-RADS score suffers from a high inter-reader variability. Inexperienced radiologists tend to overcall benign findings as PIRADS>3, thus heavily influencing the ground truth reference point depending on the reporting radiologist. Independent Review Panels (IRPs) have been suggested as a control measure, where a 2+1 approach is usually followed (e.g., 2 radiologists independently evaluate the study, and a 3rd one reviews it only if there are discrepancies between the former two). Despite these measures, variability remains a challenge, reinforcing the importance of prostate MRI AI solutions.
PI-RADS scores are not confirmed csPCa outcomes. The radiological assessment is an indication of confidence in a given lesion being csPCa. However, some PI-RADS 5 lesions turn out to be non-csPCa, as well as lesions not identified by the radiologist manifest as csPCa upon systematic biopsy. Thus, using PI-RADS scores to determine patient outcomes presents a suboptimal strategy.

These factors collectively present a formidable challenge when relying solely on radiology as the ground truth.

Biopsy outcomes

Using biopsy outcomes helps overcome the limitations outlined with PIRADS scores. By defining a csPCa lesion with an associated Gleason score equal or higher than 7, the inter-reader variability as well as the lack of definitive diagnosis are greatly minimized, enhancing the performance of AI prostate MRI tools.

However, prostate biopsies are usually performed by extracting a very small sample of tissue across evenly distributed regions of the prostate (systematic biopsy) as well as in those regions where a radiologist identified a concerning lesion (targeted biopsy).

The match between the radiological finding and the tissue extracted significantly depends on the urologist’s ability to detect the lesion accurately in the real-time ultrasound image, often leading to undersampling the desired region. New techniques, including MRI+US fusion biopsy procedures, have tremendously helped overcome this challenge, though their adoption remains limited and is usually confined to patients with a targeted biopsy.

Another challenge arising from the use of biopsy outcomes as ground truth is the lack of biopsy data on patients for whom a biopsy was not clinically warranted upon radiological review. Automatically assuming those cases to be negative risks introducing confirmation bias, and further clinical follow–up is needed before being able to consider them true negatives.

Radical prostatectomy

Radical prostatectomy overcomes the lack of spatial resolution present in biopsy outcomes. When the whole prostatic gland is removed and histologically assessed, a full 3D picture at a pathological level is obtained, thus being able to directly match the csPCa regions for training and validation, contributing to the accuracy of prostate MRI AI models.

Similarly to using biopsy outcomes, the main challenge with this approach relies on the lack of radical prostatectomies performed compared to the number of MRIs. Despite being the best scientific standard, it would not be ethical to perform this procedure on every patient solely for a validation study. Thus, its use remains limited due to the small sample size available and the inherent selection bias in this population.

Clinical input

Demonstrating safety and effectiveness through robust training and clinical validation is critical to ensuring the success of AI prostate MRI products. Collaborating with healthcare professionals ensures that algorithmic design choices align with clinical workflows and decision-making processes.

Usability testing and human factors engineering ensure that the device’s output is correctly understood by the user, intuitive to use, and clear about the device’s capabilities and limitations. Involving clinicians in algorithm development and validation facilitates relevant subgroup analyses, allowing for tailored approaches to patient care.

Developing and validating AI algorithms for prostate cancer MRI is a multifaceted journey. By acknowledging the challenges inherent in prostate cancer diagnosis, addressing the limitations of current methodologies, and incorporating clinical expertise into algorithm development and validation, AI MRI reader tools can provide more accurate and reliable diagnostic support, ultimately improving patient outcomes.

Quibim’s decision

At Quibim we embrace challenges by handling them better. We chose to use biopsy outcomes as ground truth, aiming to positively impact patient care by detecting clinically significant cancer. AI prostate MRI models trained on pathology allow for high sensitivity, low false positive rates, and high negative predictive value, empowering radiologists to standardize their reporting criteria. AI MRI reader tools will become as integral to diagnosis as adaptive cruise control in cars, potentially becoming essential for prostate cancer diagnosis.

TEMPLATE_Profile images Email-topaz-standard v2-1x-2

David Bazaga

Product Lead

Captura de pantalla 2024-03-27 a las 18.18.51

Fabio Garcia Castro

VP of ML Engineering