Artificial intelligence in infertility treatment: Applications, challenges, and future directions: A narrative review

Ghadirkhomi, Elham; Fatehifar , Aliasghar

doi:doi.org/10.18502/ijrm.v24i4.21168

Volume 24, Issue 4 (April 2026) IJRM 2026, 24(4): 275-286 | Back to browse issues page

‎ doi.org/10.18502/ijrm.v24i4.21168

Mendeley

Zotero

RefWorks

Ghadirkhomi E, Fatehifar A. Artificial intelligence in infertility treatment: Applications, challenges, and future directions: A narrative review. IJRM 2026; 24 (4) :275-286
URL: http://ijrm.ir/article-1-3655-en.html

Artificial intelligence in infertility treatment: Applications, challenges, and future directions: A narrative review

Elham Ghadirkhomi ^*¹

, Aliasghar Fatehifar²

1- Academic Center for Education, Culture, and Research (ACECR), Tabriz, Iran. , ghadirkhomi@acecr.ac.ir
2- Academic Center for Education, Culture, and Research (ACECR), Tabriz, Iran.

Keywords: Artificial intelligence, Infertility treatment, Embryo transfer, Reproductive techniques, Assisted.

Full-Text [PDF 439 kb] (146 Downloads) | Abstract (HTML) (151 Views)

Full-Text: (20 Views)

1. Introduction

Infertility is described as the failure to conceive after having regular unprotected intercourse for 1 yr. As estimates suggest, around 48 million couples globally have some form of infertility issue (1, 2). This problem is experienced by 10-15% of couples of reproductive age (3). There has been an improvement in treatment options through the implementation of assisted reproductive technologies, such as in vitro fertilization (IVF) and intracytoplasmic sperm injection. Nevertheless, pregnancy success achieved through IVF remains moderate, with clinical pregnancy success rates at 30-40%, overall (4, 5). Traditional practices and clinical decisions are usually made using the clinician's experience, which is often subjective (6).
In recent years, artificial intelligence (AI) has emerged as a promising tool in various fields of medicine, including infertility treatment. AI can analyze complex and voluminous data patterns, persistent patterns of data, which could enable the construction of predictions or classifications with at least a similar accuracy that would be achieved by human abilities. Within this capacity, AI could support healthcare providers to develop more accurate decisions, augment clinical decision-making, provide diagnostics with improved accuracy and outcome prediction, and personalize therapy and approaches based on circumstances.
In particular, the subfield of AI, machine learning (ML), has been employed to predict treatment success and negative outcomes, determine embryo viability, and even improve gametes. Considering the continued growing interest in AI, infertility treatment has the opportunity to demonstrate the integration of AI tools into clinical care. However, the standard clinicalization of these tools is still ongoing, and the extent to which AI applies to clinical care is still being determined (7). The review will focus on recent advancements, potential benefits, and remaining challenges, offering insights into how AI may reshape reproductive healthcare in the near future.
In this narrative review, we searched articles published between January 2015 and May 2025 using combinations of the keywords: 'artificial intelligence', 'machine learning', 'deep learning', 'infertility', 'IVF', and 'reproductive medicine'. Inclusion criteria covered peer-reviewed original research articles, reviews, and meta-analyses on AI applications in infertility diagnosis, embryo selection, or reproductive treatment. Exclusion criteria included non-peer-reviewed sources, editorials, and conference abstracts. 2 independent reviewers screened and extracted data, resolving any discrepancies by consensus. Ultimately, 126 relevant studies were included and narratively synthesized to highlight emerging trends, methodological gaps, and clinical implications.

2. Application of AI in male infertility

Male infertility accounts for approximately 50% of infertility cases (8). However, as mentioned previously, the diagnosis has been partially limited by subjective assessments in most cases. A semen analysis, which is defined as a crucial step in defining male factor infertility, is usually conducted manually using a microscope to assess sperm volume, motility, and morphology (9, 10). Such evaluations are prone to human error and variability. AI is notably applied in male infertility to improve the accuracy and precision of semen analysis, along with its high speed (11). There are also examples of AI-powered tools: the "Yo Home Sperm Test" and "ExSeed", both of which are desktop or handheld devices that assess sperm quality using the camera on a smartphone and are designed for early diagnostics of sperm concentration (12, 13). More advanced analyzers, such as the "SQA-Vision" by Medical Electronic Systems, use AI to examine and categorize sperm motility and morphology more precisely. A recent study showed that semen analysis using AI technology outperformed manual assessments in terms of consistency and precision (14). Comparative research has shown that AI-driven semen analysis yields far more reliable and precise outcomes than traditional microscopic assessment. For instance, a study in 2017 found that an AI-assisted smartphone-based semen analyzer achieved 98% accuracy in assessing sperm concentration and motility compared to standard laboratory methods, with reduced variability and faster results (15). Similarly, AI-enhanced analyzers like SQA-vision have shown improved inter-laboratory reliability and lower subjectivity (16).
Another application of AI in sperm selection is assisting in depositing the most qualified sperm during intracytoplasmic sperm injection procedures (17). "Zymōt Fertility" devices' deep learning algorithms can evaluate the kinetics and spatial patterns of sperm motion, as well as sperm DNA quality (18). Research performed at Weill Cornell Medical Center, Massachusetts General hospital, has shown that fertilization and subsequent transfer of embryos created in vitro with AI-assisted sperm have greater pregnancy success rates and fewer complications during pregnancy (18, 19).
AI-driven imaging technology is improving non-invasive diagnostic methods, as it can detect abnormalities of the testes, varicocele, and other structural deficits that may be causing infertility. In fact, when data from ultrasound, magnetic resonance imaging, and even other imaging modalities are integrated, AI models can detect subtle signs of pathology that may be missed using traditional methods (20).

3. Application of AI in female infertility

Among infertility cases, female components contribute to about half, including ovulatory dysfunction, tubal blockage, endometriosis, and uterine peculiarities (21). The potential of AI in improving female infertility treatment is vast, offering encouragement and inspiration for the future of reproductive medicine. However, the same problem raised regarding sperm identification also arises here: the advancement of precision medicine through medical imaging, hormonal assays, and genetic evaluations depends strongly on human interpretation, which generates subjective assessments and variable interobserver results (22). So, as a solution, AI technology automates image analysis to improve diagnostic reliability by decreasing interobserver variability. For example, in the context of polycystic ovary syndrome, convolutional neural networks have been trained to detect polycystic ovarian morphology with accuracy rates exceeding 90% (23). Previous studies reported that AI algorithms achieved over 93% accuracy, compared to 81% for traditional ultrasound interpretation by clinicians (24). For endometriosis, AI applications have focused on magnetic resonance imaging and laparoscopic image analysis. A deep learning model has been shown to detect deep infiltrating endometriosis with higher sensitivity than general radiologists, potentially improving early detection in asymptomatic or minimally symptomatic patients (25).
The hormonal pattern of female reproductive endocrinology is inherently dynamic, with hormone levels fluctuating across the menstrual cycle. Traditional single-point hormonal measurements provide only a limited view. AI systems, particularly those employing recurrent neural networks, can analyze sequential hormonal data to identify ovulatory patterns and luteal phase deficiencies more accurately than static interpretation methods (26). Moreover, AI can aid in distinguishing between different types of ovulatory dysfunction by modeling complex endocrine interactions that may be missed in standard evaluations (27). Integration of genetic and lifestyle data. Emerging research suggests that AI can integrate genomic, epigenomic, and microbiome data to uncover subtle contributors to infertility. For example, ML models have been applied to genomic datasets to predict the risk of primary ovarian insufficiency with promising early results (28). Models that integrate lifestyle factors such as caloric intake, sleep patterns, and stress levels demonstrate potential effectiveness in predicting ovulatory disorders, thereby providing a more comprehensive diagnostic approach. These coordinate devices distinguish modifiable hazard variables before empowering provoked mediation. Table I summarizes the AI applications in male and female infertility treatment.

4. AI in embryo selection

AI has also been used in embryo selection during the IVF cycle. Typically, embryologists have used visual, manual morphological, and time-lapse imaging methods to evaluate embryo quality, an inherently subjective system with inter-individual variations (29). AI-based tools, especially those utilizing deep learning algorithms, offer a new standardization level by analyzing vast embryo image datasets and clinical outcomes to identify features predictive of successful implantation (30). Studies have shown that AI models can outperform experienced embryologists in predicting embryo viability, with specific algorithms reaching an area under the receiver operating characteristic curve exceeding 0.90 in internal validations (30, 31). Such tools help rank embryos within a cohort and allow clinicians to make data-driven decisions, increasing opportunities for implantation while decreasing the need for repeat transfers (30). Several fertility centers have begun adopting embedded AI systems, although mostly in pilot or adjunct roles. For example, life whisperer, an AI-based selection tool, has been documented in Australian and European clinics, where it functions as a decision support system rather than a standalone selector (32). Similarly, the cognitive human-learning optimized embryo, embryo quality (CHLOE-EQ) platform is being explored in several centers in North America and Asia, with early studies suggesting improved inter-operator consistency, though not necessarily improved overall implantation rates (33). Table II compares the performance of some AI vs. traditional methods in embryo selection.
These examples illustrate that while AI is no longer merely theoretical in embryo selection, there is yet much to be done to integrate AI seamlessly into a clinical IVF practice, which is discussed in the challenges and limitations section.
Beyond summarizing performance metrics, several methodological issues recur across embryo-selection studies. First, model generalizability is often limited by training on single-center datasets with non-standardized imaging pipelines. This raises the risk of domain shift when models are deployed across laboratories (30-32, 34). Second, many studies report high internal Area under the curves (≥ 0.85-0.90) but lack independent external validation, making overfitting a plausible concern (31, 34). Third, outcome labels (implantation, clinical pregnancy, live birth) are not uniform across studies, hampering comparability and potentially inflating apparent performance when intermediate endpoints are used (30, 34, 35).
Systems such as life whisperer and CHLOE-EQ illustrate these trade-offs. Life whisperer has shown encouraging accuracy for blastocyst viability prediction using static images (32); however, evidence for improvement in live-birth outcomes over standard embryologist grading remains mixed, with gains more consistently observed in inter-operator consistency than in absolute success rates (31). CHLOE-EQ integrates time-lapse features and decision support overlays that can standardize grading within labs; yet, published evaluations still rely largely on retrospective cohorts and surrogate endpoints (35).

5. AI in personalized infertility treatment

Many conventional infertility treatments, such as ovulation induction, intrauterine insemination, and IVF, follow a standardized cycle protocol. Although effective for many couples, these strategies may miss individual variability in hormonal responses, genetics, or embryo quality (7). AI-driven systems, particularly those using ML, can integrate complex, multidimensional data to predict treatment outcomes and optimize protocols for each patient (22).
Here, to improve infertility treatment protocols, AI can synthesize patient-specific variables such as age, body mass index, serum anti-Müllerian hormone levels, antral follicle count, follicle-stimulating hormone levels, genetic polymorphisms (e.g., in the follicle-stimulating hormone receptor), previous ovarian response profiles, and even environmental exposures. These variables can then be used to develop predictive models that estimate the likelihood of successful ovarian response, fertilization, and live birth (36). One promising application is predicting ovarian response to controlled ovarian stimulation (37). Traditional prediction tools, like the Bologna criteria or POSEIDON classification, categorize patients into broad responder groups (37, 38). However, AI algorithms can extend this categorization by learning nuanced combinations of traits and outcomes on thousands of past cases.
A recent study, which employed supervised ML to predict the number of oocytes retrieved during IVF cycles. Their model used patient clinical characteristics and hormone profiles, outperforming conventional logistic regression in prediction accuracy (39). Another study in 2019 showed that ML models could predict ovarian response and optimize the initial gonadotropin dose for individualized stimulation protocols (40).
Such advances are critical for avoiding both under- and overstimulation, which can lead to cycle cancellation or ovarian hyperstimulation syndrome, respectively (40). Furthermore, researchers have explored how AI can predict live birth probabilities following IVF. An example of such an approach has been reported in recent studies, and the method proposed in the current study reflects gradient boosting algorithms integrating clinical laboratory and demographic data (41). These personalized forecast tools serve as a clinical decision support system, supporting fertility specialists in customizing every treatment detail, from stimulation dosing to embryo transfer timing, according to the unique biological and clinical profile of the patient. They allow for updates as new data are available (such as hormone levels at stimulation day 5), enabling dynamic changes in protocol design (42). With a transition from a protocol-driven to a data-driven AI-guided treatment design, reproductive medicine will improve success rates while simultaneously lessening physical and financial burdens and reducing the time to pregnancy (41). However, these benefits hinge on integrating validated models into clinical workflows and a commitment to continuously monitoring algorithmic performance and fairness.

6. Predictive analytics and outcome forecasting

What are my chances of success? One of the most challenging questions that has always been raised in infertility treatment. As discussed in the previous sections, the prediction of success in fertility treatment was also established based on population statistics. These more general predictors, while helpful, cannot account for the particular interaction of lifestyle and biological variables that control outcomes.
Predictive analytics transforms this by analyzing multi-dimensional patient data to develop individualized predictions of significant endpoints like clinical pregnancy, live birth, and risk of miscarriage (43). For example, gradient boosting, random forest, and deep neural network models have outperformed traditional logistic regression methods in predicting live birth outcomes following IVF (41, 44). Possibly the most impactful use has been the development of AI models that can provide pretreatment and treatment of live birth probability estimates. One big study in 2019 contrasted over 60,000 IVF cycles of the Society for Assisted Reproductive Technology database by using ML algorithms. Their models incorporated age, markers of ovarian reserve, sperm quality, characteristics of stimulation protocols, and grading of embryos to develop individualized live birth predictions with greater accuracy than conventional practices (41). These technologies are increasingly incorporated into clinical decision-making tools, including the ART calculator and IVY platforms (IVF outcome prediction software) that provide patients with a visual, dynamic image of their reproductive trajectory based on real-time clinical inputs (45).
Across studies leveraging gradient boosting machines, random forests, and deep neural networks for IVF outcome prediction, the consistent pattern is improved discrimination over logistic regression when diverse multimodal inputs are available (e.g., age, ovarian reserve, semen parameters, stimulation details, and morphokinetics) (41, 44). For instance, models trained on large registries report superior AUCs for live birth compared with conventional scores, yet calibration and external validity vary (34, 41). Work that combines embryo imaging with patient-level clinical features tends to outperform imaging-only or clinicodemographic-only approaches, suggesting complementarity of morphologic and physiological signals (31, 34). Nevertheless, 2 caveats limit immediate bedside adoption: i) performance often drops under external validation, indicating sensitivity to site-specific practices and data drifts; and ii) intermediate outcomes (implantation, clinical pregnancy) can overestimate ultimate benefit if not tied to live-birth endpoints. Prospective, adaptive studies that update risk estimates throughout stimulation show promise for dynamic decision support, but require embedded evaluation of patient-centered outcomes.
In a recent study, deep learning models were taught using time-lapse imaging and clinical data to predict the success of implantation at the blastocyst stage (34). These models were predictive compared to embryologist grading alone, showing that AI can be a helpful adjunct in embryo selection. Another study also established that combining time-lapse imaging with patient-specific demographic and hormonal data improved live birth and implantation predictive accuracy by nearly 20% (31). Unlike static models, modern predictive analytics can be dynamic, updated outcome predictions as new data becomes available. For example, if a patient's estradiol level is unexpectedly low in mid-stimulation, or there are fewer retrieved oocytes than expected. Then, the AI system can recalculate the probability of success and provide immediate protocol changes or counseling suggestions (39). That real-time adjustment enables clinicians to pre-emptively make treatments and allows empathetic patient expectation management. For the patients, this translates to more assured decisions, less worry, and perhaps fewer treatment cycles in order to conceive (46).
The chart below compares the predictive accuracy of different models for IVF outcome prediction (Figure 1).

7. Challenges and limitations

As briefly mentioned in the previous sections, AI usage in infertility treatment shows promising possibilities, although several constraints limit its potential. AI systems in reproductive healthcare encounter multiple obstacles across the technical space, clinical aspects, and ethical and regulatory frameworks, which need resolution for secure and fair integration into standard practice. Below are some general limitations of using AI in various areas of its application in infertility treatment.

7.1. Data quality, diversity, and annotation

Artificial intelligent systems demand massive collections of well-annotated information for model training and validation. For infertility treatment, the data comprise mostly imaging (e.g., ultrasound, embryo morphology), laboratory results (e.g., hormonal profiles, semen parameters), and clinical outcomes (e.g., pregnancy and live birth rates). Hence, to be exact, the commonly encountered situation is that the available data are often not only small but also limitedly associated with a certain healthcare facility and do not have a diverse enough demographic, hence resulting in a model that is not generally applicable to populations and all clinical settings (47). Moreover, data labeling, particularly in areas like embryo grading, can be subjective and prone to inter-observer variability, which introduces noise and bias into AI models (32).

7.2. Integration and standardization

Infertility care spreads over different units, such as embryology labs, imaging departments, and endocrinology and genetics departments, which use separate systems for their operations. Implementing AI tools in these disconnected workflows creates technical and operational obstacles. A successful application of an AI model that uses embryo images to forecast IVF success demands compatibility between lab management systems and electronic medical records for clinical deployment. The absence of standardized data formats and interface protocols makes integration processes difficult and resource-expensive (35).

7.3. Limited clinical validation

Most AI models in infertility treatment are still in the experimental or proof-of-concept stage. Validation often occurs retrospectively, using test datasets that do not fully represent the heterogeneity of real-world clinical environments. Few models have undergone prospective clinical trials or been validated in multi-center settings. The reliability of time-lapse imaging for embryo implantation prediction in clinics remains unproven because most studies lack external validation from multiple clinic sites (31). The clinical reliability and safety of these systems remain unclear until researchers obtain strong prospective data.

7.4. Explainability and clinician trust

Many high-performing AI models, especially those based on deep learning, operate as "black boxes", offering little insight into how decisions are made. In a field like infertility, where treatment choices are often emotionally charged and ethically sensitive, clinicians are less likely to trust or act upon opaque model outputs. Explainable AI is a growing subfield that aims to make these models more interpretable, but current applications in reproductive medicine are limited. Without transparency, it becomes difficult for practitioners to justify AI-driven decisions to patients, reducing adoption rates (22).

7.5. Ethical and legal considerations

Fertility treatment AI must consider intricate ethical challenges throughout its implementation. The algorithms that assess embryos and forecast IVF outcomes may introduce discriminatory biases because of uneven historical data distributions of age, ethnicity, and socioeconomic status. The situation creates issues regarding fair and equitable treatment access for all patients (48). Moreover, reproductive data are among the most sensitive types of personal information, making privacy and data protection critical. There is also the issue of clinical accountability, if an AI system contributes to a failed treatment or adverse outcome, determining liability is legally and ethically complex (49).

8. Conclusion

AI is increasingly embedded across the infertility care pathway from semen analysis and female factor diagnostics to embryo selection and individualized treatment planning. Evidence suggests that, when supported by diverse multimodal data and rigorous validation, AI can improve discrimination and consistency over conventional methods. However, durable gains in patient-centered outcomes (live birth, time-to-pregnancy, ovarian hyperstimulation syndrome reduction, and cost-effectiveness) require multi-center prospective trials, transparent reporting, fairness audits, and robust clinical integration. In the near term, AI should be positioned as clinician-supervised decision support rather than a standalone selector or prescriber.

Acknowledgments
No funding was received for the preparation or submission of this work. We employed AI-powered language tools, including ChatGPT (GPT-4), for grammar correction and writing refinement of the manuscript.

Conflict of Interest
The author declares that there is no conflict of interest.

Type of Study: Review Article | Subject: Assisted Reproductive Technologies

Send email to the article author

Rights and permissions
	This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Designed & Developed by : Yektaweb

International Journal of

Reproductive Biomedicine

Related Websites

Site Keywords