Using exploration and exploitation techniques to improve ranking models through (1+1)-Evolutionary Algorithms Osman Ali Sadek Ibrahim ( 

[email protected]

) Minia University https://orcid.org/0000-0001-9254-3093 Research Article Keywords: Learning to Rank, Exploration, Exploitation, Information Retrieval, Evolutionary Gradient Strategy Posted Date: January 16th, 2024 DOI: https://doi.org/10.21203/rs.3.rs-3866499/v1 License:   This work is licensed under a Creative Commons Attribution 4.0 International License. Read Full License Additional Declarations: The authors declare no competing interests. Using exploration and exploitation techniques to improve ranking models through (1+1)-Evolutionary Algorithms Osman Ali Sadek Ibrahim · Eman M. G. Younis Received: date / Accepted: date Abstract Exploration and exploitation are fundamental concepts within the domain of Nature- Inspired Algorithms (NIAs) when optimizing solutions. Exploration aims to traverse a substantial portion of the solution space, whereas exploitation is directed at refining the current solution to- ward either local or global optima. For example, mutation represents an exploration technique, while crossover and enhanced initialization strategies are employed for exploitation. In this re- search, four probability distributions are employed for mutation within NIAs, and ranking models derived from Dependent Click and Linear Regression (LR) are leveraged to enhance the exploitation process. Empirical findings indicate a preference for employing Gaussian Random Number (GRN) in conjunction with LR and Dependent Click initialization when evolving on the training dataset. Conversely, Levy Random Number generation in combination with LR demonstrates superior per- formance on the unseen test dataset, with GRN emerging as a strong contender. Furthermore, the integration of Simulated Annealing with the (1+1)-Evolutionary Strategy outperforms alter- native methods in predictive ranking and evolutionary processes, both on the training and testing datasets, regardless of whether Linear ranking initialization is employed, in contrast to the novel (1+1)-Evolutionary Gradient Strategy and other (1+1)-Evolutionary Strategy variants. This paper presents experimental results conducted on the MQ2008 dataset and provides accompanying code packages. Keywords Learning to Rank · Exploration · Exploitation · Information Retrieval · Evolutionary Gradient Strategy 1 Introduction Exploration and exploitation constitute foundational elements of problem-solving in the context of employing Evolutionary Algorithms (EAs) and Metaheuristic search methodologies for the at- tainment of optimal solutions. EAs necessitate the judicious incorporation of both exploration and exploitation strategies within their search domains. Exploration entails the deliberate exploration Osman Ali Sadek Ibrahim An Assistant Professor, Computer Science Department, Minia University, Egypt E-mail:

[email protected]

Eman M. G. Younis An Associate Professor, Faculty of Computers and Information, Minia University, Egypt E-mail:

[email protected]

2 Osman Ali Sadek Ibrahim, Eman M. G. Younis of hitherto uncharted regions within the search space, with the objective of uncovering entirely novel territories that have remained unexplored by the algorithm. Conversely, exploitation encom- passes the concentration of efforts on regions within the search space that exhibit close proximity to solutions previously encountered during the algorithm’s traversal. Furthermore, it is imperative that Evolutionary Algorithms (EAs) effectively navigate the in- tricate equilibrium between exploration and exploitation in order to optimize their operational efficiency. Numerous studies in the academic domain have undertaken the examination of vari- ous EA methodologies to tackle these multifaceted challenges, encompassing Genetic Algorithms (GAs), Evolutionary Strategies (ES), Evolutionary Programming (EP), and Evolutionary Algo- rithms (EAs). Notably, the dynamic behavior of Genetic Algorithms is intricately influenced by the interplay between exploitation and exploration throughout the entirety of the evolutionary process. It is widely acknowledged among researchers that the capacity of EAs to adeptly maintain this delicate equilibrium between exploration and exploitation is a pivotal factor contributing to their efficacy in resolving intricate problems[Michalewicz, 1996, Kramer, 2016, Fogel, 1999, Nannen et al., 2008, Herrera et al., 1996]. An early research inquiry pertaining to the equilibrium between exploitation and exploration was undertaken and subsequently documented within a seminal study authored by Eiben and Smith in [Eiben and Schippers, 1998]. This investigation underscored the exigency for a more extensive examination of this research quandary. The work undertaken by Eiben and Smith called into question the conventional tenet positing that selection mechanisms principally cater to exploitation within expert systems, while search operators such as mutation and crossover primarily facilitate exploration. To redress the prevalent paradigms within the domain of Evolutionary Algorithms (EAs) concerning the interplay of exploration and exploitation, the study also scrutinized a corpus of ten additional research works. The outcomes of their comprehensive assessment revealed the absence of a universally endorsed consensus on this matter. It is plausible that, during the period in question, the management and quantification of exploration and exploitation were notably intricate or taciturn pursuits. The principal aim of this study is to elucidate the pragmatic ramifications associated with the equilibrium between exploration and exploitation within the domain of Learning to Rank (LTR). Of greater significance is the notion that an enhanced comprehension of these principles can facilitate the development of more efficacious Evolutionary Algorithms (EAs) tailored for LTR applications. For instance, a more profound insight can assist in ascertaining the optimal initialization proce- dure within a specific contextual framework or identifying the most appropriate random number generator for mutation operations. Additionally, it can engender elucidation regarding distinctions inherent in diverse EA methodologies. Consequently, the primary contributions of this manuscript can be succinctly encapsulated as follows: – Evaluating the efficacy of various (1+1)-Evolutionary Algorithms (EAs) within the Learning to Rank (LTR) domain constitutes the central focus of this investigation. – Introducing a novel methodology for Long-Term Recurrent (LTR) algorithms, denoted as the (1+1)-Evolutionary Gradient Strategy. – Evaluating the impact of utilizing four distinct probability distributions as stochastic number generators for regulating mutation step magnitudes is of paramount importance. This method is deployed to quantify the exploratory capabilities of (1+1)-Evolutionary Algorithms, which heavily depend on a stochastic mutation generator. – Detecting the exploitation process within (1+1)-Evolutionary Algorithms (EAs) entail the uti- lization of initial solutions derived from a synthesis of Online DC and Offline LR models. Our research initiative was driven by the desire to delve into the equilibrium between exploration and exploitation in the context of Learning to Rank (LTR). Consequently, we embarked on an inquiry into the utilization of (1+1)-Evolutionary Algorithms (EAs) in conjunction with Online DC and Offline LR. In this endeavor, we incorporated four distinct probability distributions and Title Suppressed Due to Excessive Length 3 introduced an innovative amalgamation of evolutionary strategy with the Gradient Accent method. Our primary objective was to augment ranking accuracy by effectively navigating the delicate trade- off between exploring a broader spectrum of potential learning solutions and exploiting the most promising nearby solutions. This investigation constitutes the inaugural experiment conducted within the realm of LTR, with a dedicated focus on the application of computational intelligence techniques to tackle the Exploration and Exploitation quandary and enhance the performance of ranking models. The organizational framework of this paper is delineated as follows. Section 2 elucidates the foundational concepts underpinning Linear Regression and Dependent Click Models, which have been employed to enhance the initialization process within the methodologies presented herein. In Section 3, we embark upon a comprehensive exploration of pertinent antecedent research endeavors. Section 4 introduces the EGS-Rank methodology, while Section 5 provides a comprehensive exposition of the outcomes obtained through the utilization of the MQ2008 dataset. Finally, Section 6 proffers our conclusive insights and delineates avenues for prospective research. 2 Background Material This section presents the background material for the Linear Regression (LR) and Dependent Click Model (DCM). These methods were used in the Initialization process for the presented approaches in this paper. 2.1 Dependent Click Model The Dependent Click Model (DCM) is a probabilistic model used in online learning to rank (LTR) to capture user click behavior in a more sophisticated manner compared to simpler models like Position-Based Models. DCM takes into account the dependence between clicks on different search results and incorporates features like examination behavior and attractiveness of snippets. Here, I’ll provide an overview of DCM with mathematical equations. In DCM, the primary goal is to model the probability of a user clicking on a result given their previous interactions with other results. Let’s define some variables and parameters: – CT R(p) : The Click-Through Rate (CTR) for a result at position p, representing the probability that a user clicks on a result at position p. – p : The position (rank) of the search result in the ranking list, ranging from 1 (top position) to N (bottom position). – A(p) : The attractiveness of a result at position p. It captures the inherent quality or relevance of the result. – X(p) : A binary random variable that indicates whether a user examines the result at position p (1 for examination, 0 for no examination). – C(p) : A binary random variable that indicates whether a user clicks on the result at position p (1 for click, 0 for no click). DCM models the probability of examining a result at position p(X(p)) = 1) as follows: X p P (X(p) = 1) = σ( αk · A(k) · X(k − 1)) (1) k=1 In this equation: – σ(.) is the logistic function that maps the input to a probability between 0 and 1. This means that: 1 σ(x) = (2) 1 + exp(−x) 4 Osman Ali Sadek Ibrahim, Eman M. G. Younis – αk is a parameter that captures the examination propensity at position k. – A(k) represents the attractiveness of the result at position k. – X(k − 1) is the examination state of the result at position k − 1. It’s used to capture the dependence between examinations at different positions. Once the examination probabilities are modeled, DCM models the click probability (CT R(p)) as follows: P (C(p) = 1|X(p) = 1) = βp (3) This equation states that if a user examines a result at position p(X(p) = 1), the probability of clicking on it (C(p) = 1) is directly determined by a parameter βp . Overall, DCM models the entire click sequence (whether a user examines and clicks on each result) as a dependent process, where the examination and click probabilities depend on previous interactions. This captures more realistic user behavior compared to simpler models. To estimate the parameters αk and βp of DCM, you would typically use click-through data collected from users’ interactions with search results. Statistical techniques like maximum likelihood estimation (MLE) can be used to find the parameter values that best fit the observed click data, allowing the model to capture user click behavior accurately. 2.2 Linear Regression Linear Regression (LR) represents a supervised machine learning methodology applicable within the domain of Information Retrieval (IR), specifically for the task of Learning to Rank. The fun- damental objective is the acquisition of a linear model capable of prognosticating a relevance score for each constituent item within a ranked list, contingent upon its inherent feature set. The model undergoes a training process designed to minimize the mean squared error existing between the an- ticipated relevance scores and their actual counterparts. Subsequently, a succinct exposition of LR’s application in the context of Learning to Rank is presented, incorporating pertinent mathematical formulations: 1. Model Representation: – Consider the matrix X, wherein every row is associated with an individual item, and each column is indicative of a particular attribute pertaining to said item. Each row in this context signifies a query-document pairing, while each column denotes a distinct feature, encompassing attributes like term frequency, document length, among others. – Consider y as a vector representing the authentic relevance scores corresponding to the query-document pairs. – we aspire to derive a weight vector, denoted as ’w’, such that the anticipated relevance score (ŷ) for a query-document pair can be expressed as follows: ŷ = Xw (4) 2. Objective Function: In the context of LR, the primary goal is to minimize the mean squared error (MSE) that exists between the predicted values denoted as ŷ and the actual relevance scores represented as y. The MSE is formally defined as follows: 1 X N M SE(w) = (yˆi − yi )2 (5) N i=1 Where N is the number of query-document pairs in your training data. Title Suppressed Due to Excessive Length 5 3. Optimization entails the primary objective of identifying the weight vector ’w’ that serves to minimize the Mean Squared Error (MSE). Conventionally, this task is accomplished through the employment of optimization methodologies, notably including gradient descent. The calculation of the gradient of the MSE concerning the parameter vector ’w’ is a pivotal step in this process: 2 T ∆M SE(w) = − X (y − Xw) (6) N Gradient descent and various other optimization algorithms are employed iteratively to update the parameter vector ’w’ until it reaches a state of convergence. w ← w − α∆M SE(w) (7) Where α is the learning rate. 4. Prediction: After the model has been adequately trained, i.e., upon the identification of the optimal parameter vector ’w’, it becomes possible to employ this model for the purpose of estimating relevance scores for novel query-document pairs. ŷnew = Xnew W (8) The ranked list can be obtained by sorting the ŷnew values in descending order. The objective of this LR methodology is to acquire a linear correspondence between the feature vectors of query-document pairs and their corresponding relevance scores. It is imperative to dili- gently preprocess and standardize the features while also accounting for additional ranking-specific factors, such as click-through rates or user preferences, to enhance the sophistication of ranking models. 3 Related Work The importance of achieving a balance between exploration and exploitation in the context of Evolutionary Algorithms (EAs) has received relatively limited attention, primarily stemming from the constrained application of Evolutionary Computation algorithms within the Learning to Rank (LTR) problem domain. To date, the extant body of research addressing the initialization proce- dure within this context is confined to the works of [Ibrahim and Landa-Silva, 2018, Ibrahim and Younis, 2022]. These studies introduce two distinct approaches: (1+1)-Evolutionary Strategies (ES- Rank) and the fusion of Simulated Annealing with (1+1)-Evolutionary Strategy (SAS-Rank). It is pertinent to note that these methodologies currently offer support exclusively for Linear Ranking models, with no existing LTR techniques facilitating initialization utilizing ranking models from alternative LTR methodologies. Consequently, the majority of the existing literature in this field is centered on well-established ranking algorithm categories. The dearth of research can be ascribed to the underexplored nature of the challenge pertaining to the equilibrium between exploration and exploitation in the realm of the LTR problem until the present juncture. As per the findings of [Liu, 2009], Learning to Rank (LTR) methods can be categorized into three primary groups: pointwise, pairwise, and listwise approaches. These categories are determined based on the measurement of the loss or fitness function. In the pointwise method, each unique query- document pair is treated as an individual learning instance. This approach encompasses techniques like Linear Regression (LR) [Yan and Su, 2009], Boosting [Freund et al., 2003], Gradient Boosted Regression Trees (GBRT or MART) [Friedman, 2001, Mohan et al., 2011], and Random Forest (RF) [Breiman, 2001]. The pairwise methodology encompasses a loss or objective function that leverages pairs of query- document pairs pertaining to the identical query as the foundational learning instances. Prominent 6 Osman Ali Sadek Ibrahim, Eman M. G. Younis instances of pairwise techniques encompass RankNET (Rank Neural Net) [Burges et al., 2005], RankBoost, and SVMRank [Li, 2014]. In the context of information retrieval methodologies, the listwise technique involves utilizing the entire retrieved list of items as a singular learning instance. This all-encompassing approach aggregates all query-document instances pertaining to each specific query. Prominent algorithms within the listwise paradigm comprise ListNET (Listwise Neural Net) as introduced by Cao et al. (2007) [Cao et al., 2007], RankGP as elaborated upon in Lin et al. (2013) [Lin et al., 2012] and Lage et al. (2013) [Mick, Accessed 2016], Coordinate Ascent as outlined by Metzler and Croft (2007) [Metzler and Bruce Croft, 2007], AdaRank as proposed by Xu and Li (2007) [Xu and Li, 2007], and RankGPES as described in RankGPES (2013) [Islam, 2013]. The EGS-Rank algorithm, comprising ES-Rank and SAS-Rank, can be classified within the listwise category. Existing research, as indicated by studies such as those conducted by [Ibrahim and Landa-Silva, 2018, Cao et al., 2007], has shown that listwise methodologies tend to exhibit superior performance in comparison to pointwise and pairwise approaches. The aforementioned techniques are advisable for offline Learning to Rank (LTR) applications. In contrast, Katja Hofmann et al. have introduced a novel research paradigm within the LTR do- main, referred to as online LTR, predicated on modeling user click behavior. This research avenue relies on implicit relevance labels to emulate user clicks, serving as an objective metric for assessing the efficacy of online LTR algorithms. More recently, Jianxun Lian et al. have proposed a Deep Learning methodology that amalgamates implicit and explicit feature interactions in recommenda- tion systems. Furthermore, Fatih Cakir et al. have presented a Deep Learning approach designed for intensive learning, evaluating its performance using the Average Precision metric. Diverging from the conventional LTR methods like LR, SVM-Rank, ES-Rank, and SAS-Rank, Deep Learning techniques eschew linear ranking models as their initial framework during the learn- ing process. Additionally, certain other LTR methods dispense with employing any ranking model during the initialization phase. Moreover, most of these methods do not yield a Linear Ranking model that can be assimilated by ES-Rank, EGS-Rank, or SAS-Rank. Furthermore, as substantiated by earlier studies documented in references [Ibrahim and Landa- Silva, 2018] and [Ibrahim and Younis, 2022], ES-Rank has demonstrated its potential to outperform LR and SVM-Rank. Consequently, in our investigation of the exploration and exploitation processes within (1+1)-Evolutionary Algorithms, we have chosen to incorporate ES-Rank and SAS-Rank, in conjunction with the innovative EGS-Rank approach. 4 Material and methods In the course of this investigation, we utilized a spectrum of methodologies, encompassing estab- lished techniques such as LR, the Dependent Click (DC) model, ES-Rank, and SAS-Rank, all of which have been the subject of prior scholarly inquiry. In addition, we introduced a pioneering approach termed EGS-Rank. The Java package links corresponding to these methodologies are pro- vided: Linear Regression, DC, ES-Rank, SAS-Rank, In the forthcoming section, we shall present EGS-Rank, and in the ultimate release, we shall incorporate its Java code package. 4.1 (1+1)-Evolutionary Gradient Strategy for Ranking (EGS-Rank) This section introduces the proposed methodology known as EGS-Rank, an acronym denoting the (1+1)-Evolution Gradient Strategy Ranking. Within the framework of EGS-Rank, each chro- mosome commences with an initial gene weight of 0.5 assigned to every query-document feature, thereby signifying an equal significance of each feature across all instances. To innovate the initial- ization process, we incorporate the LR and DC ranking models into EGS-Rank, giving rise to a Title Suppressed Due to Excessive Length 7 novel approach termed IEGSR-Rank and IEGS-Click. The choice of (1+1)-Evolutionary Algorithms (EAs) for this novel procedure is grounded in their ability to adeptly transmute the initial solution into an improved one, characterized by computational efficiency and minimal memory utilization, as corroborated by earlier research [Arnold and Salomon, 2007, Ibrahim and Landa-Silva, 2017]. Moreover, the motivation behind enhancing the initialization procedures of (1+1)-EAs stems from the significance of exploration within diverse proposed solutions in the realm of Hybrid solutions in Meta-heuristic and Computational Intelligence approaches [Landa Silva, 2003, Diaz-Gomez and Hougen, 2007]. Furthermore, it leverages the potential for exploitation presented by offline LR models. Therefore, within the domain of offline Learning to Rank (LTR), the utilization of standard initialization values for the chromosomes of ranking models tends to emphasize the exploitation aspect of LTR models. In contrast, employing an online LTR model as an initialization method or introducing variable mutation step sizes leans more towards fostering exploration. The present study aims to evaluate the performance of EGS-Rank in relation to ES-Rank and SAS-Rank, with the objective of identifying the optimal equilibrium between exploration and exploitation. Figure 1 provides an illustration of the proposed EGS-Rank algorithm, which essentially operates as a (1+1)- Evolutionary Gradient Strategy. It iteratively evolves a single vector through gradient fitness steps across multiple generations, utilizing a training set composed of query-document pairings or feature vectors as input and yielding a linear ranking function as the output. The chromosome denoted as ParentCh comprises M genes, with each gene representing the significance of the corresponding feature in the context of the document ranking. The procedure of initializing the chromosomal vector unfolds in the initial four stages, whereby a uniform value of 0.5 is allocated to each genetic element. Subsequently, the fifth step introduces a Boolean variable designated as ”Good,” which governs the decision regarding whether the mutation process from the preceding generation should be reiterated. By default, it is configured as FALSE to preclude superfluous iteration or alteration throughout the EGS-Rank procedure, which seeks to enhance the mutational steps for a more efficacious evolutionary advancement of solutions. In the sixth step, the ”ParentCh” is cloned to yield a novel chromosome referred to as ”OffspringCh.” This scholarly paper elucidates the progressive development process, which commences at Step 7 and persists until Step 24, encompassing a total of 1300 generations as stipulated by the parameter ”MaxGenerations.” Steps 8 through 16 provide comprehensive insights into the selection of the number of genes subject to mutation, designated as ”R,” and the meticulous management of the mutation procedure, encompassing the determination of specific mutation steps. To accomplish this task, Equation (m1) is employed, wherein ”PRG(0,1)” represents a random number generator characterized by a Gaussian, Cauchy, Levy, or uniform distribution, each having a mean of 0 and a standard deviation of 1. Furthermore, the fitness values pertaining to both parental and progeny chromosomes, as computed utilizing the training dataset, are designated as ”Fitness(ParentCh)” and ”Fitness(OffspringCh),” respectively, in accordance with the guidelines outlined in [Zheng et al., 2021]. M utated Gene i = Gene i + P RG(0, 1) ∗ Gene i (m1) (F itness(P arentCh) − F itness(Of f springCh)) In the algorithm, if a mutation process within generation (G − 1) result in the generation of offspring with enhanced characteristics, this efficacious procedure is persisted and subsequently implemented in generation G,” as elucidated in Step 9. Conversely, in instances where the mutation process fails to yield improvements, the parameters governing said mutation process are reset, in accordance with the prescriptive instructions delineated in Steps 11 through 15. In the subsequent 8 Osman Ali Sadek Ibrahim, Eman M. G. Younis stages, specifically Steps 17 to 23, a clear demarcation is established between the ”ParentCh” and ”OffspringCh” cohorts, predicated upon their respective performance scores in metrics such as MAP, NDCG@10, P@10, or RR@10. In Step 25, the EGS-Rank algorithm constructs the ultimate ranking function through the transposition of the evolved feature weight vector and the query-document pairs. The application of this Learning to Rank (LTR) methodology commences with the initial acquisition of datasets designed for training, validation, and testing purposes. Subsequently, the EGS-Rank algorithm is employed to iteratively evolve a linear ranking function using the training data. The performance of this evolved linear ranking function is subsequently assessed using the dedicated test dataset. It is noteworthy that this approach constitutes an innovative contribution to the field of Learn- ing to Rank (LTR). The following section conducts an empirical analysis that compares various methodologies. The computational complexity of this technique is determined by the expression Omega(N ∗n∗log(M )), where N represents the number of query-document training pairs, n denotes the count of evolutionary iterations, and M signifies the number of chromosomal genes. 5 Results 5.1 Benchmark Dataset Descriptions Our approach was implemented on the MQ2008 Learning to Rank dataset, and this paper offers the source code of the algorithms, thereby ensuring the reproducibility of our research by fellow scholars. In the course of our experimental investigation, we employed the MQ2008 dataset, which has been previously discussed in related literature [Qin and Liu, 2013, Liu, 2011, Qin et al., 2010]. This dataset exhibits several salient characteristics: it encompasses a total of 46 distinct features and encompasses 784 unique queries. Significantly, the dataset utilizes relevance labels, which are categorized into three distinct classes: 0, 1, and 2. A relevance label of 0 denotes that the query and the document in the query-document pair are not related. Conversely, relevance labels 1 and 2 signify varying degrees of relevance, spanning from marginally relevant to highly pertinent. It is imperative to emphasize that this paper presents the findings pertaining to the MQ2008 dataset, with the complete code package set to be included in the final version of this manuscript to facilitate further academic inquiry. Title Suppressed Due to Excessive Length Fig. 1 EGS-Rank Algorithm for Learning to Rank Approach 9 10 Osman Ali Sadek Ibrahim, Eman M. G. Younis 5.2 Experimental Results and Discussion In this investigation, we employed five distinct fitness functions, namely, ”Mean Average Precision” (MAP), ”Normalized Discounted Cumulative Gain at 10” (NDCG@10), ”Precision at 10” (P@10), ”Reciprocal Rank at 10” (RR@10), and ”Expected Reciprocal Rank at 10” (ERR@10) (Baeza- Yates 2011; Li 2014), during the analysis of the training datasets. These same fitness functions were subsequently applied to assess the performance of ranking functions on the test datasets. In this section, we present a summary of the results obtained through the application of our innovative research methodology to the MQ2008 dataset benchmarks. We conducted experiments using the Loret toolkit in conjunction with the ES-Rank, EGS-Rank, or SAS-Rank library packages [Schuth et al., 2013, Ibrahim and Landa-Silva, 2018]. To construct the initial ranking model for the learning-to-rank (LTR) dataset, we first executed the Loret package, an online LTR toolkit, on the dataset. Subsequently, the model generated was utilized in the (1+1)-EAs. These experiments were conducted with the default parameters for ES-Rank, EGS-Rank, and SAS-Rank. Specifically, for ES-Rank, EGS-Rank, and SAS-Rank, the number of evolutionary iterations was set to 1300, employing four different random number distribution probabilities. Meanwhile, the default configuration for Loret was retained as provided in the package. The ranking model developed by the Loret package, utilizing the DC model (Online LTR) as the initial parent chromosome, is adopted by ES-Rank, EGS-Rank, and SAS-Rank (Offline LTR). Subsequently, employing various evaluation fitness measures, each training dataset is utilized to refine and enhance the ranking model. This research endeavor conducts a comprehensive examination of the performance of SAS-Rank, EGS-Rank, and ES-Rank enhancements on both a training dataset and an unseen test dataset, with a particular focus on ranking outcomes. A previous investigation employed the training data to facilitate the evolution of a ranking model, subsequently deploying it as a predictive ranking algorithm on a novel and unfamiliar test dataset. This additional objective serves the crucial purpose of evaluating the effectiveness of evolutionary computation techniques when applied to previously unencountered data, mirroring the principles inherent in machine learning approaches utilized in Learning to Rank (LTR) datasets. As a result, in the context of diverse challenges related to ranking textual data, researchers are provided with the opportunity to closely scrutinize the performance of EGS-Rank, ES-Rank, and SAS-Rank, guided by LTR criteria as outlined by [Ibrahim and Landa- Silva, 2018, Ibrahim and Younis, 2022]. This research harnesses recently developed linear ranking model packages, which stand as the sole methods capable of accommodating linear ranking models during the initialization phase. Progressing from Table 1 to Table 8, the application of ES-Rank, SAS-Rank, and EGS-Rank to both the training and test datasets of the MQ2008 benchmarks demonstrates a notable enhance- ment in the performance of the DC ranking model. This augmentation in model performance is rigorously assessed through the utilization of diverse evaluation metrics, including but not limited to, Mean Average Precision (MAP), Normalized Discounted Cumulative Gain at 10 (NDCG@10), Precision at 10 (P@10), Reciprocal Rank at 10 (RR@10), and Expected Reciprocal Rank at 10 (ERR@10), as delineated in the reference by [Ibrahim and Younis, 2022]. To provide a more detailed analysis, Tables 5 and 6 present the outcomes of the Exploration mutation step-size optimization process, which is grounded in Gaussian, Cauchy, Levy, and Uni- form probability distributions, applied to both the Training and Testing datasets. This analysis is conducted both with and without initialization using Offline Logistic Regression (LR) and Online Dynamic Classification (DC) models. Based on these empirical results, one can derive the conclusion that the optimal equilibrium between Exploration and Exploitation in the context of (1+1)-Evolutionary Algorithms (EAs) for ranking is attained through the utilization of an offline linear model, incorporating Gaussian and Levy mutation step-sizes within offline (1+1)-EAs. This preference is discernibly linked to the Title Suppressed Due to Excessive Length 11 explicit acquisition of relevance labels by LR, ES-Rank, SAS-Rank, and EGS-Rank, as opposed to the implicit learning that characterizes online learning to rank, particularly diverging from the Online DC methodology in the initialization phase. Moreover, Tables 7 and 8 illustrate the success rates associated with each instantiated (1+1)-EA when applied to the task of learning to rank in various challenges. Remarkably, the amalgamation of Simulated Annealing with (1+1)-Evolutionary Strategy, denoted as SAS-Rank, consistently exhibits superior performance compared to ES-Rank and EGS-Rank, regardless of whether the initialization process employs offline or online linear ranking models. 12 Table 1 The results achieved through ranking models when applied to unseen test data, both with and without Offline LR initialization. MQ2008 For Predictive Ranking Models Methods Vs Metric MAP NDCG@10 RR@10 ERR@10 P@10 Wining PDR Wining EA LR 0.309 0.378 0.335 0.059 0.216 0 0 ES-Gaussian 0.457 0.482 0.506 0.093 0.262 1 1 ES-Cauchy 0.441 0.474 0.481 0.094 0.262 0 0 ES-Levy 0.457 0.492 0.498 0.092 0.2655 3 0 ES-Rank Uniform 0.434 0.459 0.49 0.094 0.262 1 0 EGS-Gaussian 0.454 0.538 0.495 0.094 0.259 2 0 EGS-Cauchy 0.451 0.527 0.475 0.091 0.266 1 0 EGS-Levy 0.44 0.525 0.466 0.094 0.262 0 0 EGS-Uniform 0.46 0.524 0.505 0.093 0.262 2 0 SAS-Rank Gaussian 0.454 0.517 0.5 0.097 0.266 2 1 SAS-Rank Cauchy 0.456 0.524 0.506 0.093 0.263 2 0 SAS-Rank Levy 0.429 0.527 0.475 0.089 0.264 0 0 SAS-Rank Uniform 0.449 0.536 0.493 0.095 0.262 1 1 IESR-Gaussian 0.42 0.472 0.476 0.095 0.263 0 0 IESR-Cauchy 0.44 0.475 0.476 0.087 0.259 0 0 IESR-Levy 0.457 0.48 0.488 0.095 0.264 4 0 Osman Ali Sadek Ibrahim, Eman M. G. Younis IESR-Uniform 0.446 0.46 0.467 0.092 0.26 0 0 IEGSR-Gaussian 0.448 0.521 0.49 0.092 0.262 0 0 IEGSR-Cauchy 0.456 0.528 0.481 0.096 0.262 3 0 IEGSR-Levy 0.444 0.516 0.506 0.09 0.269 2 1 IEGSR-Uniform 0.452 0.52 0.494 0.095 0.262 0 0 ISASR-Gaussian 0.46 0.532 0.51 0.09 0.264 2 0 ISASR-Cauchy 0.466 0.512 0.47 0.094 0.264 2 1 ISASR-Levy 0.43 0.507 0.497 0.093 0.262 0 0 ISASR-Uniform 0.461 0.53 0.496 0.093 0.267 1 0 Title Suppressed Due to Excessive Length Table 2 The results obtained by ranking models on unseen test data with/out Online DC initialization MQ2008 For Predictive Ranking Models For Hybrid Online-Offline Methods Vs Metrics MAP NDCG@10 RR@10 ERR@10 P@10 Wining PDR Wining EA DCM 0.281 0.355 0.329 0.054 0.207 0 0 IES-Click Gaussian 0.453 0.49 0.483 0.094 0.264 2 0 IES-Click Cauchy 0.45 0.478 0.502 0.095 0.262 2 0 IES-Click Levy 0.448 0.465 0.49 0.093 0.263 0 0 IES-Click Uniform 0.453 0.473 0.47 0.094 0.262 1 0 IEGS-Click Gaussian 0.451 0.511 0.483 0.094 0.263 0 0 IEGS-Click Cauchy 0.452 0.517 0.475 0.095 0.262 2 0 IEGS-Click Levy 0.422 0.518 0.493 0.094 0.264 1 0 IEGS-Click Uniform 0.436 0.521 0.496 0.093 0.259 2 0 ISAS-Click Gaussian 0.462 0.508 0.498 0.095 0.267 2 2 ISAS-Click Cauchy 0.461 0.53 0.483 0.094 0.261 0 0 ISAS-Click Levy 0.426 0.532 0.506 0.095 0.265 1 1 ISAS-Click Uniform 0.468 0.538 0.5 0.094 0.262 2 2 13 14 Table 3 The results achieved with or without Offline LR initialization for ranking models on training data. MQ2008 On Training Data For Evolving Results Methods Vs Metrics MAP NDCG@10 RR@10 ERR@10 P@10 Wining PDR Wining EA LR 0.283 0.364 0.295 0.051 0.217 0 0 ES-Gaussian 0.48 0.507 0.557 0.097 0.275 3 1 ES-Cauchy 0.445 0.482 0.515 0.094 0.269 0 0 ES-Levy 0.479 0.504 0.554 0.1 0.28 2 0 ES-Rank Uniform 0.456 0.474 0.509 0.095 0.268 0 0 EGS-Gaussian 0.446 0.552 0.548 0.098 0.272 0 0 EGS-Cauchy 0.467 0.547 0.539 0.098 0.275 0 0 EGS-Levy 0.467 0.544 0.521 0.097 0.275 0 0 EGS-Uniform 0.473 0.548 0.527 0.097 0.276 1 0 SAS-Rank Gaussian 0.48 0.536 0.557 0.099 0.28 3 0 SAS-Rank Cauchy 0.476 0.551 0.54 0.097 0.278 0 0 SAS-Rank Levy 0.45 0.546 0.537 0.097 0.277 0 0 SAS-Rank Uniform 0.475 0.554 0.549 0.11 0.279 2 2 IESR-Gaussian 0.451 0.479 0.532 0.1 0.279 3 0 IESR-Cauchy 0.45 0.47 0.5 0.087 0.265 0 0 IESR-Levy 0.478 0.5 0.527 0.097 0.276 2 0 Osman Ali Sadek Ibrahim, Eman M. G. Younis IESR-Uniform 0.465 0.475 0.495 0.093 0.268 0 0 IEGSR-Gaussian 0.455 0.549 0.529 0.099 0.273 2 0 IEGSR-Cauchy 0.462 0.537 0.517 0.095 0.277 0 0 IEGSR-Levy 0.469 0.545 0.545 0.09 0.28 1 0 IEGSR-Uniform 0.474 0.543 0.527 0.092 0.274 1 0 ISASR-Rank Gaussian 0.484 0.553 0.553 0.098 0.278 0 0 ISASR-Rank Cauchy 0.475 0.547 0.541 0.099 0.278 0 0 ISASR-Rank Levy 0.468 0.543 0.547 0.097 0.276 0 0 ISASR-Rank Uniform 0.483 0.557 0.551 0.12 0.28 4 2 Title Suppressed Due to Excessive Length Table 4 The results from ranking models with and without Online DC initialization on training data. MQ2008 On Training Data For Evolving Results Methods Vs Metrics MAP NDCG@10 RR@10 ERR@10 P@10 Wining PDR Wining EA DCM 0.284 0.364 0.344 0.057 0.211 0 0 IES-Click Gaussian 0.479 0.499 0.556 0.098 0.279 4 0 IES-Click Cauchy 0.461 0.481 0.532 0.095 0.273 0 0 IES-Click Levy 0.476 0.469 0.57 0.096 0.277 1 1 IES-Click Uniform 0.469 0.479 0.495 0.096 0.268 0 0 IEGS-Click Gaussian 0.48 0.538 0.534 0.096 0.276 2 0 IEGS-Click Cauchy 0.467 0.547 0.54 0.095 0.274 1 0 IEGS-Click Levy 0.462 0.529 0.525 0.092 0.275 0 0 IEGS-Click Uniform 0.465 0.553 0.521 0.097 0.274 2 0 ISAS-Click Gaussian 0.481 0.53 0.552 0.099 0.28 2 1 ISAS-Click Cauchy 0.472 0.556 0.546 0.096 0.277 1 1 ISAS-Click Levy 0.468 0.553 0.539 0.099 0.277 0 0 ISAS-Click Uniform 0.485 0.553 0.547 0.11 0.278 2 2 15 16 Osman Ali Sadek Ibrahim, Eman M. G. Younis Table 5 The exploration Winning rate of the probabilistic distributions as random generators on unseen data. Winning Rate for MQ2008 Predictive Ranking Models Type of Learning Random No. Probability Distribution Winning Counts Gaussian 7 Cauchy 8 Offline Ranking Model Levy 9 Uniform 5 Gaussian 4 Cauchy 4 Hybrid Online Offline Levy 2 Uniform 5 Table 6 Winning rates in exploring probability distributions randomness on training data. Winning Rate for MQ2008 Ranking Models On Training data Type of Learning Random No. Probability Distribution Winning Counts Gaussian 11 Cauchy 0 Offline Ranking Model Levy 5 Uniform 8 Gaussian 8 Cauchy 2 Hybrid Online Offline Levy 1 Uniform 4 6 Conclusions In summary, achieving the balance between exploration and exploitation within different Evolu- tionary Algorithms (EAs) can be attained by employing offline linear ranking models with the available (1+1)-EAs. Specifically, the Offline EGS-Rank, SAS-Rank, and ES-Rank techniques have demonstrated the potential to enhance the performance of the Online DCM ranking model, as evaluated across five fitness metrics. Notably, the ES-Rank, SAS-Rank, and EGS-Rank methods represent the existing EAs that have been applied to the learning to rank problem so far. They also stand out as the sole software packages capable of initializing ranking models from other methods. These EAs incorporate linear ranking models in their initialization process and ultimately yield linear ranking models in their final steps. The research journey began with the exploration of learning Online DC or Offline LR ranking models using benchmark data. Subsequently, SAS-Rank, EGS-Rank, and ES-Rank leveraged DC Title Suppressed Due to Excessive Length 17 Table 7 The Winning rates of exploration and exploitation algorithms, employing different mutations and ini- tializations, are evaluated on new, unseen data. Winning Rate for MQ2008 Predictive Ranking Models For Algorithm with/out Initialization Type of Learning Algorithm Type Winning Counts ES-Rank 1 IESR-Rank 0 EGS-Rank 0 Offline Ranking Model IEGSR-Rank 1 SAS-Rank 2 ISASR-Rank 1 IES-Click 0 Hybrid Online Offline IEGS-Click 0 ISAS-Click 5 Table 8 The Winning rate of algorithms utilizing exploration and exploitation with diverse mutations and initializations on training data. Winning Rate for MQ2008 Training Ranking Models For Algorithm with/out Initialization Type of Learning Algorithm Type Winning Counts ES-Rank 1 IESR-Rank 0 EGS-Rank 0 Offline Ranking Model IEGSR-Rank 0 SAS-Rank 2 ISASR-Rank 2 IES-Click 1 Hybrid Online Offline IEGS-Click 0 ISAS-Click 4 and LR ranking models to evolve superior ranking models. In a broader context, the SAS-Rank ranking model consistently outperformed other methods across various predictive and evolving scenarios, particularly when employing the MQ2008 dataset. Furthermore, the Gaussian Random generator demonstrated superior performance in both evolutionary and testing results. Additionally, it’s worth noting that Offline LTR techniques consistently outperformed Online LTR approaches across the five fitness evaluation metrics employed. This can be attributed to the capability of Offline methods to learn from explicit relevance labels, as opposed to Online LTR, which relies on implicit relevance labels. Looking ahead, there is potential for further research and analysis using Java software packages on diverse learning to rank datasets, paving the way for future advancements in this field. 18 Osman Ali Sadek Ibrahim, Eman M. G. Younis Declarations Ethical Approval There is no involvement of humans or animals in this research. Authors’ contributions The paper was wrote and reviewed by the author Availability of data and material The data is publicly accessible on the Microsoft Research website, and it has been provided by the Microsoft research team. You can find the source code in section 4. Conflict of interest The author affirms that there are no conflicts of interest. Funding and Support The author wishes to express appreciation to STDF, Egypt, for their generous support in covering the open access fees for their paper published in Springer journals. References D. V. Arnold and R. Salomon. Evolutionary gradient search revisited. IEEE Transactions on Evo- lutionary Computation, 11(4):480–495, 2007. ISSN 1089-778X. doi: 10.1109/TEVC.2006.882427. Leo Breiman. Random forests. Machine Learning, 45(1):5–32, 2001. ISSN 1573-0565. doi: 10.1023/ A:1010933404324. URL http://dx.doi.org/10.1023/A:1010933404324. Chris Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender. Learning to rank using gradient descent. In Proceedings of the 22Nd International Conference on Machine Learning, ICML ’05, pages 89–96, New York, NY, USA, 2005. ACM. ISBN 1-59593-180-5. doi: 10.1145/1102351.1102363. Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li. Learning to rank: from pairwise approach to listwise approach. In Proceedings of the 24th international conference on Machine learning, ICML ’07, pages 129–136, New York, NY, USA, 2007. ACM. ISBN 978-1-59593-793-3. doi: 10.1145/1273496.1273513. URL http://doi.acm.org/10.1145/1273496.1273513. Pedro A Diaz-Gomez and Dean F Hougen. Initial population for genetic algorithms: A metric approach. In Proceedings of the 2007 International Conference on Genetic and Evolutionary Methods GEM, pages 43–49, 2007. Agoston E Eiben and Cornelis A Schippers. On evolutionary exploration and exploitation. Fun- damenta Informaticae, 35(1-4):35–50, 1998. David B Fogel. An overview of evolutionary programming. In Evolutionary Algorithms, pages 89–109. Springer, 1999. Title Suppressed Due to Excessive Length 19 Yoav Freund, Raj Iyer, Robert E. Schapire, and Yoram Singer. An efficient boosting algorithm for combining preferences. J. Mach. Learn. Res., 4:933–969, December 2003. ISSN 1532-4435. URL http://dl.acm.org/citation.cfm?id=945365.964285. Jerome H. Friedman. Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5):1189–1232, 2001. ISSN 00905364. URL http://www.jstor.org/stable/2699986. Francisco Herrera, Manuel Lozano, et al. Adaptation of genetic algorithm parameters based on fuzzy logic controllers. Genetic Algorithms and Soft Computing, 8(1996):95–125, 1996. Osman Ali Sadek Ibrahim and D. Landa-Silva. An evolutionary strategy with machine learning for learning to rank in information retrieval. Soft Computing, 22, 2018. ISSN 14337479. doi: 10.1007/s00500-017-2988-6. Osman Ali Sadek Ibrahim and Dario Landa-Silva. (1+1)-Evolutionary Gradient Strategy to Evolve Global Term Weights in Information Retrieval, pages 387–405. Springer International Publishing, Cham, 2017. ISBN 978-3-319-46562-3. doi: 10.1007/978-3-319-46562-3 25. URL http://dx.doi.org/10.1007/978-3-319-46562-3 25. Osman Ali Sadek Ibrahim and Eman MG Younis. Hybrid online–offline learning to rank using sim- ulated annealing strategy based on dependent click model. Knowledge and Information Systems, pages 1–15, 2022. Mohammad Ashiful Islam. Rankgpes: Learning to rank for information retrieval using a hybrid genetic programming with evolutionary strategies, 2013. Oliver Kramer. Machine learning for evolution strategies, volume 20. Springer, 2016. Jesus Dario Landa Silva. Metaheuristic and multiobjective approaches for space allocation. PhD thesis, University of Nottingham, 2003. Hang Li. Learning to Rank for Information Retrieval and Natural Language Processing, Second Edition. Morgan and Claypool Publishers, 2014. ISBN 9781627055857. J. Y. Lin, J. Y. Yeh, and Chao Chung Liu. Learning to rank for information retrieval using layered multi-population genetic programming. In Computational Intelligence and Cybernet- ics (CyberneticsCom), 2012 IEEE International Conference on, pages 45–49, July 2012. doi: 10.1109/CyberneticsCom.2012.6381614. Tie-Yan Liu. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3):225–331, 2009. Tie-Yan Liu. Learning to Rank for Information Retrieval, chapter The LETOR Datasets, pages 133–143. Springer Berlin Heidelberg, Berlin, Heidelberg, 2011. ISBN 978-3-642-14267-3. doi: 10.1007/978-3-642-14267-3 10. URL http://dx.doi.org/10.1007/978-3-642-14267-3 10. Donald Metzler and W. Bruce Croft. Linear feature-based models for information retrieval. Infor- mation Retrieval, 10(3):257–274, 2007. ISSN 1573-7659. doi: 10.1007/s10791-006-9019-z. URL http://dx.doi.org/10.1007/s10791-006-9019-z. Zbigniew Michalewicz. Genetic Algorithms + Data Structures = Evolution Programs (3rd Ed.). Springer-Verlag, Berlin, Heidelberg, 1996. ISBN 3540606769. Jung-Yi Lin Mick, Accessed 2016. URL http://people.cs.nctu.edu.tw/~jylin/lagep/lagep.html. A. Mohan, Z. Chen, and K.Q. Weinberger. Web-search ranking with initialized gradient boosted re- gression trees. In Journal of Machine Learning Research, Workshop and Conference Proceedings, volume 14, pages 77–89, 2011. Volker Nannen, Selmar K Smit, and Agoston E Eiben. Costs and benefits of tuning parameters of evolutionary algorithms. In International Conference on Parallel Problem Solving from Nature, pages 528–538. Springer, 2008. Tao Qin and Tie-Yan Liu. Introducing LETOR 4.0 datasets. CoRR, abs/1306.2597, 2013. URL http://arxiv.org/abs/1306.2597. Tao Qin, Tie-Yan Liu, Jun Xu, and Hang Li. Letor: A benchmark collection for research on learning to rank for information retrieval. Information Retrieval, 13(4):346–374, 2010. ISSN 1573-7659. doi: 10.1007/s10791-009-9123-y. URL http://dx.doi.org/10.1007/s10791-009-9123-y. 20 Osman Ali Sadek Ibrahim, Eman M. G. Younis Anne Schuth, Katja Hofmann, Shimon Whiteson, and Maarten De Rijke. Lerot: An online learning to rank framework. In Proceedings of the 2013 workshop on Living labs for information retrieval evaluation, pages 23–26, 2013. Jun Xu and Hang Li. Adarank: A boosting algorithm for information retrieval. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’07, pages 391–398, New York, NY, USA, 2007. ACM. ISBN 978-1-59593-597-7. doi: 10.1145/1277741.1277809. URL http://doi.acm.org/10.1145/1277741.1277809. Xin Yan and Xiao Gang Su. Linear Regression Analysis: Theory and Computing. World Scientific Publishing Co., Inc., River Edge, NJ, USA, 2009. ISBN 9789812834102, 9812834109. Yanfen Zheng, Jieqiong Wu, and Bin Wang. Clgbo: An algorithm for constructing highly robust coding sets for dna storage. Frontiers in Genetics, 12, 2021. ISSN 1664-8021. doi: 10.3389/ fgene.2021.644945. URL https://www.frontiersin.org/articles/10.3389/fgene.2021.644945.