Molecular Dynamics and Machine Learning in

Full text

Turn on search term navigation

1. Introduction

Catalysts have attracted growing interest due to their unique effects on chemical reactions. A catalyst can increase or decrease the chemical reaction rate without changing its chemical properties and does not change the chemical equilibrium. Therefore, catalysts are widely used in numerous fields, like electroreduction [1,2,3], chemical formation [4,5], combustion [6,7,8], and environmental conservation [9,10,11]. There are many kinds of catalysts, such as metal catalysts, metal oxide catalysts, molecular sieve catalyst [12], biocatalyst [13], and nano catalyst [14]. With the development of catalysts, different catalysts are constantly being discovered and created.

Although new catalysts are still being discovered, a deep understanding of the catalysis mechanism in chemical reactions still lacks and needs continuous improvement. Computational simulation and experimentation are two main approaches to study catalysts. Compared with experiments, computational simulations can provide atomic insights that go deeper into the microscopic mechanism [15]. First-principles calculation is a common computational approach in the catalysis field. The ab initio molecular dynamics (AIMD) method is preferred to study the reaction mechanism of catalytic reactions, which can solve the difficulties in describing the chemical reactions accurately, including the precise calculation of electronic structure and the dynamic process of atomic motion [16]. The AIMD approach solves the Schrödinger equation by various approximations [17]. It combines quantum mechanics and molecular dynamics that can accurately describe the electronic structure and atomic motion. However, due to the expensive calculation cost, this approach can only be applied to small systems.

Compared with the AIMD, classical molecular dynamics (MD) simulation can handle more significant and more complex systems. Still, the drawbacks of this simulation method, which is based on classical mechanics, are also evident, such as the lack of accurate electronic structure calculation. Empirical interatomic potential (EIP) is the basis of classical MD, which is expressed as the function of parametrized approximation to describe the interaction between atoms [18]. The accuracy of EIP determines the accuracy of calculation results. Besides, EIP also strongly restricts the progress of classical MD. In recent years, the development of force fields such as reactive force field [19] (ReaxFF) has improved the performance of classical MD and expanded the scope of its applications. On the one hand, ReaxFF is a bridge between quantum chemistry and non-reactive EIP that can describe the chemical reaction process [20]. On the other hand, although the calculated speed is one order of magnitude slower than classical force fields due to the charge equilibration calculations at each timestep and the modeling of bond formation and breaking [21], ReaxFF is still a useful molecular dynamics method to study chemical reactions that is still evolving [22,23].

The burst of numerical simulations not only promoted the development of catalysts but also produced vast amounts of relevant data. With the widespread use of data science methods in numerous field [24], especially machine learning [25,26,27,28,29] (ML), searching for new catalysts through big data has gained widespread attention [30,31]. New catalyst with good performance in a chemical reaction is difficult to discover because it depends on various properties, such as particle size [32], composition [33], and support [34,35,36,37]. It was mainly empirical-based for traditional catalyst selection and needed intensive time and capital to find the optimal candidate. However, ML offers a new approach to find new catalysts and is helpful to select highly efficient catalysts. Moreover, it shortens the research time [38]. ML that can provide good prediction results and is based on sufficient relevant data and different factors considered together through the model to determine the optimized results [39,40].

Many great reviews have reported and discussed the AIMD and ReaxFF molecular dynamics for catalysis. Stirling et al. [15] reviewed both the experiments and calculations study of the Wacker process. In this review, the study of the mechanism of the Wacker process by using static calculation and ab initio molecular dynamics was sufficiently reported. Furthermore, the restrictions of static analyses and the necessary use of the AIMD approach in studying the Wacker process were explained. Senftle et al. [41] reviewed the development of reactive force field from the beginning to maturity. The applied range of ReaxFF was listed, including heterogeneous catalysis, atomic layer deposition, and others. Notably, this review proposed the future development of ReaxFF and exhibited the strong computing capacity and high computing speed of a ReaxFF molecular dynamics code, PuReMD-PGPU.

This review briefly introduces molecular dynamics, including ab initio molecular dynamics and reactive molecular dynamics, and their application in several chemical reactions. Most importantly, the advantages and disadvantages of AIMD and ReaxFF are comprehensively reviewed and compared to the machine learning approaches in catalysis, especially the development of the machine learning potentials. This paper is divided into two main parts. The first part will review the study of the AIMD method and the ReaxFF molecular dynamics in calculating different reactions, including the growth of carbon materials, dehydrogenation, hydrogenation, oxidation reaction, segregation, and restructuring. The second part will provide an overview of the ML methods report of the application of machine learning in catalysis, including machine learning potentials, new catalyst discovery and design, and some helpful machine learning community projects.

2. Molecular Dynamics

2.1. Introduction of Molecular Dynamics

Although the reaction mechanism of a chemical reaction can be investigated by experiments even assisted with robot [42], studying complex systems by experiments is still tricky. Calculations provide the feasibility to explore the complex systems and reactions. Both static calculations and molecular dynamics have been intensively used to study the reaction mechanism. However, analyses on the reaction mechanism require accurate electronic structure calculations and the real-time tracking of atomic motions. Molecular dynamics provide more information and give vivid dynamics configurations that order more kinetics, thermal dynamics, and reaction trajectories for visual inspection. Furthermore, with the increasing complexity of the reaction process, molecular dynamics offer more computations possibilities to help solve the time-scale gap of AIMD.

2.1.1. Ab initio Molecular Dynamics

Ab initio molecular dynamics, which combines molecular dynamics with force directly calculated from the electronic structure, is a helpful method in the theoretical calculations of chemical reactions [43]. The electronic structure is solved directly at each step, and therefore it allows for bond breaking and formation [44,45]. Although a direct solution of the Schrödinger equation can fully reflect the wave function and the exact total energy of the nucleus and electron [44], it is impossible to solve the Schrödinger equation directly in complex systems. Therefore, several approximations are employed to solve these problems. One of the most important approximation methods is the Born-Oppenheimer approximation [46]. The Born-Oppenheimer approximation assumes that the motion of the nucleus and electron can be separated due to the difference between the nuclear and electronic masses. Moreover, many other approximations have been used to simplify further and create several different methods such as Hartree-Fock molecular dynamics [47]; Kohn-Sham molecular dynamics [48]; Car-Parrinello molecular dynamics [49,50]; and Path Integral molecular dynamics [51]. There are numerous AIMD codes that are popular and widely used, including VASP [52], Quantum ESPRESSO [53], CP2K [54], and CPMD [55].

2.1.2. Reactive Force Field Molecular Dynamics

Wide use of the AIMD approach, the computing speed and expenditure restricts the size of systems. As for the large systems, such as polymers [56,57], many active sites [58] and different approaches should be employed for simulation, like ReaxFF molecular dynamics. Unlike the AIMD approach, in which the electronic structure is solved directly, ReaxFF molecular dynamics is based on the reactive force field. ReaxFF is a bond-order-dependent force field that can be expressed as:

(1) $E_{system} {= E}_{bond} {+ E}_{over} {+ E}_{under} {+ E}_{val} {+ E}_{pan} {+ E}_{tors} {+ E}_{conj} {+ E}_{vdw} {+ E}_{coulomb}$

where the first term E_bond is bond energy and the second and third terms E_over and E_under are the over-coordination and under-coordination penalty terms, respectively. The fourth term E_val is the valence angle term. The fifth term Epan is also a penalty term representing the effects of over-coordination and under-coordination in the central atom. E_tors is the torsion angle term, and E_conj describes the conjugation effects to the total energy. The last two terms E_vdw and E_coulomb denote the non-bonded van der Waals interactions and Coulomb interactions, respectively. The most important assumption in ReaxFF is the bond order, which can be calculated directly based on the interatomic distance r_ij, and the following equation:

(2) ${BO}_{ij}^{’} {= \exp [P}_{bo, 1} {{(r}_{ij} {/ r}_{o})}^{P_{bo, 2}} {+ P}_{bo, 3} {{(r}_{ij} {/ r}_{o})}^{P_{bo, 4}} {+ P}_{bo, 5} {{(r}_{ij} {/ r}_{o})}^{P_{bo, 6}}]$

The three terms in Equation (2) represent the sigma bond, the first pi bond, and the second pi bond, respectively. The initial ReaxFF only described the hydrocarbons and gradually expanded to other materials, such as Si, SiO₂ [59], MgH [60], and Al₂O₃ [61]. A tree-like development process [60,62,63,64] was illustrated in the Figure 1. The formulation of ReaxFF is complicated, and until 2005, the form of expression became uniform and was supported direct access from some open-source software, such as LAMMPS [65] and PuReMD [66,67].

In the following subsections, different chemical processes studied by molecular dynamics were reported, including the growth of carbon materials, dehydrogenation and oxidation reaction. In addition, several dynamical phenomena in catalysis that can only model with molecular dynamics, such as segregation and restructuring, were also reported.

2.2. Application of AIMD and ReaxFF

2.2.1. The Growth of the Carbon Materials

Undoubtedly, carbon materials have been fascinating in the last two decades, especially with the discovery of carbon nanotubes and graphene [68,69,70,71]. In the past, carbon materials have been extensively studied and prepared by different methods in experiments, whereas, even now, the growth mechanisms of some carbon materials, such as multi-walled carbon nanotubes, are still lacking [72]. Chen et al. [73] investigated the dynamics of the growth of amorphous carbon in graphene using AIMD. The generated structures derived from sp³-carbon and sp²-carbon showed significant differences when the system temperature varied from 300 K to 1800 K under the catalysis of nickel, and this transformation process is depicted in Figure 2. In addition, a particularly different transformation process from the conventional chemical vapor deposition (CVD) growth was found. Fukuhara et al. [74] investigated nickel-carbon binary clusters as catalysts for the formation of carbon nanotubes. Using AIMD, the kinetic process of ethanol dehydrogenation was simulated, and the catalytic mechanism of nickel-carbon clusters was revealed at the atomic scale. The phenomenon that more carbon atoms tended to stay on the surface of nickel was observed. Meanwhile, carbon chains formed on the surface as the number of carbon atoms increased.

In addition, the problems of the chirality of carbon nanotubes have long been mentioned. The process of carbon nanotube growth, which includes the dissolution of carbon and the formation of carbon nanotube, was widely studied by a reactive force field. Neyts et al. [75] employed ReaxFF molecular dynamics and Monte Carlo simulations to investigate the growth process of carbon nanotubes. The observed growth process was consistent with the previous studies. Most importantly, the change of the chirality during the growth process was firstly reported, which is shown in the Figure 3.

2.2.2. Dehydrogenation and Hydrogenation

Ethylene is a necessary chemical raw material, and ethane dehydrogenation is one of the essential methods to produce ethylene. Though there is comprehensive application in industry, there are still numerous challenges [76], such as deactivation of the catalyst. The study of the ethane dehydrogenation mechanism and the funding of more effective catalysts will be helpful to confront these challenges. Using density functional theory and ab initio microkinetic model, Jalid et al. [77] systematically investigated the reaction mechanism of ethane dehydrogenation with transition metals (Pt, Pd, Co, Ni, Rh, Ru, Re, Cu, Au, and Ag) as catalysts and CO₂ as mild oxidant. Different surface types (111 and 211) were considered as factors affecting the reaction. The simulation results show that ethane is directly and mainly dehydrogenated to ethylene, and Rh and Pt are the most efficient catalysts compared to the other calculated transition metals.

In addition, coupling of thermodynamics and dynamics, which has been proven to be a more difficult challenge, was also neglected. In contrast to the ethane dehydrogenation reaction, the reaction mechanism of ethylene hydrogenation on the surface of δ-MoC(001) was investigated by Jimenez et al. [78]. A suitable structure was optimized by density functional theory and ab initio thermodynamics and kinetics to estimate the relationship between hydrogen surface coverage and activation energy barriers with the reaction rate on the δ-MoC(001) surface. In this study, the activation energy barrier of the δ-MoC(001) surface was found to be lower compared to the Pt(111) and Pd(111) catalyst surfaces. Figure 4 show a vivid picture representing the relationship between the hydrogen coverage and ethylene’s hydrogenation.

As for the application of ReaxFF molecular dynamics in dehydrogenation, oxidative dehydrogenation plays an important role. Chenoweth et al. [79] fitted the parameters for the oxidative dehydrogenation of ReaxFF over vanadium oxide catalysts using a quantum mechanical approach. The structure and energies of several different vanadium oxides, such as V₂O₅, VO_2, and V₂O₃, were well calculated by using the fitted parameters of ReaxFF. In addition, the oxidation process of methanol was simulated by using molecular dynamics simulations, and the results were in agreement with experiments, proving the accuracy of the fitted parameters. The oxidation process of methane was studied through the molecular dynamics method by Feng et al. [80]^, and ReaxFF was chosen as the force field to simulate the oxidative dehydrogenation process of methane. Several catalysts were considered, e.g., functionalized graphene sheets (FGS), Pt, and Pt@FGS, and the Pt@FGS catalyst showed the best catalytic performance. The essence of catalytic oxidation of methane is the breaking of C–H bonds and the formation of hydroxyl groups. The Pt@FGS catalyst increases the dehydrogenation rate of methane and drives the catalytic cycle that all conduced to the increase of the reaction rate. In addition, the hydroxyl groups generated by oxidation further enhance the functionalization of FGS, leading to an enhanced reaction.

2.2.3. Oxidation Reaction

Oxidation reactions are common but essential chemical reactions. As a promising green technology, water and CO oxidation process was widely researched. The oxidation process of water over a cobalt oxide catalyst was investigated in atomic depth using AIMD by Mattioli et al. [81]. The simulation results were directly compared with the X-ray absorption spectroscopy results. An agreement of bond distance calculations and measurements was found. Both calculations and experiments further revealed the real structure of the cobalt oxide catalyst in the water oxidation reaction. They supported that the cobalt oxide catalyst promoted the presence of low resistance hydrogen bonds. Wang et al. [82] observed the CO oxidation reaction process at the atomic scale using AIMD. The catalytic reaction mechanism of the Au/TiO₂ interfacial oxidation reaction was further investigated. Due to the catalysis of Au/TiO₂ catalyst, the oxidation reaction of CO can occur in a wide temperature range from 120 K to 700 K. Additionally, faster reaction rates were observed at high temperatures compared to low temperatures. In addition, the surface charge of gold greatly influences the oxidation reaction process, and the charge cycle diagram is shown in Figure 5.

In addition, there still exist many limitations in understanding the oxidation reaction process of complex organic matter, and some mechanisms are still unclear. Due to the lack of powerful tools for complicated systems, ReaxFF molecular dynamics are widely used to understand the oxidation process. Zhang et al. [83] employed ReaxFF molecular dynamics simulations to study ethanol oxidation and Al nanoparticles. The reaction temperature decreased to 324 K due to the existence of Al nanoparticles. More reaction pathways were found.

Most importantly, with the increase of reaction temperature, the Al nanoparticles converted from solid to a liquid state, which resulted in the more effective diffusion of H and O atoms in nanoparticles. Thus, it ordered a more active site and accelerated the reaction. The oxidation of methane on a palladium catalyst surface was comprehensively investigated by Mao et al. [84] ReaxFF molecular dynamics simulations were used to model bond breakage and formation. In addition to the bare surface, oxygen-covered surfaces were calculated, and different levels of oxygen coverage were considered. The reaction temperature was used as an indicator to evaluate the difficulty of the reaction. During the oxidation reaction of methane, oxygen is more likely to occupy the active site, while the oxygen covering the surface of the palladium catalyst hinders the dissociation and adsorption of oxygen. However, the oxygen-covered palladium catalyst has a more substantial effect on the oxidation reaction compared to the bare palladium catalyst, which is supported by the lower reaction temperature. The optimization of catalysts is complicated due to the kinetic processes and numerous factors involved in catalytic reactions.

2.2.4. Segregation and Restructuring

Many studies about the reaction mechanism have combined static calculation and AIMD approaches. Gibbs energy differences and free energy barriers are calculated by static analysis, which can relate to experiments. However, molecular dynamics can only simulate some phenomena, like segregation, restructuring, and excitation. By using AIMD methods, Hoppe et al. [85] studied the segregation behavior of Ag atoms. The DFT calculations and AIMD simulations suggested that the silver atom is next to the chain and does not replace the gold atom.

Furthermore, a more intuitive dynamic process can be seen with the AIMD methods. Wittkamper et al. [43] studied the restructuring of the Rh–Ga model because of the oxidation behavior. In this research, the simulation results were consistent with the experiments. They all supported the claim that compared to the β-Ga₂O₃, Rh is less likely to stay at bulk Ga solution. Barnard et al. [86] systematically investigated the role of interstitials in radiation-induced segregation (RIS). Due to the low migration barrier, interstitial diffusion can be easily simulated by molecular dynamics. Considering the accuracy of calculations, the AIMD approach was preferable. In this study, a Weidersich-type rate theory was modeled in Ni–18Cr alloy. Using the AIMD method, the prediction that interstitial diffusion may result in the enrichment of Cr near the grain boundary was certified. In addition, despite the errors in the lower temperature simulations, AIMD still can be a great method to study the RIS.

2.2.5. Discussion

The successful use of molecular dynamics cannot cover up the shortcoming of AIMD and ReaxFF molecular dynamics. For ReaxFF, many challenges still need to be confronted, including the charge description [87], parameter optimization [88], and the complexity of bond order [41]. For example, bond order is essential for ReaxFF molecular dynamics, but for condensed systems, the description of bond order becomes more complicated. In addition, ReaxFF molecular dynamics are empirical methods based on assumptions and a preset formula. The parameters are acquired from the DFT calculations and experimental data. However, even using the parameters with enough optimization, the limitations of parameters and preset formula error still exist. Thus, a method whose calculation accuracy is close to the DFT calculations and whose computing speed approaches the ReaxFF molecular dynamics is imperative. Machine learning produces new opportunities and challenges, especially the development of machine learning potentials, which provide a new direction to solve the above problems. This topic will be detailed discussed in the next part.

3. Machine Learning in Catalysts

More efficient catalyst performance and the discovery of new catalysts are the goals pursued by chemists [89,90,91]. However, the optimization and search for catalysts are complicated because the factors affecting catalyst performance are diverse, and sometimes, a subtle structural change can cause a dramatic shift in catalyst performance [92]. Unfortunately, the catalytic mechanisms of some reactions still lack understanding and need to be further explored. In addition, traditional catalyst optimization and search require keen scientific intuition and extensive experience. This poses a considerable challenge to scientists. However, ML has facilitated new approaches to address these issues.

Traditional problem-solving methods are based on deduction and inference, but ML methods are based on generalization and summarization [93]. With the development of big data science, ML has been extended to numerous fields, such as aiding medicinal chemical discovery [94,95,96] and material discovery [97,98,99]. Additionally, this approach can be used for catalyst discovery and optimization.

Machine learning is a broad concept that includes many methods, such as artificial neural networks [100], support vector machines [101], linear regression [102], and kernel methods [103]. The methods used in catalyst discovery and optimization are not uniform, and sometimes different methods are used simultaneously. However, the most critical issue in catalyst discovery and optimization is the choice of descriptors, which determines the model’s accuracy. The importance of descriptor derives from the catalyst performance being sensitive to the change of structure and energy. Even an energy difference of 1 kcal/mol can change the choice of catalyst [92]. Another reason is the prediction and extrapolation of the results, although the correctness of the extrapolated results is not strictly proven. Predicting and extrapolating results is still an important part of the ML approach. An accurate descriptor helps to make reasonable predictions. Therefore, descriptors should be carefully chosen when using ML methods. In this subsection, we briefly introduce three different forms, including neural networks, regression, and random forests, and machine learning potentials, which represent a critical development, are reported.

3.1. Introduction of Methods

Convolutional neural networks (CNNs) have become a widely used method in image recognition because of their powerful feature-capturing ability. In recent years, CNN methods have been applied in the area of catalysts. Xie et al. [104] promoted CNN methods in catalyst performance prediction. The main component of this CNN method is to transform the catalyst structure into catalyst graphs. Both atomic and binding energy information is considered, and the workflow is shown in Figure 6. A significant alteration in this study is the data input layer that transforms the entire structure into a planar graph. In Figure 6, each point represents a different atom, and the connections between the different points consider the environment of a particular atom. Only one convolutional layer; one pooling layer; and, finally, two fully connected layers are used. The role of the convolutional and pooling layers is to capture each atom’s feature by nonlinear convolution function and further generate the feature of the crystal, respectively. The final two fully connected layers and output layer are used to predict the target properties. The optimized equation can be expressed as:

(3) $\min_{W} J (y, f (ε; W))$

where y is the predictive value and f is the function that represents the target property. This model can achieve a computational accuracy close to that of density generalization theory methods with sufficient data training. In addition, this CNN method has been used to classify the types of materials in this study, and the highest accuracy of about 0.95 can reach the identification of 9350 catalysts. Additionally, based on the CNN method, Back et al. [105] modified the technique proposed by Xie [104] for improving the accuracy of predicting the absorption energy. It was demonstrated that the CNN method could be used to predict surface coverage and site activity, which can be helpful for catalyst design.

The random forest method is another commonly used ML approach. The decision tree method is briefly introduced before the random forest method because the former is the latter’s foundation. Generally, a decision tree contains finite numbers, nonempty nodes, and a set of edges. By a series of the child’s decisions, the original data set can be divided into numerous data sets of different attributes. Information gain is usually used to separate the feature, and its equation can be expressed as:

(4) $Gain (D, a) = Ent (D) - \sum_{n = 1}^{N} \frac{| D^{v} |}{| D |} {Ent (D}^{v})$

where D is the information entropy. Can et al. [106] used a decision tree to study the factors leading to high hydrogen production, which may have given a simple example. The random forest method carefully considers several different classifications that come from the decision tree method.

Regression methods generally can be split into the following types: linear regression, which includes ridge regression [107]; selection operator regression [108]; nonlinear regression, such as kernel ridge regression [109]; support vector regression [110], etc. The most straightforward linear regression is the linear combination of variables.

(5) ${y (x, w) = w}_{0} {+ w}_{1} x_{1} {+ \dots + w}_{D} x_{D}$

where x is the input variable and w is a linear parameter. This equation also can be extended to combine with the nonlinear function.

(6) $y (x, w) {= w}_{0} + \sum_{j = 1}^{N - 1} w_{j} φ_{j} (x)$

where φ is the nonlinear function; in addition, the parameters of this equation are derived from the error function minimizing.

(7) $E (w) = \frac{1}{2} \sum_{n = 1}^{N} {y (x_{n}, w) {- t}_{n}}^{2}$

where E is the error value and t_n is the target value. The example of using linear regression in catalysis can be seen in Werth et al. [111] study.

3.2. Applications of Machine Learning in Catalysis

3.2.1. Machine Learning Potentials

Machine learning potential is one of the most critical calculations advances in recent years and has been intensively studied and applied in catalysis [112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127]. The machine learning potential is a method that uses the machine learning algorithm to find the underneath relationship of the atomic configuration and energy [128]. It is different from the empirical interatomic potentials, which are based on the presupposed mathematic formula. Hence, the error of the assumptions that correspond to mathematical expressions and parameter optimization can be significantly avoided. The simulation accuracy increases compared to the empirical interatomic potentials. As an example, the calculation process of machine learning potentials [129] is shown in Figure 7. Firstly, a series of configurations are acquired from the AIMD approach or other methods. Then, a sufficient number of configurations are chosen to calculate the energy, force, and other critical physical quantities by using the DFT method. Next, the atomic structure is converted to descriptors as the input of the machine learning model, and the calculated energy and force are designated as target quantity. Finally, the model is trained, and the machine learning potential is achieved. Several machine learning models are chosen to acquire the machine learning potentials, such as neural networks [130,131] and gaussian process regression [132,133,134,135,136,137,138,139]. Ulissi et al. [140] studied the active sites of bimetallic catalysts. By using the DFT calculations, hundreds of possible active sites were found, and neural network potentials were used to accelerate the calculation process. Nickel gallium bimetallic was calculated as the example.

3.2.2. The Development of Descriptors

Discovering the simple standard features that influence the properties in a small group of materials as descriptors is a valuable approach in properties prediction, such as catalytic activity and materials finding [141,142,143,144,145,146,147,148,149]. Several descriptors can be used, e.g., interatomic distance, nearest neighbor coordination number (CN), surface strain, the number of facets, and p-band center [150]. However, the accuracy of these simple descriptors is challenging to verify experimentally because catalysts have complex structures that change dynamically during the reaction. Timoshenko et al. [151] proposed an ML method that directly processes data from X-ray absorption spectroscopy (XAS) containing structural and electronic information. This information is directly used to obtain specific features of some simple descriptors, such as the charge states and radial distribution function. Both the supervised ML and unsupervised ML methods that are shown in Figure 8 were used to reveal the relationships hidden in the XAS data. Sinthika et al. [151] proposed a special descriptor π electronic structure for nitrogen-, boron-, and co-doped graphene. In this article, several descriptors were summarized and illustrated in Table 1, such as surface energy [152,153], vacancy formation energy [154], occupancy [155,156], and d-band center [157]. Takahashi et al. [158] used the random forest method to search for new catalysts for methane oxidation coupling (OCM) reactions. In order to overcome the difficulty of the uncertainty in terms of methane activation, three key factors that were discovered from 1868 OCM catalyst data were first summarized as the descriptors that could determine C2 yields. By using the discoverable descriptors and the random forest method, new catalysts that could improve C2 yield were found.

Graph neural networks (GNN) have given a new direction to acquire the descriptors [159], which is different from the traditional approaches obtained from functions. Simple GNN [160] contains a series of nodes, edges, node attributes, and edge attributes. Atomic structures are only needed and as input in the GNN approach. Nodes can represent the atoms, and the neighbor information of the specified atom is encoded by the edges. The conversion and parameters optimization will be operated by using the graph. This approach has gained more and more attention and is widely used in materials founding and properties prediction [161,162,163,164].

3.3. Discussion

Machine learning approaches have been extensively and successfully applied in numerous fields, including new materials finding, materials properties prediction, and calculation acceleration. Additionally, some helpful machine learning community projects, like DeepChem [165,166] and OpenCatalysts, are proposed to help use the machine learning method for materials and chemistry [165,166,167,168,169,170,171], whereas it deficiencies still exist that restrict the development of machine learning approaches and need to be overcome. (1) Descriptors have been sufficiently introduced, and kinds of descriptors were listed in above. However, there were all static descriptors based on the fixed function with no optimizable parameters. Graph neural networks (GNN) have given a new direction to acquire the descriptors [159], which is different from the traditional approaches obtained from functions. (2) As for the machine learning potentials, with the increasing training data, the training time and accuracy are still problems. In addition, the development of universal machine learning potentials faces enormous challenges. The existing machine learning potential is obtained from specific problems by fitting the calculated data. catalysts-11-01129-t001_Table 1Table 1

Descriptor	Class of Catalyst	Reaction	Optimal Catalyst(s) Identified
d-band center [157]	Transition metals, transition metal alloys	ORR	Pt and Pd [172]
eg occupancy [155]	Transition metal oxides	ORR	Pt3Ni [173], LaCoO₃ (t₂g⁵eg¹) and LaNiO₃(t₂g⁶eg¹)
t2g occupancy [168]	Transition metal oxides	OER	CuCoO₂, PtCoO₂
O p-band ceter [174]	Transition metal oxides	OER	(Pr_0.5Ba_0.5)CoO₃
E_vac vacancy formation energy [154]	Core shell transition metal nanoparticles	ORR	Pd₃Cu₁@Pt (core@shell)
E_surf surface energy [155]	Pure metals	Hydrogen evolution reaction	Pt
E_surf surface energy [156]	Transition metal carbides	Hydrogen evolution reaction	Pt/Mo₂C

4. Conclusions and Outlook

The focus of this review is on molecular dynamics calculations of catalysts. Different types of molecular dynamics are outlined, including AIMD and ReaxFF molecular dynamics. The development of both methods in applications including growth, dehydrogenation, hydrogenation, oxidation reactions, bias and recombination of carbon materials is discussed. Although both AIMD methods and ReaxFF molecular dynamics simulations have been successfully applied in mechanistic studies of different catalytic interactions, some limitations remain, such as the expensive cost of AIMD and its limitations in complex systems, as well as the parameter optimization and charge description problems of ReaxFF. In recent years, ML methods have been widely applied in various fields. An overview of the application of ML methods in catalysis, which can address the above limitations, is given. Several different ML algorithms, such as neural networks, random forests, and regression, are briefly described. Their applications in new catalyst search and performance prediction are reported. Most importantly, the potential of one of the most significant advances, machine learning, is presented. With accuracy close to that of DFT calculations, but with lower computational cost, machine learning potential has become one of the most promising directions in analysis. In addition, the challenges of applying machine learning methods, especially the limitations of descriptors, are discussed. Finally, GNN, a viable solution, is discussed.

Author Contributions

Conceptualization, W.L., J.Z., Y.Y. and B.H.; methodology, W.L., Y.Z., Y.W., C.C. and Y.H.; software, W.L., Y.Z., Y.W., C.C. and Y.H.; validation, W.L., Y.Z., Y.W., C.C. and Y.H.; formal analysis, W.L., J.Z., Y.Y. and B.H.; investigation, W.L., J.Z., Y.Y. and B.H.; resources, W.L., J.Z., Y.Y. and B.H.; data curation, W.L., Y.Z., Y.W., C.C. and Y.H.; writing—original draft preparation, W.L., J.Z., Y.Y. and B.H.; writing—review and editing, W.L., J.Z., Y.Y. and B.H.; visualization, W.L., Y.Z., Y.W., C.C. and Y.H.; supervision, J.Z., Y.Y. and B.H.; project administration, W.L., Y.Z., Y.W., C.C. and Y.H.; funding acquisition, Y.Y. All authors have read and agreed to the published version of the manuscript.

Funding

Y. Yue acknowledges the support from the National Natural Science Foundations of China (No. 52076156), National Key Research and Development Program (No. 2019YFE0119900), and Fundamental Research Funds for the Central Universities (No. 2042020kf0194).

Data Availability Statement

This review work did not generate any new data.

Acknowledgments

The authors appreciate the support from the Supercomputing Center of Wuhan University. J. Z. thanks the support of this work from NVIDIA AI Technology Center (NVAITC).

Conflicts of Interest

The authors declare no conflict of interest.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures and Table

View Image - Figure 1. The development of ReaxFF [41]. Copyright 2016 Springer Nature.

View Image - Figure 2. The conversion from an initial configuration (a-C/Ni3C, a-C with 12 Ni, a-C, sp2-C/Ni3C, sp2-C with 12 Ni, and sp2-C) to a final model with different initial temperature [73]. Copyright 2016, Royal Society of Chemistry.

Figure 2. The conversion from an initial configuration (a-C/Ni3C, a-C with 12 Ni, a-C, sp2-C/Ni3C, sp2-C with 12 Ni, and sp2-C) to a final model with different initial temperature [73]. Copyright 2016, Royal Society of Chemistry.

View Image - Figure 3. Chirality changes during the growth of carbon nanotube [75]. Copyright 2011, American Chemical Society.

View Image - Figure 4. The relationship of hydrogen coverage and ethylene hydrogenation with the δ-MoC (001) catalyst through figurative imagery [78]. Copyright 2020, American Chemical Society.

View Image - Figure 5. Charge cycle and catalytic cycle for CO oxidation reaction [82]. Copyright 2013, ACS Publications.

View Image - Figure 6. Schematic of (a) crystal graph construction and (b) convolutional neural network [105]. Copyright 2018, APS Physics.

View Image - Figure 7. (a) A simple flow chart of machine learning potentials [129]. (b,c) Descriptors of empirical interatomic potentials and machine learning potentials. Two different models: (d) neural networks and (e) kernel methods. Copyright 2019, Advanced Materials.

Figure 7. (a) A simple flow chart of machine learning potentials [129]. (b,c) Descriptors of empirical interatomic potentials and machine learning potentials. Two different models: (d) neural networks and (e) kernel methods. Copyright 2019, Advanced Materials.

View Image - Figure 8. Schematic of supervised machine learning and unsupervised machine learning in catalysts [151]. Copyright 2019, ACS Publications.

Word count: 5482

Show less

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

Given the importance of catalysts in the chemical industry, they have been extensively investigated by experimental and numerical methods. With the development of computational algorithms and computer hardware, large-scale simulations have enabled influential studies with more atomic details reflecting microscopic mechanisms. This review provides a comprehensive summary of recent developments in molecular dynamics, including ab initio molecular dynamics and reaction force-field molecular dynamics. Recent research on both approaches to catalyst calculations is reviewed, including growth, dehydrogenation, hydrogenation, oxidation reactions, bias, and recombination of carbon materials that can guide catalyst calculations. Machine learning has attracted increasing interest in recent years, and its combination with the field of catalysts has inspired promising development approaches. Its applications in machine learning potential, catalyst design, performance prediction, structure optimization, and classification have been summarized in detail. This review hopes to shed light and perspective on ML approaches in catalysts.

Details

Title

Molecular Dynamics and Machine Learning in Catalysts

Author

Liu, Wenxiang¹; Zhu, Yang²; Wu, Yongqiang²; Chen, Cen³; Yang, Hong⁴; Yue, Yanan¹; Zhang, Jingchao⁵

; Hou, Bo⁶

¹ School of Power and Mechanical Engineering, Wuhan University, Wuhan 430072, China; [email protected]
² Weichai Power CO., Ltd., Weifang 261061, China; [email protected] (Y.Z.); [email protected] (Y.W.)
³ Firebird Biomolecular Sciences LLC, Alachua, FL 32615, USA; [email protected]
⁴ School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA 30332, USA; [email protected]
⁵ NVIDIA AI Technology Center (NVAITC), Santa Clara, CA 95051, USA
⁶ School of Physics and Astronomy, Cardiff University, The Parade, Cardiff CF24 3AA, Wales, UK

First page

1129

Publication year

2021

Publication date

2021

Publisher

MDPI AG

e-ISSN

20734344

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.3390/catal11091129

ProQuest document ID

2576383687

Molecular Dynamics and Machine Learning in Catalysts

Jump to:

Full text

Abstract

Details

Suggested sources