1. Introduction
Soil analysis holds paramount significance across diverse domains such as agriculture, ecology, environmental science, and sustainable development. The physical properties of soil, particularly the soil texture, play a significant role in determining moisture, nutrients, permeability, and compaction, directly influencing the selection and growth conditions of crops [1]. Concretely speaking, on the one hand, since different crops have varying requirements for soil texture, by understanding the soil texture, farmers can select the most suitable crops for their land, leading to better yields and more efficient use of resources. On the other hand, it is known that soil texture influences nutrient retention, drainage, and aeration. The detection of the texture helps agronomists recommend appropriate fertilizers and amendments, optimizing soil fertility and reducing environmental impact. Additionally, by optimizing crop selection, resource use, and management practices based on soil texture, farmers and agronomists can increase their profitability while minimizing costs, leading to more sustainable farming operations. However, conventional laboratory-based soil texture detection mostly heavily depends on soil physical examination techniques. Particularly, the soil samples are first dried in the air, crushed, and filtered, and then the hydrometer method is employed to measure the particle size by allowing the particles to settle under the force of gravity [2]. Despite their favorable detection quality, most of these techniques are typically faced with two major limitations. Firstly, those methods generally involve a significant cost, a slow speed, a low level of efficiency, and a high degree of complexity. Secondly, soil texture analysis in general utilizes laser particle size analyzers and diffraction particle size analyzers [3]. The expensive nature of the equipment and its intricate functioning cast their practical applications into doubt.
To overcome the above drawbacks, researchers have turned to novel sensors and advanced techniques for rapid, precise, and reliable soil texture detection. Recently, increasing attention has been given to the research on proximal sensor [4], which has several properties that can rapidly and accurately detect soil texture. Spectroscopy has been used to identify soil physical characteristics, and various studies have demonstrated its potential in detecting soil texture. Precisely, Viscarra Rossel et al. [5] used a γ-ray spectrometer to identify soil texture, revealing a significant correlation between gamma-ray spectral data and soil texture. It was shown by Villas-Boas et al. [6] that the distribution of various particle sizes in soil texture can be quickly identified through laser-induced breakdown spectroscopy. Utilizing diffuse reflectance spectroscopy to gather spectral data, Vasava et al. [7] adopted local weighted least squares to construct a soil texture detection model that successfully detected soil texture. By combining visible and near-infrared spectroscopy techniques, Davari and Jaconi et al. [8,9] yielded promising results in the detection of clay. Silva and Benedet et al. [10,11] exploited portable X-ray fluorescence spectroscopy and machine learning models to precisely identify the composition of clay, silt, and sand particles in the soil.
The utilization of electromagnetic induction technology is extensive in inferring and mapping soil parameters [12]. Benedetto et al. [13] applied ground-penetrating radar to examine soil texture and observed that the clay content in the soil fluctuates in response to the frequency of radiation. Researchers [14,15,16] have found that there is a direct correlation between magnetic susceptibility and soil texture. Additionally, they have determined that magnetization is an effective method for detecting clay particles in soil. The usefulness of quickly determining soil texture has been demonstrated by electromagnetic, induction, and spectral technologies. However, these methods have limited accuracy, even with the highly expensive monitoring equipment [17].
The advent of digital imaging technology [18] has enhanced the capabilities of cameras as close-range sensors for studying soil parameters. Machine vision can be employed to categorize, examine, and quantify soil composition. Researchers have demonstrated the correlation between image information and soil texture. For instance, Sudarsan et al. [19] gathered soil photos through the utilization of microscopes and discovered a correlation between the sand particles present in the soil and the S (saturation) and V (brightness) channels in the HSV color space of the photographs. In a similar vein, Simon et al. [20] discovered a positive correlation between the presence of sand particles in the soil and the CIELab color space through the analysis of soil pictures and color space. Furthermore, Ding et al. [21] determined that the sandy soil surface exhibits a notable degree of irregularity. The soil’s surface texture was examined by Jia et al. [22], and it was revealed that a smoother surface texture resulted from an increase in the clay content of the soil samples.
Soil texture is intimately associated with image data. In the application of image analysis and machine learning techniques for soil texture detection, researchers such as Sudarsan et al. [23] and Qi et al. [24] utilized portable digital microscopes to capture high-resolution images and established machine learning regression prediction models. These models indicated a certain correlation between image information and soil texture. Moreover, Swetha et al. [25] and Barman et al. [26] developed models to identify soil texture and captured soil images using cellphones, demonstrating their high efficacy in this area. Mirzaeitalarposhti et al. [27] utilize three machine learning models (i.e., the random forest (RF), the support vector machine (SVM), and extreme gradient boosting (XGB)) to estimate the soil texture fractions of satellite imagery. On our testing set, the prediction accuracy in terms of correlation coefficient and root mean square error on clay detection is 0.682 and 4.655, respectively, leaving room for improvement. Azadnia et al. [28] achieved an average classification accuracy of 98.6% in soil texture classification by constructing a combined model of convolutional neural networks (CNNs) and SVM for soil texture classification. Despite promising detection performances, there have been constrained implementation conditions for these works. These techniques only capture images of a limited range of soil sample types and have a restricted field of view. Additionally, the majority of them depend on conventional machine learning techniques, which can result in prolonged detection times and stringent environmental requirements. More recently, in [29], three image processing techniques, i.e., the HSV, RGB extraction, and adaptive histogram, are first employed to exploit the texture characteristics from the soil image. Then, a lightweight CNN architecture is customized for soil classification, yielding satisfactory accuracy.
Inspired by these studies, this research aims to develop a deep learning texture coding algorithm to detect soil texture by collecting various soil samples, taking photos, and customizing a deep learning texture coding algorithm. The building mainstays of the proposed algorithm are the texture encoding and frequency channel attention network (FcaNet). The texture encoding module focuses on exploiting the deep feature representations, while FcaNet aims to explore the feature correlations among the soil textures in the frequency domain, collaboratively facilitating the representation power of network architectures and enhancing detection performances. As we will explain in the experimental section, this approach enables the complete identification of soil texture across various soil types, improving the accuracy of detection.
The main contributions of this work are summarized as follows:
(1) We investigate the correlation between the texture and color characteristics of various soil samples, revealing that different soil types exhibit a strong relationship with distinct textures.
(2) Built upon this analysis, we propose a simple yet flexible soil texture detection model, where texture encoding and hierarchical multi-frequency attention mechanisms are embedded into a unified framework for effective soil texture identification.
(3) The proposed model demonstrates significant accuracy in predicting the clay, silt, and sand content properties. Notably, our model shows strong generalizability, enjoying high potential in practical scenarios.
2. Materials and Methods
2.1. Soil Sample Collection
The study of soil texture is focused on the eastern part of Guangdong and the Pearl River Delta, regions characterized by a tropical environment with high temperatures and elevated humidity. The average yearly temperature is approximately 20 °C, and the annual precipitation is about 2000 mm, conditions that significantly influence soil characteristics. The soil in the area exhibits a varied range of colors, including gray, yellow, white, brown, and red, with the latter being the most typical.
A total of 236 soil samples were collected during five time periods: March, May, and October 2022, and January and March 2023. As shown in Figure 1, most samples were gathered from the Pearl River Delta region, with 200 samples designated for training and testing the detection algorithm. Additionally, 36 samples were collected from Jieyang City in eastern Guangdong to assess the model’s capacity for accurately identifying soil texture. To ensure optimal planting conditions, various sites were selected, and surface debris was removed. Samples were obtained using a shovel at a depth of over 200 mm, and then placed in airtight bags and labeled appropriately.
To remove moisture, the collected soil samples were placed in a laboratory drying oven set at 100 °C, and the drying process continued until the samples reached an equilibrium weight. Following this, to exclude larger particles, the dehydrated samples were pulverized and filtered through a 2 mm screen. Ultimately, the soil was divided into two portions: one was sent to an external testing agency specializing in evaluating the soil’s mechanical composition, while the other was used in the laboratory to analyze soil images with machine vision-based features and to construct a model for detecting soil texture.
2.2. Image Acquisition System
The setup for capturing photographs is crucial for the analysis of soil samples. As illustrated in Figure 2, this system primarily consists of a Hikvision MV-CE200-10UC camera (Hangzhou Hikvision Digital Technology Co., Ltd., Hangzhou, China), a flat light source, a light source driver CST-DPS24200B-4TD (Dongguan Kangshida Automation Technology Co., Ltd., Dongguan, China), a detection box, and a laptop computer. The industrial camera has a resolution of 5472 × 3648, allowing it to capture high-quality photographs with precise and high-definition details. The flat light source produces natural and uniformly dispersed illumination, enhancing the quality of the images. The light source driver allows for the modification of the light source’s brightness by utilizing the panel buttons. The sensing box is coated with a condenser substance that ensures consistent light distribution and eliminates the influence of external light.
To capture soil surface images, a portion of each soil sample is randomly selected and placed in a circular container, which is then positioned on the detection platform within the testing chamber. The distance between the soil sample and the camera is maintained at 20 cm (within the industrial camera’s field of view, ranging from 18 cm to 22 cm). A total of six soil surface images are captured under six different lighting intensities (3000 lux, 3800 lux, 4600 lux, 5400 lux, 6200 lux, and 7000 lux), with each set of images collected three times for repeatability. To pinpoint the precise regions of interest among the gathered photos, segmentation must be carried out, as depicted in Figure 2. The inner connecting rectangle of the circular container, measuring 42 mm × 42 mm and with an image resolution of 1400 × 1400, is used for analyzing soil images and constructing a soil texture detection model.
2.3. Soil Surface Image Features
2.3.1. Color Features
Color features are intrinsic properties that depict the surface aspects of a particular area within a picture [30]. Different soil textures exhibit distinct color expressions, which can be analyzed using prominent color models such as the RGB model, the HSV model, and the CIELab model. The HSV model transforms color representation from the RGB model’s three-dimensional space into three distinct parameters, yielding a color description that is more intuitive and comprehensible. In this model, H denotes hue, S denotes saturation, and V denotes brightness. In contrast, the CIELab model is unaffected by lighting conditions and pigmentation, allowing for consistent color representation across different devices while minimizing the influence of external factors [31]. The model assigns the symbol L to represent brightness, a to represent the shift from green to red, and b to represent the transformation from blue to yellow.
This study employs the RGB model, the HSV model, and the CIELab model to examine and evaluate the color attributes of soil photographs. The average values of each channel from these color models are calculated to represent the color properties of the soil images, facilitating a comprehensive analysis.
2.3.2. Texture Features
Texture features in an image refer to the patterns and arrangements of pixels in relation to one another [30]. To analyze these texture features, various algorithms can be applied, such as the gray-level co-occurrence matrix, which is a commonly used method for representing picture texture information, as outlined by [32]. This matrix computes the global correlation between the grayscale values of neighboring pixels, focusing on the frequency distribution of pairs of adjacent pixels appearing together. Texture information in photographs is commonly characterized by five parameters: contrast, entropy, homogeneity, energy, and dissimilarity. The texture characteristics for different soil textures can be calculated using particular formulas as described below:
(1)
(2)
(3)
(4)
(5)
This work employs gray-level co-occurrence matrices to derive texture information from photographs of the soil surface. For this analysis, the five parameters—contrast, entropy, inverse variance, energy, and dissimilarity—are computed for four orientations: horizontal (0°), vertical (90°), and two diagonals (45° and 135°). The averages of these parameters across the four directions are utilized to represent and evaluate the textural aspects of the soil photographs.
2.4. Texture Exploiting Network
Generally, detecting the soil texture via a deep network model involves two steps. One is to design the network architecture and the other one is to model optimization from the training image samples. For the first step, we customize an effective CNN architecture with the frequency channel attention network and the texture encoding network and make the whole network suitable for soil texture detection. For the second step, we employ an end-to-end learning formulation and incorporate it with the L2 normalization to optimize the network parameters and improve the detection performance. In what follows, we proceed to elaborate on the proposed network in detail.
2.4.1. Frequency Channel Attention Network
During feature extraction, deep learning employs attention techniques to guarantee that the model concentrates on the most pertinent and distinctive aspects of soil pictures while generating regression predictions. Specifically, the frequency channel attention networks (FcaNet) play a crucial role in amplifying intricate features in an image by assigning greater importance to the high-frequency elements, allowing the model to prioritize the texture details present in the image [33,34].
The weighted feature representation is obtained by applying the frequency attention technique. Precisely, the frequency characteristics of the input data are multiplied by the corresponding frequency attention weight, which increases the model’s ability to concentrate on the soil image features at various frequencies. In the following equations, we define the components involved in the frequency attention process, which takes the following form:
(6)
where represents the 2D Discrete Cosine Transform (DCT), while denotes the input signal of the 2D feature map. The pixel coordinates of the image are encoded by and . The height and width of the image are designated by and , respectively. The frequencies of the basic cosine functions are defined by and . Notably, if both and equal 0, Equation (6) becomes(7)
where symbolizes the squared data, showing its impact on the frequency attention at the frequency indices (,) of (0, 0). indicates the process of calculating the average value of the input data in the spatial dimensions.Consequently, the process of converting data from the spatial domain to the frequency domain and extracting information from various frequency components of soil pictures can be summarized as follows:
(8)
where represents the frequency component information of the processed original feature map, and are the 2D indices of the frequency component . Following that, the various frequency components are combined to create a comprehensive frequency representation, as shown in the following formula:(9)
The frequency vector is ultimately subjected to a linear transformation, resulting in the creation of a new vector, defined as follows:
(10)
The output of the Sigmoid function can be interpreted as the significance or influence of each frequency location. Note that the Sigmoid function, or, say, activation function, allows the network to learn complex patterns in the data by introducing non-linearity into the model. It maps the output values to a range between 0 and 1, effectively characterizing probabilities. Values approaching 1 imply a greater influence of the characteristics at that frequency location on the model, while values nearing 0 indicate a lesser impact. The frequency vector is converted into an attention weight vector employing a linear transformation and an activation function. The output of the Sigmoid function can be interpreted as the significance or influence of each frequency of place of residence. Values approaching 1 imply a greater influence of the characteristics at that frequency location on the model, while values nearing 0 indicate a lesser impact. The attention weights serve as guidance for the model in allocating varying weights to characteristics based on their frequencies.
2.4.2. Texture Encoding Network
The texture encoding layer [35] has the purpose of encoding feature variables obtained through deep convolution. This allows the retrieval of unordered texture information. The encoding layer performs this task in a fully supervised manner. Given the feature variables obtained through deep convolution and a dictionary with learned encodings, the encoding can be calculated as , where = ; = . Each encoding can be formulized as follows:
(11)
By aggregating the feature encodings towards central points, the probability of each feature point to each central point is calculated, and the corresponding weights are assigned as follows:
(12)
where are the smoothing elements that can be learned. More texture information for soil image analysis can be obtained by encoding the convolutional network’s feature information using the texture encoding layer and assigning weights to each feature point.2.4.3. Bilinear Pooling
Bilinear pooling [36] is widely used in deep learning to enhance the representativeness and discriminativeness of the extracted feature variables. It combines the deep features obtained after convolution with the texture encoding features. The deep features capture spatial information, while the texture encoding features represent the unordered texture information. By combining these two types of features, bilinear pooling aims to improve the overall quality of the retrieved feature variables. The formula for bilinear pooling is as follows:
(13)
where represents the learnable weight that captures the interaction between the texture and spatial information in the deep features and texture encoding features .2.4.4. Deep Texture Encoding Residual Network
Regarding the representation of image features, the literature [24,25] has previously employed the visual bag-of-words methodology for encoding features and then using machine learning for prediction. Yet, the adoption of visual bag-of-words [37] feature encoding is gradually being substituted by more robust deep texture encoding. Deep texture encoding allows the acquisition of more extensive image characteristics without the requirement for manually creating visual dictionaries.
The soil images are conveyed to the convolutional network , as depicted in Figure 3. Deep features can be extracted from the soil images by applying a ResNet18 residual network [38], which is a deep learning architecture designed for efficient image classification. The reasons why we leverage ResNet18 for feature extraction are two-fold. Firstly, the ResNet framework enjoys several advantages, including (i) the ability to excavate multi-scale feature representations from the input for more powerful feature expression, enhancing feature expression; and (ii) the ability to enlarge the receptive field while simultaneously transmitting rich low-level information to deeper semantic features [39,40]. This approach helps to mitigate the vanishing and exploding gradient problems, thereby facilitating the training of deep networks. Secondly, in the experimental parts (please refer to Section 3.4.2), we conducted comparative experiments between the proposed network model with the AlexNet and VGG. Overall, one can observe that the ResNet framework achieved better performance than both the AlexNet and VGG in detecting the clay, silt, and sand content in the soil. In the proposed network model, we leveraged ResNet18 for the feature extraction backbone. More exquisite designs with ResNet50, ResNet101, or other ResNet series can further push the performance boundary. Nevertheless, such designs might entail more computational loads at runtime. The FcaNet is incorporated into the ResNet18 network after the max-pooling layer (with the purpose of reducing the network depth and parameters) to boost the network’s emphasis on the texture features found within the images. Thus, the feature results in for the soil, which can be expressed as . The last convolutional layer consists of two separate branches. One branch specifically serves as the layer responsible for encoding texture, with the attributes of this branch being disorganized and having the ability to retain more detailed information about soil texture. The texture encoding layer is derived from the variable representing the characteristics of the soil, and the encoding layer . The global average pooling of the ResNet18 residual network serves as another branch, which assists in preserving the deep feature information of the soil image, represented as .
Secondly, the fusion of the two feature layers is accomplished through the utilization of the bilinear pooling layer, resulting in the production of the fused feature . The feature variable is subsequently normalized via L2 normalization [41]. In the final stage, using a fully connective layer translates the feature variable into the model . This procedure facilitates the estimation of the proportions of clay, silt, and sand in the soil, which can be expressed as clay (%), silt (%), and sand (%) = .
2.4.5. Evaluation of Model Performance
In this study, 200 soil samples collected from the Pearl River Delta region were leveraged for experimental studies. These images were randomly split into an 80% training dataset and a 20% testing dataset. Additionally, a validation set consisting of 36 soil samples obtained from the Yue Dong region (Jieyang City) was used to test the model’s generalization. We then compared the proposed deep learning texture encoding model with several models (i.e., AlexNet, VGG, ResNet) for the purpose of establishing soil texture identification models. At the same time, analysis regarding the proposed model was conducted via the ablation performances.
The coefficient of determination () and root mean square error () were applied for performance evaluation. These metrics are commonly used to evaluate the precision and reliability of regression models. Note that a higher and a lower indicate greater efficacy of the model. The formulas for the metrics are outlined as follows:
(14)
(15)
where denotes the predicted soil composition; reflects the actual texture; signifies the total number of samples; and indicates the average value of the texture of the soil samples.3. Results and Discussion
3.1. Descriptive Statistics of Soil Properties
Table 1 presents the overview of descriptive statistics for the properties of the soil sample. The sand content exhibits a broad range, spanning from 7.6% to 99.1%, with an overall mean of 57.037% and a standard deviation of 17.593%. The silt percentage varies from 0.9% to 71.9%, with an average silt content of 33.034% and a standard deviation of 13.752%. The clay content fluctuates from 0% to 46.1%, with an average of 9.928% and a standard deviation of 7.713%. Most of the soil samples were collected from crop plots in Guangdong Province, where loam is the main soil type [42]. According to the USDA soil texture triangle map (Figure 4), the soil texture in this study can be categorized into four principal types: sandy soil, sandy loam, loam, and silt loam. These soil texture classifications are commonly used to describe the composition and properties of the soil, which are important for various agricultural and environmental applications.
3.2. Image Appearance Feature Analysis
The physical features of different types of soil are correlated with soil particle size. Figure 5 is created through grayscale 3D models, which are three-dimensional representations of the soil images using varying shades of gray, depicting three types of soil: sand, silt, and clay. The variation in texture and grayscale in the images is a result of the diverse soil textures. The grayscale intensity of soil exhibits greater variability when the soil sample has a higher proportion of sand, as observed by the grayscale 3D model.
3.3. Image Feature Analysis
An analysis was conducted using a Pearson correlation heatmap to investigate the relationship between soil texture parameters (clay, silt, and sand percentages) and the color and texture features of the images shown in Figure 6. The results demonstrate a moderate correlation between soil texture and certain image feature variables. The correlation between RGB, HSV, and CIELab color space attributes and soil texture parameters are relatively not very strong, and their correlation coefficients are all lower than 0.3. Nevertheless, the texture information attributes derived from the gray-level co-occurrence matrix show a moderate correlation with the soil texture parameters of interest. In particular, contrast, entropy, and dissimilarity exhibit a negative correlation with clay and silt content, while they show a positive correlation with sand content. Conversely, inverse variance and energy exhibit a positive correlation with clay and silt content and a negative correlation with sand content.
As the contrast, entropy, and dissimilarity in the gray-level co-occurrence matrix increase, the soil image displays more pronounced texture grooves, enhanced contrast, more intricate information, greater variations between pixels, and a more uneven surface [22]. This corresponds to a higher sand content in the soil. Increasing values of inverse variance and energy in the gray-level co-occurrence matrix imply a soil picture with a more uniform and fine texture, characterized by a smooth surface [22]. This corresponds to higher clay content in the soil. The diminished correlation between color attributes in the photographs and soil texture parameters can be ascribed to the abundance and intricacy of the soil samples. Soil color is determined by multiple factors, including soil texture as well as other aspects such as pH, mineral composition, and organic matter concentration. These characteristics, together with other influences, contribute to the overall soil color [43]. The tenuous correlation between color attributes and soil texture measurements can be ascribed to numerous factors.
3.4. Evaluation of the Deep Texture Encoding Residual Network Model
We train the proposed network model for 80 epochs and utilize Adam as the optimizer. The learning rate and image batch size are set as 0.0001 and 32, respectively. To mitigate overfitting and improve the generalization of the model, this work employs RMSELoss as the loss function and incorporates the regularized L2 norm to constrain the weights of the model. We conduct the experiments on the Windows 10 operating system with an i7-8700 CPU, an Nvidia GeForce GTX1060 graphics card, and 6GB of RAM. The deep learning framework is implemented on PyTorch 2.0.1, using CUDA 11.7 to enhance the training speed of the deep model.
3.4.1. Model Performance
As shown in Figure 7, the proposed deep texture coding network model achieves comparable performance on the test set. Specifically, the model achieves an value of 0.931 for detecting clay content in soil, with an of 2.106%. For silt content in soil, it achieves an value of 0.936, with an is 3.390%. The and of sand content in soil reached 0.957 and 3.602%, respectively. All the quantitative scores substantiate the promising accuracy and reliability of the proposed model in soil texture detection.
3.4.2. Comparison of Different Models on the Test Set
This section compares the performance of the proposed deep texture encoding network with AlexNet, VGG19, and ResNet18 on an identical test set. The and values across different network models are reported in Table 2.
As can be seen, the proposed deep texture encoding network regularly achieves superior performance compared to AlexNet, VGG19, and ResNet18 in detecting the clay, silt, and sand content in the soil. This can be attributed to the potential loss of detailed information during image feature extraction by AlexNet, VGG19, and ResNet18. This loss of information leads to the creation of a considerable number of redundant features, which in turn can cause excessive smoothing and blurring of texture information, and, consequently, accurately distinguishing between different texture features becomes challenging [44]. The combination of FcaNet and texture coding modules provides more efficient and in-depth feature extraction capabilities. FcaNet can effectively capture and emphasize important features in images by applying attention mechanisms in the frequency domain, especially those complex textures and patterns that may be difficult to capture in traditional spatial domain methods. This makes the network more effective in processing global features and local details of images, thus improving the accuracy of soil texture detection. The texture coding module provides a rich and complex feature representation by capturing texture features in images. This module can capture and encode more detailed image texture information so that the network can better distinguish visually similar but different categories of objects.
In summary, the combination of FcaNet and texture coding modules allows the proposed network to deeply explore the texture and frequency characteristics of images. This, combined with the effective use of the network’s deep structure, enables the model to achieve higher accuracy in soil texture detection.
3.4.3. Ablation Study
In this subsection, we conduct several ablation studies to investigate the efficacy of each individual module within the network structure. The results are reported in Table 2.
Table 2 demonstrates a notable improvement in the precision of clay identification when incorporating the frequency attention mechanism into the ResNet18 network. This enhancement results in a reduction of 0.648% in the compared to the original ResNet18 network. When texture encoding is incorporated into the ResNet18 network, the accuracy of clay identification is greatly enhanced, resulting in a reduction of 1.114% in the compared to the original ResNet18 network. There are also small enhancements in the precision of silt and sand forecasts. The detection performance is optimized when both the frequency attention mechanism and texture encoding are jointly added into the ResNet18 network, as compared to the original ResNet18 network. Regarding clay detection, the value increases by 0.099, while the value decreases by 1.186%. When detecting silt, the value increases by 0.043, while the value decreases by 1.026%. Regarding sand identification, there was a 0.041 improvement in the value, while the decreased by 1.407%.
The ablation experiments conducted on the test dataset indicate that our proposed strategy is effective and contributes to augmenting the model’s performance. Convolutional networks for feature extraction would lead to the loss of high-frequency features [45], while the model can effectively prioritize information at different frequencies in the input data using the frequency attention mechanism. Meanwhile, the incorporation of the texture encoding model enhances the network’s ability to understand and depict the texture details present in the images [33]. This allows the network to more accurately capture intricate details in the images. According to the test results, the improved network model works better than the configuration that only uses ResNet18, only uses FcaNet, or only uses the texture coding module. This is attributed to the complementing attributes and synergistic-enhancing effect of the two modules. When the two modules are integrated with ResNet18, they collaborate harmoniously, strengthening both the network’s ability to record various frequency and texture information and its overall learning ability. This integrated setup contributes to a more thorough utilization of the spatial and frequency information of the image, facilitating higher accuracy in detecting soil texture.
3.4.4. Generalization Testing of the Model
The generalization capability of the proposed deep texture encoding network is evaluated using the validation set, which is composed of 36 soil samples obtained from the eastern Guangdong region (specifically Jieyang City). The complete procedure for soil image collection and detection adheres to the same stages outlined in Section 2. The results are listed in Figure 8 and Figure 9.
As can be seen, when soil samples from various regions are chosen to verify the generalization of the model, the value reaches 0.855 and the was 1.094% for detecting soil clay content. This accuracy is slightly lower than that achieved in the model test, possibly because of the variation in soil composition and the sensitivity of detecting fine clay particles. The detection process contains some flaws; however, the remains below the permissible range specified by the laboratory’s criteria for evaluating clay content in soil texture evaluation, which is ≤3% absolute deviation. The value for detecting loam soil content was 0.931, with an of 2.929%. For detecting sand soil content, the value was 0.947, with an of 2.871%. The test’s accuracy in loam and sandy soil is comparable to its accuracy on the test set. To summarize, the proposed network model demonstrates strong generalization capabilities for detecting soil texture, suggesting high potential in practical applications, particularly in analyzing various soil textures.
4. Conclusions
In this study, by examining the correlation between the texture and color characteristics of soil pictures and soil texture, we observe that different soil types present varying textures. Built upon this finding, we propose a deep learning model to enhance the representation and capture of delicate texture information in photos for effective soil texture identification. This model is built on texture encoding and frequency channel attention networks, with the goal of improving detection performances. Experimental results demonstrate the favorable performances of the proposed network on soil texture detection. More correctly, the clay content identification in the test set has an value of 0.931 and an of 2.106%. While silt content detection has an value of 0.936 and an of 3.390%. The sand content detection has an value of 0.957 and an of 3.602%. In the generalization testing of soil samples collected from different regions, the value for detecting clay content is 0.855, with an of 1.094%. The value for detecting silt content was 0.931, with an of 2.929%. Lastly, the value for detecting sand content was 0.947, with an of 2.871%. As can be seen, all errors are within a reasonable range, suggesting a robustness and generalization ability in detecting soil texture and, meanwhile, substantiating its high potential in practical usage.
Conceptualization, R.M., J.J. and H.X.; methodology, R.M., J.J., J.H. and L.Q.; software, J.J., L.O., Q.Y. and S.W.; validation, R.M. and H.X.; formal analysis, L.O.; investigation, S.W.; data curation, Q.Y., L.Q. and J.H.; writing—original draft preparation, R.M., J.J. and J.D.; writing—review and editing, Q.Y., S.W., H.X. and J.H.; visualization, L.O. and S.W.; supervision, L.Q., H.X. and J.H. All authors have read and agreed to the published version of the manuscript.
The data presented in this study are available on request from the corresponding author.
The authors declare no conflicts of interest.
Footnotes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Figure 1. Soil sample distribution. (a) Soil collection regions in Guangdong Province. (b) Number of soil samples collected in each region.
Figure 2. Image acquisition platform. From left to right: image acquisition platform, image acquisition setup, and regions of interest of the soil sample.
Figure 3. The architecture of the proposed deep learning texture encoding network model.
Figure 4. Diagram of the United States Department of Agriculture soil texture triangle.
Figure 5. Soil surface image and grayscale 3D model diagram: (a) sand surface image; (b) silt surface image; (c) clay surface image; (d) sand grayscale image 3D model diagram; (e) silt grayscale image 3D model diagram; (f) clay grayscale image 3D model diagram. Please note that (d–f) reflect the degree of change in grayscale values of three different types of soil, where the higher proportion of sand, the greater change of amplitude it suggests.
Figure 6. Thermal maps of soil clay, silt, sand, and soil image color and texture.
Figure 7. Performance of the ResNet18-FcaNet-Texture coding network on the test set. (a) Scatterplot of clay between the values measured by the hydrometer method and the values predicted by the method proposed in this study in the test set. (b) Scatterplot of silt between the values measured by the hydrometer method and the values predicted by the method proposed in this study in the test set. (c) Scatterplot of sand between the values measured by the hydrometer method and the values predicted by the method proposed in this study in the test set.
Figure 8. Prediction accuracy on different network models. (a) Prediction accuracy in terms of [Forumla omitted. See PDF.]. (b) Prediction accuracy in terms of [Forumla omitted. See PDF.].
Figure 9. Performance of the ResNet18-FcaNet-Texture coding network on the verification set: (a) scatterplot of clay between the values measured by the hydrometer method and the values predicted by the method proposed in this study in the test set; (b) scatterplot of silt between the values measured by the hydrometer method and the values predicted by the method proposed in this study in the test set; (c) scatterplot of sand between the values measured by the hydrometer method and the values predicted by the proposed method in the test set.
Descriptive statistics of soil properties.
Soil Property | Min (%) | Max (%) | Mean (%) | SD (%) |
---|---|---|---|---|
Sand | 7.6 | 99.1 | 57.037 | 17.593 |
Silt | 0.9 | 71.9 | 33.034 | 13.752 |
Clay | 0 | 46.1 | 9.928 | 7.713 |
Ablation study on different components of the proposed model.
Models | Soil Type | | |
---|---|---|---|
ResNet18 | Clay | 0.832 | 3.292 |
ResNet18-FcaNet | Clay | 0.891 | 2.644 |
Silt | 0.896 | 4.328 | |
Sand | 0.912 | 5.021 | |
ResNet18-Texture coding | Clay | 0.926 | 2.178 |
Silt | 0.933 | 3.477 | |
Sand | 0.947 | 4.026 | |
ResNet18-FcaNet-Texture coding | Clay | 0.931 | 2.106 |
Silt | 0.936 | 3.390 | |
Sand | 0.957 | 3.602 |
References
1. Yu, H.; Zou, W.; Chen, J.; Chen, H.; Yu, Z.; Huang, J.; Tang, H.; Wei, X.; Gao, B. Biochar amendment improves crop production in problem soils: A review. J. Environ. Manag.; 2019; 232, pp. 8-21. [DOI: https://dx.doi.org/10.1016/j.jenvman.2018.10.117] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/30466010]
2. Kakuturu, S.P.; Xiao, M.; Kinzel, M. Effects of maximum particle size on the results of hydrometer tests on soils. Geotech. Test. J.; 2019; 42, pp. 945-965. [DOI: https://dx.doi.org/10.1520/GTJ20170236]
3. Yang, Y.; Wang, L.; Wendroth, O.; Liu, B.; Cheng, C.; Huang, T.; Shi, Y. Is the laser diffraction method reliable for soil particle size distribution analysis?. Soil Sci. Soc. Am. J.; 2019; 83, pp. 276-287. [DOI: https://dx.doi.org/10.2136/sssaj2018.07.0252]
4. Viscarra Rossel, R.A.; Adamchuk, V.I.; Sudduth, K.A.; Mckenzie, N.J.; Lobsey, C. Proximal soil sensing: An effective approach for soil measurements in space and time. Adv. Agron.; 2011; 113, pp. 243-291.
5. Viscarra Rossel, R.A.; Taylor, H.J.; Mcbratney, A.B. Multivariate calibration of hyperspectral γ-ray energy spectra for proximal soil sensing. Eur. J. Soil Sci.; 2007; 58, pp. 343-353. [DOI: https://dx.doi.org/10.1111/j.1365-2389.2006.00859.x]
6. Villas-Boas, P.R.; Romano, R.A.; de Menezes Franco, M.A.; Ferreira, E.C.; Ferreira, E.J.; Crestana, S.; Milori, D.M.B.P. Laser-induced breakdown spectroscopy to determine soil texture: A fast analytical technique. Geoderma; 2016; 263, pp. 195-202. [DOI: https://dx.doi.org/10.1016/j.geoderma.2015.09.018]
7. Vasava, H.B.; Gupta, A.; Arora, R.; Das, B.S. Assessment of soil texture from spectral reflectance data of bulk soil samples and their dry-sieved aggregate size fractions. Geoderma; 2019; 337, pp. 914-926. [DOI: https://dx.doi.org/10.1016/j.geoderma.2018.11.004]
8. Davari, M.; Karimi, S.A.; Bahrami, H.A.; Taher Hossaini, S.M.; Fahmideh, S. Simultaneous prediction of several soil properties related to engineering uses based on laboratory Vis-NIR reflectance spectroscopy. Catena; 2021; 197, 104987. [DOI: https://dx.doi.org/10.1016/j.catena.2020.104987]
9. Jaconi, A.; Vos, C.; Don, A. Near infrared spectroscopy as an easy and precise method to estimate soil texture. Geoderma; 2019; 337, pp. 906-913. [DOI: https://dx.doi.org/10.1016/j.geoderma.2018.10.038]
10. Benedet, L.; Faria, W.M.; Silva, S.H.G.; Mancini, M.; Demattê, J.A.M.; Guilherme, L.R.G.; Curi, N. Soil texture prediction using portable X-ray fluorescence spectrometry and visible near-infrared diffuse reflectance spectroscopy. Geoderma; 2020; 376, 114553. [DOI: https://dx.doi.org/10.1016/j.geoderma.2020.114553]
11. Silva, S.H.G.; Weindorf, D.C.; Pinto, L.C.; Faria, W.M.; Acerbi Junior, F.W.; Gomide, L.R.; de Mello, J.M.; de Pádua Junior, A.L.; de Souza, I.A.; Teixeira, A.F.D.S. et al. Soil texture prediction in tropical soils: A portable X-ray fluorescence spectrometry approach. Geoderma; 2020; 362, 114136. [DOI: https://dx.doi.org/10.1016/j.geoderma.2019.114136]
12. Beucher, A.; Koganti, T.; Iversen, B.V.; Greve, M.H. Mapping of peat thickness using a Multi-Receiver electromagnetic induction instrument. Remote Sens.; 2020; 12, 2458. [DOI: https://dx.doi.org/10.3390/rs12152458]
13. Benedetto, F.; Tosti, F. GPR spectral analysis for clay content evaluation by the frequency shift method. J. Appl. Geophys.; 2013; 97, pp. 89-96. [DOI: https://dx.doi.org/10.1016/j.jappgeo.2013.03.012]
14. Andrade, R.; Silva, S.H.G.; Faria, W.M.; Poggere, G.C.; Barbosa, J.Z.; Guilherme, L.R.G.; Curi, N. Proximal sensing applied to soil texture prediction and mapping in Brazil. Geoderma Reg.; 2020; 23, e321. [DOI: https://dx.doi.org/10.1016/j.geodrs.2020.e00321]
15. César De Mello, D.; Demattê, J.A.M.; Silvero, N.E.Q.; Di Raimo, L.A.D.L.; Poppiel, R.R.; Mello, F.A.O.; Souza, A.B.; Safanelli, J.L.; Resende, M.E.B.; Rizzo, R. Soil magnetic susceptibility and its relationship with naturally occurring processes and soil attributes in pedosphere, in a tropical environment. Geoderma; 2020; 372, 114364. [DOI: https://dx.doi.org/10.1016/j.geoderma.2020.114364]
16. Filla, V.A.; Coelho, A.P.; Ferroni, A.D.; Bahia, A.S.R.D.; Marques Júnior, J. Estimation of clay content by magnetic susceptibility in tropical soils using linear and nonlinear models. Geoderma; 2021; 403, 115371. [DOI: https://dx.doi.org/10.1016/j.geoderma.2021.115371]
17. Hartemink, A.E.; Minasny, B. Towards digital soil morphometrics. Geoderma; 2014; 230–231, pp. 305-317. [DOI: https://dx.doi.org/10.1016/j.geoderma.2014.03.008]
18. Mishra, V.K.; Kumar, S.; Shukla, N. Image acquisition and techniques to perform image acquisition. Samriddhi; 2017; 9, pp. 21-24. [DOI: https://dx.doi.org/10.18090/samriddhi.v9i01.8333]
19. Sudarsan, B.; Ji, W.; Biswas, A.; Adamchuk, V. Microscope-based computer vision to characterize soil texture and soil organic matter. Biosyst. Eng.; 2016; 152, pp. 41-50. [DOI: https://dx.doi.org/10.1016/j.biosystemseng.2016.06.006]
20. Simon, T.; Zhang, Y.; Hartemink, A.E.; Huang, J.; Walter, C.; Yost, J.L. Predicting the color of sandy soils from Wisconsin, USA. Geoderma; 2020; 361, 114039. [DOI: https://dx.doi.org/10.1016/j.geoderma.2019.114039]
21. Ding, W.; Huang, C. Effects of soil surface roughness on interrill erosion processes and sediment particle size distribution. Geomorphology; 2017; 295, pp. 801-810. [DOI: https://dx.doi.org/10.1016/j.geomorph.2017.08.033]
22. Jia, S.; Li, H.; Wu, X.; Li, Q. Laboratory-based hyperspectral image analysis for the classification of soil texture. J. Appl. Remote Sens.; 2019; 13, 046508. [DOI: https://dx.doi.org/10.1117/1.JRS.13.046508]
23. Sudarsan, B.; Ji, W.; Adamchuk, V.; Biswas, A. Characterizing soil particle sizes using wavelet analysis of microscope images. Comput. Electron. Agric.; 2018; 148, pp. 217-225. [DOI: https://dx.doi.org/10.1016/j.compag.2018.03.019]
24. Qi, L.; Adamchuk, V.; Huang, H.; Leclerc, M.; Jiang, Y.; Biswas, A. Proximal sensing of soil particle sizes using a microscope-based sensor and bag of visual words model. Geoderma; 2019; 351, pp. 144-152. [DOI: https://dx.doi.org/10.1016/j.geoderma.2019.05.020]
25. Swetha, R.K.; Bende, P.; Singh, K.; Gorthi, S.; Biswas, A.; Li, B.; Weindorf, D.C.; Chakraborty, S. Predicting soil texture from smartphone-captured digital images and an application. Geoderma; 2020; 376, 114562. [DOI: https://dx.doi.org/10.1016/j.geoderma.2020.114562]
26. Barman, U.; Choudhury, R.D. Soil texture classification using multi class support vector machine. Infor. Proc. Agric.; 2020; 7, pp. 318-332. [DOI: https://dx.doi.org/10.1016/j.inpa.2019.08.001]
27. Mirzaeitalarposhti, R.; Shafizadeh-Moghadam, H.; Taghizadeh-Mehrjardi, R.; Demyan, M.S. Digital Soil Texture Mapping and Spatial Transferability of Machine Learning Models Using Sentinel-1, Sentinel-2, and Terrain-Derived Covariates. Remote Sens.; 2022; 14, 5909. [DOI: https://dx.doi.org/10.3390/rs14235909]
28. Azadnia, R.; Jahanbakhshi, A.; Rashidi, S.; Khajehzadeh, M.; Bazyar, P. Developing an automated monitoring system for fast and accurate prediction of soil texture using an image-based deep learning network and machine vision system. Measurement; 2022; 190, 110669. [DOI: https://dx.doi.org/10.1016/j.measurement.2021.110669]
29. Kiran Pandiri, D.N.; Murugan, R.; Goel, T. Smart soil image classification system using lightweight convolutional neural network. Expert Syst. Appl.; 2024; 238, 122185. [DOI: https://dx.doi.org/10.1016/j.eswa.2023.122185]
30. Gonzaez, E.; Bianconi, F.; Lvarez, M.X.A.; Saetta, S.A. Automatic characterization of the visual appearance of industrial materials through colour and texture analysis: An overview of methods and applications. Adva. Opt. Technol.; 2013; 2013, 503541.
31. Sharma, G. Color Fundamentals for Digital Imaging, Digital Color Imaging Handbook; CRC Press: Boca Raton, FL, USA, 2017; pp. 1-114.
32. Haralick, R.M.; Shanmugam, K.; Dinstein, I.H. Textural features for image classification. IEEE Trans. Syst. Man Cyber.; 1973; 3, pp. 610-621. [DOI: https://dx.doi.org/10.1109/TSMC.1973.4309314]
33. Qin, Z.; Zhang, P.; Wu, F.; Li, X. Fcanet: Frequency channel attention networks. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV); Montreal, BC, Canada, 11–17 October 2021; pp. 783-792.
34. Ma, R.; Zhang, Y.; Zhang, B.; Fang, L.; Huang, D.; Qi, L. Learning Attention in the Frequency Domain for Flexible Real Photograph Denoising. IEEE Trans. Image Process.; 2024; 33, pp. 3707-3721. [DOI: https://dx.doi.org/10.1109/TIP.2024.3404253] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/38809730]
35. Zhang, H.; Xue, J.; Dana, K. Deep ten: Texture encoding network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); Honolulu, HI, USA, 21–26 July 2017; pp. 708-717.
36. Tsung-Yu, L.; Maji, S. Improved Bilinear Pooling with CNNs; Cornell University Library: Ithaca, NY, USA, 2017.
37. Zhang, Y.; Jin, R.; Zhou, Z. Understanding bag-of-words model: A statistical framework. Inter. J. Mach. Learn. Cyber.; 2010; 1, pp. 43-52. [DOI: https://dx.doi.org/10.1007/s13042-010-0001-0]
38. Targ, S.; Almeida, D.; Lyman, K. Resnet in resnet: Generalizing residual architectures. arXiv; 2016; arXiv: 1603.08029
39. Ma, R.; Li, S.; Zhang, B.; Fang, L.; Li, Z. Flexible and Generalized Real Photograph Denoising Exploiting Dual Meta Attention. IEEE Trans. Cybern.; 2023; 53, pp. 6395-6407. [DOI: https://dx.doi.org/10.1109/TCYB.2022.3170472]
40. Ma, R.; Zhang, B.; Zhou, Y.; Li, Z.; Lei, F. PID Controller Guided Attention Neural Network Learning for Fast and Effective Real Photographs Denoising. IEEE Trans. Neural Netw. Learn. Syst.; 2022; 33, pp. 3010-3023. [DOI: https://dx.doi.org/10.1109/TNNLS.2020.3048031]
41. Ma, R.; Li, S.; Zhang, B.; Li, Z. Towards Fast and Robust Real Image Denoising with Attentive Neural Network and PID Controller. IEEE Trans. Multimed.; 2022; 24, pp. 2366-2377. [DOI: https://dx.doi.org/10.1109/TMM.2021.3079697]
42. Wang, R.; Zou, R.; Liu, J.; Liu, L.; Hu, Y. Spatial distribution of soil nutrients in farmland in a hilly region of the pearl river delta in China based on geostatistics and the inverse distance weighting method. Agriculture; 2021; 11, 50. [DOI: https://dx.doi.org/10.3390/agriculture11010050]
43. Schulze, D.G.; Nagel, J.L.; Van Scoyoc, G.E.; Henderson, T.L.; Baumgardner, M.F.; Stott, D.E. Significance of organic matter in determining soil colors. Soil Color; 1993; 31, pp. 71-90.
44. Alibabaei, K.; Gaspar, P.D.; Lima, T.M.; Campos, R.M.; Girão, I.; Monteiro, J.; Lopes, C.M. A review of the challenges of using deep learning algorithms to support Decision-Making in agricultural activities. Remote Sens.; 2022; 14, 638. [DOI: https://dx.doi.org/10.3390/rs14030638]
45. Xu, K.; Qin, M.; Sun, F.; Wang, Y.; Chen, Y.; Ren, F. Learning in the frequency domain. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); Seattle, WA, USA, 14–19 June 2020; pp. 1740-1749.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Soil texture is a significant attribute of soil properties. Obtaining insight into the soil texture is beneficial when making agricultural decisions during production. Nevertheless, assessing the soil texture in specific laboratory conditions entails substantial dedication, which is time-consuming and includes a high cost. In this paper, we propose a soil texture detection network by embedding the frequency channel attention network and a texture encoding network into the representation learning paradigm of the ResNet framework. Concretely, the former is reliable in exploiting the feature correlations among multi-frequency, while the latter focuses on encoding feature variables, jointly enhancing the ability of feature expression. Meanwhile, the clay, silt, and sand particles present in the soil are exported through a ResNet18 fully linked layer. Experimental results show that the correlation coefficient for predicting clay, silt, and sand content are 0.931, 0.936, and 0.957, respectively. For the root mean square error, the quantitative scores are 2.106%, 3.390%, and 3.602%, respectively. The proposed network also exhibits proposing generalization capability, yielding quite considerable results on different soil samples. Notably, the detection results are almost in agreement with the conventional laboratory measurements, and, at the same time, outperform other competitors, making it highly attractive for practical applications.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details



1 College of Engineering, South China Agricultural University, Guangzhou 510642, China;
2 College of Water Conservancy and Civil Engineering, South China Agricultural University, Guangzhou 510642, China
3 School of Automobile and Construction Machinery, Guangdong Communication Polytechnic, Guangzhou 510650, China