Predicción de calibres de bayas del arándano Vaccinium corymbosum L. empleando variables fenológicas y técnicas de aprendizaje automático

Dávila Arrasco, Francis Ruth; Vásquez Velasco, Christian Richard Alberto

Ver/

Dávila__AF.pdf (4.402Mb)

Dávila_AF_Reportesimilitud.pdf (998.0Kb)

Dávila_AF_Autorización.pdf (1.015Mb)

Fecha

2025-06-03

Autor

Dávila Arrasco, Francis Ruth

Vásquez Velasco, Christian Richard Alberto

Metadatos

Mostrar el registro completo del ítem

Resumen

Esta investigación se enfocó en predecir los calibres por semana de los frutos del cultivo de arándano utilizando datos fenológicos y algoritmos de aprendizaje automático. Se aplicó un preprocesamiento de datos y se identificaron las variables fenológicas más influyentes en los calibres. Para evaluar el rendimiento de los algoritmos, se compararon modelos de regresión lineal múltiple, Random Forest y Modelo Mixto Aditivo Generalizado (GAMM). Los resultados evidencian que el preprocesamiento, que empleó partición inicial por Hold-Out Cross-Validation para series de tiempo para generar dos validaciones (interna y externa) y validación cruzada por Time Series Cross-Validation permitió evaluar la generalización de los modelos. Fue relevante el uso de transformación por natural splines, generación de variables temporales y transformación de variables categóricas por One – Hot encoding. Adicionalmente, se halló que la variable fenológica del número de frutos cuajados por planta tuvo una relación significativa con el calibre de baya y resaltó la importancia de variables fenológicas. El modelo Random Forest (RF) obtuvo el mayor poder predictivo, con un MAPE de 2.44 ±0.13 % en fase de validación cruzada y contó con la capacidad de garantizar pronósticos con bajo sesgo dentro del rango de valores reales históricos del calibre. Se recomendó el uso de ingeniería de variables para optimizar el rendimiento predictivo de los algoritmos y emplear variables biométricas como el número de ramas terminales por planta, así como variables de riego, poda, fertilización y bioregulación. Además, se desaconseja el uso de variables climáticas debido a su complejidad en interacción con distintas variedades.

This research focused on predicting the weekly fruit caliber in blueberry cultivation using phenological data and machine learning algorithms. Data preprocessing was applied, and the most influential phenological variables affecting fruit caliber were identified. To evaluate algorithm performance, multiple regression models, Random Forest, and Generalized Additive Mixed Models (GAMM) were compared. The results demonstrate that the preprocessing approach, which utilized an initial partitioning through Hold-Out Cross-Validation for time series to generate two validation levels (internal and external), as well as Time Series Cross-Validation, enabled a robust assessment of model generalization. The use of natural spline transformations, temporal variable generation, and categorical variable transformation through One-Hot Encoding was particularly relevant. Additionally, it was found that the phenological variable representing the number of fruit set per plant had a significant relationship with berry caliber, highlighting the importance of phenological variables. The Random Forest (RF) model exhibited the highest predictive power, achieving a Mean Absolute Percentage Error (MAPE) of 2.44 ± 0.13% in the cross-validation phase, while also ensuring low-bias predictions within the historical range of fruit caliber values. The study recommends leveraging feature engineering to optimize the predictive performance of the algorithms, incorporating biometric variables such as the number of terminal branches per plant, as well as irrigation, pruning, fertilization, and bioregulation variables. Furthermore, the use of climatic variables is discouraged due to their complex interactions with different varieties.

URI

https://hdl.handle.net/20.500.12893/14782

Colecciones

Agronomía [294]

Excepto si se señala otra cosa, la licencia del ítem se describe como info:eu-repo/semantics/openAccess