Loading…

Potential Use of Data-Driven Models to Estimate and Predict Soybean Yields at National Scale in Brazil

Large-scale assessment of crop yields plays a fundamental role for agricultural planning and to achieve food security goals. In this study, we evaluated the robustness of data-driven models for estimating soybean yields at 120 days after sow (DAS) in the main producing regions in Brazil; and evaluat...

Full description

Saved in:

Bibliographic Details
Published in:	International journal of plant production 2022-12, Vol.16 (4), p.691-703
Main Authors:	Monteiro, Leonardo A., Ramos, Rafael M., Battisti, Rafael, Soares, Johnny R., Oliveira, Julianne C., Figueiredo, Gleyce K. D. A., Lamparelli, Rubens A. C., Nendel, Claas, Lana, Marcos Alberto
Format:	Article
Language:	English
Subjects:	Agricultural Science Agriculture Biomedical and Life Sciences Climatic and soil variables Geospatial and temporal variability Jordbruksvetenskap Large-scale analysis Life Sciences Machine learning approaches Plant Ecology Plant Physiology Public databases
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c406t-f0afc81162566f22fb587ebe518d187128b3a9d3847cef0cc2c3b71e832ed9c63
cites	cdi_FETCH-LOGICAL-c406t-f0afc81162566f22fb587ebe518d187128b3a9d3847cef0cc2c3b71e832ed9c63
container_end_page	703
container_issue	4
container_start_page	691
container_title	International journal of plant production
container_volume	16
creator	Monteiro, Leonardo A. Ramos, Rafael M. Battisti, Rafael Soares, Johnny R. Oliveira, Julianne C. Figueiredo, Gleyce K. D. A. Lamparelli, Rubens A. C. Nendel, Claas Lana, Marcos Alberto
description	Large-scale assessment of crop yields plays a fundamental role for agricultural planning and to achieve food security goals. In this study, we evaluated the robustness of data-driven models for estimating soybean yields at 120 days after sow (DAS) in the main producing regions in Brazil; and evaluated the reliability of the “best” data-driven model as a tool for early prediction of soybean yields for an independent year. Our methodology explicitly describes a general approach for wrapping up publicly available databases and build data-driven models (multiple linear regression—MLR; random forests—RF; and support vector machines—SVM) to predict yields at large scales using gridded data of weather and soil information. We filtered out counties with missing or suspicious yield records, resulting on a crop yield database containing 3450 records (23 years × 150 “high-quality” counties). RF and SVM had similar results for calibration and validation steps, whereas MLR showed the poorest performance. Our analysis revealed a potential use of data-driven models for predict soybean yields at large scales in Brazil with around one month before harvest (i.e. 90 DAS). Using a well-trained RF model for predicting crop yield during a specific year at 90 DAS, the RMSE ranged from 303.9 to 1055.7 kg ha –1 representing a relative error (rRMSE) between 9.2 and 41.5%. Although we showed up robust data-driven models for yield prediction at large scales in Brazil, there are still a room for improving its accuracy. The inclusion of explanatory variables related to crop (e.g. growing degree-days, flowering dates), environment (e.g. remotely-sensed vegetation indices, number of dry and heat days during the cycle) and outputs from process-based crop simulation models (e.g. biomass, leaf area index and plant phenology), are potential strategies to improve model accuracy.
doi_str_mv	10.1007/s42106-022-00209-0
format	article
fullrecord	<record><control><sourceid>swepub_cross</sourceid><recordid>TN_cdi_swepub_primary_oai_slubar_slu_se_119035</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>oai_research_chalmers_se_4456f599_82bd_4e82_8610_453a2d63b91c</sourcerecordid><originalsourceid>FETCH-LOGICAL-c406t-f0afc81162566f22fb587ebe518d187128b3a9d3847cef0cc2c3b71e832ed9c63</originalsourceid><addsrcrecordid>eNp9kctuFDEQRVsIJELgB1j5BzqUH-12LyEJEClApCELVlbZLhNHne7I9gSFr4-HGbGDVdWiztFV3a57y-GEA4zvihIcdA9C9AACph6edUd8lENvQMnnh10brl52r0q5BdBac3PUxau10lITzuy6EFsjO8OK_VlOD7SwL2ugubC6svNS0x1WYrgEdpUpJF_ZZn10hAv7kWgOhWFlX7GmdWmyjceZWFrYh4y_0_y6exFxLvTmMI-764_n308_95ffPl2cvr_svQJd-wgYveFci0HrKER0gxnJ0cBN4GbkwjiJU5BGjZ4ieC-8dCMnIwWFyWt53J3sveUX3W-dvc8tdX60KyZb5q3DvBu2kOV8Ajk0YPNPIFMhzP7G-huc7yiXHafUoOMwTdYIF6wiI6zRHKwaJIqgpZu4b1axt_q8lpIp_vW2y11hdl-YbYXZP4VZaJA8RGnHy0_K9nbd5vbM8j_qCdoTmRc</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Potential Use of Data-Driven Models to Estimate and Predict Soybean Yields at National Scale in Brazil</title><source>Springer Link</source><creator>Monteiro, Leonardo A. ; Ramos, Rafael M. ; Battisti, Rafael ; Soares, Johnny R. ; Oliveira, Julianne C. ; Figueiredo, Gleyce K. D. A. ; Lamparelli, Rubens A. C. ; Nendel, Claas ; Lana, Marcos Alberto</creator><creatorcontrib>Monteiro, Leonardo A. ; Ramos, Rafael M. ; Battisti, Rafael ; Soares, Johnny R. ; Oliveira, Julianne C. ; Figueiredo, Gleyce K. D. A. ; Lamparelli, Rubens A. C. ; Nendel, Claas ; Lana, Marcos Alberto ; Sveriges lantbruksuniversitet</creatorcontrib><description>Large-scale assessment of crop yields plays a fundamental role for agricultural planning and to achieve food security goals. In this study, we evaluated the robustness of data-driven models for estimating soybean yields at 120 days after sow (DAS) in the main producing regions in Brazil; and evaluated the reliability of the “best” data-driven model as a tool for early prediction of soybean yields for an independent year. Our methodology explicitly describes a general approach for wrapping up publicly available databases and build data-driven models (multiple linear regression—MLR; random forests—RF; and support vector machines—SVM) to predict yields at large scales using gridded data of weather and soil information. We filtered out counties with missing or suspicious yield records, resulting on a crop yield database containing 3450 records (23 years × 150 “high-quality” counties). RF and SVM had similar results for calibration and validation steps, whereas MLR showed the poorest performance. Our analysis revealed a potential use of data-driven models for predict soybean yields at large scales in Brazil with around one month before harvest (i.e. 90 DAS). Using a well-trained RF model for predicting crop yield during a specific year at 90 DAS, the RMSE ranged from 303.9 to 1055.7 kg ha –1 representing a relative error (rRMSE) between 9.2 and 41.5%. Although we showed up robust data-driven models for yield prediction at large scales in Brazil, there are still a room for improving its accuracy. The inclusion of explanatory variables related to crop (e.g. growing degree-days, flowering dates), environment (e.g. remotely-sensed vegetation indices, number of dry and heat days during the cycle) and outputs from process-based crop simulation models (e.g. biomass, leaf area index and plant phenology), are potential strategies to improve model accuracy.</description><identifier>ISSN: 1735-6814</identifier><identifier>ISSN: 1735-8043</identifier><identifier>EISSN: 1735-8043</identifier><identifier>DOI: 10.1007/s42106-022-00209-0</identifier><language>eng</language><publisher>Cham: Springer International Publishing</publisher><subject>Agricultural Science ; Agriculture ; Biomedical and Life Sciences ; Climatic and soil variables ; Geospatial and temporal variability ; Jordbruksvetenskap ; Large-scale analysis ; Life Sciences ; Machine learning approaches ; Plant Ecology ; Plant Physiology ; Public databases</subject><ispartof>International journal of plant production, 2022-12, Vol.16 (4), p.691-703</ispartof><rights>Springer Nature Switzerland AG 2022. Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c406t-f0afc81162566f22fb587ebe518d187128b3a9d3847cef0cc2c3b71e832ed9c63</citedby><cites>FETCH-LOGICAL-c406t-f0afc81162566f22fb587ebe518d187128b3a9d3847cef0cc2c3b71e832ed9c63</cites><orcidid>0000-0003-3889-6095</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,315,786,790,891,27957,27958</link.rule.ids><backlink>$$Uhttps://research.chalmers.se/publication/532111$$DView record from Swedish Publication Index$$Hfree_for_read</backlink><backlink>$$Uhttps://res.slu.se/id/publ/119035$$DView record from Swedish Publication Index$$Hfree_for_read</backlink></links><search><creatorcontrib>Monteiro, Leonardo A.</creatorcontrib><creatorcontrib>Ramos, Rafael M.</creatorcontrib><creatorcontrib>Battisti, Rafael</creatorcontrib><creatorcontrib>Soares, Johnny R.</creatorcontrib><creatorcontrib>Oliveira, Julianne C.</creatorcontrib><creatorcontrib>Figueiredo, Gleyce K. D. A.</creatorcontrib><creatorcontrib>Lamparelli, Rubens A. C.</creatorcontrib><creatorcontrib>Nendel, Claas</creatorcontrib><creatorcontrib>Lana, Marcos Alberto</creatorcontrib><creatorcontrib>Sveriges lantbruksuniversitet</creatorcontrib><title>Potential Use of Data-Driven Models to Estimate and Predict Soybean Yields at National Scale in Brazil</title><title>International journal of plant production</title><addtitle>Int. J. Plant Prod</addtitle><description>Large-scale assessment of crop yields plays a fundamental role for agricultural planning and to achieve food security goals. In this study, we evaluated the robustness of data-driven models for estimating soybean yields at 120 days after sow (DAS) in the main producing regions in Brazil; and evaluated the reliability of the “best” data-driven model as a tool for early prediction of soybean yields for an independent year. Our methodology explicitly describes a general approach for wrapping up publicly available databases and build data-driven models (multiple linear regression—MLR; random forests—RF; and support vector machines—SVM) to predict yields at large scales using gridded data of weather and soil information. We filtered out counties with missing or suspicious yield records, resulting on a crop yield database containing 3450 records (23 years × 150 “high-quality” counties). RF and SVM had similar results for calibration and validation steps, whereas MLR showed the poorest performance. Our analysis revealed a potential use of data-driven models for predict soybean yields at large scales in Brazil with around one month before harvest (i.e. 90 DAS). Using a well-trained RF model for predicting crop yield during a specific year at 90 DAS, the RMSE ranged from 303.9 to 1055.7 kg ha –1 representing a relative error (rRMSE) between 9.2 and 41.5%. Although we showed up robust data-driven models for yield prediction at large scales in Brazil, there are still a room for improving its accuracy. The inclusion of explanatory variables related to crop (e.g. growing degree-days, flowering dates), environment (e.g. remotely-sensed vegetation indices, number of dry and heat days during the cycle) and outputs from process-based crop simulation models (e.g. biomass, leaf area index and plant phenology), are potential strategies to improve model accuracy.</description><subject>Agricultural Science</subject><subject>Agriculture</subject><subject>Biomedical and Life Sciences</subject><subject>Climatic and soil variables</subject><subject>Geospatial and temporal variability</subject><subject>Jordbruksvetenskap</subject><subject>Large-scale analysis</subject><subject>Life Sciences</subject><subject>Machine learning approaches</subject><subject>Plant Ecology</subject><subject>Plant Physiology</subject><subject>Public databases</subject><issn>1735-6814</issn><issn>1735-8043</issn><issn>1735-8043</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNp9kctuFDEQRVsIJELgB1j5BzqUH-12LyEJEClApCELVlbZLhNHne7I9gSFr4-HGbGDVdWiztFV3a57y-GEA4zvihIcdA9C9AACph6edUd8lENvQMnnh10brl52r0q5BdBac3PUxau10lITzuy6EFsjO8OK_VlOD7SwL2ugubC6svNS0x1WYrgEdpUpJF_ZZn10hAv7kWgOhWFlX7GmdWmyjceZWFrYh4y_0_y6exFxLvTmMI-764_n308_95ffPl2cvr_svQJd-wgYveFci0HrKER0gxnJ0cBN4GbkwjiJU5BGjZ4ieC-8dCMnIwWFyWt53J3sveUX3W-dvc8tdX60KyZb5q3DvBu2kOV8Ajk0YPNPIFMhzP7G-huc7yiXHafUoOMwTdYIF6wiI6zRHKwaJIqgpZu4b1axt_q8lpIp_vW2y11hdl-YbYXZP4VZaJA8RGnHy0_K9nbd5vbM8j_qCdoTmRc</recordid><startdate>20221201</startdate><enddate>20221201</enddate><creator>Monteiro, Leonardo A.</creator><creator>Ramos, Rafael M.</creator><creator>Battisti, Rafael</creator><creator>Soares, Johnny R.</creator><creator>Oliveira, Julianne C.</creator><creator>Figueiredo, Gleyce K. D. A.</creator><creator>Lamparelli, Rubens A. C.</creator><creator>Nendel, Claas</creator><creator>Lana, Marcos Alberto</creator><general>Springer International Publishing</general><scope>AAYXX</scope><scope>CITATION</scope><scope>ADTPV</scope><scope>AOWAS</scope><scope>F1S</scope><orcidid>https://orcid.org/0000-0003-3889-6095</orcidid></search><sort><creationdate>20221201</creationdate><title>Potential Use of Data-Driven Models to Estimate and Predict Soybean Yields at National Scale in Brazil</title><author>Monteiro, Leonardo A. ; Ramos, Rafael M. ; Battisti, Rafael ; Soares, Johnny R. ; Oliveira, Julianne C. ; Figueiredo, Gleyce K. D. A. ; Lamparelli, Rubens A. C. ; Nendel, Claas ; Lana, Marcos Alberto</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c406t-f0afc81162566f22fb587ebe518d187128b3a9d3847cef0cc2c3b71e832ed9c63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Agricultural Science</topic><topic>Agriculture</topic><topic>Biomedical and Life Sciences</topic><topic>Climatic and soil variables</topic><topic>Geospatial and temporal variability</topic><topic>Jordbruksvetenskap</topic><topic>Large-scale analysis</topic><topic>Life Sciences</topic><topic>Machine learning approaches</topic><topic>Plant Ecology</topic><topic>Plant Physiology</topic><topic>Public databases</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Monteiro, Leonardo A.</creatorcontrib><creatorcontrib>Ramos, Rafael M.</creatorcontrib><creatorcontrib>Battisti, Rafael</creatorcontrib><creatorcontrib>Soares, Johnny R.</creatorcontrib><creatorcontrib>Oliveira, Julianne C.</creatorcontrib><creatorcontrib>Figueiredo, Gleyce K. D. A.</creatorcontrib><creatorcontrib>Lamparelli, Rubens A. C.</creatorcontrib><creatorcontrib>Nendel, Claas</creatorcontrib><creatorcontrib>Lana, Marcos Alberto</creatorcontrib><creatorcontrib>Sveriges lantbruksuniversitet</creatorcontrib><collection>CrossRef</collection><collection>SwePub</collection><collection>SwePub Articles</collection><collection>SWEPUB Chalmers tekniska högskola</collection><jtitle>International journal of plant production</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Monteiro, Leonardo A.</au><au>Ramos, Rafael M.</au><au>Battisti, Rafael</au><au>Soares, Johnny R.</au><au>Oliveira, Julianne C.</au><au>Figueiredo, Gleyce K. D. A.</au><au>Lamparelli, Rubens A. C.</au><au>Nendel, Claas</au><au>Lana, Marcos Alberto</au><aucorp>Sveriges lantbruksuniversitet</aucorp><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Potential Use of Data-Driven Models to Estimate and Predict Soybean Yields at National Scale in Brazil</atitle><jtitle>International journal of plant production</jtitle><stitle>Int. J. Plant Prod</stitle><date>2022-12-01</date><risdate>2022</risdate><volume>16</volume><issue>4</issue><spage>691</spage><epage>703</epage><pages>691-703</pages><issn>1735-6814</issn><issn>1735-8043</issn><eissn>1735-8043</eissn><abstract>Large-scale assessment of crop yields plays a fundamental role for agricultural planning and to achieve food security goals. In this study, we evaluated the robustness of data-driven models for estimating soybean yields at 120 days after sow (DAS) in the main producing regions in Brazil; and evaluated the reliability of the “best” data-driven model as a tool for early prediction of soybean yields for an independent year. Our methodology explicitly describes a general approach for wrapping up publicly available databases and build data-driven models (multiple linear regression—MLR; random forests—RF; and support vector machines—SVM) to predict yields at large scales using gridded data of weather and soil information. We filtered out counties with missing or suspicious yield records, resulting on a crop yield database containing 3450 records (23 years × 150 “high-quality” counties). RF and SVM had similar results for calibration and validation steps, whereas MLR showed the poorest performance. Our analysis revealed a potential use of data-driven models for predict soybean yields at large scales in Brazil with around one month before harvest (i.e. 90 DAS). Using a well-trained RF model for predicting crop yield during a specific year at 90 DAS, the RMSE ranged from 303.9 to 1055.7 kg ha –1 representing a relative error (rRMSE) between 9.2 and 41.5%. Although we showed up robust data-driven models for yield prediction at large scales in Brazil, there are still a room for improving its accuracy. The inclusion of explanatory variables related to crop (e.g. growing degree-days, flowering dates), environment (e.g. remotely-sensed vegetation indices, number of dry and heat days during the cycle) and outputs from process-based crop simulation models (e.g. biomass, leaf area index and plant phenology), are potential strategies to improve model accuracy.</abstract><cop>Cham</cop><pub>Springer International Publishing</pub><doi>10.1007/s42106-022-00209-0</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0003-3889-6095</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 1735-6814
ispartof	International journal of plant production, 2022-12, Vol.16 (4), p.691-703
issn	1735-6814 1735-8043 1735-8043
language	eng
recordid	cdi_swepub_primary_oai_slubar_slu_se_119035
source	Springer Link
subjects	Agricultural Science Agriculture Biomedical and Life Sciences Climatic and soil variables Geospatial and temporal variability Jordbruksvetenskap Large-scale analysis Life Sciences Machine learning approaches Plant Ecology Plant Physiology Public databases
title	Potential Use of Data-Driven Models to Estimate and Predict Soybean Yields at National Scale in Brazil
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-09-21T22%3A59%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-swepub_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Potential%20Use%20of%20Data-Driven%20Models%20to%20Estimate%20and%20Predict%20Soybean%20Yields%20at%20National%20Scale%20in%20Brazil&rft.jtitle=International%20journal%20of%20plant%20production&rft.au=Monteiro,%20Leonardo%20A.&rft.aucorp=Sveriges%20lantbruksuniversitet&rft.date=2022-12-01&rft.volume=16&rft.issue=4&rft.spage=691&rft.epage=703&rft.pages=691-703&rft.issn=1735-6814&rft.eissn=1735-8043&rft_id=info:doi/10.1007/s42106-022-00209-0&rft_dat=%3Cswepub_cross%3Eoai_research_chalmers_se_4456f599_82bd_4e82_8610_453a2d63b91c%3C/swepub_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c406t-f0afc81162566f22fb587ebe518d187128b3a9d3847cef0cc2c3b71e832ed9c63%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true