Updated November 9, 2023:
Project report (first quarter Jan 1 2023 to March 31, 2023)
Project funded by North Central Soybean Research Program, sponsored by the Soy Checkoff
Project tile - Field phenotyping using machine learning tools integrated with genetic mapping
to address heat and drought induced flower abortion in soybean
Participating institutions – Texas Tech University, Kansas State University, University of Missouri, and University of Tennessee
Goals & Objectives
Long-term Goal – Develop soybean cultivars with 20 to 30% lower flower abortion under favorable to challenging environmental conditions, leading to about 10-15% increase in yield potential
Objectives (Year 1)
• Explore the genetic diversity in flower abortion under different soil moisture and climatic conditions using a large diversity panel
• Develop an image-based field phenotyping system and deep-learning tools to precisely document temporal dynamics in flower abortion and pod retention in genetically diverse soybeans
• Discover environmentally stable and region-specific genomic regions controlling flower abortion in diverse soil types, moisture, and climatic conditions
Progress achieved
Objective 1 - Explore the genetic diversity in flower abortion under different soil moisture and climatic conditions using a large diversity panel
A total of 350 diverse soybean lines were sent for winter nursery seed increase at Costa Rica in December 2022. They were planted in foundation seed increase plot (total of 150 ft row length for each line) to make sure enough seeds (5 lbs) is available for field planting at multiple locations in summer 2023. Among the 350 lines, 310 lines had good germination and plant stand in the seed multiplication field. We expect to receive sufficient seeds for these lines in late April for 2023 summer planting. Genetic diversity among the group 3 and 4s are targeted in terms of genetic structure
The 310 lines represents genetic diversity of the USDA soybean germplasm collection in maturity group III and IV. We have whole genome sequencing data for this set with an average sequencing coverage of 20x. Approximately, 0.6 million high quality SNPs and 0.5 million In/Del are available for robust GWAS to identify genetic loci and genes regulation of stress resilience and flower abortion in soybean. The average SNP and In/Del density together is about 1 marker/Kbp.
Preparation of field trails at multiple participating locations
The experimental site at the University of Missouri for this project is located in the Bradford Research Center (Columbia, MO). Three-acre field was reserved in the farm for this project. We will collect soil samples to identify basic soil properties. The field will be prepared for planting in April. The proposed ~310 diverse lines will be planted in mid-May to early-June, depending on the local weather.
The experimental site to evaluate the diversity panel under rain-fed conditions at Kansas State University will be located at the Agronomy North Farm near Manhattan, KS. Three and one-half acres have been reserved for planting the experiment. Field preparation for planting is underway and soil samples will be taken following planting. We expect to receive seed of the panel from the winter nursery in April or early May (shared by University of Missouri colleagues) with an expected planting date in May.
The experimental site in University of Tennessee that the experiment will be carried out will be located in West TN Research and Education Center (WTREC) under rainfed condition. We have secured a little over 2.5 acres for this study in 2023. The soil samples collection is in progress and detailed information will be documented about the field. The burndown will be done in a couple of weeks. We will be receiving 310 soybean lines seeds in from University of Missouri colleagues and planting will be done in early May.
The experiment will be conducted on the Quaker Avenue Research Farm at Texas Tech University in Lubbock, TX. The experiment will be carried out under sub-surface drip irrigation (SDI). Multiple irrigation zones have been obtained for this trail, which total to an area of 3 acres. Soil samples will be collected and analyzed along with documentation of the field history over prior years. Herbicide applications for burndown will be completed in April followed by a pre-emerge herbicide application in mid-May prior to planting of the ~310 soybean lines thereafter.
Objective 2 - Develop an image-based field phenotyping system and deep-learning tools to precisely document temporal dynamics in flower abortion and pod retention in genetically diverse soybeans
Before the field season begins the team has taken good advantage of greenhouse grown soybean plants and other existing datasets to develop a robust machine learning tool to detect flower number and rate of abortion under field conditions.
The team is implementing two general strategies for enumerating aborted flowers and has begun to apply them to greenhouse grown soybean plants.
1. Pre-abortion: Counting flowers on the plant and comparing the counts over time
2. Post-abortion: Collecting and counting aborted flowers over time
Strategy 1: We have developed a preliminary imaging protocol by which images of greenhouse plants are collected from multiple views and with high enough resolution (e.g., 4K x 6K) such that the smallest flowers are comprised of a minimum of 30 pixels. Our proposed strategy would then detect the flowers in two stages. (see page 3 for the in attached PDF for image)
a) Subsample acquired image and feed it to a node-detection network. Subsampling the original high-resolution image would make it possible for the detection network to ingest it without compromising image fidelity.
b) Having the nodes localized from the previous step, crop the original image, and feed the resulting high-resolution sub-images to a flower-detection network. This ensures that even the smallest flowers are comprised of a sufficiently large number of pixels and yet, the cropped input images are small enough for the network to ingest.
Node-Detection Network: As an initial approach to detecting nodes, we have employed the Faster R-CNN architecture. We started by pre-training our model with a dataset provided by the study in 2023 that focuses on detecting nodes on Eggplant, Chili, and Tomato plants. (see pages 4 to 9 for different images related to this network approach)
Future work includes:
1) Simplifying the annotation process for the new dataset, which contains three times more images than the previous one, by using the existing model's predictions (as shown in Figure 5) as preliminary annotations. Therefore, the annotators will primarily focus on refining the predicted bounding boxes and occasionally making additions or deletions. This approach will significantly accelerate the annotation process, which is essential for efficient model development.
2) Exploring and implementing other state-of-the-art network architectures that may be better suited and capable of achieving superior performance for our application.
3) Associating the model predictions with the ground truth flower and node data to ascertain the efficiency of the model predictions and the extent of refinement needed for models to be precise to allow for deployment under field conditions.
Flower Detection Network: Similar to the node detection network, the flower detection network is also based on the Faster R-CNN architecture. Specifically, we used the Faster R-CNN implementation available in Detectron2 (a library containing state-of-the-art detection and segmentation algorithms made publicly available by Facebook AI Research). We trained an initial model based on a dataset published by Zhu et al. (2022). A summary of the dataset, the statistics on the training/validation/test subsets and all related images and tables can be found in Page 10 to 15
Future work includes:
1) Fine-tuning the original model trained on images from Zhu et al. (2022) to images selected from our images to ensure the model performs well on our images and is robust to variations in image resolution and other image variations (e.g., images with smaller or larger number of flowers, images with more or less leaves, etc.)
2) Exploring and implementing other state-of-the-art network architectures (e.g., YOLOv7) that may be better suited and capable of achieving superior performance for our application.
Strategy 2: We have developed a preliminary imaging protocol by which the aborted flowers from greenhouse plants are collected, imaged, and annotated on capture plates. The approach of capturing the aborted flowers and quantifying them and related images can be found in the attached PDF see pages 16 to 19.
Annotated images are used to train a network for aborted flower detection and counting. The network used is also a Faster R-CNN network available in Detectron2. To gain an understanding of what plate color may lead to best predicted counts for aborted flowers, we imaged aborted flowers on plates of three colors: Sky Blue (2 images), Deep Blue (2 images), and Black (3 images), and we trained a model for each plate color (we used one image for training and one for test). Furthermore, we trained a model based on all imaged plates regardless of the color (three images of three different colors were used for training and three images for testing). A total of 168 aborted flowers were annotated on the 7 plate images.
Future work includes:
1) Annotating more image plates and training a model that is robust to plate color/background.
2) Exploring transfer learning from a model that the team has trained in prior work for detecting sorghum seeds spread on a piece of paper.
3) Exploring and implementing other state-of-the-art network architectures (e.g., YOLOv7) that may be better suited and capable of achieving superior performance for our application.
Objective 3 - Discover environmentally stable and region-specific genomic regions controlling flower abortion in diverse soil types, moisture, and climatic conditions
Organ abscission (in this case pistil and flower) is an important process that regulates the detachment of flower from the stem. However, the underlying genetic mechanism of flower abscission is largely unknown in plants. To understand the flower abscission in soybean we surveyed the key determinant genes involved in flower and flower organ abscission in Arabidopsis and identified orthologs in soybean genome. The majority of genes expressed in abscission layer in the model organisms are associated with hormone biosynthesis/transport and nutrient uptake. We have selected a subset of these genes (mainly transcription factors) involved in hormone regulation. We will conduct a gene-based haplotype analysis to select the group of lines and correlated the large effect variants with the phenotypic data. The confounding effect (if any) (if any) of flowering QTLs will be compared for the selected genes.
View uploaded report
Updated November 9, 2023:
Project report (Second quarter April 1 2023 to June 30, 2023)
Project funded by North Central Soybean Research Program, sponsored by the Soy Checkoff
Project tile - Field phenotyping using machine learning tools integrated with genetic mapping
to address heat and drought induced flower abortion in soybean
Participating institutions – Texas Tech University, Kansas State University, University of Missouri, and University of Tennessee
Goals & Objectives
Long-term Goal – Develop soybean cultivars with 20 to 30% lower flower abortion under favorable to challenging environmental conditions, leading to about 10-15% increase in yield potential
Objectives (Year 1)
• Explore the genetic diversity in flower abortion under different soil moisture and climatic conditions using a large diversity panel
• Develop an image-based field phenotyping system and deep-learning tool to precisely document temporal dynamics in flower abortion and pod retention in genetically diverse soybeans
• Discover environmentally stable and region-specific genomic regions controlling flower abortion in diverse soil types, moisture, and climatic conditions
Progress achieved
Objective 1 - Explore the genetic diversity in flower abortion under different soil moisture and climatic conditions using a large diversity panel
Texas Tech University
Seed processing and field preparation activities were initiated for the 228 lines on May 19th, following the delivery of seeds on May 18th, from University of Missouri. On June 13th, the seeds underwent treatment with Histick (Basf) - Inoculant and Biofungicide to promote seed emergence, growth and protection.
However, unfavorable weather conditions characterized by frequent showers posed challenges, resulting in a delay in planting. On June 16th, the soybean seeds were planted, and the plants have currently progressed to the V3 growth stage (Figure 1; see attached PDF).
To ensure effective weed control, continuous monitoring efforts have been undertaken in the field. To facilitate image phenotyping, our team is currently exploring the use of a sprayer or a tractor with a sprayer implement for installing the cameras (Figure 2; see attached PDF).
University of Missouri
Seeds of a diverse set of soybean germplasm (228 lines) in the USDA Gene Bank were successfully increased in Costa Rica. Our group distributed seeds to collaborators in Tennessee, Kansas, and Texas in May.
We planted all these entries in Columbia, MO on May 24, 2023. Germination was excellent. Field plots are well established, and plants reached V4-V5 growth stages as of June 30th (Figure 3; see attached PDF). We expect initial flowering in 10-14 days.
We are preparing image-based field phenotyping system as instructed by the engineering group in this project and field phenotyping is expected to start in last week of July or first week of August.
University of Tennessee
Plots were planted on June 7, 2023 at WTREC. All 700 plots are well maintained. The beans are at growth stage V3 to V4 (3 to 4 trifoliate leaves) (Figure 4; see attached PDF). The soybean crop will be managed according to University of Tennessee recommendations for growth regulator, pesticide applications, etc.
Rainfall and environmental data will be provided by the National Oceanic and Atmospheric Administration Global Historical Climatology Network Weather Station (GHCND: USC00404561) located at the immediately adjacent the experimental field. A Ph.D. student is on board with us to start his dissertation research actives on the current soybean project.
Kansas State University
The planting of the soybean plots took place on May 25th. Currently, we are actively monitoring the plots, and it is anticipated that the soybean plants will soon reach the R1 growth stage (Figure 5; see attached PDF).
To facilitate the installation of the imaging system, we have made specific modifications to a high-clearance spray vehicle (Figure 6; see attached PDF). The wheel spacing has been adjusted to straddle our 10' wide plots, which will serve as the mounting platform for the imaging system. This modification ensures optimal coverage and accessibility for capturing high-quality images of the soybean plants.
Objective 2¬ - Develop an image-based field phenotyping system and deep-learning tool to precisely document temporal dynamics in flower abortion and pod retention in genetically diverse soybeans
Texas Tech University
Five different models for node detection were evaluated, all of which were found to have comparable performances. We will hand these over to the K-State team so that they can integrate them into the flower detection pipeline and begin to process the images that will be collected at various sites in the coming weeks.
A GoPro Hero11 camera 27-megapixel was evaluated due to easy use and image collection. A protocol for image quality collection was developed based on the GoPro camera parameters to all locations. The imaging system was tested in a greenhouse and its ability to capture and record high-quality images at 60 frames per second was verified (Figure 7; see attached PDF). Furthermore, the captured images were used as input to the node detection model with successful outcome (Figure 8; see attached PDF).
We expect that the respective teams in each of the location will innovate, assemble and implement a strategy for conveying the imaging system through the field. We will provide back-stopping and help with image processing as the teams start generating field-imaging videos.
Kansas State University
Improving the quality of the flower detection model
We have fine-tuned the original Faster R-CNN flower detection model to improve its predictions. Specifically, the model was fine-tuned with a variety of images, some taken in a more controlled environments and others resembling images taken in-the-field; some more focused, and others somewhat blurred; or images taken with different imaging systems/cameras producing different resolutions and quality. The Average Precision for detections whose bounding boxes overlap by at least 50% with the ground truth bounding boxes (denoted as AP50) was 79.53 on the test images. Some sample predictions on test images are provided (see PDF attachment), together with their corresponding ground truth annotations (the predicted and ground truth counts are also shown underneath each image) (Figure 9).
Adding pods to the flower model
During the last reporting period, we have also enriched our model with the ability to detect pods. Specifically, we have adapted the previous Faster R-CNN model to detect pods (in addition to flowers) by fine-tuning it with 2693 annotated pod images (Table 1; see PDF). We used the Faster R-CNN implementation available in Detectron2 (a library containing state-of-the-art detection and segmentation algorithms made publicly available by Facebook AI Research).
In Figure 10 (see attached PDF) are predicted bounding boxes by comparison with the ground truth annotations along with the original images
Flowers/Pods per whole plant images
To better estimate the overall prediction capability of the flower/pod detection model, we evaluated it by comparing the number of detected flowers/pods with the number of ground truth flowers/pods per whole plant image (Note that this is different from the number of flowers/pods per plant, as some flowers/pods may not be visible in a particular image, depending on the angle of the image.) More specifically, we mapped the coordinates of the flowers/pods in each individual node image to coordinates in the whole plant image. This allows us to avoid duplicate detections. Some examples of predictions for each node in a plant image are shown in Figures 11 and 12 (see attached PDF).
Objective 3 - Discover environmentally stable and region-specific genomic regions controlling flower abortion in diverse soil types, moisture, and climatic conditions
Texas Tech University
Floral organ abscission is an important process that regulates the detachment of flowers from the stem. Floral organ abscission in well characterized model species (Arabidopsis) involves four steps: Initiation of abscission zone (AZ), promotion of AZ by ethylene, activation of separation and deposition of protective layer where organs have detached from the plant. We have shortlisted 6 genes (Blade on Petiole (BOP), KNAT (KNOX genes), BREVIPEDICELLUS 1 (BP1), INFLORESCENCE DEFICIENT IN ABSCISSION (IDA), HAE/HSL (leucine-rich repeat receptor like kinase), and DNA BINDING WITH ONE FINGER 4.7 (DOF4.7)) in Arabidopsis which correspond to 27 orthologous genes in Soybean involved in floral organ abscission. In addition, we shortlisted additional genes which have been reported to also play a role in floral organ abscission in addition to their known function- ASYMMETRIC LEAVES1 (AS1), AGAMOUS-like 15 (AGL15), and FOREVER YOUNG FLOWER (FYF). The mutant alleles of these genes have shown significant effect on several stages of floral organ abscission. And lastly, the maturity locus E1-E4 plays a significant role in the regulation of flowering in soybeans. The J locus, ortholog of AtELF3 (EARLY FLOWERING 3), is under the influence of E1. The functional analysis of mutant alleles for these genes showed an early flowering phenotype. The haplotype analysis for these genes is currently in progress, the analysis shows that some higher maturity group (MG) lines used in the current project (MG III, IV) retain one or more of the variant alleles. From this analysis, a group of lines correlating large effect variants associated with flowering traits (floral initiation and flower abortion) will be selected to identify causal genomic regions and thereby underlying genes.
View uploaded report
View uploaded report 2
Updated November 9, 2023:
Project report (Third quarter July 1, 2023, to September 30, 2023)
Project funded by North Central Soybean Research Program, sponsored by the soy checkoff
Project title - Field phenotyping using machine learning tools integrated with genetic mapping to address heat and drought induced flower abortion in soybean
Participating institutions – Texas Tech University, Kansas State University, University of Missouri, and University of Tennessee
Goals & Objectives
Long-term Goal – Develop soybean cultivars with 20 to 30% lower flower abortion under favorable to challenging environmental conditions, leading to about 10-15% increase in yield potential
Objectives (Year 1)
• Explore the genetic diversity in flower abortion under different soil moisture and climatic conditions using a large diversity panel
• Develop an image-based field phenotyping system and deep-learning tool to precisely document temporal dynamics in flower abortion and pod retention in genetically diverse soybeans
• Discover environmentally stable and region-specific genomic regions controlling flower abortion in diverse soil types, moisture, and climatic conditions
Progress achieved
Objective 1 - Explore the genetic diversity in flower abortion under different soil moisture and climatic conditions using a large diversity panel.
Texas Tech University
In early July, the soybean plants of all 228 lines were at the V2 developmental stage. In preparation for imaging and flower counting, several steps were taken. Labels were added to the plots, and measures were implemented for weed control. Additionally, cameras were mounted on the tractor for testing purposes (Figure 1). As the month progressed, the plants transitioned to the reproductive stage, which was delayed and occurred around the 20th of July due to stressful conditions i.e., Lubbock had over 5 weeks of 100 plus oF with no rain. Once they started to flower, manual flower counting and imaging was started. Various aspects, including camera angles, lens types, and camera numbers, were systematically adjusted and tested on the tractor to determine the optimal position and speed for imaging. Also, some pictures to document the diversity of the genotypes were taken.
To date, we have completed the 13th round of flower and pod counts for the 228 diverse genotypes. These counts were conducted every 4 to 5 days in conjunction with the imaging process. Most genotypes have now completed their flowering stage and have reached the R7 stage of development. As we approach the harvesting phase, a final round of imaging will be conducted on the dried plants for developing and counting pods using machine learning models. Subsequently, each plot will be harvested manually (3 feet per genotype per rep). To ensure representative data, we will select the tagged plant used for flower counting, along with an additional four plants, for other yield related parameters. From these five plants we will gather information on plant height, the number of branches, internode size, pod count, seeds per pod, 1000-seed weight, seed size, and grain weight per plant. Data from the tagged plant will be used as ground truth data for validating machine learning models for flower and pod numbers. Plants from 3-foot row length will be used for yield determination. Lodging scores will be recorded at harvest. Figure 2 displays the current flower count progress at TTU, indicating the range in maximum flower counts in 228 lines. Some of the very low numbers could be a result of rabbit damage.
July 2023, Dr Jagadish aired a radio interview of the Dakota Farm Talk to highlight the project and indicated the benefits that the progress made will have on the US and global soybean industry.
October 29th, 2023, Dr. Espíndola will deliver an oral presentation titled "Advancing Phenotyping for Flower Abortion in Soybeans through Image Analysis and Machine Learning" at the 2023 annual meeting of ASA-CSSA-SSSA in St Louis, Missouri.
University of Missouri
Since it was highly challenging to obtain human help to physically count flowers on all 228 lines, all other participating locations, selected a core set of 30 lines based on genetic diversity for manual flower counting. To date, we have completed the 7th round of flower counting for this core set of 30 lines in three replications (90 plots). These counts were conducted every 3 to 4 days in conjunction with the video imaging. Flower numbers are relatively consistent across 3 replications of each genotype and significant differences in flower number were observed among different genotypes. All genotypes have finished flowering and currently reached the R5 to R7 stage. As we approach the harvesting phase, a final round of imaging will be conducted on the dried plants for pod counting together with manual pod counting. Subsequently, each plot of 228 lines (684 plots) will be harvested manually (two center-rows of 8 feet/row) to estimate seed yield. Seed harvest will start at the end of September and the harvested seeds will be used for next year planting at all the locations. There are about 10 lines that may not have enough seeds, which will be included as a part of our winter nursery for seed increase.
University of Tennessee
At the University of Tennessee soybean plants imaging system was facilitated using GoPro Hero11 cameras mounted on a Traxxas Hoss ® 4x4 VXL conveyor (Figure 3). GoPro cameras were set up based on camera parameters including FPS, image ratio, boost, high quality video mode, white balance, camera angles, lens types, and positioning (10-12 inches distance). Field phenotyping was carried out throughout the flowering period wherein flowers and pods were manually counted separately every 4 to 5 days. Labeling was done in all 690 experimental plots. Plants were identified and tagged in each plot in order to facilitate manual counting of flowers and pods and imaging which was initiated on July 27, 2023. An overhead shot was taken to show the entire field including all of the 228 genotypes using an UAS platform. Remarkable differences in foliage color were observed among the genotypes (Figure 4).
When most of the genotypes completed their flowering stage and reached the R7 developmental stage (i.e., beginning maturity), we recorded another set of imaging, this time for the soybean pods for the 30 core lines (complete defoliation in a considerable number of lines). For those plots with plants reaching R8 (full maturity), harvesting has already been initiated (Figure 5). Plants were harvested manually from the two-row plots within 10.8 ft2 (~1 m2) to calculate the final yield. For documenting the yield components and other morphological parameters, the tagged plant that was used for manual counting of flowers and pods during the season along with other 4 plants on the same row were sampled. The harvested soybean plants are threshed using the USDA single plant thresher then the collected seeds per plot were placed inside a labeled bag for quantifying yield. Plant growth stage per soybean line were regularly monitored to estimate the harvest time. Plant height, number of branches, number of pods, seed number per plant, 100-seed weight, total seed weight per plot (1m2), and final yield will be determined. Lodging scores were also recorded once during late August and will be recorded per soybean line at harvest.
We were able to release a podcast on the UTIAg website (available on Spotify for Podcasters) about our soybean flowers abortion project. Find the link here:
https://podcasters.spotify.com/pod/show/utiag/episodes/Culture--Agriculture-Ep--4-Research-Could-Improve-Soybean-Yield-e277htq/a-aa5c7ku
Furthermore, we worked with the UTIA communication team to create and release a video about our current research project. To see the video, click here: https://www.youtube.com/watch?v=H5CVeWbiliU
The video will be broadcasted via WBBJ and Nashville TV channels during late September 2023.
Finally, we put a research abstract together titled “Image-based field high throughput phenotyping for quantifying flower abortion in genetically diverse soybean germplasm” and submitted it to the 2023 ASA-CSSA-SSA annual meeting. It will be presented at the CSSA section during the meeting in late October/early November 2023.
Objective 2 - Develop an image-based field phenotyping system and deep-learning tool to precisely document temporal dynamics in flower abortion and pod retention in genetically diverse soybeans.
Texas Tech University
In pursuit of Objective #2, Texas Tech has been working on dataset preparation for the flower detection model and implementing an algorithm for flower counting. These are essential to the foundation for the successful development of our phenotyping system.
1. Dataset Preparation for Flower Detection Model Development:
1.1 Automated Video Frame Extraction
One of our initial tasks was to develop an automated pipeline to extract unique frames from videos captured at various locations. This pipeline streamlined the data collection process and ensured a consistent dataset for analysis.
1.2 Dataset Compilation and Annotation
We compiled a new dataset consisting of 1314 images from four diverse locations, namely Missouri, Tennessee, Texas, and Kansas. Collaborating with annotation teams from Texas Tech and Kansas State University, these images were annotated for flower detection. Before these annotated images could be used for model development, they need to be validated by a domain expert which involves confirming, removing, or adding annotations, enhancing the dataset's quality and consistency.
To date, 1341 images (1037 from the previous dataset and 277 from the new dataset) have been validated by Dr. Espíndola, our domain expert and the post-doctoral fellow on the project, encompassing 9367 confirmed flower annotations.
Furthermore, an additional 1037 annotated images are currently undergoing validation. This expansion aims to increase dataset diversity and enable the development of better-generalized models.
2. Dataset Preparation for Flower Detection Model Development:
Accurate flower counting in captured videos is essential for Objective #2. To achieve this, we focused on tracking detected flowers across frames to prevent overcounting.
2.1 Implementation and Modification of Tracking Algorithms
We implemented and modified three state-of-the-art multi-object tracking algorithms: SORT, OC-SORT, and OC-SORT with Byte. These algorithms were selected for evaluation in the context of flower counting.
2.2 Annotation of Tracking Data
Evaluating these algorithms required annotating a series of consecutive frames in a video. This process was challenging and time-consuming, as it necessitated identifying and tracking flowers through frames, even in the presence of occlusions. We annotated 211 consecutive frames from a Kansas field video, encompassing a total of 35 flowers. This segment was chosen for its complexity, involving both long-term and short-term occlusions.
2.3 Algorithm Evaluation
We evaluated the three tracking algorithms with various parameter combinations. Surprisingly, our findings indicate that the choice of algorithm is not the critical factor for accurate tracking and counting of flowers. All three algorithms yielded accurate results when specific parameter settings were used.
To validate and consolidate our conclusions, further videos need to be annotated for tracking and subsequently used for evaluation.
Kansas State University
In this phase of the project, we made a significant shift in our approach to flower detection. Specifically, we switched from models for node detection followed by flower/pod detection to a single model that performs flower detection directly on full images or frames extracted from videos taken in the greenhouse. This shift was motivated by a preliminary exploration of the flower model on full greenhouse images, which showed that the model was capable of detecting flowers directly in those images. Furthermore, by training a flower detection model without the need for node detection, we aimed to streamline our workflow and improve efficiency.
To train an accurate model on full images, we labeled flowers in a set of 1200 images/frames based on guidelines from the TTU domain experts. The images that we labeled were taken by the K-State team in the greenhouse in the beginning of the flowering season, and exhibited many buds and small flowers. We used 800 labeled images for training a new model, 300 images for development and 100 images for testing. Some examples of images predicted by the model are shown below. As can be seen, the model can accurately detect flowers directly in the greenhouse images, without the need to detect and extract the nodes in the first place.
While the model worked well on images and frames from videos taken in the greenhouse early in the flowering season, we encountered challenges when attempting to apply the model to field images (Figure 6). The model's performance suffered because the flowers presented significant differences in their characteristics (including color, shape and texture), as compared with the flowers in the greenhouse images. To account for such differences, we needed to enhance the labeled dataset by incorporating additional frames from videos taken at various flowering stages, which captured a large variety of flowers. A total of 1200 new frames sampled from videos from all four institutions were annotated and added to the original dataset. The original model was fine-tuned with the additional images and showed good performance overall in our testing as can be seen below. However, the performance on blurry frames and frames from videos taken at higher speed can still be improved.
As we are moving towards annotating pods in the next quarter of the project, and we may also need to annotate more flowers, we have also started to explore the use of large pre-trained foundation models, such as the recent Segment Anything model, to annotate images in a zero-shot setting with human-in-the-loop to improve its annotations. We have also worked on a script to identify differences between ground truth bounding boxes and predicted bounding boxes, with the goal of identifying mistakes in the human annotations as well as identifying challenging images that can help improve the robustness of the model.
University of Missouri
We used GoPro cameras to take 3 rounds of videos of the selected core set of 30 lines. There are some challenges, including steady walking speed of the camera, shade of soybean branches and leaves, and plant lodging issues. Group is designing a uniform imaging platform based on the experience of this year field studies, aiming to unify walking speed and avoid shade from leaves. Meanwhile, we are taking notes on lodging score and maturity date for the whole set of 228 lines, which will be used to select upright genotypes (low lodging) with similar maturity dates for our next year field studies. Initial observation indicated that maturity group (MG) III lines showed significant lodging issues compared to the MG IV lines. We will discuss with the group to focus on the set of lines that had minimal lodging across locations. Thus, we can solve the other 2 major issues caused by plant lodging and different flowering peak time.
Objective 3 - Discover environmentally stable and region-specific genomic regions controlling flower abortion in diverse soil types, moisture, and climatic conditions.
In our prior analysis, we meticulously selected six pivotal genes that are well-documented in their roles pertaining to the initiation of the abscission zone (AZ), the facilitation of AZ development through ethylene signaling, the activation of tissue separation mechanisms, and the subsequent deposition of protective layers following organ detachment from the plant. Furthermore, we incorporated genes known to be involved in soybean maturity and flowering processes. To perform a robust gene-based clustering analysis and to identify alleles with substantial effects, we leveraged a state-of-the-art gene-haplotype analysis framework, which was executed on the high-performance computing servers at TTU. In a preliminary study, we executed the gene-based haplotype analysis on a cohort comprising 481 lines as a means of testing our analytical pipeline using major flowering genes. During this analysis, we successfully pinpointed four significant haploblocks, with particular emphasis on haploblocks H1 and H4 (as illustrated in Figure 7), which exhibited pronounced allelic variations possessing substantial effects on the observed traits. While the haplotype analysis of additional genes remains an ongoing endeavor, our ultimate objective is to compare the lines that overlap with these haploblocks to field data especially flower number and aborted flowers. Following the data collection from all locations will overlay the phenotypic data with haplotype analysis to identify most diverse accession for further analysis.
View uploaded report
View uploaded report 2
Updated December 17, 2023:
Project report (Fourth quarter October 1, 2023, to December 31, 2023)
Project funded by North Central Soybean Research Program
Project tile - Field phenotyping using machine learning tools integrated with genetic mapping to address heat and drought induced flower abortion in soybean
Participating institutions – Texas Tech University, Kansas State University, University of Missouri, and University of Tennessee
Goals & Objectives
Long-term Goal – Develop soybean cultivars with 20 to 30% lower flower abortion under favorable to challenging environmental conditions, leading to about 10-15% increase in yield potential
Objectives (Year 1)
• Explore the genetic diversity in flower abortion under different soil moisture and climatic conditions using a large diversity panel
• Develop an image-based field phenotyping system and deep-learning tool to precisely document temporal dynamics in flower abortion and pod retention in genetically diverse soybeans
• Discover environmentally stable and region-specific genomic regions controlling flower abortion in diverse soil types, moisture, and climatic conditions
Progress achieved
Objective 1 - Explore the genetic diversity in flower abortion under different soil moisture and climatic conditions using a large diversity panel.
Note – All data presented on flower abortion is preliminary as the teams are yet to have thorough discussions on the approach taken to ensure the percentage abortion presented is confirmed.
Texas Tech University
Soybean harvesting started on September 22nd and concluded on October 18th (Figure 1 – Please refer to the attached PDF for all figures and images). Figures 2 and 3 showcase the data gathered from the field for flower and pod counts. Other measurements including yield, pods per node, number of seeds per plant, and 1000 seed weight, are currently in progress and will be reported upon completion.
In Figure 2, the data illustrates that among the 30 genotypes, the average flower abortion rate is approximately 47%. Meanwhile, Figure 3 presents the percentage variation in abortion across 161 genotypes exclusively analyzed at Texas Tech University. This range spans from 20% to 80% among the different genotypes. The temperatures and precipitation of the experimental farm during the trail is presented in Figure 4.
University of Missouri
Using phylogenetic analysis a core set of 30 lines was selected that represented the genetic diversity to perform manual flower/pod counting and imaging. We counted flowers over 9 times with intervals of 3 to 4 days during the flowering stage. We performed the final pod counting in October to estimate the flower abortion rates of selected 30 lines. Preliminary data on the flower and pod count and related abortion percentage is presented in Figure 5. The average flower abortion rate at Missouri is approximately 50%, ranging between 37% and 62%. Video imaging at all 9 times was taken during flowering of these 30 lines and shared with the group to optimize the ML-based automatic flower counting platform.
Harvesting of the entire diverse panel of 280 lines from the field (center 2 rows). The harvested plants will be threshed to estimate yield of these lines. The yield data will be used to correlate with flower abortion rates. The temperatures and precipitation of the experimental farm during the trial is presented in Figure 6.
University of Tennessee
Harvesting of soybean plots at the University of Tennessee at the West TN Res. and Edu. Center (WTREC) started on September 14th (Figure 7). Plants were harvested manually from the two-row plots within 1 m2 and the total seed weight was recorded to calculate the yield in kg/acre (Figure 8). The tagged plant that was used for manual counting of flowers and pod along with other 4 plants within the same row were collected using a burlap fabric roll. Morphological characters such as plant height and number of branches as well as yield component parameters including number of pods plant-1 and number of seeds plant-1 were recorded before threshing. All harvested plants were threshed using the USDA single plant thresher then the collected seeds per plot were weighed to account for yield. All 90 plots were harvested and data collection is near completion. Figure 9 shows the rate of flower abortion of the soybean genotypes. The abortion rate among the 30 genotypes was upto 29% The temperature and precipitation at the experimental farm during the trial is presented in Figure 10.
Kansas State University
Flower and pod counts were completed this quarter on the core set of 30 genotypes. At the end of the growing season, single plants used for flower and pod counts from this core set were harvested. Number of pods, nodes, and seeds per pod, along with total seeds, 100 seed weight and total seed weight were determined for each of the single plants. Those counts have been completed. The data is now being evaluated for quality prior to analysis. Based on a preliminary analysis of the flower and pod counts, we observed about a 40% difference in the relative flower abortion in the core set, with abortion ranging between 22% and 70%. This needs to be confirmed and compared to the results at the other locations. Seed yield, plant maturity, lodging and height were taken on the entire panel during September, October and early November. At harvest, these plots were threshed with a stationary thresher. The harvested seed is now being cleaned and weighed to measure final seed yield. Videos taken weekly of the developing plants are now being inventoried and labeled to evaluate the relationship between the flower counts in the field throughout the season, and the detection of the flowers in the videos.
Objective 2 - Develop an image-based field phenotyping system and deep-learning tool to precisely document temporal dynamics in flower abortion and pod retention in genetically diverse soybeans.
Texas Tech University
In this quarter, our focus has been on advancing the development of a customized Multi-Object Tracking (MOT) algorithm specifically tailored for counting soybean flowers in the field, aligning with the overarching objective of creating an image-based field phenotyping system.
1. Tracking Dataset Preparation, Continued
As mentioned in the previous report, preparing a dataset for the evaluation of MOT algorithms is quite challenging and time-consuming. This intricate task involves identifying and tracking individual flowers across an extensive sequence of consecutive frames, a laborious process exacerbated by the presence of long-term occlusions. This endeavor requires precision and attention to detail, as each flower needs to be tracked separately.
Our tracking dataset has expanded from one video (211 frames) to five videos (1,382 frames), comprising 22,606 individual flower annotations. These videos, hailing from diverse locations such as Kansas, Tennessee, Missouri, and Texas, enrich our dataset with varied environmental conditions and soybean varieties. This growth enhances the representativeness and applicability of our dataset for robust algorithm evaluation and refinement.
It's important to note that well-constructed MOT datasets are inherently scarce, given the complexities involved. Crafting a MOT dataset tailored specifically for Soybean Flower counting adds an extra layer of rarity. Even in its current state, we consider our dataset highly valuable, recognizing its uniqueness within the research landscape.
Our commitment to dataset expansion remains steadfast. The addition of more diverse videos is key to enhancing the accuracy of our algorithm evaluations, and we will continue this effort in the upcoming phases of our research.
2. Tracking and Counting Evaluation Method
We've delved into diverse evaluation methods for our tracking and counting algorithms. Specifically, we've implemented two approaches:
1. A dedicated approach for assessing the accuracy of counting flowers.
2. An evaluation method gauging the quality of flower tracking across frames. This method incorporates various metrics, with a particular emphasis on achieving high tracking accuracy. Inspired by the widely recognized HOTA paper by Luiten (2020), this approach is instrumental in refining our counting performance.
3. Tracking for Counting Algorithm Development
Expanding on our algorithm development, we've incorporated two additional state-of-the-art tracking algorithms, ByteTrack and DeepSORT, alongside our previous implementations of SORT, OC-SORT, and OC-SORT with Byte. The inclusion of DeepSORT is particularly noteworthy, introducing deep learning and a specialized neural network for tracking. While still in the active exploration phase, our preliminary findings indicate an unexpected trend— the integration of deep learning appears to be adversely affecting the performance of our tracking algorithm. These initial results carry significant implications and shed light on the current state of tracking algorithms. Our commitment to rigorously evaluating and investigating these outcomes remains paramount, guiding our next steps in algorithm refinement and optimization.
In summary, our investigation into tracking algorithms for counting soybean flowers is groundbreaking. While previous studies focus on detecting soybean flowers from still images, our work stands out by offering a novel solution to use detection algorithms for large-scale flower counting. This unique contribution addresses a gap in existing research, providing a practical approach to advance soybean phenotyping.
Kansas State University
Recognizing the intricate shape of soybean pods, we have opted for an instance segmentation method as opposed to bounding box object detection, for the task of identifying and counting the pods. The instance segmentation approach enables us to obtain precise segmentation masks of the pods, ensuring a more accurate representation of their complex structures.
Addressing the challenge of limited labeled data, we have devised a strategy of utilizing images extracted from field videos captured at various stages of the soybean growing phase. These images are subsequently annotated using AnyLabeling, a powerful tool driven by the Segment Anything Model (SAM) developed by Meta. This innovative tool allows us to generate precise masks by leveraging weakly supervised prompts, mitigating the need for laborious labeling of segmentation masks.
Through the utilization of AnyLabeling, we have successfully annotated approximately 300 frames of images extracted from videos recorded across all four locations. This annotation process not only facilitates the accurate identification of soybean pods, but also contributes to the enrichment of our dataset with diverse and representative samples. The incorporation of weakly supervised prompts, coupled with the efficiency of SAM, empowers our annotation process, ensuring that the generated masks accurately delineate soybean pods in their varying stages of development. This meticulous annotation approach enhances the robustness and reliability of our dataset, laying the foundation for more accurate and comprehensive analyses of soybean phenotypic traits.
The 300 annotated frames were split into three subsets, used for model training (Train), model development/hyper-parameter tuning (Valid) and model evaluation (Test) as shown below
The performance of the current trained model is shown below in terms of average precision (AP), average precision at 50% IoU (AP50) and average precision at 75% IoU (AP75) Some samples of annotated frames are shown below (Figure 11), together with the original un-annotated frames.
In addition to pod segmentation, we have also worked on tracking the soybean pods. We are currently using multi-object trackers and fine-tuning the trackers, so that we can use them on field level vídeos (Figure 12).
Objective 3 - Discover environmentally stable and region-specific genomic regions controlling flower abortion in diverse soil types, moisture, and climatic conditions
Previously, we selected key soybean homologs involved in flower abortion including Initiation of abscission zone (AZ) and promotion of AZ by ethylene and performed haplotype analysis. We identified two major and two minor haplotypes for one of the transcription factors (GmRNI) involved in flower organ abscission (Figure 13). The major haplotype carries two alleles and showed higher allelic diversity in wild accessions (Figure 13B). Most interestingly, this gene expressed during R1 flower stage in multiple soybean accession and suggests a critical role in flower development and probably in floral abscission (Figure 14). Currently we are performing additional analysis to identify allelic variants in a subset of accessions that were selected from Year 1.
View uploaded report