Project Details:

Utilizing Unique Genetic Diversity to Combine Elevated Protein Concentration with High Yield in New Varieties and Experimental Lines (1720-152-0104)

Parent Project: This is the first year of this project.
Checkoff Organization:United Soybean Board
Categories:Breeding & genetics, Seed composition
Organization Project Code:1720-152-0104
Project Year:2017
Lead Principal Investigator:Steven Clough (USDA/ARS-University of Illinois)
Co-Principal Investigators:
David Hyten (Iowa State University)
Asheesh Singh (Iowa State University)
William Schapaugh (Kansas State University)
Brian Diers (University of Illinois at Urbana-Champaign)
Aaron Lorenz (University of Minnesota)
Hari Krishnan (University of Missouri)
George Graef (University of Nebraska)
Rusty Smith (USDA/ARS-University of Illinois)
Show more
Keywords: exotic germplasm, Protein, QTL

Contributing Organizations

Funding Institutions

Information and Results

Comprehensive project details are posted online for three-years only, and final reports indefinitely. For more information on this project please contact this state soybean organization.

Click a section heading to display its contents.

Final Project Results

1) Develop and release experimental lines and varieties that are higher in protein concentration and yield than current varieties.

In 2017, the project releasing LD13-1419 which has 2 percentage points greater protein than the average of the checks in the 2016 SCN Preliminary Test II and LD11-2170, which had 1.1 percentage points greater protein than the average of the checks in the 2016 SCN Uniform Test III.

Analysis of the samples from the 2016 diversity tests found that most of entries exceeded 35% seed protein concentration, with many in the 37-38% range. These preliminary data indicate that we have a potentially rich and diverse source of germplasm to increase seed protein concentration in high-yield soybean lines. Thirty-one of the best lines were retested in 2017 but those data are not yet available. An additional 148 experimental lines in maturity groups 0 to IV were evaluated in cooperation with four commercial soybean breeding companies. The top 50% of the lines for yield will be analyzed for seed composition using samples from 5 locations.

Approximately 100 new crosses were made to combine improved protein composition and over 6000 new experimental lines were evaluated to identify lines with increase protein concentration and improved yield.

2) Characterize high protein sources for the presence of the major protein gene on chromosomes 15 and 20.

To refine the genetic map resolution of the protein gene on chromosome 15 (cqSeed protein-001), 1000 plants segregating for this gene were characterized to identify those that have genetic recombinations close to the gene. Out of the 1000 plants, 153 were selected that have recombinations in 11 intervals that span the region in which the gene was previously mapped. The selected plants were harvested, threshed and approximately 100 seed from each of 33 selected plants were planted in the field in 2017. DNA was extracted from each plant and tested with markers in a region surrounding the gene. Seeds from these plants will be analyzed for protein and oil concentration and those values will be associated with the DNA markers to determine precisely the segment of the chromosome related to increased protein concentration.

A candidate gene for the protein QTL on chromosome 20 was identified (cqSeed protein-003). This is most common allele in high protein soybean lines. DNA has been extracted from 200 high protein accessions from the USDA Soybean Germplasm Collection. Work is in process to identify the presence of the cqSeed protein-003 in these lines. Our goal is to identify high protein lines without this high protein QTL as an alternative source of high protein that could have a lesser effect on reducing yield.

3. Map new genes for high protein concentration from Glycine tomentella.

We mapped 3 QTL associated with protein concentration among crosses between high protein lines derived from crosses between soybean and G. tomentella. All of these QTL overlap with genomic regions previously identified in high protein soybean germplasm. These lines with 40 soybean chromosomes were derived from lines with 42 chromosomes (40 from soybean and 2 from G. tomentella). Perhaps more significantly in related research, we discovered that the variation in these types of lines may have come from enhanced cross pollination and not from G. tomentella. There are still many unanswered questions about these type of G. tomentella-derived but it is highly likely that the genes responsible for the high protein concentration are not from G. tomentella.

4. Characterize protein composition and amino acid profiles of G. tomentella-derived lines and selected experimental soybean lines.

This research confirmed what we found in the mapping research. We completed the 2-D gel analyses of seed proteins of 15 Glycine tomentella-derived high protein lines and compared their protein profile to that of their parents, Dwight and G. tomentella. Computer-assisted comparison of the 2-D protein profile indicates that all the 15 Glycine tomentella-derived high protein lines matched exactly the protein profile of Dwight. No introgression of G. tomentella proteins in these hybrids was detected indicating that G. tomentella had little or no contribution to the seed protein composition of the hybrids. Our analyses confirmed an overall 5 to 8% increase in all seed protein spots in these hybrids when compared to that of Dwight. Based on our observation we can conclude that the overall increase in the protein content observed in Glycine tomentella-derived high protein lines is not due to a preferential increase in some specific seed proteins but due to a uniform increase in all seed proteins.

We have also examined the accumulation of Bowman-Birk protease inhibitor (BBi), a cysteine-rich protein. The abundance of this protein in soybean seeds can be used as a marker for the relative concentration of sulfur-containing amino acids in the seeds. Differences in the accumulation of BBi were detected among the 15 Glycine tomentella-derived high protein lines. Since the accumulation of BBi is influenced by environmental conditions, the differences observed among 15 Glycine tomentella-derived high protein lines should be confirmed from seeds grown in multiple environments.

5) Improving seed quality in early planting soybean production system (ESPS).

Poor seed quality of soybeans grown in the early planting soybean production system can reduce protein concentration. The purpose of this research is develop high, heat tolerant lines and varieties that will have acceptable levels of oil and protein. This project released MG IV DS25-1, the first improved heat tolerant soybean germplasm released for high stress heat/drought environments. DS25-1 has a pedigree that is 50% exotic. Although DS25-1 yielded less under irrigated conditions, under early planting (April) and without irrigation at Stoneville, MS in 2012 and 2013, the yields of DS25-1, AG 4903, and C4926 did not significantly differ (38.0, 43.4, and 31.5 bu/ac, respectively). In addition, DS25-1 had superior seed quality traits (higher germination and accelerated aging, less damage from green seed, hard seed, and total Federal Grain Inspection Service damage) and potentially superior seed composition in stress environments. Under early planting (April) without irrigation in 2012 and 2013 at Stoneville, MS, seed of DS25-1 had significantly higher (P=0.05) levels of protein (36.3 % at 13% moisture) compared to those of AG 4903 (33.2%) and C4926 (32.1%). Under less stress, average protein (35.6%) levels for DS25-1 were similar to those of AG 4903 (35.5%) across the locations of the Uniform Soybean Tests -- Southern States (UT) Preliminary IV-S-Late Test in 2011 (PIVSL). In the UT Uniform IV-S Test over 2013 and 2014 (UIVS), DS25-1averaged 35.8% protein and 17.8% oil, whereas Monsanto 'AG 4907' averaged 35.4% protein and 19.7% oil. Planted early (April), but grown with irrigation in 2016, DS25-1 averaged 36.1% protein and 17.4% oil, compared to AG4835RR2, which averaged 35.4% protein and 18.1% oil. Under Southern growing conditions, DS25-1 consistently delivered a 48% protein meal, with greater than 10 pounds of oil per bushel.

There are other notable exotically-derived lines for the early production system with higher levels of protein and yield. MG III 10076-121-21 has 12.5% exotic pedigree derived from PI 587982A and had higher yield, germination, and protein (60.7 bu/a, 90%, and 38.3%, respectively) than commercial cultivar AG3803 (58.7 bu/a, 71%, and 35%, respectively). The level of protein meal in 10076-121-21 greatly exceeded 48.5% and its oil exceeded 11 pounds per bushel. The two lines differed in maturity by one day. The line 4014-242-341, MG III and 50% exotic, also had greater yield, germination, and protein (60.8 bu/a, 93%, and 36.2%, respectively) than AG3803. The level of meal and oil for 4014-242-341 was approximately 48% and 11 pounds per bushel, respectively.

Early MG IV 10053-124-12 (25% exotic) had higher yield, germination, and protein (64.2 bu/ac, 94%, and 36.1, respectively) commercial cultivar AG 4232RR2Y (61.8, 92, and 34.7, respectively). Both lines had 19% oil. The line 10053-124-12 had 48% meal and 11 pounds of oil.

Did this project meet the intended Key Performance Indicators (KPIs)? List each KPI and describe progress made (or not made) toward addressing it, including metrics where appropriate.
1. At least 5 high protein experimental lines will be identified as candidates for release as varieties or germplasm by the end of FY17.

We released two high yielding varieties with protein concentrations 1 to 2 percentage points greater than the checks. We identified over 30 high yielding experimental lines with good protein concentrations for advanced regional testing.

2. At least 5 high protein experimental lines will have been exchanged within the project to increase the diversity for protein concentration and yield within each breeding program by the end of FY17.

Four high protein lines were exchanged for use as parents within the project and four lines were provided to commercial companies for use as parents.

3. A genetic marker for the chromosome 20 high protein gene will be available for use by soybean breeders and geneticists to characterize high protein germplasm by the end of FY17.

We did identify a genetic marker for the chromosome 20 high protein gene and it was deployed within the project. Pending publication of the results of this research, it was not made public but that will be done.

Expected Outputs/Deliverables - List each deliverable identified in the project, indicate whether or not it was supplied and if not supplied, please provide an explanation as to why.
This project was written as 3 year project and few of the expected outcomes/deliverables can be accomplished in year one but some were achieved.

1. A report of all field and seed composition data for experimental lines tested jointly with commercial companies will be distributed to all participants and other interested soybean breeders.

These data are still be summarized or collected on plots harvested in 2017.

2. A list of all high protein germplasm accessions indicating the presence or absence of the high protein gene on chromosome 20 that can help select new sources of high protein that are genetically different from what is currently being used.

The research is underway but has not been completed.

3. Based on previous research supported by USB, we will make germplasm releases of experimental lines with improved yield and enhanced protein concentration to be used by both public and private sector breeders to develop new varieties.

Two varieties have been released.

4. Preliminary data indicating if the genes controlling high protein concentration in G. tomentella-derived lines are different from the major genes known in soybean.

Data collected indicates that the genes controlling high protein in these lines are not different from those in soybean.

5. Identification of specific protein components that are responsible for the increased protein concentration in experimental lines derived from G. tomentella.

The protein components in these lines are the same as those in the soybean parent.

6. Data on protein quality for both high protein soybean lines and high protein G. tomentella-derived lines that will help select lines with higher concentration of sulfur-containing amino acids.

Preliminary data indicate that some of these lines have elevated levels of Bowman-Birk protease inhibitor (BBi), a cysteine-rich protein, that may improve the quality of protein in these lines.

7. Genetic markers to identify the specific genes on chromosomes 15 and 20 that increase protein concentration.

The genetic marker on chromosome 20 was identified and research is underway to find a suitable marker on chromosome 15.

8. Identify genes in elite and exotic germplasm pools that influence seed protein concentration.

Based on our initial analysis most of the high protein lines accumulated much higher amounts of 7S and 11S seed storage proteins when compared to that of their parent lines. Additionally, our 1D gel analysis have confirmed that some of the high protein lines do not accumulate glycinin 4 subunit (Gy4). The development of high protein lines in a Gy4 null background will be highly desirous for high quality tofu production. Interestingly, several of the high protein lines accumulated a unique high molecular weight (around 60 kDa) protein. We have also initiated 2-D gel electrophoresis of few high protein lines and have confirmed the accumulation of these unique protein spots. We will soon identify these unique proteins in the high protein lines by mass spectrometry. Plans are underway for performing western blot analysis to quantify the concentration of sulfur-rich proteins (leginsulin ad Bowman-Birk protease inhibitor) in these lines. Amino acid analysis will be performed and correlation, if any, between the concentration of sulfur rich proteins and methionine and cysteine content (as determined by amino acid analysis) of the seed will be established.

9. Identify unique loci in exotic sources that are not present in the commercial gene pool. This research is not completed.

Determine if there are genes in either the exotic or elite gene pools that are related to increased seed protein concentration without decreasing seed oil. If so, select lines with those genes for multi-location yield evaluations in Year 2. These lines will be identified in the lines evaluated in 2016 but for which seed composition data is not yet available.

Describe any unforeseen events or circumstances that may have affected project timeline, costs, or deliverables (if applicable.)
Research that indicates that our high protein G. tomentella lines may have originated from outcrossing to other soybean lines, which provided the genes for high protein, was unexpected. This affects objectives 3 and 4.
What, if any, follow-up steps are required to capture benefits for all US soybean farmers? Describe in a few sentences how the results of this project will be or should be used.
Lines with high yield and enhanced protein identified in year one of this project will be evaluated more extensively in year 2 to confirm these attributes. When we have the confirmation data will we make these lines broadly available to all seed companies, universities, USDA, and others within the soybean breeding community so that they can be used as parents in producing new varieties with improved yield and overall nutritional bundle in the seed.
List any relevant performance metrics not captured in KPI’s.
We involve all the major soybean seed companies in the cooperative wide-area evaluation of new soybean lines developed from this program, so they have immediate access to the data and the most recent soybean lines developed from this project.

Project Years