Project Objectives
1) Yield trials. We will perform yield trials of the transgenic plant material already developed, after ensuring that we have homozygous transgenic events in the Thorne background. We will also trial other material developed in the remaining objectives below. The yield trials will be conducted to test the impact of the transgenes on yield and protein and oil concentration.
2) Introgression. We will introgress the RNAi transgene from Thorne into high-yielding elite lines developed by the Diers group, to determine its effects on protein content and yield in a commercially relevant current variety.
3) CRISPR/Cas9 editing. We will develop knockout lines in Glyma.20G085100 using the CRISPR/Cas9 system, in order to generate specific edits that we anticipate will maintain the high-yielding, high-seed-protein phenotype of the transgenic lines. The transgenic RNAi lines are tightly regulated and have several regulatory hurdles before they can be commercialized, but CRISPR-derived edited lines, once cleaned by back-crossing to elite material, lack the transgenic DNA present in transgenic varieties.
4) Identification of the causative locus of the protein content gene located on Chromosome 15. We are close to identifying the molecular basis of a second protein content gene located on Chromosome 15 using fine mapping. We will extend the fine mapping data and use whole-genome sequencing of the line to identify the causative gene at this locus for future editing experiments.
Project Deliverables
We are applying to NCSRP for one year of funding to initiate this project. At the end of the first year we will have reached the following milestones:
1) Field trial completed for one growing season using RNAi transgenics
2) The high protein allele for Glyma.20G085100 downregulated and the effect on protein and oil tested in both a greenhouse and field and for yield in the field.
3) CRISPR guide construct designed, created and tested in transgenic soybean roots
Progress of Work
Updated January 29, 2021:
Report: March 2020.
At the beginning of the project in October 2019 we stated the following milestones and KPIs. A brief progress report is given for each milestone and KPI. In addition, we have found several interesting research leads that were not anticipated when the initial proposal was written.
Two key insights are:
1) A finding that the transgenic RNAi construct is also affecting protein, but has pleiotropic effects on maturity and seed mineral content, where the original allele does not.
2) A discovery that the low protein gene found in high-yielding conventional soybean varieties may revert (at a very low frequency) back to the high protein gene. Thus, we may be able to identify revertant alleles within existing elite lines to facilitate the creation of high protein varieties.
In addition to these novel findings, which both require further research before they can be exploited in breeding and agronomy, our project to engineer protein levels using RNAi transgenics has proceeded according to plan and all milestones and KPIs are on or ahead of schedule.
Milestones:
1) Field trial completed for one growing season using RNAi transgenics
This field trial was completed at the transgenic field facility in Nebraska and we are still analyzing the data. Preliminary indications are that at least one transgenic event shows increased protein and free amino acid levels, and that oil content does not appear to be affected. However the results are preliminary and may not be statistically significant. The interpretation of yield data has been complicated by the fact that the RNAi transgenic constructs seem to affect maturity date, with the transgenic plants maturing later than the control, untransformed line. This is unexpected as the original pro/oil allele derived from PI468916 does not seem to affect maturity.
2) The high protein allele for Glyma.20G085100 down-regulated and the effect on protein and oil tested in both a greenhouse and field and for yield in the field.
We have shown down-regulation of the Glyma.20G085100 gene and have good data from the greenhouse and preliminary data from the field. The effect on protein appears to vary between different transgenic events, as expected. So far we are seeing a 1-2% increase in protein content in the best lines, which is also seen under field conditions, but is not yet statistically significant with the current year’s data. Interestingly we are also seeing relatively large differences in leaf mineral content in some transgenic events. Further work and additional years of field data will be needed to verify the significance of these results.
3) CRISPR guide construct designed, created and tested in transgenic soybean roots
The CRISPR guide has been designed and has been synthesized and placed in a plasmid vector. The construct has not yet been tested. Restrictions on laboratory work may slow the testing of this vector in the coming months.
We aim to complete the following Key Performance Indicators (KPIs) by the end of Year 1:
1) Data available for multiple RNAi transgenic events for yield trials in the field
We have extensive data for two RNAi events that is currently being analysed.
2) The impact of down-regulating the high protein allele tested
We have demonstrated down-regulation of the allele in transgenic plants that are in the trials.
3) At least one plant transformation plasmid completed containing a guide RNA sequence targeted to the Glyma.20G085100 gene.
Although we have the guide RNA sequence it is not yet ready for plant transformation. Laboratory facilities at the University of Illinois are currently largely closed. We anticipate this will be completed by the end of the project, assuming that University laboratories are allowed to reopen.
View uploaded report 
Updated January 29, 2021:
Report: Oct 2020.
At the beginning of the project in October 2019 we stated the following milestones and KPIs. A brief progress report is given for each milestone and KPI, along with additional progress. Our progress has been substantially impeded by the COVID-19 pandemic in 2020, as the graduate student working on this as the main laboratory scientific investigator graduated and left the program in Dec 2019, and we have not been able to replace her because of the problems in hiring and recruitment since February. However, all KPIs for this year have still been completed on schedule, and we are making good additional progress in field work and data analysis.
Milestones:
1) Field trial completed for one growing season using RNAi transgenics
This field trial was completed at the transgenic field facility in Nebraska. We have identified one transgenic event that shows increased protein and free amino acid levels, and that oil content does not appear to be affected. The interpretation of yield data has been complicated by the fact that this RNAi transgenic event seems to affect maturity date, with the transgenic plants maturing later than the control, untransformed line. This is unexpected as the original pro/oil allele derived from PI468916 does not seem to affect maturity. Additional field trials have been conducted in 2020, which are awaiting harvest, and additional transgenic events are being developed.
2) The high protein allele for Glyma.20G085100 down-regulated and the effect on protein and oil tested in both a greenhouse and field and for yield in the field.
The effect on protein so far of down-regulation is still 1-2% increase in protein content in the best lines, which is also seen under field conditions, although it is more statistically reliable in the greenhouse than in the current year’s field data. The 2020 field data will be available soon. The extent of down-regulation in the two events that have been characterized is less than expected, so further events are being developed and characterized.
3) CRISPR guide construct designed, created and tested in transgenic soybean roots
The CRISPR guide has been designed and has been synthesized and placed in a plasmid vector. Restrictions on laboratory work have limited our ability to progress this objective. We anticipate resuming laboratory work when the restrictions on hiring are lifted.
We aim to complete the following Key Performance Indicators (KPIs) by the end of Year 1:
1) Data available for multiple RNAi transgenic events for yield trials in the field
This has been completed for last year, and the 2020 data should be available shortly after harvest.
2) The impact of down-regulating the high protein allele tested
We have demonstrated down-regulation of the allele in transgenic plants that are in the trials.
3) At least one plant transformation plasmid completed containing a guide RNA sequence targeted to the Glyma.20G085100 gene
This has been completed; the plasmid will be transformed into plants as soon as staff are available at Illinois to do this.
Additional progress:
1) A manuscript is almost ready for submission describing the identification of the gene and its role in controlling oil and protein levels. This includes our recent finding that the low protein gene found in high-yielding conventional soybean varieties may revert (at a very low frequency) back to the high protein gene.
2) Further field experiments are in progress to investigate the effects on maturity and seed mineral content seen in the transgenic field trials last year.
3) We are working to identify revertant alleles within existing elite lines to facilitate the creation of high protein varieties.
Updated October 30, 2021:
Final Project Results
Updated November 18, 2021:
Final Report on “Manipulating a major gene governing seed reserves as a means to maintain yield and oil while increasing protein”, October 2021.
Introduction:
Of the many QTL controlling soybean seed protein content, alleles of the cqSeed protein-003 QTL on chromosome 20 exert the greatest additive effect. The high protein allele exists in both cultivated and wild soybean (Glycine soja Siebold & Zucc.) germplasm. We fine mapped this and identified the underlying causative gene. An insertion / deletion variant detected in Glyma.20G85100 was found to have near-perfect + / - concordance with a high / low protein allele genotypes inferred for this QTL in parents of published mapping populations. The indel structure was concordant with an evolutionarily recent insertion of a TIR transposon into the gene in the low protein lineage. For this project we developed and trialed transgenic plants designed to alter the expression level of Glyma.20G85100. Seed protein was significantly greater in greenhouse-grown soybean expressing an RNAi hairpin down-regulation element in two independent events relative to control null segregant lineages. We conclude that a transposon insertion within the CCT domain protein encoded by the Glyma.20G85100 gene accounts for the high / low seed protein alleles of the cqSeed protein-003 QTL. Field trials for the transgenic plants have been completed for three growing seasons, and show pleiotropic effects of the transgene that include higher protein content along with other phenotypes in mineral content and maturity.
At the beginning of the project in October 2019 we stated the following milestones and KPIs. A brief progress report is given for each milestone and KPI. Our project to engineer protein levels using RNAi transgenics has proceeded according to plan and all milestones and KPIs were met.
Milestones:
1) Field trial completed for one growing season using RNAi transgenics
We have now completed field trials for three seasons at the transgenic field facility in Nebraska and we are still analyzing the data. Preliminary indications are that at least one transgenic event shows increased protein and free amino acid levels, and that oil content does not appear to be affected. However the results are preliminary and may not be statistically significant. The interpretation of yield data has been complicated by the fact that the RNAi transgenic constructs seem to affect maturity date, with the transgenic plants maturing later than the control, untransformed line. This is unexpected as the original pro/oil allele derived from PI468916 does not seem to affect maturity. The RNAi lines show very significant GxE and results vary between seasons and plots more than for PI468916-derived alleles.
2) The high protein allele for Glyma.20G085100 downregulated and the effect on protein and oil tested in both a greenhouse and field and for yield in the field.
We have shown downregulation of the Glyma.20G085100 gene and have good data from the greenhouse and extensive data from the field. The effect on protein appears to vary between different transgenic events, as expected. So far we are seeing a 1-2% increase in protein content in the greenhouse consistently in the best transgenic lines. We also see an increase under field conditions but it is more variable. Interestingly we are also seeing relatively large differences in leaf mineral content in some transgenic events, and also effects on maturity and flowering time that are not seen in germplasm with the naturally occurring pro/oil allele. Completion of the analysis for the third year of field data will hopefully solidify these results from the field trials.
3) CRISPR guide construct designed, created and tested in transgenic soybean roots
The CRISPR guide was designed, synthesized and placed in a plasmid vector. Restrictions on laboratory work prevented the testing of this vector using the soybean hairy root system through 2020. However, we have now developed a new, high throughput CRISPR guide testing system that allows us to test in leaf cells directly, as well as a software application that predicts functional guides very accurately, so we no longer need to test the construct in roots.
We aimed to complete the following Key Performance Indicators (KPIs) by the end of Year 1:
1) Data available for multiple RNAi transgenic events for yield trials in the field
By the end of Year 1 we had extensive field data for two RNAi events. We now have three field seasons of data for several events.
2) The impact of down regulating the high protein allele tested
We have demonstrated down-regulation of the mRNA for the allele in transgenic plants that are in the trials.
3) At least one plant transformation plasmid completed containing a guide RNA sequence targeted to the Glyma.20G085100 gene.
The plasmid was completed. Work is continuing to transfer it to soybean plants.
In addition, we have found several interesting research leads that were not anticipated when the initial proposal was written. Two key insights are
1) A finding that the transgenic RNAi construct is also affecting protein, but has pleiotropic effects on maturity and seed mineral content, where the original allele does not.
2) A discovery that the low protein gene found in high-yielding conventional soybean varieties may revert (at a very low frequency) back to the high protein gene. Thus, we may be able to identify revertant alleles within existing elite lines to facilitate the creation of high protein varieties.
In addition to these novel findings, which both require further research before they can be exploited in breeding and agronomy, we have developed straightforward markers that define the locus for the pro/oil trait that can be directly used by any breeders wishing to alter protein content in germplasm.
View uploaded report 
We used soybean breeding and genetics, along with genetically engineered soybean plants, to confirm that a gene we identified can increase the protein content of soybeans. The version of the gene we identified has relatively small effects on yield. As well as growing the engineered plants in the greenhouse, we also grew them in the field in Nebraska for three years and confirmed that the gene can increase protein content of soybeans. We need to do more work to investigate the variability of the protein content in the engineered plants between years.
Benefit to Soybean Farmers
Soybean is a major source of protein and oil globally. The protein component is currently the most valuable reserve of the seed, which is complemented by the oil co-product. Soybean protein is used primarily in feed applications and its demand is expected to increase. Unfortunately, there is an inherent negative correlation between protein and yield in soybean. An evaluation of soybean varieties released from the 1920’s to 2010 showed that during this timeframe, seed protein content was reduced by approximately 2% (20 g/kg seed). This reduction in protein makes it difficult for crushers to produce a soybean meal with 48% protein, the industry standard, resulting in loss of value to growers.
Soybean is a global commodity and sold based on volume with minimal regard to quality, which has de-incentivized breeders to select for protein as they breed for higher yields. While the changes in protein concentration are small in percentage terms, the world soybean crop is projected to be around 370 million metric tons, thus a single percentage point in protein concentration represents 3.7 million tons of protein. There is now a great deal of commercial demand for higher protein soybeans that are high yielding, but conventional breeding processes are slow to incorporate introgressed genes for high protein into elite lines. We are proposing a genetic strategy to control protein concentration in tandem with preserving high yields, without compromising oil levels.
Performance Metrics
We aim to complete the following Key Performance Indicators (KPIs) by the end of Year 1:
1) Data available for multiple RNAi transgenic events for yield trials in the field
2) The impact of down-regulating the high protein allele tested
3) At least one plant transformation plasmid completed containing a guide RNA sequence targeted to the Glyma.20G085100 gene.