| Milk production traits are the most important quantitative economic traits in dairy cow production,and improving the yield and quality of milk is an important way to ensure the production efficiency of the dairy industry and stabilize the fundamentals of agriculture,farmer and rural area.This project carried out a series of in-depth statistical genetics research and molecular analysis on the Chinese Holstein cows in Jiangsu province,such as genetic parameter estimation,genome-wide association analysis,copy number variation analysis,candidate gene validation,and genome prediction.First,the genetic parameters of five milk production traits(milk yield,milk fat percentage,milk fat yield,milk protein percentage and milk protein yield)of dairy cows were estimated using a random regression test-day model;Then,a genome-wide association study(GWAS)and a copy number variation(CNV)analysis were carried out to locate quantitative trait loci(QTL)of the milk production traits;Subsequently,we selected a candidate gene named SIRT2,and validated the mechanism of lipid metabolism regulation by microRNA-212(miR-212)and SIRT2 in bovine mammary epithelial cells at the cellular level;In addition,we imputed the SNP data,and studied the effects on the accuracy of genomic prediction by adding single nucleotide polymorphisms(SNPs)from the imputed whole-genome sequencing(WGS)data using GWAS to the 100K data;Finally,we used three haplotype construction strategies to explore the impact of the accuracy of genomic predictions by converting SNP data and imputed data into haplotype data.The results of the project are as follows:(1)Based on a random regression test-day model,we found that the total heritability of the milk production traits of cows in Jiangsu area ranged from 0.271 to 0.342,which belonged to moderately low heritability traits.There were significant phenotypic correlations(p<0.05)between the five milk production traits,and the overall phenotypic correlation range of-0.423-0.945 and the overall genetic correlation range of-0.625-0.938 for the first three parities.The genetic correlation of milk production traits between different parities was high(0.552-0.989),but the genetic variance and heritability between different parities were different in the whole lactation period,and the genetic variance and heritability of the first parity were slightly larger and more stable than those of the second and the third parities.We suggested that all phenotypic information and environmental factors of the first three parities should be considered during the genetic evaluation of dairy cows,or the evaluation should focus on the first parity to improve the accuracy of genetic evaluation of milk production traits,thereby speeding up the breeding level of cows in Jiangsu province.(2)Based on the GGPBovine 100K SNP data,the QTLs detecting research on the first-parity milk Performance of dairy cows was carried out by GWAS and CNV.A total of 16 SNPs were detected to be significantly associated with milk production traits.Some candidate genes and QTLs that might play key roles in the genetic variation of milk production traits were verified,including DGAT1,SIRT2,LDLR,HSF1,MAF1,PRMT6,GLUD1,PYCR3,and PLA2G4A,these genes were mainly enriched in the biosynthesis and metabolic processes of amino acid and lipid,protein maturation process,regulation of the transcription process and regulation of the macromolecular metabolic process,and they were identified as candidate genes associated with milk production traits in dairy cows.In addition,we identified 1731 CNVs and 236 CNVRs in the 29 autosomes of 984 Holstein dairy cows,and 19 CNVRs were significantly associated with milk production traits(p<0.05),among which CNVR124(located on chromosome 14 146715 bp to 891340 bp)and CNVR161(located on chromosome 1848610254 bp to 48869465 bp)containing two significant SNPs of the GWAS result,respectively,and we speculated that these two regions could be used as candidate regions that related to milk production traits of dairy cows.(3)The dual luciferase reporter gene detection system,qRT-PCR,western blot,triacylglycerol(TAG)assay,and Oil Red O staining were conducted to verify the function of three miRNAs(miR-212,miR-375,and miR-655)and their potential targeting gene named SIRT2,which was selected by GWAS and CNV.The dualluciferase reporter gene verified that miR-212 could target the 3UTR region and regulate the expression of SIRT2,but miR-375 and miRNA-655 had no targeting relationship with SIRT2 gene;qRT-PCR and western blot results showed that miR-212 was negatively correlated with the expression level of SIRT2;qRT-PCR,triglyceride The results of triacylglycerol(TAG)assay and Oil Red O staining showed that the regulatory relationship between miR-212 and SIRT2 can affect the expression of lipid synthesis-related genes and lipid production in bovine mammary epithelial cells.This study confirmed the targeted regulatory relationship between miR-212 and SIRT2 gene,and that miR-375 and miR-655 did not have a targeted regulatory relationship with SIRT2 gene.It was also showed that miR-212 could participate in the lipid metabolism process of bovine mammary epithelial cells by targeting the expression of SIRT2,and the targeted regulatory relationship between miR-212 and SIRT2 may be a potential factor affecting the process of milk fat metabolism in milk.(4)We imputed the SNP data into the WGS level and screened significant SNPs from the imputed WGS data at different thresholds by two GWAS analysis methods.Fixed and random model Circulating Probability Unification(FarmCPU)and Mixed linear model(MLM),to evaluate the effect on the accuracy of genome prediction after combining of the screened SNPs and the 100K SNP data.The results showed that the imputed WGS data did not significantly(p<0.05)improve the accuracy of genome prediction compared with the 100K SNP data.Except for milk protein yield,Among the traits we studied,the evaluation accuracy of the traits using the Bayesian fourdistribution mixed model was higher than the breeding value evaluation using Genomic best linear unbiased prediction(GBLUP),and the improvement range was 0.18%1.60%.Compared with the original SNP data,selecting SNPs from the imputed WGS data and incorporating them into the SNP data using GWAS could improve the accuracy of the genetic prediction of the milk production traits(except milk protein content),somatic cell score and body height traits,and the improvement range is 0.16%-6.94%,among which the accuracy of the milk fat percentage increased the most(6.94%),and the improvement of the milk protein percentage was the smallest(0.16%).Generally.compared with GBLUP based on the 100K SNP data,selecting the SNPs dataset based on the MLM with the thresholds ofp values at 0.0001 and 0.001,and then using GBLUP based on dual genetic components or Bayesian four-distribution mixed model based on single genetic component,the accuracy and unbiasedness of genome prediction were better than other strategies.The study not only confirmed that adding SNPs from imputed the WGS data to the 100K data can improve the accuracy of genome prediction,but also an attempt to combine GWAS with genomic selection(GS).(5)We defined haploblocks according to linkage disequilibrium(LD),fixed length and fixed number of SNPs,and converted 100K SNP data,imputed high-density data and imputed WGS data into haplotype variables according to the defined haploblocks to explore the effect of constructing haplotype variables on the genome prediction.The results show that compared with SNP data,constructing haplotypes could effectively reduce the amount of computational data in WGS data,and the genome prediction stability by converting imputed WGS data into haplotype variables according to different haplotype construction strategies was better than that of haplotype variables transformed from 100K data and high-density SNP data under the same strategy.For different traits,it was very important to determine the optimal haploblocks construction method and threshold when converting SNP data into haplotype variables for genomic prediction.Constructing the haploblocks from 100K data could improves the accuracy and unbiasedness of trait genome prediction with the range of LD(r2)from 0.3 to 0.5.The genomic prediction based on the haplotype variable constructed by the defined fixed length or fixed number of SNPs might face instability or low accuracy.The study proves that converting SNP data into haplotype variables could improve the accuracy of genome prediction and reduce the variable number of WGS data,but the best strategy and threshold for constructing haploblocks need to be determined according to the trait characteristics.In summary,this study revealed the genetic changes of milk production traits of Jiangsu Chinese Holstein cows during different lactation periods,excavated some genes and QTLs that may affect the milk production traits,verified the effect of targeted regulation of miR-212 and SIRT2 on lipid metabolism,and explored the application of GWAS technology and haplotype construction technology in genome prediction.The results of this study will not only make theoretical contributions to the improvement of genetic evaluation methods and the molecular regulation of milk production traits,but also contribute to the genetic evaluation system of the dairy cows in Jiangsu province. |