Fine-mapping and identification of candidate causal genes for tail length in the merinolandschaf breed
- Select a language for the TTS:
- UK English Female
- UK English Male
- US English Female
- US English Male
- Australian Female
- Australian Male
- Language selected: (auto detect) - EN

Play all audios:

ABSTRACT Docking the tails of lambs in long-tailed sheep breeds is a common practice worldwide. But this practice is associated with pain. Breeding for a shorter tail could offer an
alternative. Therefore, this study aimed to analyze the natural tail length variation in the Merinolandschaf and to identify causal alleles for the short tail phenotype segregating within
long-tailed breeds. We used SNP-based association analysis and haplotype-based mapping in 362 genotyped (Illumina OvineSNP50) and phenotyped Merinolandschaf lambs. Genome-wide significant
regions were capture sequenced in 48 lambs and comparatively analyzed in various long and short-tailed sheep breeds and wild sheep subspecies. Here we show a SNP located in the first exon of
_HOXB13_ and a SINE element located in the promotor of _HOXB13_ as promising candidates. These results enable more precise breeding towards shorter tails, improve animal welfare by
amplification of ancestral alleles and contribute to a better understanding of differential embryonic development. SIMILAR CONTENT BEING VIEWED BY OTHERS ESTIMATES OF GENOMIC HERITABILITY
AND GENOME-WIDE ASSOCIATION STUDIES FOR BLOOD PARAMETERS IN AKKARAMAN SHEEP Article Open access 02 November 2022 RNA-SEQ BASED GENETIC VARIANT DISCOVERY PROVIDES NEW INSIGHTS INTO
CONTROLLING FAT DEPOSITION IN THE TAIL OF SHEEP Article Open access 11 August 2020 GENOME-WIDE IDENTIFICATION OF COPY NUMBER VARIATION AND ASSOCIATION WITH FAT DEPOSITION IN THIN AND
FAT-TAILED SHEEP BREEDS Article Open access 25 May 2022 INTRODUCTION Sedentary human communities began sheep management as early as 10,000–11,000 BP in an area stretching from central
Anatolia to northwestern Iran1. It is proposed that the Asiatic mouflon (_Ovis orientalis_), which was common in the same area, was the wild ancestor. The Asiatic mouflon, like other wild
sheep subspecies, is a short-tailed hair sheep. Accordingly, the first domesticated sheep were also short-tailed hair sheep, kept mainly for their meat2. The systematic production and
processing of wool did not occur until several millennia later, leading to the “Secondary Product Revolution”3 and the worldwide replacement of hair sheep by wool sheep. Long-term selection
for fine wool fibers culminated in the economically most important and widespread group of sheep breeds, the Merinos. All Merinos are characterized by long tails and the common occurrence of
fine wool and a long tail led to the frequent opinion that these phenotypes are also genetically coupled or the result of the same artificial selection4. This assumption could not be proven
directly, however, in today’s sheep husbandry systems the long woolly tail comes with several problems, e.g., the accumulation of dags in the tail area, which predisposes for flystrike5.
Therefore, most lambs of long-tailed breeds worldwide are docked shortly after birth4. With the increasing importance of animal welfare in our society and subsequent restrictions and
prohibitions of practices that cause pain and suffering to animals, tail docking has come under scrutiny. In Scandinavia, tail docking without a veterinary indication has already been made
illegal6 and in the Netherlands, exclusions from the docking ban have only been granted for three long-tailed English breeds under the condition of an effective breeding program for shorter
tails7. In Austria, tail docking in lambs is allowed until the age of 7 days, provided the operation is done by a veterinarian or another qualified person and an analgesic for intra- and
post-operative pain-relief is given8. The German Animal Welfare Act currently still allows tail docking without anesthesia for lambs under eight days of age9, but future amendments will
probably seek to eliminate exceptions to the amputation ban10. These developments clearly show that a long-term non-invasive alternative for the painful practice of tail docking is urgently
needed. Here, a genetic solution offers itself. A high ethical acceptance of genetic breeding for a shorter tail could be expected as all wild sheep subspecies and thus also the ancestor of
today’s domestic sheep have naturally short tails11,12. Therefore, this breeding could be seen as a “back to the roots” program. The genetic basis for shorter tail breeding efforts is
provided by the medium to high heritability of tail length (TL) in different sheep breeds, e.g., 0.58 in Merinos13 or 0.77 in Finnsheep14. James et al.15 suggest that the inheritance of TL
in Australian Merino depends on a small number of interacting genes of large effects, in which short tail genes show a possible dominance. The presence of various short-tailed Nordic breeds
offers a possibility of genetic reduction of TL by introgression of the desired genetic variants from short-tailed breeds. Scobie and O’Connell16 crossed short-tailed Finnsheep with
long-tailed Cheviot sheep and observed that an increased proportion of Finnsheep genes led to a proportional reduction in TL. However, this option is unpopular with breeders, as crossing is
associated with the loss of breed-specific traits and a possible decline in previously achieved breeding progress in important production traits17. In extreme cases, it could also introduce
undesirable traits/conditions in the population. For example, Jordan18 described fully or partially hindquarter paralysis in the No-tail lambs from the cross of Siberian fat-rumped sheep
with other sheep breeds such as Hampshire sheep and Rambouillet sheep. This condition is attributed to the appearance of a lethal gene, known as “sidewheeler”. The majority of the lambs with
this condition starved to death during early years of their life. Post-mortem examination of these lambs revealed the abnormal termination of the spinal cord. As an alternative, breeders in
Australia and New Zealand attempted genetic shortening of the tail by phenotypic selection within individual breeds. However, Carter19 reported for Romney sheep that breeding for the
short-tail phenotype possibly reduced the viability of embryos that were homozygous carriers of some putative short-tail alleles. In Merinos, James et al.15 observed increased incidences of
rear-end defects. Zhi et al.20 discovered a c.G334T mutation in the _T_ gene in the native Chinese Hulunbuir breed and showed that the T allele leads to the extreme short-tailed phenotype,
i.e., tailless animals with exposed anus. To prove the causality of this mutation, they genotyped 120 short-tailed Hulunbuir sheep. The observed frequencies of the genotypes (17 G/G, 103
G/T, and no T/T) are consistent with the embryonic lethality due to the T/T genotypes in _T_ gene20. On the other hand, Han et al.21 claim that the nonsynonymous c.334 G > T mutation is
predominantly or exclusively homozygous in Chinese fat-rumped and Hulunbuir sheep. Moreover, this genotype also segregates in other breeds and is always associated with the synonymous c.333
G > C mutation. Despite these two contradictory statements by Zhi et al.20 and Han et al.21, both studies concluded that an association between the (extremely) short tail phenotype and
the c.334 G > T mutation is likely. A comparable association between short tails and embryonic lethality or malformations has been demonstrated in various breeds of dogs and cats
too22,23,24. These undesirable negative side effects discouraged and slowed down active breeding programs against overlong tails in the economically most important wool sheep breeds.
Moreover, there have been no successful genetic mapping studies in long-tailed Merino sheep breeds and the possible relationship between the short tail phenotype and embryonic viability or
hind end malformations has, to the best of our knowledge, never been investigated on a genetic basis in Merinos. The aim of the present study was therefore: * 1. To investigate the
phenotypic and additive-genetic variance in TL in the Merinolandschaf, which belongs to the economically most important long-tailed Merino breed group worldwide; * 2. To map the position of
the major quantitative trait locus (QTL) affecting TL; * 3. To detect and confirm causal candidate genes by sequencing and genotyping; * 4. To determine the distribution of ancestral and
derived alleles in a wide range of domestic sheep breeds with different TLs as well as in different wild sheep subspecies; * 5. To contribute to the understanding of the relationship between
genotype and phenotype during embryonic patterning and early development; * 6. To put causal alleles in the evolutionary context of sheep species. Together, these objectives will enable
more efficient breeding towards the ancestral phenotype and thus improve animal welfare in sheep production without negative side effects. RESULTS INITIAL MIXED LINEAR MODEL ASSOCIATION
ANALYSIS _GCTA-GREML_ analysis revealed SNP-based heritability of 0.992 (standard error of 0.12), meaning that a very high proportion of the TL variance in Merinolandschaf breed is explained
by genome-wide SNP markers. Despite this very high heritability, i.e., close to 1, the initial association analysis (MLMA Model 1, Table 1) revealed no genome-wide significant association
between any SNPs and TL. Even the four most significant SNPs (Fig. 1a) remain below the suggestive significance threshold of _P_ = 2.22 × 10−5. INITIAL COMBINED LINKAGE DISEQUILIBRIUM AND
LINKAGE ANALYSIS The haplotype-based cLDLA mapping (cLDLA Model 3) resulted in two genome-wide significant QTLs associated with TL in Merinolandschaf (Fig. 1c). The most prominent and narrow
peak is on OAR11 at position 37,111,462 bp with _LRTmax_ = 29.460 corresponding to _P_ = 5.71 × 10−8 (Bonferoni corrected: _P_ = 6.43 × 10−5) as shown in Supplementary Fig. S1a. The second
genome-wide significant QTL affecting TL was mapped to the OAR2 at position 94,538,115 bp with _LRTmax_ = 19.356 corresponding to _P_ = 1.08 × 10−5 (Bonferoni corrected: _P_ = 1.22 × 10−2).
Applying the 2-LOD criterion, the corresponding confidence interval (CI) was set for the _LRTmax_ on OAR11 between positions 37,000,925 bp and 37,521,490 bp and for the maximum value on OAR2
between positions 93,441,900 bp and 96,402,884 bp. These intervals were then considered in the UCSC Genome Browser Oar_v4.0/oviAri4 Assembly, which resulted in the list of genes summarized
in Table 2 and Table 3. The list of positional candidates also includes obvious functional candidates from the sheep homeobox B gene cluster (Chr11:37,290,203–37,460,240) with _HOXB13_
(37,290,203–37,292,513) as the closest and most prominent candidate, lying only 179 Kb proximal to the _LRTmax_. Additional chromosome-wide significant peaks were observed on OAR2, OAR3,
OAR10, OAR14, and OAR17. However, these peaks show _LRT_ values far below the genome-wide significance and thus, were not investigated further. ESTIMATION OF QTL EFFECTS AND SELECTION OF
ANIMALS FOR CAPTURE SEQUENCING In the previous step of cLDLA, we used _ASReml_25 to estimate variance components, fixed and random effects affecting TL in Merinolandschaf breed. Here, we
analyzed in more detail the estimated effects at loci with the most significant association, i.e., at loci showing _LRTmax_ values. We sorted all 362 lambs according to the random
additive-genetic QTL effects (vector Q) estimated at _LRTmax_. Together with the QTL effects, we simultaneously considered all input (Y, sex, age, body weight (BW), withers height (WH),
maternal and paternal haplotypes at 40-SNP window with _LRTmax_ in interval between SNP 20 and 21) and output data (U, SS, and E) that contributed to _LRTmax_. Visual inspection of this
table and regression analysis allowed us to select 48 lambs for targeted capture sequencing. Supplementary Figs. S2, S3 show the distribution of the sequenced lambs regarding TL and
diplotype effects on OAR2 and OAR11. A regression analysis performed with the function _lm_ in _R_26 estimated the adjusted coefficient of determination of _R__2_ = 0.58 for the _LRTmax_ on
OAR11 and only _R__2_ = 0.15 for the _LRTmax_ on OAR2 when using TL as the dependent and diplotype effect as the independent variable. Adding, age, sex, BW, and WH as additional independent
variables yield _R__2_ = 0.78 for QTL on OAR11 and _R__2_ = 0.45 for QTL on OAR2 (see Supplementary Tables S1, S2). According to the shape of the LRT curve, the significance of the mapping
and the coefficient of determination, the haplotypes associated with the putative causative alleles are more distinct in QTL on OAR11 than on OAR2. However, the selection of 48 lambs for
capture sequencing represents a trade-off between the two QTLs, with the choice for OAR11 being more decisive. The selected lambs could be divided into two groups: 23 long-tailed with
positive QTL effect and 25 short-tailed lambs with negative QTL effect on OAR11. Sorting the same lambs by QTL effects on OAR2 changes the order within the group and results in 3 individuals
from the long-tailed group and 4 individuals from the short-tailed group moving to the other group. CAPTURE SEQUENCING OF 48 LAMBS AND DETECTION OF CANDIDATE MUTATIONS Capture Sequencing
was carried out at a mean depth between 0.41 and 2.45 on the target region of OAR2 and between 1.01 and 2.39 on the target region of OAR11. This coverage is much lower than intended and most
possibly caused by competition with whole-genome sequencing (WGS) performed on the same sequencing lane. However, applying default settings of GATK _HaplotypeCaller_27, we detected 40,433
short variants on the investigated region on OAR11. Next, we sought to identify variants showing remarkable differences in frequency between short-tailed and long-tailed groups by applying 5
steps described in supplementary methods. This allowed us to reduce the number of potential candidates from 40,433 to 19 (Supplementary Table S3). Of all these 19 most plausible genomic
variations, only one SNP nearly satisfied the frequency-based criterion (see Supplementary Methods). Interestingly, this SNP (C→G) was located at position 37,290,361. The visual examination
using _Jbrowse_28 confirmed the base substitution (_rs413316737_) as a nonsynonymous point mutation within the first exon of _HOXB13_ (relative position 23). The point mutation results in a
p.(Thr8Ser) substitution. All sequenced Merino lambs from the long-tailed group are homozygous for this missense variant (G/G). In the short-tailed group, 4 lambs are homozygous G/G, 6 are
heterozygous C/G and 15 are homozygous C/C on that position. The Ensembl Variant Effect Predictor29 predicted a SIFT score of 0.54 and classified the mutation as so-called “_tolerated_”
missense variant. In the next step, protein _BLAST_30 was used to align the amino acid sequence of the mutant HOXB13 protein against the amino-acid sequence of wild-type HOXB13 protein of
different mammals including all the extant wild sheep species shown in Supplementary Table S4. This cross-species alignment (Fig. 2) revealed that the amino acid at which the discovered
variant occurred is conserved. In addition, 5 downstream amino acids are also conserved among the aligned species. By comparing the allele frequency in long-tailed and short-tailed sheep in
multiple breeds, we observed that the above-mentioned derived allele G occurs more frequently in long-tailed sheep breeds (Supplementary Table S5). Further, at this position, we observed
only the ancestral allele in Urial, Argali, Snow sheep, Dall sheep, Canadensis and two ancient (~8000 years) sheep genomes31. On the other side out of 16 investigated Asiatic mouflon 3 were
heterozygous C/G and 2 homozygous G/G for the point mutation. However, it is worth mentioning here that one Asiatic mouflon (G/G) and the two ancient WGS have low coverage at this locus.
Further, we identified SVs in the targeted region of OAR11 by using three different approaches: _Smoove_32, _Delly_33, and visual examination with _Jbrowse_. Using the strict threshold
criteria, i.e., SV not present in a pooled sample of short tail sheep but present in a pooled sample of long tail sheep, we identified 27 and 32 SVs from _Smoove_ and _Delly_ approaches,
respectively. However, it is noteworthy that the captured sequencing had highly non-uniform and relatively low coverage, therefore, these approaches might have missed many true positives and
included high number of false positives. In fact, the visual examination of these regions also indicated so (Fig. 3). Interestingly, window-by-window visual examination of the targeted
region in _Jbrowse_ revealed two distinct clusters of soft-clipped reads (Fig. 3a) just about ~130 bp upstream of the candidate SNP. Both clusters were arranged next to each other. One
cluster was present in all sequenced animals, indicating either assembly error or assembly-specific variant as the likely cause. The other cluster of soft-clipped reads had distinct
frequency distribution between the groups of short-tailed and long-tailed reads. At position 37,290,229 on OAR11, the long-tailed group showed 34 of the 35 reads mapped as soft-clipped,
while the short-tailed group showed 13 of the 54 reads as soft-clipped. We also observed a remarkable difference in the frequency of the clipped reads around this position between the WGS
data of short and long-tailed sheep that were downloaded from NCBI SRA (Supplementary Table S4). Therefore, in the next step, we aligned the pooled reads of the short-tailed and the
long-tailed group and the WGS data of NCBI SRA against the latest sheep assembly (ARS-UI_Ramb_v2.0)34 (https://www.ncbi.nlm.nih.gov/assembly/GCF_016772045.1). Visual examination of the
region upstream of the first exon of _HOXB13_ gene on OAR11 revealed only one cluster of soft-clipped reads at positions 37,524,996 (Fig. 3b) indicating that another cluster which was
identified in the mapping against Oar 4.0 assembly was due to missing sequences of about 40 bp in Oar 4.0 assembly. Interestingly, we observed that a very high number of soft-clipped reads
in this cluster had supplementary alignment on OAR5. Further, on OAR11 at the breakpoint, we also observed discordant alignment (overlapped) between forward and reverse reads. Based on the
presence of the soft-clipped reads and discordant alignment, we suspected the presence of an insertion or translocation. While we were investigating this region on ARS-UI_Ramb_v2.0 assembly,
we came across a pre-print by Li et al.35 who reported an insertion in the same region. To investigate the soft-clipped regions further, we carried out Sanger sequencing of five samples,
based on the previously described candidate SNPs. The analysis of the Sanger sequencing data (Fig. 3c) and subsequent alignment against OAR11 in ARS-UI_Ramb_v2.0 using _BLAST_30 identified
the SV as an insertion of 167 bp. We further observed that this sequence is flanked by 14 bp of direct repeats (CTGCCAGCGATTTA) on both sides. Therefore, we hypothesized that this insertion
could be a part of short interspersed nuclear elements (SINE) repeat family. Next, we searched for this sequence in DFAM repeat database and identified it as belonging to OviAri-1.113 SINE
family. GENOTYPING OF THE MOST PLAUSIBLE CANDIDATES IN 362 LAMBS AND REMAPPING OF TAIL LENGTH To confirm the association between the detected variants and TL, we performed genotyping of
these two candidates in the entire mapping population and used genotypes in the GWAS and cLDLA Model 2, 4, and 5 (Table 1). The PCR-genotyping of the candidate SNP resulted in 220 G/G, 118
C/G, and 24 C/C lambs. The PCR-genotyping of the 132 bp upstream candidate insertion of 167 bp showed an identical distribution of genotypes over the entire mapping population, i.e., the
insertion occurred in all haplotypes harboring the base G and never in haplotypes harboring C on the position 37,290,361. Due to the complete linkage between the insertion and the missense
SNP, both candidates are considered synchronously and equally in Model 2 (MLMA) and Models 4 and 5 (cLDLA). The distribution of alleles in wild sheep subspecies and domestic sheep allows us
to infer ancestral and derived alleles at both candidate loci. The absence of the insertion and presence of base C at c.C23G SNP in _HOXB13_ are ancestral alleles, whereas the 167 bp
insertion and base G indicate derived alleles. All lambs homozygous for ancestral alleles could be classified as short-tailed with mean tail-length of 24.1 cm (±1.34) and mean QTL effect of
−2.92 cm (±0.71). On the other side lambs homozygous for derived alleles classified as long-tailed as well as short-tailed. Consequently, lambs with derived homozygous genotype show higher
average TL of 31.5 cm and 4.15 times higher standard deviation of TL (±5.56). Lambs with heterozygous genotype show average TL of 25.7 and 2.73 times higher standard deviation (±3.66). Table
4 summarizes phenotype, QTL, and polygenic effects of candidate insertion and SNP that are in population-wide linkage disequilibrium. MLMA AND CLDLA INCLUDING CANDIDATE VARIANTS AS MARKERS
To investigate the impact of derived alleles on the results of the SNP-based association analysis and the haplotype-based mapping, we considered the candidate locus as an additional marker
located on Chr11:37,290,361 (see Model 2 and 4 Table 1). Thereby, only the number of considered markers increases by one compared to the original analysis, while the other input data, as
well as the parameters and the model, do not change. This minimal change in the input data led to enormous changes in the results of the association analysis and limited changes in the
results of the haplotype-based mapping. MLMA Model 2 confirmed the candidate causal locus as a uniquely genome-wide significant (_P_ = 6.2 × 10−12) locus, while cLDLA revealed a slightly
altered significance (_LRTmax_ = 29.112) at the slightly altered position 37,311,842 bp. However, in the initial mapping _LRTmax_ was 179 Kb away from _HOXB13_ and the Model 4 of cLDLA
placed _LRTmax_ between _HOXB13_ and _HOXB9_ (Table 3). Therefore, the distance between _LRTmax_ and the candidate gene _HOXB13_ decreased from 179 Kb to 19 Kb and thus _HOXB13_ became the
closest gene to _LRTmax_. VARIANCE COMPONENTS, MLMA, AND CLDLA INCLUDING CANDIDATES AS FIXED EFFECT The above results point to _rs413316737_ and/or the insertion as plausible candidates for
causal variations. The mixed linear model (MLMA or cLDLA) allows us to model important causal candidates and thus improve the mapping of residual variance (if present) in the mapping
population. To investigate the presence of additional loci affecting TL in long-tailed Merino sheep, we modeled the genotypes at the candidate insertion and SNP as fixed effects, i.e., lambs
with homozygous ancestral genotype were classified as class 1, heterozygous as class 2, and homozygous derived as class 3, while the other input data, parameters, and model did not change.
We first estimated the SNP-based heritability of TL after correcting for the most significant causal candidates. The heritability decreased from _h__2_ = 0.992 to 0.898. However, as shown in
Fig. 1d, the cLDLA is unable to highlight additional candidates, although a relatively high proportion of additive-genetic variance is still present in the mapping population studied here.
As expected from the true candidate, the mapping signal on OAR11 disappeared completely (Supplementary Fig. S1b and Fig. 1d). Moreover, the LRT value at the peak on OAR2 decreased from
19.356 to 14.476 and marginally changed its position from 94,538,115 bp to 94,345,619 bp. With this change, a possible QTL on OAR2 loses its genome-wide significance. The mapping results for
the region between 35 Mb and 40 Mb on OAR 11 for the models 1, 2, 3, and 5 are shown in Supplementary Fig. S4. DISCUSSION The present study aimed to identify the genes or variants
responsible for the natural variability of TL in the Merinolandschaf breed. The results presented here suggest that the inheritance of TL depends largely on additive-genetic variance almost
without environment effects (_h__2_ = 0.992). Despite the relatively small number (362) of phenotyped and genotyped animals, heritability was estimated with a relatively low standard error
(SE = 0.12). According to Visscher et al.36, the SE depends only on the sample size and became below 0.1 by using more than ~3000 independent individuals. Phenotypic correlation in related
individuals is often due to a shared environment rather than SNP-associated genetic effects, leading to inflated estimates of heritability37,38. Since all the lambs studied here are from
only two different farms, the actual heritability is most likely lower than the calculated one, even when relatedness and population structure are taken into account. The GWAS (MLMA), i.e.,
the standard method for mapping loci associated with complex traits and diseases, showed no significant and even no suggestive association in the design with 362 animals and 45,114 markers.
Typically, GWAS36 uses several hundred thousand markers in large mapping designs. Therefore, the solution would be to enlarge the sample size and genotype this enlarged mapping design with
high marker density (e.g., OvineHD). However, in studies such as this, where phenotypes are not routinely collected, the number of phenotyped animals may be limited or can only be expanded
at great expense. On the other hand, the number of markers can also be a limiting factor in many species, e.g., there is only 50 K chip for domestic goats. The alternative solution could be
to apply a mapping method that uses more information from the current design. Due to time and cost constraints, we opted for haplotype-based mapping and obtained a highly significant (_P_ =
5.71 × 10−8) and fine mapped QTL on OAR11 (CI 37,000,925–37,521,490 Mb) as well as another genome-wide significant result (_P_ = 1.08 × 10−5) with less sharp mapping (OAR2, CI
93,441,900–96,402,884 Mb). This shows that variance component analysis in the haplotype-based mixed linear model can be more successful than methods such as GWAS when the number of
phenotyped individuals and genotyped markers is not large enough for GWAS. Previous research on TL in domesticated animals, including sheep, mainly pointed to the _T_ gene, also known as the
brachyury gene, as a candidate causal gene. In Hulunbuir short-tailed sheep, a c.G334T mutation in _T_ gene is the main cause of the extreme short-tail phenotype20. Mice that are
heterozygous for mutations in the _T_ gene have a short tail and homozygous embryos die in the middle of the gestation39,40,41. In Manx cats, the short-tailed phenotype is caused by
naturally occurring mutations in _T_ gene23, in particular by three 1-bp deletions. The _T_ gene has also been associated with the short-tailed phenotype in various dog breeds22,42,43.
Additionally, _ANKRD11_, _ACVR2B,_ and _SFRP2_ were detected as plausible candidate genes that could contribute to the reduction in TL in particular dog breeds42 and the somite
segmentation-related gene _HES7_44 in Asian domestic cats. In this study, we could not detect any increase in the LRT curve in the _T_ gene region (OAR8: 87,717,306–87,727,483bp).
Furthermore, neither _ANKRD11_ (OAR14: 13,810,611–13,882,911bp), _ACVR2B_ (OAR19: 11,794,562–11,802,479bp) or _SFRP2_ (OAR17: 3,727,698–3,736,241bp) showed a significant increase in the LRT
value. _HES7_ is located on OAR11 (27,284,897–27,287,414bp) but is 10 Mb away from the QTL CI and was therefore not considered as a candidate gene for TL in the Merinolandschaf breed. The CI
on OAR11 includes the complete ovine _HOXB_ cluster and seven additional genes (Table 3). Among these genes, a potential influence on TL could be predicted for _HOXB6_, _HOXB8,_ and
_HOXB13_. Here, _HOXB13_ is the gene closest to the maximum _LRT_ value, and a corresponding literature search (see Diaz-Cuadros et al.45 for a review) indicates this gene as the most likely
gene causing the large effect. The literature search for plausible candidates for genes at OAR2 (Table 2) yielded hardly any useful results. We tried to improve the search with MGI
Mammalian Phenotype level 4 ontology, a method for classifying and organizing phenotypic information related to mammalian species. After correction for multiple testing (adjusted _P_ <
0.05), 29 ontologies were significantly enriched, but we saw no plausible link to TL. To select the most suitable lambs for capture sequencing of CIs of QTLs on OAR11 and OAR2, we mutually
considered haplotypes, phenotypes, fixed effects, and random effects at _LRTmax_ of both regions. Again, the visual inspections, as well as linear regression analyses, confirmed QTL on OAR2
as inconclusive. This is evident from adjusted _R__2_ = 0.58 and only 0.15 for TL fitted to QTL of OAR11 and OAR2, respectively. Consistent with the clues in favor of OAR11 discussed above,
capture sequencing revealed one plausible point mutation and one SINE insertion in the QTL region. All Merinolandschaf lambs and their fathers show a complete linkage disequilibrium between
these two variants. This is not surprising because these mutations are only 132 base pairs away from each other and recombination in such a short segment should be extremely rare. We further
investigated this linkage in some typical long-tailed and short-tailed breeds (Supplementary Table S4). The linkage was confirmed for long-tailed White Swiss Alpine and Rambouillet breeds
but we observed the occurrence of the missense mutation without the insertion in some individuals of short-tailed domestic sheep breeds as well as in five Asiatic Mouflons (Supplementary
Table S4). Analyses of the WGS data (Supplementary Table S4) and amino acid sequence alignment (Fig. 2) identifies allele C to be ancestral. Therefore, allele G is a derived but relatively
ancient allele that segregates in _Ovis gmelini_ and _Ovis aries_. The 167 bp insertion in the promotor region is also derived but more recent and segregates exclusively in domestic sheep
and predominantly in long-tailed sheep breeds. This insertion occurred in the haplotype with the older missense allele G and both derived alleles segregate as a block. So, we did not observe
the insertion without allele G, but we did observe the older allele G without insertion. The presence of the insertion in long-tailed sheep breeds and its location in the promotor region of
_HOXB13_ increases the probability of the insertion to be the causative variant for the long tail phenotype. Most likely because of its location the insertion modulates the promoter
activity of _HOXB13_ and leads to a longer tail by reducing the expression of the _HOXB13_ gene. Indeed, a recent study by Li et al.35 also detected an insertion of 169 bp close to the 5’
UTR of _HOXB13_ at position 37,525,005 (ARS-UI_Ramb_v2.0) using the graph-assembly-based method on the Pac-bio sequencing data of 13 different sheep breeds including Merino. Therefore, it is
likely this study and Li et al.35 both identified the same insertion. The small differences in length and position of this insertion can be attributed to sequencing error due to the
presence of long homopolymer of “T” base in the identified SINE repeat element. Second generation short reads sequencing technology coupled with assembly errors can make finding promising
structural candidate variants difficult and thus visual examination of the targeted region should be considered. However, long-read sequencing technology like PacBio and ONT have already
proven to be effective in identifying such SVs. Interestingly, the missense mutation (_rs413316737_) in the first exon of _HOXB13_ (37,290,361 C→G) is included on the OvineHD array as SNP
marker _oar3_OAR11_37337253_. This offers the possibility to check the allele distribution in available open-source genotypes (Supplementary Table S5). It is noteworthy, that top-allele G of
_oar3_OAR11_37337253_ correspond to ancestral C in the coding sequence of the _HOXB13_. Genotyping of the candidate SNP and insertion throughout the mapping design provides the opportunity
to test the efficiency of GWAS and cLDLA with the candidates as a marker and as a fixed effect. The inclusion of the candidates as a marker supports a very strong association (i.e., _P_ =
6.2 × 10−12) or identity of the _rs413316737_ and/or insertion with the causative locus. Additionally, it demonstrates the power of GWAS when the design includes causal variants or markers
with population-wide LD. On the other hand, the inclusion of candidates as markers in haplotype-based mapping leads only to partial change, i.e., the mapping significance remains about the
same, but the mapping peak is 9-times closer to the most plausible candidate gene. Even more conclusive is the impact of these mutations as a fixed effect in the model, because the
correction for the true causal variant should cancel out the LRT peak on OAR11 and if this explains a full additive-genetic variance, the heritability should also be reduced towards zero.
Indeed, this model erase the LRT peak on OAR11 (Fig. 1d) but the heritability remains relatively high (0.898). We thus gathered further evidence pointing to a missense mutation in the first
exon of _HOXB13_ and/or a structural variation in the promotor region of _HOXB13_ as plausible causal mutations but also confirmed the presence of other, yet unknown causal variants that
explain a large part of the phenotypic variance. Conversely, it is also possible that heritability estimate of the present study is inflated because of the pedigree-associated genetic
variants and the shared environment37,38 of our samples and therefore, the variations around _HOXB13_ contribute more to TL than these results suggest. _HOXB13_ belongs to the family of
homeobox genes, which were first described in _Drosophila melanogaster_46. This gene codes for transcription factors and play an important role in structuring the body plan during
embryogenesis (reviewed by Diaz-Cuadros et al.45). In mammals, there are 39 _Hox_ genes organized in four clusters and 13 paralogous groups47. There is functional redundancy among the
paralogous _Hox_ genes48 and the paralog alleles can compensate for each other to a certain degree49. Therefore, the loss of function of one _Hox_ gene from a cluster is usually compensated
by the functionality of an intact paralogue, and only the loss of function of several paralogs results in more severe consequences in axial structuring47. There is indirect evidence that
_HOXB6_50 and _HOXB8_51 may influence TL and embryonic viability. However, capture sequencing finds no genetic variants associated with TL in these two candidate genes. Therefore, these
candidates are not discussed further. As vertebrate embryos develop progressively from head to tail, the HOX13 paralogous group has been proposed to control axis termination45, which is
mainly achieved through regulation of proliferation and apoptosis activity in the posterior embryonic regions52. In mice, loss-of-function mutations in _Hoxb13_ lead to overgrowth of the
spinal cord and caudal vertebrae in homozygous mice. These animals consequently show longer and thicker tails while viability and fertility remain unaffected52. Premature expression of genes
of the _Hox13_ paralogous group, on the other hand, negatively influences the extension of the caudal axis and results in a truncated phenotype51. Kingsley et al.53 also suggested that
_Hoxd13_ plays a role in the development of natural TL variation and found out that long-tailed forest mice express less _Hoxd13_ in the embryonic tail bud than short-tailed prairie mice and
(therefore) show better climbing performance. Among all candidate genes from the _Hox_ family, _HOXB13_ is thus the best functional candidate gene. Moreover, Li et al.35 used previously
published RNA-Seq data of sheep colon, performed luciferase reporter assays, and indicated an association of lower expression of _HOXB13_ with insertion in its promotor region. This is in
the line with above findings in mice. However, due to complete linkage, we cannot exclude the cooperative causality of the insertion and missense mutation in exon 1. The _HOXB13_ is
expressed in the prostate of adult humans and is intensively studied as a candidate biomarker for the prognosis of prostate cancer (see Ouhtit et al.54 for review). Studies show that
missense mutations in the coding sequence of _HOXB13_ can change the affinity55 or half-life56 of heterodimer between HOXB13 and, e.g., MEIS1 proteins. However, it was not within the scope
of this study to investigate the affinity between different transcription factors in growing sheep embryos or cell lines. It is well known that the SINE insertion alone can alter gene
expression in multiple ways57, but we propose to test the hypothesis of a possible causality of the combination of altered expression and altered amino acid sequence. As already introduced,
attempts were made in 1970s and 1990s to breed short-tailed Romney19 and Merino12,15 sheep. These attempts failed due to evidence of reduced viability of presumably homozygous short-tailed
Romney embryos and by increased incidences of rear-end defects in short-tailed Merinos. However, these observations contradict the fact that the short tail is the ancestral trait and that
old Nordic short-tailed breeds like Romanov and Finnsheep are very viable and highly fertile. Therefore, we assume that these early experiments were not carried out with animals that had
shorter tails due to ancestral genetic variants, but due to some recent deleterious variants. James et al.15 mentioned that the TLs of the four sires measured before mating were all less
than 5 cm. This is very short for an adult ram, even much shorter than the tail of short-tailed adult Romanov rams (~20 cm). Also, Romney sheep used for experimental mating by Carter19 were
described as “tailless”. Therefore, short or tailless phenotypes described in initial sheep breeding attempts15,19 are comparable with deleterious mutations described for certain dog and cat
breeds22,23,24,42,43 rather than to some ancestral alleles. The causal candidates for short tails detected in this study are highly frequent or fixed in some very viable and highly fertile
breeds (e.g., Finnsheep, Romanov, Dalapaels, and Soay, Supplementary Table S5). Therefore, the selection toward the ancestral allele do not carry any detrimental side effects for fertility
or malformations58 and can be achieved without introgression. However, further studies would be necessary to detect the causality between the extremely short-tailed animals and the
associated undesirable phenotypes as reported previously15,19,22,23,24,42,43. In our design, all homozygous ancestral and the majority (79.7%) of heterozygous lambs could be classified as
short-tailed, while homozygous derived lambs are mostly (71.4%) but not always long-tailed (Table 4). This could be caused by interaction (epistasis) with alleles at other loci in the sheep
genome or simply by the polygenic nature of the high proportion of additive-genetic variance which was still not explained by the mutations discovered here (0.898). In addition, functional
redundancy48 and synergistic interaction59 between the paralogus HOX genes could contribute to the additional complexity of the phenotype. However, the genomic regions containing the HOXA
(OAR4, 68 Mb), HOXC (OAR3, 132 Mb), and HOXD (OAR2, 132 Mb) gene clusters show no signals in the cLDLA analyses without (Model 3) and with (Model 5) candidate mutations as a fixed effect
(Fig. 1c, d). Tacking together, to map some remaining causal variants in long-tailed breeds, we will need a design with much higher statistical power than the one carried out in the present
study. The synergistic interaction of different paralogous HOX genes could also cause a possible link between TL and fine wool in sheep, since it is known that _HOXB13_60 and _HOXC13_61 has
an impact in hair follicle morphogenesis and therefore controls hair formation. This would explain an already discussed association between TL and fine wool. Further studies are needed to
prove or reject a correlation by taking into account the expression patterns of the different _HOX13_ genes. Our results provide a comprehensive insight into the genetic variance of TL in
long-tailed Merino sheep and offer information towards direct gene-assisted selection for shorter tails and thus contribute to animal welfare by avoiding tail docking in the future. Part of
the society, which is actively committed to animal welfare, frequently has prejudices against genetic methods in general. Therefore, it should be pointed out once again that this is a
natural original genetic variant and that selection in favor of this variant serves to restore the most natural original trait. We would not call it a “repair”, but a “back to the roots”.
Additional to commercial and animal welfare aspects this and follow-up studies could contribute to a better understanding of embryonal development too. According to Aires et al.62
quantitative differences within the _Gdf11-Lin28-Hoxb13-Hoxc13_ gene-network might account for the tail size variability observed among vertebrate species. Thereby, the determination of the
TL could result from the relative intensity or the sequence of the individual network components. The mechanisms regulating tail size are still not fully understood especially the pathways
downstream of _Lin28_ and _Hox13_ genes in the _Gdf11-Lin28-Hoxb13-Hoxc13_ network. Therefore, Aires et al.62 suggested testing the gene-network parameters in embryos of vertebrate species
with different tail sizes. The embryos of a sheep breed with an ancestral short tail and a derived long tail are suitable candidates for studying the control of axis termination. In the
meantime, the mapping design could be extended and improved to map genes pointing to further network candidates, possibly downstream mediators of the _HOXB13_. In conclusion, we revealed a
SNP in the first Exon of _HOXB13_ and a SV in the promotor region of _HOXB13_ on OAR11, which are highly associated with TL in the Merinolandschaf breed. The results presented here suggest
that the TL depends on many loci with minor effects, whereas mentioned variations around _HOXB13_ cause the main effect in TL. This is evident from the only slight reduction in heritability
after correcting for these most effective variations. To detect the additional causal loci a more powerful design is needed. Finally, we want to stress that selection for shorter tails is
possible without introgression and without negative side effects and that this selection is ethically unproblematic as it leads to an increased frequency of the ancestral allele METHODS
ANIMAL SAMPLES AND PHENOTYPES The entire mapping design of 362 phenotyped and genotyped animals were collected in three phases: (i) 236 Merinolandschaf lambs with very short (104) or long
(132) tails were selected for phenotyping and sampling from 2293 visually inspected lambs on a farm in Lower Bavaria with no custom of tail docking, (ii) 102 random male lambs from the same
farm were phenotyped and sampled, i.e., without pre-selection by visual inspection of tail-length and (iii) 24 Merinolandschaf lambs with short (19) or long (5) tails were selected from 102
visually inspected lambs from the sheep flock of the teaching and research farm “Oberer Hardthof” at the Justus Liebig University of Giessen (JLU). Figure 4 shows the variation in the tail
phenotype in Merinolandschaf and Supplementary Fig. S5 shows the distribution of TL from the 260 out of 2395 lambs that were selected for sampling after visual inspection. Phenotyping of
these 362 lambs was conducted according to the method proposed by Eck et al.63. Although an age of 5 weeks proved to be the optimal time point for phenotyping, we sampled and phenotyped also
younger and older lambs (Supplementary Table S6 shows descriptive statistics about the age and other phenotypic traits) in order not to disturb the work processes on the farm. TL was
measured with a custom-made wooden board from the anus to the tip of the tail, BW with a standard scale, and WH with a metal measuring device from dog sports from the floor to the highest
point of the withers. Furthermore, gender, age, and litter size were recorded. Unfortunately, the age and litter size are only approximately known for randomly sampled 102 animals. To
improve haplotype inference, we sampled and genotyped 22 putative fathers of the above lambs. These rams were not phenotyped and contributed only indirectly to the QTL mapping. The
collection of blood samples for this study was approved by the ethics committee of the Veterinary Faculty of LMU Munich. All blood samples were taken by experienced and qualified
veterinarians and under a permit from the Government of Upper Bavaria (permit number: 55.2-1-54-2532.0-47-2016), or the Regional Council of Gießen, Hassia (KTV number: 19 c 20 15 h 02 Gi
19/1 KTV 22/2020). GENOTYPES All 362 phenotyped lambs and 22 sires were genotyped using Illumina’s OvineSNP50 BeadChip (Illumina, San Diego, USA) according to the manufacturer’s
specifications. All physical marker positions were determined on the ovine reference genome assembly Oar_4.0 (https://www.ncbi.nlm.nih.gov/assembly/GCF_000298735.2) and the positions of all
markers or sequences in this work correspond to the Oar_4.0 reference genome unless otherwise stated. The chip contains 54,241 SNPs almost evenly distributed across the genome with an
average marker spacing of 50.9 Kb64. Not all of these 54,241 SNPs were used for mapping. We filtered SNPs according to the following exclusion criteria: (i) unsuccessful genotyping in >5%
of the animals, (ii) frequent paternity conflicts in animals with known paternity, (iii) unknown position in the reference genome, (iv) minor allele frequency (MAF) of >0.025 and (v)
localization on a sex chromosome since the analyses were exclusively carried out on autosomes. As a result, 45,114 markers remained in the marker set for the mapping analyses. Haplotype
phasing and imputation were conducted using a Hidden Markov Model implemented in _Beagle_ version 5.065. To improve the accuracy of haplotyping and imputation, genotype and pedigree
information from ~5100 additional animals genotyped with the OvineSNP50 BeadChip but not phenotyped were added to the haplotyping design66. ESTIMATION OF HERITABILITY AND MAPPING First, we
tested the heritability of the TL in the pure Merinolandschaf breed. For this purpose we used the software _GCTA_ v1.93.2, which has been extended with GRM, a tool for estimating the genetic
relationship matrix and a genomic-relatedness-based restricted maximum-likelihood approach (GREML), to estimate the proportion of variance explained by all SNPs (the SNP-based
heritability)67. The sex, age, BW, and WH of the lambs at phenotyping were modeled as fixed effects. MIXED LINEAR MODEL-BASED ASSOCIATION ANALYSIS (MLMA) To map a putative TL locus, we
performed MLMA analyses with a leave-one-chromosome-out approach as implemented in the software _GCTA_ v1.93.268. The model is described below:
$${{{{{\bf{y}}}}}}={{{{{\bf{X}}}}}}\ss+{{{{{{\bf{Z}}}}}}}_{{{{{{\bf{a}}}}}}}{{{{{\bf{a}}}}}}+{{{{{{\bf{Z}}}}}}}_{{{{{{\bf{u}}}}}}}{{{{{\bf{u}}}}}}+{{{{{\bf{e}}}}}}$$ (1) where Y is the
vector of TLs (cm), SS is a vector of fixed effects including the mean, sex, age, BW in kg, and WH in cm at tail phenotyping (BW and WH data were both standardized and centered), A is the
vector of the additive effect (fixed) of the candidate SNP to be tested for association, U is the vector of polygenic effect (random or accumulated) of all markers, excluding those on the
chromosome which contains the candidate SNP and E is a vector of residuals. X, ZA, and ZU are the incidence matrices for SS, A, and U, respectively. The suggestive significance threshold was
set at _P_ < 1/_N_ and the genome-wide significance threshold at _P_ < 0.05/_N_, according to Bonferroni method, _N_ stands for the number of markers69. For the initial MLMA-analysis
(Model 1) we considered 45,114 markers resulting in the suggestive _P_ value at 2.22 × 10−5 and genome-wide at 1.11 × 10−6. A second MLMA (Model 2) analysis included one additional locus,
which we detected during this research, as a consequence, we considered 45,115 markers. The suggestive and genome-wide significance threshold remained the same. COMBINED LINKAGE
DISEQUILIBRIUM AND LINKAGE ANALYSIS (CLDLA) Parallel with SNP-based association analyses using MLMA, we performed multiple haplotype-based cLDLA analyses70, which have been successfully used
for binary and quantitative trait mapping in previous studies71,72,73,74. To correct for familial relationships and population stratification, unified additive relationships (UARs) were
estimated between all animals on a genome-wide level75. The inverse of the UAR matrix was then included in the variance component analysis. To account for linkage disequilibrium in the form
of local haplotype relationships, the Locus IBD (LocIBD) was estimated according to the method of Meuwissen and Goddard76 using sliding windows of 40 SNPs. For each window, we estimated
LocIBD in the middle, i.e., between SNPs 20 and 21, based on the flanking marker haplotypes. Following the method for additive-genetic relationship matrices by Lee and Van der Werf77, the
matrix of LocIBD-values was converted into a diplotype relationship matrix (DRM). Variance component analyses were carried out with the program _ASReml_25 according to the method of
Meuwissen et al.70. _ASReml_ estimated the maximum likelihood, variance components and fixed and random effects simultaneously by considering the genome-wide UAR matrix as well as the
locus-specific (DRM) relationships matrices in the following mixed linear model:
$${{{{{\rm{y}}}}}}={{{{{\rm{X}}}}}}{{\ss}}+{{{{{\rm{Z}}}}}}_1{{{{{\rm{u}}}}}}+{{{{{\rm{Z}}}}}}_2{{{{{\rm{q}}}}}}+e$$ (2) where Y is again the vector of TLs and Β is the vector of fixed
effects (BW, WH, sex, age, and the overall mean µ; BW and WH data were both standardized and centered). The vector U is the vector of random polygenic effects (with U~_N_(0,
\({{{{{{\bf{G}}}}}}\sigma }_{u}^{2}\)) where G represents the matrix of genome-wide UAR estimations), Q is the vector of random additive-genetic QTL effects (with
Q~_N_(0,\({{{{{{\bf{D}}}}}}}_{{{{{{\bf{RMp}}}}}}}{\sigma }_{q}^{2}\)) where DRMP is the diplotype relationship matrix at position _p_ of a supposed QTL) and E is the vector of random
residual effects (with E~_N_(0, \({{{{{{\bf{I}}}}}}\sigma }_{e}^{2}\)), where I is an identity matrix). It is assumed that U, Q, and E are not correlated. X, Z1, and Z2 are incidence
matrices linking the observed values with the fixed and random QTL effects. The variance components and likelihood estimated with _ASReml_ were then used in a likelihood ratio test statistic
(_LRT_). The _LRT_ values follow a _χ_2 distribution with one degree of freedom78 and were calculated as: $${{{{{\boldsymbol{LRT}}}}}}{{{{{\boldsymbol{p}}}}}}=-{{{{{\bf{2}}}}}}\times
({{{{{\bf{log }}}}}}({{{{{\boldsymbol{L}}}}}}{{{{{\bf{0}}}}}})-{{{{{\mathbf{log }}}}}}({{{{{\boldsymbol{L}}}}}}{{{{{\bf{1}}}}}}{{{{{\boldsymbol{p}}}}}}))$$ (3) where log(_L__0_) is the
logarithm of the likelihood estimated by _ASReml_ for the model without QTL effects and log(_L__1p_) the model with QTL effects at the center of the window _p_. To obtain a significance
threshold, a Bonferroni correction was carried out to account for multiple testing due to the 40-SNP sliding windows79. This resulted in a corrected _P_ value of <4.44× 10−5 (i.e.,
0.05/1127 where 1127 is the number of non-overlapping 40-SNP windows), and a corresponding _LRT_ value with genome-wide significance is equal to 16.67. In addition to Model 3 described
above, we performed two further genome-wide cLDLA analyses, with both analyses using candidate locus genotypes for the same set of animals. Model 4 contains only one additional locus and is
therefore comparable to Model 2 of MLMA. In Model 5, the genotypes of the same candidate locus are considered as a fixed effect, i.e., Β is the vector of µ, sex, age, BW, WH, and candidate
locus effects. A comprehensive overview of the different models is provided in Table 1. For all maxima of the LRT curve (_LRT_max) that exceeded the genome-wide significance threshold, the
2-LOD (logarithm of the odds) criterion was used to determine the associated CIs80. Closely located LRT peaks were assumed to belong to the same QTL as described by Müller et al.74. The
results of the analyses were presented as Manhatten plots produced by the _R_ package _qqman_81. ANNOTATION OF GENE CONTENT AND GENE SET ENRICHMENT ANALYSIS The CIs of the _LRT_max were
compared with a map of annotated genes in the UCSC Genome Browser Oar_v4.0/oviAri4 Assembly (https://genome.ucsc.edu/cgi-bin/hgGateway) using the “RefSeq Genes” track. We refer to the
assembly of Oar_v3.1 (https://www.ncbi.nlm.nih.gov/assembly/GCF_000298735.1), the _Mus musculus_ Assembly (https://www.ncbi.nlm.nih.gov/assembly/GCF_000001635.27) and the _Homo sapiens_
Assembly (https://www.ncbi.nlm.nih.gov/assembly/GCF_000001405.39) as well to consider genes encompassing the CIs. Next, gene set enrichment analyses were carried out with the software
_Enrichr_ (Ontologies, MGI Mammalian Phenotype Level 4 2019)82,83. DNA EXTRACTION, SEQUENCING, AND ANALYSIS OF THE SEQUENCES Genomic regions where the LRT curve reached the maximum values
above the genome-wide significance threshold (in Model 3) were sequenced using targeted capture sequencing on 48 selected lambs (23 long-tailed and 25 short-tailed). To minimize the risk of
missing some causal variations that are close but outside the CI, we increased the candidate regions on both sides by ~300 Kb. This resulted in the captured regions on the sheep chromosome 2
(OAR2) from 93,200,000 to 96,700,000 bp and on chromosome 11 (OAR11) from 36,600,000 to 37,900,000 bp. Genomic DNA was extracted from the blood samples using the ReliaPrep™ Blood gDNA
Miniprep System. WGS libraries were prepared from 250 ng of genomic DNA by tagmentation with the NexteraFlex kit from Illumina. Subsequently, the libraries were dual-barcoded and amplified
by PCR, purified with SPRI beads and pooled in equimolar amounts. The pooled libraries were enriched for the target regions by hybridization to an Agilent capture array with 244k oligo
spots. The oligo probes were selected from the repeat-masked DNA sequence as all possible 60mers that do not overlap with repeat-masked bases and that are staggered in 15 nt tiling steps
concerning their neighbors. After 65 hours of hybridization in the presence of cot-I sheep DNA and adapter blocking oligos at 65 °C the capture array was washed and the captured library
molecules were eluted at 95 °C for 10 min in a volume of 500 µl DNA-grade water. The enriched libraries were then amplified by PCR, analyzed on the Bioanalyzer and sequenced in 2 × 110 bp
paired-end mode on a P2 flowcell of a NextSeq1000 sequencer from Illumina. _Sickle_84 was used to trim the adaptors and filter low-quality sequences of the raw reads in the FASTQ files. With
_FastQC_85, we assessed the quality parameters of the filtered sequencing data. The filtered reads were mapped to the sheep reference genome OAR_v4.0 using the default parameters of
_BWA-MEM_86 alignment tool. To convert the SAM files into coordinate sorted BAM files and to remove the duplicated reads, _Samtools_87 and _Picard_88 were used. Base quality recalibration
and indel realignment were done with _GATK_27. SNPs and small indels calling was performed with _GATK HaplotypeCaller_27 using: (i) 37 wild sheep genomes, (ii) 16 domestic sheep genomes
(representing 8 samples of short-tailed breeds and long-tailed breeds each) and (iii) two sequence pools, representing short and long-tailed group of the 48 Merinolandschaf lambs. These 48
lambs were sequenced individually (see above), then reads were pooled to increase the sequencing coverage of both groups. The wild and domestic sheep genomes downloaded from NCBI sequence
read archive are shown in Supplementary Table S4. To filter the obtained _GATK_ short variants, the criteria described in the Supplementary methods were applied. To detect structural
variants (SV) we used _SMOOVE_ (_LUMPY_)32 and _DELLY_33 with default parameters in the pooled reads of short- and long-tailed lambs. To ensure that we do not miss any candidate variant, we
performed a visual examination of the captured regions using _JBrowse_28, focusing on the region near the candidate gene detected here. VALIDATION OF CANDIDATE SNP USING PCR For one detected
candidate SNP, we performed genotyping by PCR-RFLP and electrophoresis on a 2% agarose gel on all 362 sampled lambs, and there 17 confirmed fathers. The PCR primer sequences designed by
_Primer3_89 were TTTAAAACGCTTTGGATT (forward) and CACTCGGCAGGAGTAGTA (reverse). The used restriction enzyme was _BsrI_. It recognizes the mutant sequence TGAC/CN, where the G is the variable
base and the “/” presents the site where cutting is performed. DNA amplification was performed with 35 cycles. The total reaction mixture was 15.0 µl containing 3.0 µl 5× buffer, 1.5 µl
dNTPs (10 mM), 0.6 µl of 10 µM forward and reverse primers, respectively, 1.0 µl DNA (15 ng/µl), 0.07 µl GoTaq®G2 DNA Polymerase (Promega, Madison, Wisconsin, USA) and distilled water. We
used 1.5 U of the enzyme _BsrI_, 3.0 µl DNA (PCR product), 2 µl Cut Smart Buffer, and distilled water for a total reaction volume of 20 µl. The reaction mixture was afterward incubated for 3
h at 65 °C. In the final step, we separated the DNA fragments by size and visualized them by GelRed™-stained agarose gel electrophoresis. Only sequences harboring the derived allele (SNP G)
were cut, the two resulting fragments had a length of 120 bp and 259 bp. The sequence with the ancestral allele (SNP C) retained its length of 379 bp. VALIDATION OF CANDIDATE INSERTION
USING PCR AND SANGER SEQUENCING We also performed genotyping by PCR and electrophoresis on a 2% agarose gel on the same 379 sampled sheep for one detected causal candidate insertion.
Multiple PCR primer sequences designed by _Primer3_89 were tested (see Supplementary Table S7), those that worked best are TTTATGAGCTTCTCTCCGCCA (forward) and CACTCGGCAGGAGTAGTA (reverse).
DNA amplification was performed with 35 cycles. The total reaction mixture was 25.0 µl containing 5.0 µl 5× buffer, 2.5 µl dNTPs (10 mM), 1 µl of 10 µM forward and reverse Primer,
respectively, 1.0 µl DNA (15 ng/µl), 0.07 µl GoTaq®G2 DNA Polymerase (Promega, Madison, Wisconsin, USA) and distilled water. In the final step, we separated the amplicons by size and
visualized them by GelRed™-stained agarose gel electrophoresis. Two lambs, which are homozygous for the SV, one lamb, which is homozygous for the ancestral allele, and two heterozygous lambs
were resequenced using Sanger sequencing with the above-mentioned best working primers. The amplicons were sequenced using the cycle sequencing technology (dideoxy chain termination/cycle
sequencing) on ABI 3730XL sequencing machines (Eurofins Genomics, Germany). The sequenced data were analyzed using _SnapGene_ software (from Insightful Science; available at
https://www.snapgene.com/). REPORTING SUMMARY Further information on research design is available in the Nature Research Reporting Summary linked to this article. DATA AVAILABILITY Genotype
and phenotype data as well as Sanger sequences that support the findings of this study have been deposited in Figshare (https://doi.org/10.6084/m9.figshare.19375712,
https://doi.org/10.6084/m9.figshare.19368962, https://doi.org/10.6084/m9.figshare.19368905, https://doi.org/10.6084/m9.figshare.19368734). Also, the source data for the plots (results of the
MLMAs and the cLDLAs) have been deposited on Figshare (https://doi.org/10.6084/m9.figshare.20439849 and https://doi.org/10.6084/m9.figshare.20439582). Capture sequencing data have been
deposited in European Nucleotide Archive (ENA) with the study accession PRJEB51698. All other data are available from the corresponding author on reasonable request. REFERENCES * Zeder, M.
A. Domestication and early agriculture in the Mediterranean basin: origins, diffusion, and impact. _Proc. Natl Acad. Sci. USA_ 105, 11597–11604 (2008). Article CAS PubMed PubMed Central
Google Scholar * Arbuckle, B. S. & Atici, L. Initial diversity in sheep and goat management in Neolithic south-western Asia. _Levant_ 45, 219–235 (2013). Article Google Scholar *
Sherratt, A. J. W. A. The secondary exploitation of animals in the Old World. _World Archaeol._ 15, 90–104 (1983). Article Google Scholar * Shelton, M. Studies on tail length of
Rambouillet and Mouflon sheep. _J. Hered._ 68, 128–130 (1977). Article CAS PubMed Google Scholar * Sutherland, M. A. & Tucker, C. B. The long and short of it: a review of tail
docking in farm animals. _Appl. Anim. Behav. Sci._ 135, 179–191 (2011). Article Google Scholar * Hannemann, R., Bauer, B., Ganter, M. & Strobel, H. Schmerzhafte Eingriffe beim
Schaf–Schwanzkupieren (Painful interventions in sheep - tail docking). _Tier.ärztliche Prax. Ausg. G: Großtiere/Nutztier._ 45, 302–311 (2017). Article Google Scholar * Bohte-Wilhelmus, D.
I., De Haas, Y., Veerkamp, R. F. & Windig, J. J. Genetic selection as alternative to tail docking in Hampshire Down and Clun Forest. In: Proceedings of the 9th world congress on genetics
applied to livestock production; Leipzig, Germany 2010. * Verordnung der Bundesministerin für Gesundheit und Frauen über die Mindestanforderungen für die Haltung von Pferden und
Pferdeartigen, Schweinen, Rindern, Schafen, Ziegen, Schalenwild, Lamas, Kaninchen, Hausgeflügel, Straußen und Nutzfischen 1. Tierhaltungsverordnung (Austrian Animal Husbandry Ordinance),
ThVO (2017). * Tierschutzgesetz in der Fassung der Bekanntmachung vom 18. Mai 2006 (BGBl. I S. 1206, 1313), das zuletzt durch Artikel 105 des Gesetzes vom 10. August 2021 (BGBl. I S. 3436)
geändert worden ist (German animal protection law), TierSchG (1972). * Bundestierärztekammer e. V. Positionspapier der Bundestierärztekammer zu notwendigen Weiterentwicklungen der
Rechtsetzung zur Verbesserung des Tierschutzes bei Nutztieren (Position statement of the German Veterinary Association on necessary further developments of legislation to improve animal
welfare in farm animals). (2017). * Lynch, J. J., Hinch, G. N. & Adams, D. B. _The behaviour of sheep: biological principles and implications for production_. 1st edn, (C.A.B.
international, 1992). * James, P. J., Gare, D. R., Singh, A. W. & Ancell, P. M. Studies of the potential for breeding short tail Merinos. _Wool. Technol. Sheep Breed._ 38, 106–111
(1990). Google Scholar * Greeff, J. C., Karlsson, L. J. E. & Schlink, A. C. Inheritance of tail length in Merino sheep. _Proc. Assoc. Advmt. Breed. Genet._ 21, 237–240 (2015). Google
Scholar * Oltenacu, E. A. & Boylan, W. J. Inheritance of tail length in crossbred Finnsheep. _J. Hered._ 65, 331–334 (1974). Article CAS PubMed Google Scholar * James, P., Ponzoni,
R., Gare, D. & Cockrum, K. Inheritance of short tailedness in South Australian Merinos. In: Proceedings of the Association for the Advancement of Animal Breeding and Genetics. p. 404–407
(1991). * Scobie, D. & O’Connell, D. Genetic reduction of tail length in New Zealand sheep. In: Proceedings of the New Zealand Society of Animal Production: New Zealand Society of
Animal Production. p. 195–198 (2002). * Eck, K. Untersuchung der natürlichen Schwanzlängenvariation beim Merinolandschaf als mögliche Zuchtalternative zur tierschutzrelevanten Praktik des
Schwanzkupierens (Investigation of natural tail length variation in Merinolandschaf as a possible breeding alternative to the welfare-relevant practice of tail docking), lmu, (2020). *
Jordan, R. The description of the No Tail breed of sheep following forty years of breeding. In: Proceedings of the South Dakota Academy of Science. p. 103–104 (1952). * Carter, A. H.
Inherited taillessness in sheep. _New Zealand Ministry of Agriculture and Fisheries. Annual report of the research division_., p. 44–45 (1976). * Zhi, D. et al. Whole genome sequencing of
hulunbuir short-tailed sheep for identifying candidate genes related to the short-tail phenotype. _G3 (Bethesda)_ 8, 377–383 (2018). Article CAS Google Scholar * Han, J. et al. Two linked
_TBXT_ (brachyury) gene polymorphisms are associated with the tailless phenotype in fat-rumped sheep. _Anim. Genet._ 50, 772–777 (2019). Article CAS PubMed PubMed Central Google Scholar
* Hytonen, M. K. et al. Ancestral T-box mutation is present in many, but not all, short-tailed dog breeds. _J. Hered._ 100, 236–240 (2009). Article CAS PubMed Google Scholar *
Buckingham, K. J. et al. Multiple mutant T alleles cause haploinsufficiency of Brachyury and short tails in Manx cats. _Mamm. Genome_ 24, 400–408 (2013). Article PubMed Google Scholar *
Indrebo, A. et al. A study of inherited short tail and taillessness in Pembroke Welsh corgi. _J. Small Anim. Pract._ 49, 220–224 (2008). Article CAS PubMed Google Scholar * Gilmour, A.
R. et al. ASReml User Guide Release 3.0. (VSN International Ltd, Hemel Hempstead, 2009). * R: A language and environment for statistical computing (R Foundation for Statistical Computing,
Vienna, Austria. URL https://www.R-project.org/). * Auwera, G. D. & O’Connor, B. _Genomics in the Cloud: Using Docker, GATK, and WDL in Terra_. 1st edn, (O’Reilly Media, 2020). * Buels,
R. et al. JBrowse: a dynamic web platform for genome visualization and analysis. _Genome Biol._ 17, 66 (2016). Article PubMed PubMed Central CAS Google Scholar * McLaren, W. et al. The
ensembl variant effect predictor. _Genome Biol._ 17, 122 (2016). Article PubMed PubMed Central CAS Google Scholar * Altschul, S. F. et al. Basic local alignment search tool. _J. Mol.
Biol._ 215, 403–410 (1990). Article CAS PubMed Google Scholar * Yurtman, E. et al. Archaeogenetic analysis of Neolithic sheep from Anatolia suggests a complex demographic history since
domestication. _Commun. Biol._ 4, 1279 (2021). Article CAS PubMed PubMed Central Google Scholar * Layer, R. M., Chiang, C., Quinlan, A. R. & Hall, I. M. LUMPY: a probabilistic
framework for structural variant discovery. _Genome Biol._ 15, R84 (2014). Article PubMed PubMed Central Google Scholar * Rausch, T. et al. DELLY: structural variant discovery by
integrated paired-end and split-read analysis. _Bioinformatics_ 28, i333–i339 (2012). Article CAS PubMed PubMed Central Google Scholar * Davenport, K. M. et al. An improved ovine
reference genome assembly to facilitate in-depth functional annotation of the sheep genome. _GigaScience_ 11, https://doi.org/10.1093/gigascience/giab096 (2022). * Li, R. et al. The first
sheep graph pan-genome reveals the spectrum of structural variations and their effects on different tail phenotypes. _bioRxiv_ https://www.biorxiv.org/content/10.1101/2021.12.22.472709v1
(2022). * Visscher, P. M. et al. Statistical power to detect genetic (co)variance of complex traits using SNP data in unrelated samples. _PLoS Genet._ 10, e1004269 (2014). Article PubMed
PubMed Central CAS Google Scholar * Zaitlen, N. et al. Using extended genealogy to estimate components of heritability for 23 quantitative and dichotomous traits. _PLoS Genet._ 9,
e1003520 (2013). Article CAS PubMed PubMed Central Google Scholar * Xia, C. et al. Pedigree- and SNP-associated genetics and recent environment are the major contributors to
anthropometric and cardiometabolic trait variation. _PLoS Genet._ 12, e1005804 (2016). Article PubMed PubMed Central CAS Google Scholar * Gluecksohn-Schoenheimer, S. The development of
normal and homozygous brachy (T/T) mouse embryos in the extraembryonic coelom of the chick. _Proc. Natl Acad. Sci. USA_ 30, 134–140 (1944). Article CAS PubMed PubMed Central Google
Scholar * Gruneberg, H. Genetical studies on the skeleton of the mouse. XXIII. The development of brachyury and anury. _J. Embryol. Exp. Morphol._ 6, 424–443 (1958). CAS PubMed Google
Scholar * Beddington, R. S. P., Rashbass, P. & Wilson, V. Brachyury-a gene affecting mouse gastrulation and early organogenesis. _Dev. Suppl._ 157–165 (1992). * Yoo, D. et al. The
genetic origin of short tail in endangered Korean Dog, DongGyeongi. _Sci. Rep._ 7, 10048 (2017). Article PubMed PubMed Central CAS Google Scholar * Haworth, K. et al. Canine homolog of
the T-box transcription factor T; failure of the protein to bind to its DNA target leads to a short-tail phenotype. _Mamm. Genome_ 12, 212–218 (2001). Article CAS PubMed Google Scholar *
Xu, X. et al. Whole genome sequencing identifies a missense mutation in _HES7_ associated with short tails in asian domestic cats. _Sci. Rep._ 6, 31583 (2016). Article CAS PubMed PubMed
Central Google Scholar * Diaz-Cuadros, M., Pourquie, O. & El-Sherif, E. Patterning with clocks and genetic cascades: Segmentation and regionalization of vertebrate versus insect body
plans. _PLoS Genet._ 17, e1009812 (2021). Article CAS PubMed PubMed Central Google Scholar * Lewis, E. B. A gene complex controlling segmentation in Drosophila. _Nature_ 276, 565–570
(1978). Article CAS PubMed Google Scholar * Wellik, D. M. Hox patterning of the vertebrate axial skeleton. _Dev. Dyn._ 236, 2454–2463 (2007). Article CAS PubMed Google Scholar *
McIntyre, D. C. et al. Hox patterning of the vertebrate rib cage. _Development_ 134, 2981–2989 (2007). Article CAS PubMed Google Scholar * Maconochie, M., Nonchev, S., Morrison, A. &
Krumlauf, R. Paralogous _Hox_ genes: function and regulation. _Annu. Rev. Genet._ 30, 529–556 (1996). Article CAS PubMed Google Scholar * Casaca, A., Nóvoa, A. & Mallo, M. _Hoxb6_
can interfere with somitogenesis in the posterior embryo through a mechanism independent of its rib-promoting activity. _Development_ 143, 437–448 (2016). CAS PubMed Google Scholar *
Young, T. et al. _Cdx_ and _Hox_ genes differentially regulate posterior axial growth in mammalian embryos. _Dev. Cell_ 17, 516–526 (2009). Article CAS PubMed Google Scholar *
Economides, K. D., Zeltser, L. & Capecchi, M. R. _Hoxb13_ mutations cause overgrowth of caudal spinal cordand tail vertebrae. _Dev. Biol._ 256, 317–330 (2003). Article CAS PubMed
Google Scholar * Kingsley, E. P. et al. Adaptive tail-length evolution in deer mice is associated with differential _Hoxd13_ expression in early development. _bioRxiv_
https://www.biorxiv.org/content/10.1101/2021.12.18.473263v1 (2021). * Ouhtit, A. et al. _Hoxb13_, a potential prognostic biomarker for prostate cancer. _Front. Biosci._ 8, 40–45 (2016).
Article Google Scholar * Van Opstall, C. et al. _MEIS_-mediated suppression of human prostate cancer growth and metastasis through _HOXB13_-dependent regulation of proteoglycans. _Elife_
9, https://doi.org/10.7554/eLife.53600 (2020). * Johng, D. et al. HOXB13 interaction with MEIS1 modifies proliferation and gene expression in prostate cancer. _Prostate_ 79, 414–424 (2019).
Article CAS PubMed Google Scholar * Elbarbary, R. A., Lucas, B. A. & Maquat, L. E. Retrotransposons as regulators of gene expression. _Science_ 351, aac7247 (2016). Article PubMed
PubMed Central Google Scholar * Stoffel, M. A., Johnston, S. E., Pilkington, J. G. & Pemberton, J. M. Genetic architecture and lifetime dynamics of inbreeding depression in a wild
mammal. _Nat. Commun._ 12, 2972 (2021). Article CAS PubMed PubMed Central Google Scholar * Chen, F. & Capecchi, M. R. Targeted mutations in _hoxa-9_ and _hoxb-9_ reveal synergistic
interactions. _Dev. Biol._ 181, 186–196 (1997). Article CAS PubMed Google Scholar * Komuves, L. G. et al. HOXB13 homeodomain protein is cytoplasmic throughout fetal skin development.
_Dev. Dyn._ 227, 192–202 (2003). Article CAS PubMed Google Scholar * Sander, G. R. & Powell, B. C. Structure and expression of the ovine _Hoxc-13_ gene. _Gene_ 327, 107–116 (2004).
Article CAS PubMed Google Scholar * Aires, R. et al. Tail bud progenitor activity relies on a network comprising _Gdf11_, _Lin28_, and _Hox13_ genes. _Dev. Cell_ 48, 383–395 e388 (2019).
Article CAS PubMed Google Scholar * Eck, K. et al. Morphometric measurements in lambs as a basis for future mapping studies. _Small Rumin. Res._ 181, 57–64 (2019). Article Google
Scholar * Illumina. _Data Sheet: Agrigenomics. OvineSNP50 Genotyping BeadChip_, <https://www.illumina.com> (2015). * Browning, B. L., Zhou, Y. & Browning, S. R. A one-penny
imputed genome from next-generation reference panels. _Am. J. Hum. Genet._ 103, 338–348 (2018). Article CAS PubMed PubMed Central Google Scholar * Browning, S. R. & Browning, B. L.
Haplotype phasing: existing methods and new developments. _Nat. Rev. Genet._ 12, 703–714 (2011). Article CAS PubMed PubMed Central Google Scholar * Yang, J., Lee, S. H., Goddard, M. E.
& Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. _Am. J. Hum. Genet._ 88, 76–82 (2011). Article CAS PubMed PubMed Central Google Scholar * Yang, J. et al.
Advantages and pitfalls in the application of mixed-model association methods. _Nat. Genet._ 46, 100–106 (2014). Article PubMed PubMed Central CAS Google Scholar * Xiong, X. et al.
Genome-wide association analysis reveals genetic loci and candidate genes for meat quality traits in Chinese Laiwu pigs. _Mamm. Genome_ 26, 181–190 (2015). Article PubMed Google Scholar *
Meuwissen, T. H. et al. Fine mapping of a quantitative trait locus for twinning rate using combined linkage and linkage disequilibrium mapping. _Genetics_ 161, 373–379 (2002). Article CAS
PubMed PubMed Central Google Scholar * Kunz, E. et al. Confirmation of a non-synonymous SNP in _PNPLA8_ as a candidate causal mutation for Weaver syndrome in Brown Swiss cattle. _Genet.
Sel. Evol._ 48, 21 (2016). Article PubMed PubMed Central CAS Google Scholar * Rothammer, S. et al. Remapping of the belted phenotype in cattle on _BTA3_ identifies a multiplication
event as the candidate causal mutation. _Genet. Sel. Evol._ 50, 36 (2018). Article PubMed PubMed Central CAS Google Scholar * Rothammer, S. et al. Genome-wide QTL mapping of nine body
composition and bone mineral density traits in pigs. _Genet. Sel. Evol._ 46, 68 (2014). Article PubMed PubMed Central Google Scholar * Muller, M. P. et al. Genome-wide mapping of 10
calving and fertility traits in Holstein dairy cattle with special regard to chromosome 18. _J. Dairy Sci._ 100, 1987–2006 (2017). Article PubMed CAS Google Scholar * Powell, J. E.,
Visscher, P. M. & Goddard, M. E. Reconciling the analysis of IBD and IBS in complex trait studies. _Nat. Rev. Genet._ 11, 800–805 (2010). Article CAS PubMed Google Scholar *
Meuwissen, T. H. & Goddard, M. E. Prediction of identity by descent probabilities from marker-haplotypes. _Genet. Sel. Evol._ 33, 605–634 (2001). Article CAS PubMed PubMed Central
Google Scholar * Lee, S. H. & Van der Werf, J. H. Using dominance relationship coefficients based on linkage disequilibrium and linkage with a general complex pedigree to increase
mapping resolution. _Genetics_ 174, 1009–1016 (2006). Article CAS PubMed PubMed Central Google Scholar * Heuven, H. C., Bovenhuis, H., Janss, L. L. & van Arendonk, J. A. Efficiency
of population structures for mapping of Mendelian and imprinted quantitative trait loci in outbred pigs using variance component methods. _Genet. Sel. Evol._ 37, 635–655 (2005). Article CAS
PubMed PubMed Central Google Scholar * Bland, J. M. & Altman, D. G. Multiple significance tests: the Bonferroni method. _BMJ_ 310, 170 (1995). Article CAS PubMed PubMed Central
Google Scholar * van Ooijen, J. W. Accuracy of mapping quantitative trait loci in autogamous species. _Theor. Appl Genet._ 84, 803–811 (1992). Article PubMed Google Scholar * qqman: Q-Q
and Manhattan Plots for GWAS Data. v. R package version 0.1.4. (2017). * Chen, E. Y. et al. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. _BMC
Bioinformatics_ 14, 128 (2013). Article PubMed PubMed Central Google Scholar * Kuleshov, M. V. et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update.
_Nucleic Acids Res._ 44, W90–W97 (2016). Article CAS PubMed PubMed Central Google Scholar * Joshi NA, F. J. Sickle: a sliding-window, adaptive, quality-based trimming tool for FastQ
files (Version 1.33). (2011). * B, B. FastQC: a quality control tool for high throughput sequence data. (2011). * Li, H. & Durbin, R. Fast and accurate short read alignment with
Burrows-Wheeler transform. _Bioinformatics_ 25, 1754–1760 (2009). Article CAS PubMed PubMed Central Google Scholar * Li, H. et al. The sequence alignment/map format and SAMtools.
_Bioinformatics_ 25, 2078–2079 (2009). Article PubMed PubMed Central CAS Google Scholar * Institute, B. Picard toolkit. _Broad Institute, GitHub repository_ (2019). * Untergasser, A. et
al. Primer3-new capabilities and interfaces. _Nucleic Acids Res._ 40, e115 (2012). Article CAS PubMed PubMed Central Google Scholar * Goujon, M. et al. A new bioinformatics analysis
tools framework at EMBL-EBI. _Nucleic Acids Res._ 38, W695–W699 (2010). Article CAS PubMed PubMed Central Google Scholar * Sievers, F. et al. Fast, scalable generation of high-quality
protein multiple sequence alignments using clustal omega. _Mol. Syst. Biol._ 7, 539 (2011). Article PubMed PubMed Central Google Scholar Download references ACKNOWLEDGEMENTS The authors
thank the shepherd Hermann Stadler from Lower Bavaria as well as Stefan Mandler and Anja Henning from the teaching and research farm “Oberer Hardthof” at the JLU Giessen for making sheep
available for measuring and sampling. The authors thank Ch. Flury and C. Drögemüller for sending unfiltered SNP genotype data of four Swiss sheep breeds (see Supplementary Table S5). D.L.
was funded by a PhD Research Fellowship from the Tierzuchtforschung e.V. München. FUNDING Open Access funding enabled and organized by Projekt DEAL. AUTHOR INFORMATION AUTHORS AND
AFFILIATIONS * Population Genomics Group, Department of Veterinary Sciences, LMU Munich, Lena-Christ-Str. 48, 82152, Martinsried, Germany Dominik Karl Lagler, Elisabeth Hannemann, Kim Eck,
Jürgen Klawatsch, Maulik Upadhyay & Ivica Medugorac * Tierzuchtforschung e.V. München, Senator-Gerauer-Str. 23, 85586, Poing, Germany Dominik Karl Lagler, Kim Eck, Jürgen Klawatsch,
Doris Seichter & Ingolf Russ * Institute for Animal Breeding, Bavarian State Research Center for Agriculture, Prof.-Dürrwaechter-Platz 1, 85586, Poing, Germany Christian Mendel *
Institute of Animal Breeding and Genetics, JLU Gießen, Ludwigstr. 21, 35390, Gießen, Germany Gesine Lühken * Laboratory for Functional Genome Analysis, Gene Center,
Ludwig-Maximilians-University Munich, 80539, Munich, Germany Stefan Krebs & Helmut Blum Authors * Dominik Karl Lagler View author publications You can also search for this author
inPubMed Google Scholar * Elisabeth Hannemann View author publications You can also search for this author inPubMed Google Scholar * Kim Eck View author publications You can also search for
this author inPubMed Google Scholar * Jürgen Klawatsch View author publications You can also search for this author inPubMed Google Scholar * Doris Seichter View author publications You can
also search for this author inPubMed Google Scholar * Ingolf Russ View author publications You can also search for this author inPubMed Google Scholar * Christian Mendel View author
publications You can also search for this author inPubMed Google Scholar * Gesine Lühken View author publications You can also search for this author inPubMed Google Scholar * Stefan Krebs
View author publications You can also search for this author inPubMed Google Scholar * Helmut Blum View author publications You can also search for this author inPubMed Google Scholar *
Maulik Upadhyay View author publications You can also search for this author inPubMed Google Scholar * Ivica Medugorac View author publications You can also search for this author inPubMed
Google Scholar CONTRIBUTIONS D.L. analyzed and interpreted data and drafted the manuscript. E.H. and K.E. collected samples and phenotypes and assisted in drafting the manuscript. J.K.
assisted in the interpretation of the analyses. D.S. and I.R. coordinated genotyping of the samples and provided data. C.M. provided expertise about the status and problems of modern sheep
breeding and established contacts with breeders. G.L. provided samples and phenotypes. S.K. performed capture sequencing and assisted in drafting the manuscript. H.B. performed and
coordinated capture sequencing. M.U. carried out data analyses and interpretation and assisted in drafting the manuscript. I.M. conceived and guided this study, provided analysis tools,
analyzed data, and critically revised the manuscript. All authors read and approved the final manuscript. CORRESPONDING AUTHOR Correspondence to Ivica Medugorac. ETHICS DECLARATIONS
COMPETING INTERESTS The authors declare no competing interests. PEER REVIEW PEER REVIEW INFORMATION _Communications Biology_ thanks Xiaolong Wang and the other, anonymous, reviewer(s) for
their contribution to the peer review of this work. Primary Handling Editors: Luciano Matzkin and Luke R. Grinham. ADDITIONAL INFORMATION PUBLISHER’S NOTE Springer Nature remains neutral
with regard to jurisdictional claims in published maps and institutional affiliations. SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION REPORTING SUMMARY RIGHTS AND PERMISSIONS OPEN
ACCESS This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format,
as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third
party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the
article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright
holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Lagler, D.K., Hannemann, E., Eck, K.
_et al._ Fine-mapping and identification of candidate causal genes for tail length in the Merinolandschaf breed. _Commun Biol_ 5, 918 (2022). https://doi.org/10.1038/s42003-022-03854-3
Download citation * Received: 04 April 2022 * Accepted: 16 August 2022 * Published: 06 September 2022 * DOI: https://doi.org/10.1038/s42003-022-03854-3 SHARE THIS ARTICLE Anyone you share
the following link with will be able to read this content: Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the Springer
Nature SharedIt content-sharing initiative