The case study handles the situation in German Holstein Frisian (HF) breeding where the general Index (RZG) is based on several traits. The composition of the index changed in 2021, so that there are now eight instead of 6 traits in the index. Also, the exact trait definition was changed. In total, there are 10 “breeding goal traits”. See Simianer et al. “How economic weights translate into genetic and phenotypic progress, and vice versa” Genet Sel Evol 55, 38 (2023) for more details on it.
The economic weights (w
) of the indices are as follows,
with traits not in the index having zero weight.
<- c("RZM", "RZN", "RZEo", "RZEn", "RZR", "RZKm", "RZKd", "RZH", "RZC", "RZS")
tn
<- c(0.45, 0.2, 0.15, 0, 0.1, 0.03, 0, 0, 0, 0.07)
w_old names(w_old) <- tn; w_old
#> RZM RZN RZEo RZEn RZR RZKm RZKd RZH RZC RZS
#> 0.45 0.20 0.15 0.00 0.10 0.03 0.00 0.00 0.00 0.07
<- c(0.36, 0.18, 0, 0.15, 0.07, 0.015, 0.015, 0.18, 0.03, 0)
w_new names(w_new) <- tn; w_old
#> RZM RZN RZEo RZEn RZR RZKm RZKd RZH RZC RZS
#> 0.45 0.20 0.15 0.00 0.10 0.03 0.00 0.00 0.00 0.07
Breeding values are scaled to a mean of 100 index points and a additive genetic standard deviation of 12 index points. This makes it easy to set up the genetic variance-covariance matrix ( Γ=G ) from the genetic correlation matrix by simply multiplying the correlation matrix with a constant of 122.
<- matrix(
G c(1.0,0.13,0.13,0.07,-0.15,0.11,0.07,0.09,-0.02,0.04,
0.13,1.0,0.23,0.28,0.43,0.25,0.22,0.78,0.13,0.46,
0.13,0.23,1.0,0.92,0.02,0.09,-0.05,0.25,-0.1,0.19,
0.07,0.28,0.92,1.0,0.06,0.08,-.03,0.31,-.1,0.25,
-0.15,0.43,.02,0.06,1.0,0.32,0.19,0.41,0.04,0.15,
0.11,0.25,0.09,0.08,0.32,1.0,0.0,0.25,0.04,0.13,
0.07,0.22,-.05,-.03,0.19,0,1.0,0.23,0.05,0.10,
0.09,0.78,0.25,0.31,0.41,0.25,0.23,1.0,0.1,0.57,
-.02,0.13,-.10,-.1,0.04,0.04,0.05,0.1,1.0,0.02,
0.04,0.46,0.19,0.25,0.15,0.13,0.10,0.57,0.02,1.0)
byrow = TRUE, nrow = length(tn), ncol = length(tn), dimnames = list(tn, tn)
,
)<- G*12^2
G
G#> RZM RZN RZEo RZEn RZR RZKm RZKd RZH RZC RZS
#> RZM 144.00 18.72 18.72 10.08 -21.60 15.84 10.08 12.96 -2.88 5.76
#> RZN 18.72 144.00 33.12 40.32 61.92 36.00 31.68 112.32 18.72 66.24
#> RZEo 18.72 33.12 144.00 132.48 2.88 12.96 -7.20 36.00 -14.40 27.36
#> RZEn 10.08 40.32 132.48 144.00 8.64 11.52 -4.32 44.64 -14.40 36.00
#> RZR -21.60 61.92 2.88 8.64 144.00 46.08 27.36 59.04 5.76 21.60
#> RZKm 15.84 36.00 12.96 11.52 46.08 144.00 0.00 36.00 5.76 18.72
#> RZKd 10.08 31.68 -7.20 -4.32 27.36 0.00 144.00 33.12 7.20 14.40
#> RZH 12.96 112.32 36.00 44.64 59.04 36.00 33.12 144.00 14.40 82.08
#> RZC -2.88 18.72 -14.40 -14.40 5.76 5.76 7.20 14.40 144.00 2.88
#> RZS 5.76 66.24 27.36 36.00 21.60 18.72 14.40 82.08 2.88 144.00
In our case, reliabilities (r2AI) of the estimated breeding values are available for all traits.
<- c(0.743, 0.673, 0.638, 0.717, 0.541, 0.635, 0.604, 0.720, 0.499, 0.764)
r2 names(r2) <- tn; r2
#> RZM RZN RZEo RZEn RZR RZKm RZKd RZH RZC RZS
#> 0.743 0.673 0.638 0.717 0.541 0.635 0.604 0.720 0.499 0.764
Note: If you calculate an index for observed phenotypes based on own performances, the reliability ( r2 ) equals the narrow sense heritability ( h2 ).
If we regard a situation where breeding value estimation is only performed for the actual index traits, r2 needs to be subsetted.
<- r2[w_old != 0]
r2_old <- r2[w_new != 0] r2_new
For the case of the old index, estimates of the observed genetic gains of the traits in the index are available from evaluations of the HF breeding program.
<- c(0.28401392, 0.21637703, 0.17932963, 0.09986764, 0.08767239, 0.13273939)
deltautr names(deltautr) <- tn[w_old>0]
We may further need heritabilities (h2) of the traits to translate genetic gain into phenotypic gain.
<- c(0.314, 0.090, 0.194, 0.194, 0.013, 0.049, 0.033, 0.061, 0.014, 0.273)
h2 names(h2) <- tn
Further, to allow residual errors to be correlated, we need a variance-covariance matrix of breeding values (H) as starting point for the internal estimation process.
<- matrix (
H c(1.00,0.06,0.06,-.20,0.05,0.03,
0.06,1.00,0.14,0.40,0.20,0.46,
0.06,0.14,1.00,-.03,0.03,0.15,
-.20,0.40,-.03,1.00,0.30,0.13,
0.05,0.20,0.03,0.30,1.00,0.11,
0.03,0.46,0.15,0.13,0.11,1.00),
nrow=6, ncol=6,
dimnames = list(tn[w_old != 0], tn[w_old != 0]))
<- H * 144
H
H#> RZM RZN RZEo RZR RZKm RZS
#> RZM 144.00 8.64 8.64 -28.80 7.20 4.32
#> RZN 8.64 144.00 20.16 57.60 28.80 66.24
#> RZEo 8.64 20.16 144.00 -4.32 4.32 21.60
#> RZR -28.80 57.60 -4.32 144.00 43.20 18.72
#> RZKm 7.20 28.80 4.32 43.20 144.00 15.84
#> RZS 4.32 66.24 21.60 18.72 15.84 144.00
IndexWizard
All index calculations are performed within the function
SelInd()
. Minimum required input are w
,
G
and r2
. Note that all vectors and matrices
need to be named to allow checks and sorting based on trait names.
<- SelInd(
res w = w_old,
G = G,
r2 = r2_old
)#> - only reliabilities given
#> --> setting up E based on r2 and G
#> --> residual errors are assumed to be uncorrelated!
#> - no selection intensity provided
#> --> can only compute the relative genetic and phenotypic trend
#> - no heritabilities provided
#> --> cannot compute the expected relative phenotypic trend
#> - No observed composition of genetic progress given
#> --> cannot calculate realized weights
The function by default informs the user, if certain calculations
cannot be performed. This behavior can be silenced with setting
verbose = FALSE
.
<- SelInd(w = w_old, G = G, r2 = r2_old, verbose = FALSE) res
The summary()
function further gives information on the
number of traits, and all available entries in the SelInd
object.
summary(res)
#> An object of class SelInd. The index is based on:
#> - n = 10 breeding goal traits
#> - 6 traits with economic weight != 0
#> - m = 6 index traits
#>
#> Residual errors were assumed to be uncorrelated.
#>
#> The SelInd object contains the entries w, G, E, r2, b, b_scaled, var_I, d_G_exp_scaled, r_IP, r_IH, del_d_scaled, delta, D, del_d_abs.
#> The SelInd object does not contain the entries H, h2, w_real, b_real, i, d_G_obs, d_G_exp, d_P_exp, d_P_exp_scaled, dG.
The print()
function further re-formats the output to a
more readable format, which includes rounding to two decimals. If you
want to see the “pure” list, use print.default()
instead.
res#> An object of class SelInd. The index is based on:
#> - n = 10 breeding goal traits
#> - 6 traits with economic weight != 0
#> - m = 6 index traits
#>
#> --------------------------------------------------------------------
#>
#> Matrices G, E present in SelInd object, but not printed to reduce complexity.
#> Extract them by the use of the `$` operator.
#>
#> --------------------------------------------------------------------
#>
#> Vaiance of the index (`var_I`), expected genetic gain (`dG`)
#> and assumed selection intensity (`i`):
#>
#> $var_I:
#> [1] 41.29
#>
#> --------------------------------------------------------------------
#>
#> Economic (`w`) and index weights (`b`). Potentially they are rescaled (`scaled`)
#> so that sum(abs()) = 1. The weights might represent realized (`real`) weights
#> based on an observed composition of the genetic trend:
#>
#> $w:
#> RZM RZN RZEo RZEn RZR RZKm RZKd RZH RZC RZS
#> 0.45 0.20 0.15 0.00 0.10 0.03 0.00 0.00 0.00 0.07
#>
#> $b:
#> RZM RZN RZEo RZR RZKm RZS
#> 0.45 0.24 0.17 0.09 0.07 0.11
#>
#> $b_scaled:
#> RZM RZN RZEo RZR RZKm RZS
#> 0.40 0.21 0.15 0.08 0.06 0.09
#>
#> --------------------------------------------------------------------
#>
#> reliabilities (`r2`) and heritabilities (`h2`) of the traits:
#>
#> $r2:
#> RZM RZN RZEo RZR RZKm RZS
#> 0.74 0.67 0.64 0.54 0.64 0.76
#>
#> --------------------------------------------------------------------
#>
#> Composition (`d`) of genetic (`G`) / phenotypic (`P`) trend.
#> The composition might be observed (`obs`) or expected (`exp`).
#> The composition might be scaled (`scaled`) so that sum(abs()) = 1:
#>
#> $d_G_exp_scaled:
#> RZM RZN RZEo RZEn RZR RZKm RZKd RZH RZC RZS
#> 0.20 0.16 0.11 0.10 0.05 0.08 0.04 0.14 0.00 0.11
#>
#> --------------------------------------------------------------------
#>
#> Analytic measures:
#> Correlation between the overall index and the phenotype of trait j (`r_IP`) and
#> the loss in prediction accuracy when omitting trait j from the index (`r_IH`).
#> Further the approximate first derivative of d_G_exp with respect to w (`del_d_scaled`) with rows
#> being scaled so that sum(abs()) = 1.
#>
#> $r_IP:
#> RZM RZN RZEo RZR RZKm RZS
#> 0.80 0.68 0.49 0.25 0.37 0.43
#>
#> $r_IH:
#> RZM RZN RZEo RZR RZKm RZS
#> 0.30 0.05 0.03 0.01 0.01 0.01
#>
#> $del_d_scaled:
#> RZM RZN RZEo RZEn RZR RZKm RZKd RZH RZC RZS
#> RZM 0.15 -0.15 -0.09 -0.10 -0.14 -0.06 -0.02 -0.14 -0.01 -0.13
#> RZN -0.16 0.23 -0.01 0.02 0.15 0.04 0.05 0.18 0.04 0.12
#> RZEo -0.11 -0.01 0.36 0.33 -0.03 -0.02 -0.05 0.02 -0.05 0.02
#> RZEn -0.13 0.03 0.33 0.31 -0.01 -0.02 -0.04 0.05 -0.04 0.06
#> RZR -0.14 0.15 -0.02 0.00 0.31 0.12 0.05 0.13 0.02 0.04
#> RZKm -0.09 0.06 -0.03 -0.02 0.19 0.49 -0.01 0.07 0.02 0.01
#> RZKd -0.08 0.16 -0.13 -0.10 0.18 -0.03 0.07 0.13 0.04 0.08
#> RZH -0.15 0.19 0.01 0.04 0.14 0.05 0.04 0.17 0.03 0.18
#> RZC -0.06 0.19 -0.16 -0.13 0.11 0.07 0.05 0.12 0.06 0.04
#> RZS -0.14 0.13 0.02 0.05 0.05 0.01 0.02 0.19 0.01 0.39
Nevertheless, each entry can also be extracted by the $
operator.
$w
res#> RZM RZN RZEo RZEn RZR RZKm RZKd RZH RZC RZS
#> 0.45 0.20 0.15 0.00 0.10 0.03 0.00 0.00 0.00 0.07
$b_scaled
res#> RZM RZN RZEo RZR RZKm RZS
#> 0.39887304 0.20882509 0.15191062 0.08378771 0.06243987 0.09416367
The basic usage of the selection index would be the calculation of
index weights (b
) to combine several estimated breeding
values to one combined selection criterium (I
). We can do
this for both of our indices as follows:
<- SelInd(w = w_old, G = G, r2 = r2_old, verbose = FALSE)
res_old <- SelInd(w = w_new, G = G, r2 = r2_new, verbose = FALSE) res_new
The according index weights would then be:
round(res_old$b,2)
#> RZM RZN RZEo RZR RZKm RZS
#> 0.45 0.24 0.17 0.09 0.07 0.11
round(res_new$b,2)
#> RZM RZN RZEn RZR RZKm RZKd RZH RZC
#> 0.36 0.22 0.17 0.07 0.05 0.04 0.23 0.03
Let’s assume an arbitrary individual with following estimated breeding values:
<- c(RZM = 130, RZN = 110, RZEo = 105, RZEn = 106, RZR = 95,
ind RZKm = 100, RZKd = 101, RZH = 115, RZC = 90, RZS = 120)
The RZG_old of this animal would then be calculated as follows:
t(res_old$b_scaled) %*% ind[names(res_old$b_scaled)]
#> [,1]
#> [1,] 116.2783
The same individual would have a slightly lower RZG given the new index.
t(res_new$b_scaled) %*% ind[names(res_new$b_scaled)]
#> [,1]
#> [1,] 114.4104
Further interest might be, how the index translates into genetic
gain. We can derive the expected composition of the total genetic gain
by extracting d_G_exp_scaled
. This is scaled so that the
sum of absolute values in the vector sum up to one. Note that genetic
gain is also expected in breeding goal traits that are not part of the
index as correlated selection response.
<- SelInd(w = w_old, G = G, r2 = r2_old, verbose = FALSE)
res_old round(res_old$d_G_exp_scaled, 2)
#> RZM RZN RZEo RZEn RZR RZKm RZKd RZH RZC RZS
#> 0.20 0.16 0.11 0.10 0.05 0.08 0.04 0.14 0.00 0.11
Even though the expected phenotypic trend in natural units equals the
expected genotypic trend in natural units, a comparison of phenotypes
across traits would usually follow a scaling in phenotypic standard
deviations to make traits comparable. Due to different heritabilities of
the traits, this also leads to another composition of the genetic gain.
To calculate the expected composition of the total phenotypic gain
(d_P_exp_scaled
), we need to pass also heritabilities to
the function. In our example, the phenotypic gain depends relatively
more on milk (“RZM”) than the genetic gain, as milk comes with a higher
heritability.
h2#> RZM RZN RZEo RZEn RZR RZKm RZKd RZH RZC RZS
#> 0.314 0.090 0.194 0.194 0.013 0.049 0.033 0.061 0.014 0.273
<- SelInd(w = w_old, G = G, r2 = r2_old, h2 = h2, verbose = FALSE)
res_old round(res_old$d_P_exp_scaled, 2)
#> RZM RZN RZEo RZEn RZR RZKm RZKd RZH RZC RZS
#> 0.29 0.13 0.13 0.12 0.02 0.05 0.02 0.09 0.00 0.15
The total genetic gain further depends on the selection intensity
(i
). So if we are interested in the total genetic gain in
genetic standard deviations (dG
), we need to further assume
a selection intensity.
<- SelInd(w = w_old, G = G, r2 = r2_old, h2 = h2, i = 0.1, verbose = FALSE)
res_old round(res_old$dG, 2)
#> [1] 0.64
We can then also retrieve the expected gain for the single traits
(d_G_exp
or d_P_exp
) scaled in genetic/
phenotypic standard deviations.
round(res_old$d_G_exp, 2)
#> RZM RZN RZEo RZEn RZR RZKm RZKd RZH RZC RZS
#> 0.83 0.67 0.47 0.44 0.22 0.35 0.16 0.58 0.02 0.45
round(res_old$d_P_exp, 2)
#> RZM RZN RZEo RZEn RZR RZKm RZKd RZH RZC RZS
#> 0.04 0.02 0.02 0.02 0.00 0.01 0.00 0.01 0.00 0.02
Of strong practical interest should be the question how the genetic gain changes when the index is changed. This question can either be answered by comparison of different indices, or by calculation of analytic measures.
We will first regard a situation where we not only have a numeric
change in the economic weights, but also a discrete change in the index
traits. Traits in w
, but not in r2
have an
economic weight of zero, which allows to calculate information on the
indirect response through correlations to index traits.
<- SelInd(w = w_old, G = G, r2 = r2_old, h2 = h2, verbose = FALSE)
res_old <- SelInd(w = w_new, G = G, r2 = r2_new, h2 = h2, verbose = FALSE) res_new
This then allows to compare the changes between the two indices, and we see that the total gain will be slightly less due to milk (“RZM”), but more due to gain in functional traits.
round(res_new$d_G_exp_scaled - res_old$d_G_exp_scaled,2)
#> RZM RZN RZEo RZEn RZR RZKm RZKd RZH RZC RZS
#> -0.06 0.01 -0.01 0.00 0.02 -0.01 0.02 0.03 0.01 -0.01
The same can also be done for the phenotypic trend, where this effect is even stronger due to the higher heritability of Milk.
round(res_new$d_P_exp_scaled - res_old$d_P_exp_scaled,2)
#> RZM RZN RZEo RZEn RZR RZKm RZKd RZH RZC RZS
#> -0.07 0.02 0.00 0.01 0.01 0.00 0.01 0.03 0.00 0.00
If we have a certain index and want to see which effect the single traits have on the composition of the total index, several analytic measures might help.
<- SelInd(w = w_old, G = G, r2 = r2_old, verbose = FALSE) res_old
First one would be the correlation between the overall index I and a phenotype in trait j (rI,yj; r_IP
). It
reflects, to what extent a trait contributes to the response in overall
genetic merit.
round(res_old$r_IP, 2)
#> RZM RZN RZEo RZR RZKm RZS
#> 0.80 0.68 0.49 0.25 0.37 0.43
Second one is the loss of accuracy of prediction if trait j is omitted from the index (rIH−jrIH;
r_IH
), which reflects the contribution of trait j to the overall selection objective.
round(res_old$r_IH, 2)
#> RZM RZN RZEo RZR RZKm RZS
#> 0.30 0.05 0.03 0.01 0.01 0.01
Further, the first derivative of d with respect to w (δdδw;
del_d_scaled
) tells us how a change in the economic weight
in some trait would affect the composition of the genetic trend. Note
that the values are scaled so that the sum of the absolute values within
a row sums up to one. The first row thereby tells us that a change of
the weight for “RZM” mainly affects the gain in “RZM”, but only
moderately affects the functional traits. A change in the weight for
“RZN” (second row) also indirectly affects e.g. “RZR”.
round(res_old$del_d_scaled, 2)
#> RZM RZN RZEo RZEn RZR RZKm RZKd RZH RZC RZS
#> RZM 0.15 -0.15 -0.09 -0.10 -0.14 -0.06 -0.02 -0.14 -0.01 -0.13
#> RZN -0.16 0.23 -0.01 0.02 0.15 0.04 0.05 0.18 0.04 0.12
#> RZEo -0.11 -0.01 0.36 0.33 -0.03 -0.02 -0.05 0.02 -0.05 0.02
#> RZEn -0.13 0.03 0.33 0.31 -0.01 -0.02 -0.04 0.05 -0.04 0.06
#> RZR -0.14 0.15 -0.02 0.00 0.31 0.12 0.05 0.13 0.02 0.04
#> RZKm -0.09 0.06 -0.03 -0.02 0.19 0.49 -0.01 0.07 0.02 0.01
#> RZKd -0.08 0.16 -0.13 -0.10 0.18 -0.03 0.07 0.13 0.04 0.08
#> RZH -0.15 0.19 0.01 0.04 0.14 0.05 0.04 0.17 0.03 0.18
#> RZC -0.06 0.19 -0.16 -0.13 0.11 0.07 0.05 0.12 0.06 0.04
#> RZS -0.14 0.13 0.02 0.05 0.05 0.01 0.02 0.19 0.01 0.39
Since V0.2.0.0, we calculate an approximate derivative by incrementing the weight of one trait slightly, while reducing the weight of the other traits accordingly, to be able to account for the side restriction that ∑w=0.
A further interesting feature of the method is to compare the
expected composition of the gain with the observed genetic trend. Note
that we subset w
and G
to the index traits, to
have a matching scaling.
<- SelInd(w = w_old[w_old>0], G = G[w_old>0,w_old>0], r2 = r2_old, verbose = FALSE)
res_old round(deltautr - res_old$d_G_exp_scaled, 3)
#> RZM RZN RZEo RZR RZKm RZS
#> 0.007 -0.007 0.022 0.027 -0.031 -0.018
This shows us that the resulting trend matches the expected well with probably slightly more weight on exterior (“RZEo”) and reproduction (“RZR”).
We can also check what this meant for the economic and index weights by calculating “realized weights” given the observed gains
<- SelInd(
res_old w = w_old[w_old>0],
G = G[w_old>0, w_old>0],
r2 = r2_old,
d_G_obs = deltautr,
verbose = FALSE)
round(res_old$w_real - res_old$w, 2)
#> RZM RZN RZEo RZR RZKm RZS
#> -0.05 -0.11 0.03 0.12 -0.10 -0.03
round(res_old$b_real - res_old$b_scaled, 2)
#> RZM RZN RZEo RZR RZKm RZS
#> -0.01 -0.05 0.04 0.10 -0.08 -0.03
This reveals that there was effectively slightly more weight on “RZEo” and “RZR” in practical breeding decisions than suggested by the index.