Calculates the Bradley-Terry probabilities of each item in a fully-connected component of the comparison graph, $$G_W$$, winning against every other item in that component (see Details).

btprob(object, subset = NULL, as_df = FALSE)

## Details

Consider a set of $$K$$ items. Let the items be nodes in a graph and let there be a directed edge $$(i, j)$$ when $$i$$ has won against $$j$$ at least once. We call this the comparison graph of the data, and denote it by $$G_W$$. Assuming that $$G_W$$ is fully connected, the Bradley-Terry model states that the probability that item $$i$$ beats item $$j$$ is $$p_{ij} = \frac{\pi_i}{\pi_i + \pi_j},$$ where $$\pi_i$$ and $$\pi_j$$ are positive-valued parameters representing the skills of items $$i$$ and $$j$$, for $$1 \le i, j, \le K$$. The function btfit can be used to find the strength parameter $$\pi$$. It produces a "btfit" object that can then be passed to btprob to obtain the Bradley-Terry probabilities $$p_{ij}$$.

If $$G_W$$ is not fully connected, then a penalised strength parameter can be obtained using the method of Caron and Doucet (2012) (see btfit, with a > 1), which allows for a Bradley-Terry probability of any of the K items beating any of the others. Alternatively, the MLE can be found for each fully connected component of $$G_W$$ (see btfit, with a = 1), and the probability of each item in each component beating any other item in that component can be found.

Bradley, R. A. and Terry, M. E. (1952). Rank analysis of incomplete block designs: 1. The method of paired comparisons. Biometrika, 39(3/4), 324-345.

Caron, F. and Doucet, A. (2012). Efficient Bayesian Inference for Generalized Bradley-Terry Models. Journal of Computational and Graphical Statistics, 21(1), 174-196.

btfit, btdata

## Examples

citations_btdata <- btdata(BradleyTerryScalable::citations)
fit1 <- btfit(citations_btdata, 1)
btprob(fit1)#> 4 x 4 sparse Matrix of class "dgCMatrix"
#>               citing
#> cited              JRSS-B Biometrika       JASA Comm Statist
#>   JRSS-B       .           0.5672532 0.67936229    0.9615848
#>   Biometrika   0.43274683  .         0.61779270    0.9502388
#>   JASA         0.32063771  0.3822073 .             0.9219605
#>   Comm Statist 0.03841516  0.0497612 0.07803945    .        btprob(fit1, as_df = TRUE)#> # A tibble: 6 x 5
#>      component      cited       citing prob1wins  prob2wins
#>          <chr>      <chr>        <chr>     <dbl>      <dbl>
#> 1 full_dataset     JRSS-B   Biometrika 0.5672532 0.43274683
#> 2 full_dataset     JRSS-B         JASA 0.6793623 0.32063771
#> 3 full_dataset Biometrika         JASA 0.6177927 0.38220730
#> 4 full_dataset     JRSS-B Comm Statist 0.9615848 0.03841516
#> 5 full_dataset Biometrika Comm Statist 0.9502388 0.04976120
#> 6 full_dataset       JASA Comm Statist 0.9219605 0.07803945toy_df_4col <- codes_to_counts(BradleyTerryScalable::toy_data, c("W1", "W2", "D"))
toy_btdata <- btdata(toy_df_4col)
fit2a <- btfit(toy_btdata, 1)
btprob(fit2a)#> $2 #> 3 x 3 sparse Matrix of class "dgCMatrix" #> player2 #> player1 Han Gal Fin #> Han . 0.5703074 0.8586132 #> Gal 0.4296926 . 0.8206436 #> Fin 0.1413868 0.1793564 . #> #>$3
#> 4 x 4 sparse Matrix of class "dgCMatrix"
#>        player2
#> player1       Cyd       Amy       Ben       Dan
#>     Cyd .         0.6364291 0.6975107 0.7259617
#>     Amy 0.3635709 .         0.5684605 0.6021258
#>     Ben 0.3024893 0.4315395 .         0.5346338
#>     Dan 0.2740383 0.3978742 0.4653662 .
#> btprob(fit2a, as_df = TRUE)#> # A tibble: 9 x 5
#>   component player1 player2 prob1wins prob2wins
#>       <chr>   <chr>   <chr>     <dbl>     <dbl>
#> 1         2     Han     Gal 0.5703074 0.4296926
#> 2         2     Han     Fin 0.8586132 0.1413868
#> 3         2     Gal     Fin 0.8206436 0.1793564
#> 4         3     Cyd     Amy 0.6364291 0.3635709
#> 5         3     Cyd     Ben 0.6975107 0.3024893
#> 6         3     Amy     Ben 0.5684605 0.4315395
#> 7         3     Cyd     Dan 0.7259617 0.2740383
#> 8         3     Amy     Dan 0.6021258 0.3978742
#> 9         3     Ben     Dan 0.5346338 0.4653662btprob(fit2a, subset = function(x) "Amy" %in% names(x))#> \$3
#> 4 x 4 sparse Matrix of class "dgCMatrix"
#>        player2
#> player1       Cyd       Amy       Ben       Dan
#>     Cyd .         0.6364291 0.6975107 0.7259617
#>     Amy 0.3635709 .         0.5684605 0.6021258
#>     Ben 0.3024893 0.4315395 .         0.5346338
#>     Dan 0.2740383 0.3978742 0.4653662 .
#> fit2b <- btfit(toy_btdata, 1.1)
btprob(fit2b, as_df = TRUE)#> # A tibble: 28 x 5
#>       component player1 player2 prob1wins prob2wins
#>           <chr>   <chr>   <chr>     <dbl>     <dbl>
#>  1 full_dataset     Eve     Cyd 0.8067082 0.1932918
#>  2 full_dataset     Eve     Han 0.8396707 0.1603293
#>  3 full_dataset     Cyd     Han 0.5565123 0.4434877
#>  4 full_dataset     Eve     Amy 0.8784344 0.1215656
#>  5 full_dataset     Cyd     Amy 0.6338864 0.3661136
#>  6 full_dataset     Han     Amy 0.5797890 0.4202110
#>  7 full_dataset     Eve     Gal 0.8811003 0.1188997
#>  8 full_dataset     Cyd     Gal 0.6397156 0.3602844
#>  9 full_dataset     Han     Gal 0.5859168 0.4140832
#> 10 full_dataset     Amy     Gal 0.5063006 0.4936994
#> # ... with 18 more rows