Creates a btdata object, primarily for use in the btfit function.
btdata(x, return_graph = FALSE) # S3 method for btdata summary(object, ...)
x | The data, which is either a three- or four-column data frame, a directed igraph object, a square matrix or a square contingency table. See Details. |
---|---|
return_graph | Logical. If TRUE, an igraph object representing the comparison graph will be returned. |
object | An object of class "btdata", typically the result |
... | Other arguments |
An object of class "btdata", which is a list containing:
A \(K\) by \(K\) square matrix, where \(K\) is the total number of players. The \(i,j\)-th element is \(w_{ij}\), the number of times item \(i\) has beaten item \(j\). If the items in x
are unnamed, the wins matrix will be assigned row and column names 1:K.
A list of the fully-connected components.
The comparison graph of the data (if return_graph = TRUE). See Details.
The x
argument to btdata
can be one of four types:
A matrix (either a base matrix
) or a class from the Matrix
package), dimension \(K\) by \(K\), where \(K\) is the number of items. The i,j-th element is \(w_{ij}\), the number of times item \(i\) has beaten item \(j\). Ties can be accounted for by assigning half a win (i.e. 0.5) to each item.
A contingency table of class table
, similar to the matrix described in the above point.
An igraph
, representing the comparison graph, with the \(K\) items as nodes. For the edges:
If the graph is unweighted, a directed edge from node \(i\) to node \(j\) for every time item \(i\) has beaten item \(j\)
If the graph is weighted, then one edge from node \(i\) to node \(j\) if item \(i\) has beaten item \(j\) at least once, with the weight attribute of that edge set to the number of times \(i\) has beaten \(j\).
If x
is a data frame, it must have three or four columns:
3-column data frameThe first column contains the name of the winning item, the second column contains the name of the losing item and the third columns contains the number of times that the winner has beaten the loser. Multiple entries for the same pair of items are handled correctly. If x
is a three-column dataframe, but the third column gives a code for who won, rather than a count, see codes_to_counts
.
4-column data frameThe first column contains the name of item 1, the second column contains the name of item 2, the third column contains the number of times that item 1 has beaten item 2 and the fourth column contains the number of times item 2 has beaten item 1. Multiple entries for the same pair of items are handled correctly. This kind of data frame is also the output of codes_to_counts
.
In either of these cases, the data can be aggregated, or there can be one row per comparison.
Ties can be accounted for by assigning half a win (i.e. 0.5) to each item.
summary.btdata
shows the number of items, the density of the wins
matrix and whether the underlying comparison graph is fully connected. If it is not fully connected, summary.btdata
will additional show the number of fully-connected components and a table giving the frequency of components of different sizes. For more details on the comparison graph, and how its structure affects how the Bradley-Terry model is fitted, see btfit
and the vignette: https://ellakaye.github.io/BradleyTerryScalable/articles/BradleyTerryScalable.html.
codes_to_counts
select_components
#> Number of items: 4 #> Density of wins matrix: 1 #> Fully-connected: TRUEtoy_df_4col <- codes_to_counts(BradleyTerryScalable::toy_data, c("W1", "W2", "D")) toy_btdata <- btdata(toy_df_4col) summary(toy_btdata)#> Number of items: 8 #> Density of wins matrix: 0.25 #> Fully-connected: FALSE #> Number of fully-connected components: 3 #> Summary of fully-connected components: #> Component size Freq #> 1 1 1 #> 2 3 1 #> 3 4 1