Metrics models
The repometrics package allows the construction of
“models” adapted
from those developed by the CHAOSS community (“Community Health
Analytics in Open Source Software”). These
models collate various numbers of individual
“metrics”:
to provide deeper context and answer more complex questions about a community’s health.
The metrics used in the models are described in the separate CHAOSS
metrics
vignette,
and are detailed in the data.frame returned by the
rm_metrics_list()
function.
In contrast to the metrics, each of which is encoded as a separate
function within this package, the models are not hard-coded, rather the
metrics used in each are defined within a single JSON file stored at
extdata/metrics-models/metrics-models.json
in the inst sub-directory of the package. This allows the model
construction to be easily modified without needing to modify the
underlying code of this package.
The file can be accessed by loading the package:
library (repometrics)
and then locating the file with:
f <- system.file (
fs::path ("extdata", "metrics-models", "metrics-models.json"),
package = "repometrics"
)
This file defines all CHAOSS metrics used in this package, and lists the metrics used to construct each model.
The models used here
Each model collates individual metrics defined in the
chaoss-models.json
file,
with final model values summing individual values of all component
metrics, as described in the following sub-section.
Combining metrics within models
Different CHAOSS
metrics quantify
different properties, and are measured on different scales. To combine
metrics into a single model, differences in scale must be reconciled.
The metrics-models.json
file
defines each metrics according to one of five types of measurement:
binary, translated to 0 for false and 1 for true;
count, for integer numbers greater than or equal to 0;
ratio, for floating-point numbers between 0 and 1;
days, for integer measurements in days; and
real, for single, real values greater than zero.
Metrics measured as counts or days are log-transformed before being aggregated to final models. Each metric also indicates where higher or lower values are desirable. Aggregate model scores are designed so that higher values are better, so any metrics for which lower values are better are first negated before being aggregated into overall model scores.