Randomly assigns a k-fold cross-validation number to each individual in a dataset.

add_cv_number(data_long, individual_id, k, seed = 1)

Arguments

data_long

Data frame in long format i.e. there may be more than one row per individual

individual_id

Character string specifying the column name in data_long which contains the individual identifiers

k

Integer specifying the number of folds for cross-validation.

seed

The value of the seed (default is 1)

Value

Data frame data_long updated to contain a new column cross_validation_number indicating the fold to which the individual has been assigned.

Details

This function randomly divides the n individual IDs into k groups, each with n/k members (or as close to this number as possible).

Author

Isobel Barrott isobel.barrott@gmail.com

Examples

data(data_repeat_outcomes)
data_repeat_outcomes <- add_cv_number(data_long = data_repeat_outcomes,
                                      individual_id = "id",
                                      k = 10)