Fit a landmarking model using a linear mixed effects (LME) model for the longitudinal submodel

This function performs the two-stage landmarking analysis.

fit_LME_landmark(
  data_long,
  x_L,
  x_hor,
  predictors_LME,
  responses_LME,
  predictors_LME_time,
  responses_LME_time,
  random_slope_longitudinal = TRUE,
  random_slope_survival = TRUE,
  include_data_after_x_L = TRUE,
  individual_id,
  k,
  cross_validation_df,
  standardise_time = FALSE,
  lme_control = nlme::lmeControl(),
  event_time,
  event_status,
  survival_submodel = c("standard_cox", "cause_specific", "fine_gray"),
  b
)

Arguments

data_long	Data frame or list of data frames each corresponding to a landmark age `x_L` (each element of the list must be named the value of `x_L` it corresponds to). Each data frame contains repeat measurements data and time-to-event data in long format.
x_L	Numeric specifying the landmark time(s)
x_hor	Numeric specifying the horizon time(s)
predictors_LME	Vector of character strings specifying the column names in `data_long` which correspond to the predictor variables in the LME model
responses_LME	Vector of character strings specifying the column names in `data_long` which correspond to the response variables in the LME model
predictors_LME_time	Vector of character strings specifying the column names in `data_long` which contains the time at which the predictor variables were recorded. This should either be length 1 or the same length as `predictors_LME`. In the latter case the order of elements must correspond to the order of elements in `predictors_LME`.
responses_LME_time	Vector of character strings specifying the column names in `data_long` which contain the times at which response variables were recorded. This should either be length 1 or the same length as `responses_LME`. In the latter case the order of elements must correspond to the order of elements in `responses_LME`.
random_slope_longitudinal	Boolean indicating whether to include a random slope in the LME model. See Details section of `fit_LME_longitudinal` for more information.
random_slope_survival	Boolean indicating whether to include the random slope estimate from the LME model. See Details section of `fit_LME_longitudinal` for more information. as a covariate in the survival submodel. See Details section of `fit_LME_longitudinal` for more information.
include_data_after_x_L	Boolean indicating whether to include all longitudinal data, including data after the landmark age `x_L`, in the model development dataset. See Details section of `fit_LME_longitudinal` for more information.
individual_id	Character string specifying the column name in `data_long` which contains the individual identifiers
k	Integer specifying the number of folds for cross-validation. An alternative to setting parameter `cross_validation_df` for performing cross-validation; if both are missing no cross-validation is used.
cross_validation_df	List of data frames containing the cross-validation fold each individual is assigned to. Each data frame in the list should be named according to the landmark time `x_L` that they correspond. Each data frame should contain the columns `individual_id` and a column `cross_validation_number` which contains the cross-validation fold of the individual. An alternative to setting parameter `k` for performing cross-validation; if both are missing no cross-validation is used.
standardise_time	Boolean indicating whether to standardise the time variable in the LME model by subtracting the mean and dividing by the standard deviation. See Details section of `fit_LME_longitudinal` for more information.
lme_control	Object created using `nlme::lmeControl()`, which will be passed to the `control` argument of the `lme` function
event_time	Character string specifying the column name in `data_long` which contains the event time
event_status	Character string specifying the column name in `data_long` which contains the event status (where 0=censoring, 1=event of interest, if there are competing events these are labelled 2 or above).
survival_submodel	Character string specifying which survival submodel to use. Three options: the standard Cox model i.e. no competing risks (`"standard_cox"`), the cause-specific regression model (`"cause_specific"`), or the Fine Gray regression model (`"fine_gray"`)
b	Integer specifying the number of bootstrap samples to take when calculating standard error of c-index and Brier score

Value

List containing containing information about the landmark model at each of the landmark times. Each element of this list is named the corresponding landmark time, and is itself a list containing elements: data, model_longitudinal, model_LME, model_LME_standardise_time, model_survival, and prediction_error.

data has one row for each individual in the risk set at x_L and contains the value of the predictors_LME using the LOCF approach and predicted values of the responses_LME using the LME model at the landmark time x_L. It also includes the predicted probability that the event of interest has occurred by time x_hor, labelled as "event_prediction". There is one row for each individual.

model_longitudinal indicates that the longitudinal approach is LME.

model_LME contains the output from the lme function from package nlme. For a model using cross-validation, model_LME contains a list of outputs with each element in the list corresponds to a different cross-validation fold.

model_LME_standardise_time contains a list of two objects mean_response_time and sd_response_time if the parameter standardise_time=TRUE is used. This is the mean and standard deviation use to normalise times when fitting the LME model.

model_survival contains the outputs from the survival submodel functions, including the estimated parameters of the model. For a model using cross-validation, model_survival will contain a list of outputs with each element in the list corresponding to a different cross-validation fold.

prediction_error contains a list indicating the c-index and Brier score at time x_hor and their standard errors if parameter b is used.

Details

Firstly, this function selects the individuals in the risk set at the landmark time x_L. Specifically, the individuals in the risk set are those that have entered the study before the landmark time x_L (there is at least one observation for each of the predictors_LME and responses_LME on or before x_L) and exited the study after the landmark age (event_time is greater than x_L).

Secondly, if the option to use cross validation is selected (using either parameter k or cross_validation_df), then an extra column cross_validation_number is added with the cross-validation folds. If parameter k is used, then the function add_cv_number randomly assigns these folds. For more details on this function see ?add_cv_number. If the parameter cross_validation_df is used, then the folds specified in this data frame are added. If cross-validation is not selected then the landmark model is fit to the entire group of individuals in the risk set (this is both the training and test dataset).

Thirdly, the landmark model is then fit to each of the training datasets. There are two parts to fitting the landmark model: using the longitudinal data and using the survival data. Using the longitudinal data is the first stage and is performed using fit_LME_longitudinal. See ?fit_LME_longitudinal more for information about this function. Using the survival data is the second stage and is performed using fit_survival_model. This function censors the individuals at the time horizon x_L and fits the survival model. See ?fit_survival_model more for information about this function.

Fourthly, the performance of the model is then assessed on the set of predictions from the entire set of individuals in the risk set by calculating Brier score and C-index. This is performed using get_model_assessment. See ?get_model_assessment more for information about this function.

Author

Isobel Barrott isobel.barrott@gmail.com

Examples

# \donttest{
library(Landmarking)
data(data_repeat_outcomes)
data_model_landmark_LME <-
  fit_LME_landmark(
    data_long = data_repeat_outcomes,
    x_L = c(60, 61),
    x_hor = c(65, 66),
    k = 10,
    predictors_LME = c("ethnicity", "smoking", "diabetes"),
    predictors_LME_time = "response_time_sbp_stnd",
    responses_LME = c("sbp_stnd", "tchdl_stnd"),
    responses_LME_time = c("response_time_sbp_stnd", "response_time_tchdl_stnd"),
    individual_id = "id",
    standardise_time = TRUE,
    lme_control = nlme::lmeControl(maxIter = 100, msMaxIter = 100),
    event_time = "event_time",
    event_status = "event_status",
    survival_submodel = "cause_specific"
  )
#> Warning: 864 individuals have been removed from the model building as they are not in the risk set at landmark age 60
#> Warning: 737 individuals have been removed from the model building as they are not in the risk set at landmark age 61
#> [1] "Fitting longitudinal submodel, landmark age 60"
#> [1] "Complete, landmark age 60"
#> [1] "Fitting survival submodel, landmark age 60"
#> Warning: Loglik converged before variable  2 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  2 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  2 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  1,3,4 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  2 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  2,4 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  2 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  2 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  2 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  2 ; coefficient may be infinite. 
#> [1] "Complete, landmark age 60"
#> [1] "Fitting longitudinal submodel, landmark age 61"
#> [1] "Complete, landmark age 61"
#> [1] "Fitting survival submodel, landmark age 61"
#> Warning: Loglik converged before variable  1,2,3,4,6 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  1,3,4 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  1,3,4 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  1,3,4 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  1,3,4 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  1,3 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  1,3,4 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  1,3,4 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  1,3,4 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  1,3,4 ; coefficient may be infinite. 
#> [1] "Complete, landmark age 61"
# }