Compute the edgelist of a network from a database of movements records.
Source:R/NetworkBuilding.R
edgelist_from_base.Rd
This function computes the edgelist of a network of facilities across which subjects can be transferred. The edgelist is computed from a database that contains the records of the subjects' stays in the facilities.
Usage
edgelist_from_base(
base,
window_threshold = 365,
count_option = "successive",
prob_params = c(0.0036, 1/365, 0.128),
condition = "dates",
noloops = TRUE,
nmoves_threshold = NULL,
flag_vars = NULL,
flag_values = NULL,
keep_nodes = FALSE,
verbose = FALSE
)
Arguments
- base
(data.table) A database of records of stays of subjects in facilities. The table should have at least the following columns:
subjectID (character) unique subject identifier
facilityID (character) unique facility identifier
admDate (POSIXct) date of admission in the facility
disDate (POSIXct) date of discharge of the facility
- window_threshold
(integer) A number of days. If two stays of a subject at two facilities occurred within this window, this constitutes a connection between the two facilities (given that potential other conditions are met).
- count_option
(character) How to count connections. Options are "successive", "probability" or "all". See details.
- prob_params
(vector of numeric) Three numerical values to calculate the probability that a movement causes an introduction from hospital A to hospital B. See Donker T, Wallinga J, Grundmann H. (2010) <doi:10.1371/journal.pcbi.1000715> for more details. For use with count_option="probability". prob_params[1] is the rate of acquisition in hospital A (related to LOS in hospital A). Default: 0.0036 prob_params[2] is the rate of loss of colonisation (related to time between admissions). Default: 1/365 prob_params[4] is the rate of transmission to other patients in hospital B (related to LOS in hospital B). Default: 0.128
- condition
(character) Condition(s) used to decide what constitutes a connection. Can be "dates", "flags", or "both". See details.
- noloops
(boolean). Should transfers within the same nodes (loops) be kept or set to 0. Defaults to TRUE, removing loops (setting matrix diagonal to 0).
- nmoves_threshold
(numeric) A threshold for the minimum number of subject transfer between two facilities. Set to NULL to deactivate, default to NULL.
- flag_vars
(list) Additional variables that can help flag a transfer, besides the dates of admission and discharge. Must be a named list of two character vectors which are the names of the columns that can flag a transfer: the column that can flag a potential origin, and the column that can flag a potential target. The list must be named with "origin" and "transfer". Eg: list("origin" = "var1", "target" = "var2"). See details.
- flag_values
(list) A named list of two character vectors which contain the values of the variables in flag_var that are matched to flag a potential transfer. The list must be named with "origin" and "transfer". The character vectors might be of length greater than one. Eg: list("origin" = c("value1", "value2"), "target" = c("value2", "value2")). The values in 'origin' and 'target' are the values that flag a potential origin of a transfer, or a potential target, respectively. See details.
- keep_nodes
(logical) Should nodes with no connections be kept in the edgelist? Defaults to FALSE.
- verbose
TRUE to print computation steps
Value
A list of two data.tables, which are the edgelists. One in long format (el_long), and one aggregated by pair of nodes (el_aggr).
Details
The edgelist contains the information on the connections between nodes of the network, that is the movements of subjects between facilities. The edgelist can be in two different formats: long or aggregated. In long format, each row corresponds to a single movement between two facilities, therefore only two columns are needed, one containing the origin facilities of a movement, the other containing the target facilities. In aggregated format, the edgelist is aggregated by unique pairs of origin-target facilities.
Examples
mydb <- create_fake_subjectDB(n_subjects = 100, n_facilities = 10)
myBase <- checkBase(mydb)
#> Checking for missing values...
#> Checking for duplicated records...
#> Removed 0 duplicates
#> Done.
edgelist_from_base(myBase)
#> $el_aggr
#> Key: <origin, target>
#> origin target N
#> <char> <char> <int>
#> 1: f01 f02 2
#> 2: f01 f03 2
#> 3: f01 f04 3
#> 4: f01 f05 4
#> 5: f01 f07 4
#> 6: f01 f08 3
#> 7: f01 f09 1
#> 8: f01 f10 2
#> 9: f02 f01 3
#> 10: f02 f03 1
#> 11: f02 f05 1
#> 12: f02 f08 4
#> 13: f02 f09 2
#> 14: f02 f10 1
#> 15: f03 f01 3
#> 16: f03 f02 1
#> 17: f03 f04 1
#> 18: f03 f05 4
#> 19: f03 f06 3
#> 20: f03 f08 1
#> 21: f03 f10 2
#> 22: f04 f01 2
#> 23: f04 f02 1
#> 24: f04 f05 3
#> 25: f04 f10 1
#> 26: f05 f01 3
#> 27: f05 f03 2
#> 28: f05 f04 1
#> 29: f05 f06 3
#> 30: f05 f07 1
#> 31: f05 f08 1
#> 32: f05 f09 2
#> 33: f05 f10 2
#> 34: f06 f01 4
#> 35: f06 f02 1
#> 36: f06 f03 2
#> 37: f06 f04 1
#> 38: f06 f05 1
#> 39: f06 f07 2
#> 40: f06 f08 3
#> 41: f06 f09 3
#> 42: f07 f01 1
#> 43: f07 f03 1
#> 44: f07 f05 1
#> 45: f07 f06 2
#> 46: f07 f09 1
#> 47: f07 f10 1
#> 48: f08 f01 5
#> 49: f08 f03 1
#> 50: f08 f04 1
#> 51: f08 f05 1
#> 52: f08 f06 2
#> 53: f08 f07 1
#> 54: f08 f09 1
#> 55: f08 f10 1
#> 56: f09 f02 1
#> 57: f09 f03 2
#> 58: f09 f04 4
#> 59: f09 f05 2
#> 60: f09 f06 1
#> 61: f09 f07 1
#> 62: f09 f08 1
#> 63: f10 f02 1
#> 64: f10 f03 3
#> 65: f10 f05 1
#> 66: f10 f07 1
#> 67: f10 f08 1
#> 68: f10 f09 1
#> origin target N
#>
#> $el_long
#> Key: <origin, target>
#> sID origin target
#> <char> <char> <char>
#> 1: s036 f01 f02
#> 2: s090 f01 f02
#> 3: s009 f01 f03
#> 4: s018 f01 f03
#> 5: s034 f01 f04
#> ---
#> 123: s084 f10 f03
#> 124: s004 f10 f05
#> 125: s038 f10 f07
#> 126: s010 f10 f08
#> 127: s086 f10 f09
#>