Removes trips with multiple overlapping operators or vehicles assigned to the same trip number.
Source:R/avl_cleaning.R
clean_overlapping_subtrips.RdIn some AVL vendors, multiple vehicles or operators may be logged to the same trip ID at the same time. This may be acceptable in some scenarios (e.g., a vehicle/operator tradeoff mid-trip). Other times, it may be an error, with these distinct (trip, vehicle, operator) truples running simulataneously. This function identifies both scenarios, and gives the option to remove one or both.
Usage
clean_overlapping_subtrips(
distance_df,
check_operator = FALSE,
remove_single_observations = TRUE,
remove_non_overlapping = FALSE,
return_removals = FALSE
)Arguments
- distance_df
A dataframe of linearized AVL data. Must include
event_timestamp,trip_id_performed, andvehicle_id. Optionally, may includeoperator_id.- check_operator
Optional. A boolean, should overlaps of multiple
operator_ids be checked for? Default is FALSE.- remove_single_observations
Optional. A boolean, should subtrips with only one observation be removed? Default is TRUE.
- remove_non_overlapping
Optional. A boolean, should trips with multiple vehicles or operators that do not overlap be removed? Default is FALSE.
- return_removals
Optional. A boolean, should the function return a dataframe of trips removed and why? Default is FALSE.
Value
The input distance_df, with violating trips removed. If return_removals = TRUE, a dataframe with trip IDs and the reason why it was identified for removal.
Examples
# Get input data
c53_dists <- new_transittraj_data("get_linear_distances")
dim(c53_dists)
#> [1] 639 11
# Run function
c53_no_overlaps <- clean_overlapping_subtrips(distance_df = c53_dists)
dim(c53_no_overlaps)
#> [1] 639 11
head(c53_no_overlaps)
#> location_ping_id vehicle_id trip_id_performed service_date route_id
#> 1 1586 5516 13437100 2026-02-16 C53
#> 2 1667 5516 13437100 2026-02-16 C53
#> 3 1694 5516 13437100 2026-02-16 C53
#> 4 1775 5516 13437100 2026-02-16 C53
#> 5 2018 5516 13437100 2026-02-16 C53
#> 6 2261 5516 13437100 2026-02-16 C53
#> direction_id speed trip_stop_sequence event_timestamp stop_id distance
#> 1 0 6.4008 2 2026-02-16 11:08:31 13111 0.00000
#> 2 0 0.0000 2 2026-02-16 11:09:01 13111 2.08491
#> 3 0 0.0000 2 2026-02-16 11:09:11 13111 2.08491
#> 4 0 0.0000 2 2026-02-16 11:09:41 13111 2.08491
#> 5 0 0.0000 2 2026-02-16 11:11:12 13111 2.08491
#> 6 0 0.0000 2 2026-02-16 11:12:43 13111 2.08491