0

I am dealing with some tricky GTFS from Belgian public transport operator De Lijn, which somehow added belbus (demand-response buses) as a bus route that comes every hour on their GTFS, making some poorly served countryside misleadingly appear as a highly accessible area with excellent public transport connection.

In routes.txt, they are listed as this:

route_id agency_id route_short_name route_long_name route_desc route_type route_url route_color route_text_color
61135 1 460 Belbus Vlaamse Ardennen Belbus Vlaamse Ardennen/Belbus Vlaamse Ardennen 3 FFFFFF 000099

I really want to know how I can filter any routes with "Belbus" in their route_desc or route_long_name.

At first I tried to just find them on Excel, delete them, and save it into routes.txt, but of course it didn't work when I calculated stop-level frequency on ArcGIS, since I suppose it just looks at stop_times.txt and does not check if the data in Routes.txt went missing.

I also used gtfstools to try to filter it by route_type, but it was either take all buses out or not unfortunately.

Kloot Zak
  • 3
  • 2
  • 1
    Assuming you load the txt into a dataframe called `df`, then with `tidyverse` try: `df |> filter(grepl(pattern = "Belbus", x = paste(route_long_name, route_desc)) == FALSE)` – Nicolás Velasquez Mar 01 '23 at 16:05

2 Answers2

1

I recommend that you filter rows using the str_detect function from the stringr package.

library(dplyr)
library(stringr)
df_filtered <- df %>% filter(str_detect(route_long_name, "Belbus") == TRUE)
Leonardo19
  • 83
  • 5
0

{gtfstools} maintainer here.

What I'd do:

library(gtfstools)

path <- "path_to_gtfs.zip"

gtfs <- read_gtfs(path)

# select route ids whose route_long_name includes "Belbus"
selected_routes <- gtfs$routes[grepl("Belbus", route_long_name)]$route_id

# filter them out of the gtfs object
filtered_gtfs <- filter_by_route_id(gtfs, selected_routes, keep = FALSE)
dhersz
  • 525
  • 2
  • 8