Building a Live Transit Departure Board with Discourse
Published October 8, 2025
10 min read
I've been working at Discourse for a few months now, learning how flexible the platform is. It's a forum. It's designed for conversations. But what if we used it for something completely different?
Something like tracking flights and trains?
I wanted to learn more about Discourse by building something that felt real and that would force me into actual problems. Most importantly though, I wanted to build something that involves my interests. I love transit infrastructure and open data so why not a transit tracker using free government feeds? I had this image in my head: those old split-flap airport departure boards, the ones that click and whir as letters rotate into place.
Photo by Amsterdam City Archives on
Unsplash
What if I could recreate that aesthetic, but powered by Discourse topics instead of actual flights?
You can see the result here: discourse-transit-tracker on GitHub
Live demo: discourse.gnarlyvoid.com/board
(Note: Running on a small demo droplet, so it might be a bit slow. The plugin runs great on properly-sized servers!)
Why This Makes No Sense (And Why I Did It Anyway)
Let's be clear: Discourse is not a transit tracking system. It's a forum platform built for human conversations, not GTFS feeds and real-time departure data.
But that's exactly what made it interesting.
Discourse topics are incredibly flexible. They have custom fields, tags, categories, and a robust permission system. If you squint hard enough, a flight departure is just a "post" with structured data. The departure time? A custom field. The status (on-time, delayed, departed)? A tag. The airline? Maybe a category.
It's ridiculous. But it could work.
Learning with Claude Code
I built this entire plugin using Claude Code, Anthropic's CLI tool. Not because I couldn't write it myself, but because I wanted to learn Discourse patterns while writing quality code from the start.
"Vibe programming" gets a bad rap. People hear "AI-assisted development" and think it's about blindly accepting generated code. But that's not how I used it. Claude Code became a learning tool. I'd describe what I wanted to build, Claude would suggest an approach using Discourse conventions, and I'd understand why those patterns exist.
This allowed me to learn the platform faster than reading docs alone would have taught me. I saw real implementations of custom fields, service objects, Ember components, and ActiveRecord patterns. And because Claude follows Discourse's style guide and architecture, the code I wrote actually fits the codebase.
This is what good AI-assisted development looks like: not replacing understanding, but accelerating it.
The Technical Stack
The plugin integrates three data sources, each with its own challenges.
Amtrak (GTFS)
I built an AmtrakGtfsService
that downloads Amtrak's GTFS feed (a ZIP
file with CSVs), parses routes, stops, trips, and schedules, and creates
departure topics with detailed stop information. No API key required.
The service:
- Downloads and extracts
GTFS.zip
from Amtrak's CDN - Parses
routes.txt
,stops.txt
,trips.txt
, andstop_times.txt
- For each trip in the next 24 hours, creates a topic with:
- All basic departure info as custom fields
- A detailed stops array with lat/lon coordinates and times
- A formatted schedule table as Post #2
Running bin/rake transit_tracker:import_amtrak
processes ~2,300 trips and
creates ~600 departure topics (many trips share the same departure, so they
get merged).
The Problem: "Title has already been used"
The first import run worked fine, processing ~2,300 trips and creating ~600 topics. But when I ran it again to test updates, I hit a wall:
Processing 340 trips...
Created 1 topic
Error: Title has already been used (339 times)
Only 1 out of 340 trips succeeded on the second run. The rest failed.
Discourse requires unique topic titles. My title format was: "City of New Orleans to New Orleans at 19:05"
. Multiple different trains with the same
route, destination, and departure minute produced identical titles.
My lookup strategy was to find topics by trip_id
+ service_date
stored
in custom fields. That worked fine when topics existed. But on subsequent
runs, when a topic wasn't found by custom field (maybe the trip_id
changed), I'd try to create one, and Discourse would reject it because
another train had already claimed that title.
The Fix: Fallback Lookup by Title
The solution was to add a fallback lookup by title before trying to create:
# First, try to find by trip_id + service_date (the ideal natural key)
topic = Topic.joins(:_custom_fields)
.where(topic_custom_fields: {
name: "transit_trip_id",
value: attributes[:trip_id]
})
.where(...)
.first
# If still no match, try to find by title to avoid duplicates
if !topic
title = build_title(attributes)
category_id = determine_category(attributes[:mode])
topic = Topic.where(title: title, category_id: category_id).first
if topic
Rails.logger.info "[TransitTracker] Found existing topic by title,
will update"
is_new = false
end
end
This way, if multiple trips share a title, they merge into the same topic and get updated instead of failing.
Result: 327 departures created, 0 errors. All trips within the 24-hour window imported successfully. The duplicate trips just update the same topic with fresh data.
NYC MTA Subway (GTFS)
The NYC MTA subway system is massive. Their GTFS feed contains over 500,000 stop times covering weeks of schedules across dozens of routes.
My first approach was simple: import everything, just like I did with Amtrak. Parse the entire feed, create topics for every departure in the next 24 hours.
The Problem: 19GB of RAM
I ran the import and watched my Rails process climb. 1GB. 5GB. 10GB. It kept going. By the time it finished parsing, it had consumed over 19GB of RAM.
Loading and processing 500k+ stop times to create 20,000+ Discourse topics consumed massive amounts of memory. At scale, big data creates big problems.
The Fix: Reduce the Time Window
The fix? Reduce the import window from 24 hours to 6 hours:
# Only import departures within the next 6 hours
dep_time = parse_gtfs_time(today, first_stop[:departure_time])
next if dep_time < now || dep_time > (now + 6.hours)
Result: ~5,000 topics instead of 20,000+, memory usage stayed under 2GB, and the board still shows plenty of departures. For a live departure board, you don't need train schedules from tomorrow anyway.
The final implementation includes official MTA line colors (red 1/2/3, green 4/5/6, yellow N/Q/R/W, etc.) and creates ~5,000 departure topics with complete schedules.
AviationStack API
Tracks flight departures with gate assignments, delays, and code-share detection. Requires an API key from aviationstack.com.
The Problem: Duplicate Code-Share Flights
Multiple airlines sell seats on the same physical flight under different
flight numbers. That's called code-sharing. So you might have AA123
,
BA456
, and IB789
all referring to the exact same plane leaving from
Gate E7 at 07:30.
At first, I tried to build my own detection using departure time + gate +
destination as a natural key. But then I looked closer at the AviationStack
API response and found it: a codeshared
field that tells you exactly
which flight is the operating carrier.
The Fix: Use the API's Built-In Field
# Handle code-share flights: use operating carrier as natural key
codeshared = flight_info["codeshared"]
if codeshared.present?
# This is a marketing carrier selling seats on another airline's flight
# Use the operating flight as trip_id so all code-shares merge
operating_flight = codeshared["flight_iata"] || codeshared["flight_icao"]
trip_id = "#{operating_flight}-#{departure_info['scheduled']}"
Rails.logger.info "[TransitTracker] Code-share detected:
#{marketing_flight} operated by #{operating_flight}"
else
# Regular flight, use its own flight number
trip_id = "#{flight_info['iata']}-#{departure_info['scheduled']}"
end
Why reinvent the wheel? The API already does the hard work of identifying
code-shares. I just use the operating carrier's flight number as the
trip_id
, and all marketing carriers automatically merge into the same
topic.
Result: AA 1234 / BA 5678 / IB 789
displayed as one departure.
The Architecture
Topics as Transit Legs
Each flight (or train, or bus) is a Discourse topic. I created a
TransitLeg
model that wraps Topic and handles all the custom fields:
transit_dep_sched_at
(scheduled departure time)transit_dep_est_at
(estimated departure time for delays)transit_route_short_name
(flight numbers)transit_headsign
(destination)transit_gate
/transit_platform
(where to board)transit_dest
(airport code)transit_stops
(JSON array of all stops with coordinates and times)
Tags handle the mode (flight
, train
, bus
) and status
(status:scheduled
, status:delayed
, status:departed
).
Posts as Schedule Details
Here's where using Discourse as the foundation really paid off.
When you click on a departure row, it expands to show the complete route schedule with all stops and arrival/departure times. But I didn't build a custom data structure for this. I used Discourse posts.
Each departure topic has:
- Post #1 (the OP): Basic departure info (route, times, gate)
- Post #2: A markdown table with the complete schedule
- Post #3+: Any delay notifications or status updates
When you expand a row, you're literally seeing the topic's replies rendered inline. It slides down with a smooth animation, showing the full schedule table styled to match the departure board aesthetic.
The schedule post looks like this:
## 🚂 Complete Schedule
**Route:** City of New Orleans
**Direction:** Chicago
| Stop | Arrival | Departure |
|---------------------------------------|---------|-----------|
| New Orleans Union Passenger Terminal | 12:45 | 12:45 |
| Hammond Amtrak Station | 13:42 | 13:45 |
| McComb | 14:30 | 14:32 |
| ... | ... | ... |
| Chicago Union Station | 08:15 | 08:15 |
_Schedule times are in local timezone. This is the planned schedule and may
be subject to delays._
It renders beautifully in the expanded row with the board's dark styling.
Why This Works Really Well
Using posts instead of a custom schema means:
- Update history is built-in. If a train gets delayed, we post an update and users see the entire timeline.
- Moderation tools work. If there's bad data, moderators can edit posts using Discourse's existing tools.
- Comments could work. Users could reply to departures (we don't allow this now, but the infrastructure is there).
- No additional database tables. Posts are just posts, Discourse handles all the storage.
Screenshots
Flights - Collapsed View
Flights - Expanded View
NYC Subway - Collapsed View
NYC Subway - Expanded View
Amtrak Trains - Collapsed View
Amtrak Trains - Expanded View
Bonus Tooling: Consistent Screenshots
Getting these screenshots pixel-perfect required some tooling. I wanted to be able to:
- Select a region once
- Click things in the browser to expand/collapse
- Take multiple screenshots of the exact same region
I built a Nix shell with Wayland screenshot tools (grim
+ slurp
):
nix-shell ~/discourse/nix-shells/screenshot.nix
screenshot-region flight-1.png # Select region once
# Click to expand in browser
screenshot-repeat flight-2.png # Same exact region
The slurp
tool saves the region geometry, and screenshot-repeat
reuses
it for perfect alignment across multiple screenshots.
Takeaways
I learn best by writing real tools. Tutorial projects teach syntax, but they don't force you into the messy, real-world problems that make you actually understand a framework.
Discourse Topics Are More Flexible Than You Think
Custom fields, tags, and categories gave me all the structured data I needed. Topics aren't just "posts". They're flexible containers for any kind of information.
Posts Are the Perfect Update Mechanism
Instead of building a custom "updates" system with timestamps and status changes, I just used Discourse posts. When a delay happens, post an update. The topic becomes a living history of what happened to that departure.
Entity Resolution: Check Your Data Before Building Logic
My first implementation created duplicate topics for every code-share
flight. I started building my own deduplication logic using a natural key
(departure time + gate + destination). But then I actually read the API
response and found it: a codeshared
field that identifies which flights
are the same. I was about to solve a problem the API had already solved for
me.
GTFS Parsing Has Edge Cases
GTFS times can exceed 24 hours (e.g., "25:30:00" means 1:30 AM the next day). ZIP files can have encoding issues. Stop sequences aren't always sequential. Real-world data is messy.
Performance Matters at Scale
The MTA feed has 500k+ stop times. A 6-hour import window instead of 24 hours keeps memory usage reasonable and topic counts manageable (~5,000 instead of 20,000+).
Is This Practical?
Probably not for real transit tracking. But it's a great example of pushing Discourse in unexpected directions to understand the platform deeply.
The same pattern (topics as structured data + posts as updates) could work for:
- Package tracking (topics = packages, posts = scan events)
- Server status boards (topics = servers, posts = incidents)
- Deployment pipelines (topics = deploys, posts = stage completions)
- Event schedules (topics = sessions, posts = time/room changes)
- Support ticket boards (topics = tickets, expandable = full history)
The split-flap aesthetic is a bonus.
Try It Yourself
The plugin is open source: ducks/discourse-transit-tracker
Clone it, run the Amtrak import (no API key required), and see what Discourse topics can become when you push them beyond forum discussions.