designing data models for trains, timings, and station

A model for updating train timetables
published: (updated: )
by Harshvardhan J. Pandit
is part of: Pune-Lonavala Train Schedule App
automation python script trains

Train Stations

The train stations are setup as follows:

  1. Pune
  2. Shivajinagar
  3. Khadki
  4. Dapodi
  5. Kasarwadi
  6. Pimpri
  7. Chinchwad
  8. Akurdi
  9. Dehuroad
  10. Begdewadi
  11. Ghorawadi
  12. Talegaon
  13. Vadgaon
  14. Kanhe
  15. Kamshet
  16. Malavali
  17. Lonavala

Each station has an id number (the same as the index in above list) that is used as its unique identifier. Storing and retrieving data in a database is easier with primary keys, which are like the unique identifiers for each record. If the user doesn't supply one, the database creates one automatically. In this case, explicitly setting the ID is beneficial as it allows nifty queries such as select stations where id<=15; would give all stations from Pune to Kamshet.

Each station has three properties:

  1. Name
  2. Time taken to reach this station from the previous one: time_from_prev
  3. Time taken to reach the next station from this one: time_to_next

Based on these timings (next and previous) we can calculate the train timing once we know three things:

  1. Where does it start from? (initial station)
  2. Where does it go to? (destination)
  3. When does it leave the initial station? (timing)

Train

A train has five properties:

  1. train_id: a unique integer used to identify the train, like the station_id
  2. train_no: each train has a number used to identify the train (given by the Railways Department)
  3. initial station: station_start
  4. destination: station_end
  5. timing for leaving the initial station: timing

The two stations, initial and destination refer to the Train Station, whereas the timing is a Time object.

Storing train runs

Each train run consists of the train leaving the initial station at the specified time towards its destination. We can calculate the scheduled stops at all stations based on the station's next and previous timing properties. For eg. A train leaves Pune (station = 1) at exactly 12pm (noon) for Lonavala (station = 17). Then we can have:

station_list = [
  stations where id >= Pune.id 
  and id <= Lonavala.id ]
  ( order by id )
  train = 12:00
  time_to_next = 0
  for station in station_list:
    time = time + time_to_next
    train stops at <station> at <time>
    time_to_next = station.time_to_next

This generates the train's time runs from Pune to Lonavala. Using this approach to generate the timings every time a request comes in is not efficient. Instead, we can use this to calculate the timings for a train when it is added, and save it to database. So we have a separate table for Timings that contains: * timings: Time Field * train: reference to Train * station: reference to Station

This approach makes sense, since we can get the following frequently required things very easily:

  • get timings for train A: select timings where train = A
  • get all trains for station B: select timings where station = B
  • get all trains for station B going towards C: select timings where station = B and train in [ trains going up (pune) or down (lonavala) ]