7 April 2017

Optimization for Taxi Drivers

Building an accurate recommendation for taxi drivers to optimize on their customer earnings with view of planned route journeys is a hard problem. The recommendations need to take into account customers as well as the driver profiles. How big should the circumference be to cover a driver's route recommendation journeys? What will be a good enough benchmark for recommendations?

Take for example the imaginary case between two drivers and their customers. Suppose there exists Driver1 that is located in Piccadilly Circus, a Driver2 located in Oxford Circus, and potential customers are situated equidistant from both drivers on Regent Street, who have a cumulative score of 30% likelihood that one of the customers will want to use a taxi cab. But, one does not know which customer. How will the recommendation work for both drivers? Should one take into account the driver profiles as well as the behavior, intents, and propensity of the customers? However, if one takes into account customer demographics and the opportunity cost of the drivers in satisficing on game theory then how will their recommendations provide an optimization threshold that is dynamically balanced and suited for both drivers at any given time, supposing the drivers are continuously driving and the customers are unpredictable in their direction of movement. Is looking at individually for each driver optimization within their immediate clustering radius sufficient to track for hot spots for nearest customers? What if the geolocation is also a densely populated hot spot for other taxi drivers as in more vs less driver competition in the area.

The issues in recommendations are further compounded by uncertainties of the road network and the variability of drivers as well as the number of geolocated customers who are potentially seeking a taxi cab ride at any given time and at any given direction of road traffic. Other influencing factors could include: public/private events, seasonal variations, weather, security, roadworks, traffic congestion, public transport, density of customers, shopping arcades, poor vs rich neighborhoods, other rural and urban dynamics.

One solution to this problem could incorporate a complex network representation of road network which incorporates a semantically defined probabilistic reasoning similar in respect to a Bayesian or a Markov Chain Monte Carlo approach to generalize over all the drivers. At any given point of recommendation one can then query for drivers and their probabilistic customers within the neighboring geolocations. Additionally, an aggregation step could include a combination of regression and clustering for individual weighted driver recommendations.