The continuum captures events occurring in succession, ranging from past to the present and even into the future. It understands and predicts the behaviour of an entity and maps it to intelligent events. In RIVIGO’s context, Vehicle Continuum is business intelligence over GPS data.
Trucks and pilots (drivers) are the two most fundamental elements of any full-truck load business. Almost all objectives of driving operational efficiencies and business performance can be derived by optimizing and leveraging these two sources. At RIVIGO Labs, we make every decision, whether big or small, using data. For that to happen well, we need to build accurate data sources for these two core entities.
The continuum captures events occurring in succession, ranging from past to the present and even into the future. It understands and predicts the behaviour of an entity and maps it to intelligent events. In RIVIGO’s context, Vehicle Continuum (VC) is business intelligence over GPS data.
To make a continuum of a vehicle, we needed to know various nodes involved in the ecosystem. We categorized nodes into two categories – RIVIGO nodes and client nodes. RIVIGO nodes can be our pitstops (establishments that are each separated by roughly 300 kms where relay driver changeover happens), parking spots, fuel pumps, toll booths, borders, workshops etc., whereas client nodes can be warehouses, client parking spots etc. Identification of the correct set of toll booths, fuel pumps (which we use for filling), client warehouses etc. was a challenge. Since the inherent nature of these nodes are different, we had to come up with different logics for detection of different nodes.
Let’s consider toll booths dataset. We used NHAI data and fastag transaction details provided by the bank partner for RIVIGO vehicles. The idea was to use vehicle’s GPS location at the time of fastag transaction, but it was not so straightforward. This fastag transaction data stream had a delay of up to 15 minutes, during which a vehicle can go around 10 km away from toll booth. Therefore, we needed to filter out noise of stoppage points from the dataset of -15 min to +15 minutes of a fastag transaction. On this dataset, we ran density-based spatial clustering of applications with noise (DBSCAN) algorithm to identify the correct location of a toll booth. To test the accuracy, we used Google maps (Satellite view) to check if the identified location looks like a toll booth. We found out that approximately 98% of toll booths were identified correctly with the help of DBSCAN algorithm. Fuel pumps were also located through fuel sensor data with the same clustering algorithm.
Vehicle continuum data is currently processed via batch jobs in three layers:
Chronos – Collects raw data
At RIVIGO, every vehicle is installed with several IOT sensors like GPS, fuel sensors, temperature sensors etc. Our proprietary pilot app also throws in data through GPS, Gyro sensors etc. Chronos layer collects data from these various sensors and stores it after basic sanity in MongoDB and generates necessary events. You can read more about this in our earlier post on IoT sensor data collection.
Athena – Processes raw data
Athena converts raw data streams into RUNNING, STOPPED or UNKNOWN legs. It also processes GPS data, fuel data etc. to attribute distance, time and fuel consumption to each leg.
Base Processing Layer
This layer takes the output of the Athena layer as an input and attaches trip context to this data. It also detects node, if any, for every stoppage. Node detection is a bit tricky. Nodes, as mentioned earlier, are categorized as client nodes and RIVIGO nodes. RIVIGO nodes can be identified anywhere in any time interval, but client nodes can be identified in the time relevant to the trip. Let’s look at the picture below:
Here, only C1 and C2’s nodes will be identified between two trips. We then use these detected nodes to predict loading/unloading, dry run to/from client warehouse.
At the beginning, we were using linear search to find the nearest neighbour node for a compressed data leg of raw data. However, as our dataset size started increasing, we needed to come up with a faster approach for node identification to reduce overall processing time. Hence, we switched to k-nearest neighbour algorithm for better performance.
Advanced Processing Layer
This layer creates actual polished data required for all the analysis. It adds business intelligence over the data from previous layer in such a way that reporting of various KPIs can be easy. It also aggregates data from various microservices within our technology architecture like Fuel Desk, Ticketing system etc. after validating accuracy to a certain level and adding it to relevant legs. Let’s go through some utilities of VC with a few examples:
There are many more utilities of VC like mileage performance of a vehicle and pilot (driving behaviour) causing low mileage, mileage performance of a route etc.
Given such impactful use cases, VC is used to measure almost every vehicle’s KPI at RIVIGO.
For any data-driven problem, getting past a proper and sanitized source is a big win. The continuum delivers the facts. This source of truth is helping us drive insights that lead to intelligent business decisions every day. It is driving the future of intelligence and optimisation for us at RIVIGO.