-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
ced825a
commit ea262b9
Showing
1 changed file
with
51 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
# venturo | ||
"Swipe right to your next adventurous ride" | ||
|
||
|
||
Venturo is a ridesharing platform with common destinations, e.g. to complement city passes. Users select multiple destinations; and matched with a driver and other users. Currently serving 2 cities: Chicago and New York City. | ||
|
||
## Demo | ||
Demo available at: | ||
|
||
## Data | ||
The plaform takes 2 data stream, both in JSON format: | ||
1. 🙋 From passenger : {id, current location, status, destinations ..} | ||
|
||
2. 🚕 From the driver: {id, current location, status, destination, passenger, ..} | ||
|
||
click here for a complete schema. The data is generated by placing both drivers and passenger on a map, and simulate a moving location toward destinations within 1-2 hops (euclidian). | ||
|
||
## Matching | ||
The matching is heuristic, and driver-centric: | ||
1) if a driver available / idle, scan for passenger nearby | ||
2) Dispatch | ||
3) On trip to next destination. Along the way, unless cab is full, driver continuously scan other passenger nearby with common destinations, and re-route if necesary. | ||
4) Once arrived, drivers's status are set back to idle to pick up other passengers. Passengers removed after 2 hours later. | ||
|
||
|
||
## Dashboard and Queries | ||
The dashboard put the drivers and passengers on the map in realtime fashion within 1 hour window. Since the data is simulated, some passengers and drivers may appear on unlikely places (e.g. river/lake). | ||
|
||
Green dots represent passengers, Blue represents drivers. Green inside blue represents passenger(s) on a trip. | ||
The stats: | ||
- Total cabs: total active drivers (including idle, on trip, and dispatched drivers) | ||
- Idle driver: cabs without pasenger | ||
- Total passenger: total passengers (including passengers on trip, waiting passenger, or passengers being pickup, etc) | ||
- On going trips: total cabs with passengers on going to destination | ||
- Passenger waiting: total passengers not assigned to any drivers | ||
- Average waiting time: in seconds, average time between waiting passengers to the time passengers hop on into the car. | ||
|
||
## Architecture | ||
__Ingestion layer__ | ||
Data streams for passengers and drivers generated separately in python (kafka producers), and ingested in Kafka as different topics. Kafka is set with 4 hours retention policy. | ||
|
||
Secor is used to save all raw streams into Amazon S3 for later purposes (batch, re-play, forensics, or analytics). | ||
|
||
__Stream processing__ | ||
Stream processing is performed in Spark Streaming with window 3 seconds, consuming data streams from both drivers and passangers. Every incoming messages are subject of sanity check: e.g. driver's reported status is matched with previous status, etc. to anticipate latency. | ||
|
||
__Sink__ | ||
Elasticsearch is used as the buffer/transactional interface for the resulted messages | ||
|
||
__Output__ | ||
Output is served as API by using Flask. |