Skip to content

Commit

Permalink
some readme
Browse files Browse the repository at this point in the history
  • Loading branch information
kristiyanto authored Oct 6, 2016
1 parent ced825a commit ea262b9
Showing 1 changed file with 51 additions and 0 deletions.
51 changes: 51 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# venturo
"Swipe right to your next adventurous ride"


Venturo is a ridesharing platform with common destinations, e.g. to complement city passes. Users select multiple destinations; and matched with a driver and other users. Currently serving 2 cities: Chicago and New York City.

## Demo
Demo available at:

## Data
The plaform takes 2 data stream, both in JSON format:
1. 🙋 From passenger : {id, current location, status, destinations ..}

2. 🚕 From the driver: {id, current location, status, destination, passenger, ..}

click here for a complete schema. The data is generated by placing both drivers and passenger on a map, and simulate a moving location toward destinations within 1-2 hops (euclidian).

## Matching
The matching is heuristic, and driver-centric:
1) if a driver available / idle, scan for passenger nearby
2) Dispatch
3) On trip to next destination. Along the way, unless cab is full, driver continuously scan other passenger nearby with common destinations, and re-route if necesary.
4) Once arrived, drivers's status are set back to idle to pick up other passengers. Passengers removed after 2 hours later.


## Dashboard and Queries
The dashboard put the drivers and passengers on the map in realtime fashion within 1 hour window. Since the data is simulated, some passengers and drivers may appear on unlikely places (e.g. river/lake).

Green dots represent passengers, Blue represents drivers. Green inside blue represents passenger(s) on a trip.
The stats:
- Total cabs: total active drivers (including idle, on trip, and dispatched drivers)
- Idle driver: cabs without pasenger
- Total passenger: total passengers (including passengers on trip, waiting passenger, or passengers being pickup, etc)
- On going trips: total cabs with passengers on going to destination
- Passenger waiting: total passengers not assigned to any drivers
- Average waiting time: in seconds, average time between waiting passengers to the time passengers hop on into the car.

## Architecture
__Ingestion layer__
Data streams for passengers and drivers generated separately in python (kafka producers), and ingested in Kafka as different topics. Kafka is set with 4 hours retention policy.

Secor is used to save all raw streams into Amazon S3 for later purposes (batch, re-play, forensics, or analytics).

__Stream processing__
Stream processing is performed in Spark Streaming with window 3 seconds, consuming data streams from both drivers and passangers. Every incoming messages are subject of sanity check: e.g. driver's reported status is matched with previous status, etc. to anticipate latency.

__Sink__
Elasticsearch is used as the buffer/transactional interface for the resulted messages

__Output__
Output is served as API by using Flask.

0 comments on commit ea262b9

Please sign in to comment.