Skip to content

Commit

Permalink
wip
Browse files Browse the repository at this point in the history
  • Loading branch information
george24601 committed May 12, 2024
1 parent a82efbd commit 2ceb10e
Show file tree
Hide file tree
Showing 16 changed files with 554 additions and 3 deletions.
6 changes: 3 additions & 3 deletions _includes/google_analytics.html
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
<!-- Global site tag (gtag.js) - Google Analytics -->
<script async src="https://www.googletagmanager.com/gtag/js?id=UA-63136411-1"></script>
<!-- Google tag (gtag.js) -->
<script async src="https://www.googletagmanager.com/gtag/js?id=G-WFH4EKD7ZG"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());

gtag('config', 'UA-63136411-1');
gtag('config', 'G-WFH4EKD7ZG');
</script>
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
110 changes: 110 additions & 0 deletions _posts/2024/2024-03-01-minerva.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
---
layout: post
title: "Minerva: Airbnb's Business Metrics Platform"
description: ""
category:
tags: [arch]
---

### Pain points

* New tables were created manually on top of `core_data` tables every other day, but there was no way to tell if similar tables already existed.
* Data lineage became impossible to track. When a data issue upstream was discovered and fixed, there was no guarantee that the fix would propagate to all downstream jobs.
* Different teams reported different numbers for very simple business questions, and there was no easy way to know which number was correct.
* core_data standardized table consumption and allowed users to quickly identify which tables to build upon. On the other hand, it burdened the centralized data engineering with the impossible task of gate-keeping and onboarding an endless stream of new datasets into new and existing core tables. Furthermore, pipelines built downstream of core_data created a proliferation of duplicate and diverging metrics. We learned from this experience that table standardization was not enough and that standardization at the metrics level is key to enabling trustworthy consumption. After all, users do not consume tables; they consume metrics, dimensions, and reports.


### Use cases

* Minerva’s product vision is to allow users to “define metrics once, use them everywhere”. That is, a metric created in Minerva should be easily accessed in dashboards, our A/B testing framework, or our anomaly detection algorithms to spot business anomalies
* Users simply ask for metrics and dimension cuts, and receive the answers without having to worry about the “where” or the “how”.
* Data Catalog
* When a user interfaces with the Dataportal and searches for a metric, it ranks Minerva metrics at the top of the search results. The Dataportal also surfaces contextual information, such as certification status, ownership, and popularity so that users can gauge the relative importance of metrics. For most non-technical users, the Dataportal is their first entry point to metrics in Minerva.
* Data Exploration
* Upon selecting a metric, users are redirected to Metric Explorer, a component of the Dataportal that enables out-of-the-box data exploration. On a metric page, users can see trends of a metric with additional slicing and drill down options such as `Group By` and `Filter`
* A/B Testing
* All base events for A/B tests are defined and sourced from Minerva.
* Executive Reporting
* Data Analysis
* Minerva data is exposed to Airbnb’s custom R and Python clients through Minerva’s API. This allows data scientists to query Minerva data in a notebook environment with ease


### Metrics

* Simple metrics are composed of single materialized events (e.g., bookings)
* Filtered metrics are composed of simple metrics filtered on a dimension value (e.g., bookings in China);
* Derived metrics are composed of one or more non-derived metrics (e.g. search-to-book rate).

### Sample Request

* Average daily price (ADR), cut by destination region, excluding private rooms for the past 4 weeks in the month of August 2021.

```json
{
metric: ‘price_per_night’,
groupby_dimension: ‘destination_region’,
global_filter: ‘dim_room_type!=”private-room”’,
aggregation_granularity: ‘W-SAT’,
start_date: ‘2021–08–01’,
end_date: ‘2021–09–01’,
truncate_incomplete_leading_data: ‘true’,
truncate_incomplete_trailing_data: ‘true’,
}
```
* Given that the `price_per_night` metric is a ratio metric (a special case of derived metric) that contains a numerator (`gross_booking_value_stays`) and a denominator (`nights_booked`), Minerva API breaks up this request into two sub-requests.


### Minerva vs Shasta

#### Differences

* Target audience: Minerva mainly for internal users, while Shasta is mainly for external users of Google ads manager.
* Latency: Sub-second latency is P0 for Shasta, nice to have for Minerva
* Storage: reading from different storage systems is a P0 for Shasta, and nice to have for Minerva
* Consistent metric definition: P0 for Minerva, P1 for Shasta
* Reverse ETL: Shasta focuses on reducing revert ETL by reading the OLTP storage directly, and this use case is not focused by Minerva

#### Similarities

* Separation of dimensions and metrics.
* Metrics often contain business logic
* Declarative metric computation
* Separation of dimension columns and metric columns


* Minerva takes (normalized) fact and dimension tables as inputs, de-normalize them, and serves the aggregated data to downstream applications.
* It uses Airflow for workflow orchestration, Apache Hive and Apache Spark as the compute engine, and Presto and Apache Druid for consumption.
* Minerva defines key business metrics, dimensions, and other metadata in a centralized Github repository
* Minerva de-normalize data efficiently by reuse data and intermediate joined results.
* Minerva provides a unified data API to serve both aggregated and raw metrics on demand.
* Users define the “what” and not the “how”. How the metrics are calculated, stored, or served are entirely abstracted away from end users.
* Minerva focuses on metrics and dimensions as opposed to tables and columns.
* Minerva requires metrics authors to provide self-describing metadata, such as, ownership, lineage, and metric description.


* At the heart of Minerva’s configuration system are event sources and dimension sources, which correspond to fact tables and dimension tables in a Star Schema design, respectively.
* Event sources define the atomic events from which metrics are constructed, and dimension sources contain attributes and cuts that can be used in conjunction with the metrics
* With Minerva, users can simply define a dimension set, an analysis-friendly dataset that is joined from Minerva metrics and dimensions.

### computational flow

* Ingestion Stage: Partition sensors wait for upstream data and data is ingested into Minerva.
* Data Check Stage: Data quality checks are run to ensure that upstream data is not malformed.
* Join Stage: Data is joined programmatically based on join keys to generate dimension sets.
* Post-processing and Serving Stage: Joined outputs are further aggregated and derived data is virtualized for downstream use cases.

In the ingestion stage, Minerva waits for upstream tables to land and ingests the data into the static Minerva tables. This ingested data becomes the source-of-truth and requires modification to the Minerva configs in order to be changed.

Here are some typical checks that Minerva runs:
* sources should not be empty
* timestamps should not be NULL and should meet ISO standards
* primary keys should be unique
* dimension values are consistent with what is expected



### References

* [How Airbnb Achieved Metric Consistency at Scale](https://medium.com/airbnb-engineering/how-airbnb-achieved-metric-consistency-at-scale-f23cc53dea70)
* [How Airbnb Standardized Metric Computation at Scale](https://medium.com/airbnb-engineering/airbnb-metric-computation-with-minerva-part-2-9afe6695b486)

60 changes: 60 additions & 0 deletions _posts/2024/2024-03-08-leading-teams.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
---
layout: post
title: "Leading Teams"
description: ""
category:
tags: [arch]
---

From [How to Lead a Team](https://abseil.io/resources/swe-book/html/ch05.html)

* Manager and tech leads work together to ensure engs' tasks match their skill sets and skill levels. In smaller teams both roles could be on one person
* Most TL are ICs, and they should delegate tasks to team members, even if it means slower. Otherwise, the team is hard to grow in size and capacity
* Act quickly on difficult situations, e.g., productivity, skill, and motivation, because rarely these problems will work themselves out

#### Influence without authority

* Identify a strategic need for the company and show how it is linked to existing priorities.
* Automate your process to reduce friction
* Working to build team consensus
* If your team is moving quickly, sometimes it will voluntarily concede authority and direction to team leads. Even though this might look like a dictatorship or oligarchy, when it’s done voluntarily, it’s a form of consensus.


#### Moving from IC to a leadership role

* “Above all, resist the urge to manage.”
* Social health of them is as important as the tech health, but harder to manage
* At the end of each one-on-one meeting, “What do you need?”
* Great managers worry about what things get done (and trust their team to figure out how).
* Drive consensus and set directions, and let people put the product together to figure out how it should be done
* You will not get everything right, nor will you have all the answers, and if you act like you do, you’ll quickly lose the respect of your team.
* Your job is to inspire, but inspiration is a 24/7 job. Your visible attitude about absolutely everything, no matter how trivial, is unconsciously noticed and spreads infectiously to your team.


### Leading at Scale

From [Leading at Scale](https://abseil.io/resources/swe-book/html/ch06.html)

* Go "broad" than "Deep".
* Loss of details
* Engineering expertise less relevant
* Depends on technical intuition and ability to move engs in good directions
* Your job became less proactive and more reactive. The higher up in leadership you go, the more escalations you receive. You are the "finally" clause in a long list of code blocks!
* 95% observation and listening, and 5% making critical adjustments in just the right place. Listen to your leaders and skip-reports. Talk to your customers, who may not be end users out in the world, but your coworkers. Customers’ happiness requires just as much intense listening as your reports’ happiness.
* Walk a fine line between "too rigid" and "too vague" team structures
* A common mistake is to put a team in charge of a specific product rather than a general problem. A product is a solution to a problem. The life expectancy of solutions can be short, and products can be replaced by better solutions. However, a problem, if chosen well, can be evergreen.

#### Strategy

* More on the high level strategy. Most of the decisions are about finding the correct set of trade-offs.
* To defend against "analysis paralysis", frame your process as continuous re-balancing of trade-offs
* Guide people solve difficult, ambagious problems, i.e., no obvious solution or may not even be solvable.
* See the forest, and find a path to the important trees, and let engs to chop them down
* Your strategy needs to cover not just overall technical direction, but an organizational strategy as well. You’re building a blueprint for how the ambiguous problem is solved and how your organization can manage the problem over time.

#### Delegation

* Leave a trail of self-sufficient success in your wake.
* "If you want something done right, do it yourself."
* Delegation is absolutely the most effective way to train them. You give them an assignment, let them fail, and then try again and try again
* Unless the task is truly time sensitive and on fire, bite the bullet and assign the work to someone else, presumably someone who you know can do it but will probably take much longer to finish. You need to create opportunities for your leaders to grow; they need to learn to "level up" and do this work themselves so that you’re no longer in the critical path.
Loading

0 comments on commit 2ceb10e

Please sign in to comment.