Chronon, Airbnb's ML feature platform, is now open source

vquemener | 224 points

It's refreshing to read something about ML and inference and have it not be anything related to a transformer architecture sending up fruit growing from a huge heap of rotten, unknown, mostly irrelevant data. With traditional ML, it's useful to talk about the sources of bias and error, and even measure some of them. You can do things that improve them without starting over on everything else.

With LLMs, it's more like you buy a large pancake machine that you dump all of your compost into (and you suspect the installers might have hooked up to your sewage line as input too). It triples your electricity bill, it makes bizarre screeching noises as it runs, you haven't seen your cat in a week, but at the end out come some damn fine pancakes.

I apologize. I'm talking about the thing that I was saying was a relief to be not talking about.

sfink | a year ago

What is the difference between a ML feature store and a low-latency OLAP DB platform/data warehouse? I see many similarities between both, like the possibility of performing aggregation of large data sets in a very short time.

giovannibonetti | a year ago

First of all, congrats on the release! Well done. A few questions:

- Since the platform is designed to scale, it would be nice to see scalability benchmarks

- Is the platform compatible with human-in-the-loop workflows? In my experience, those workflows tend to require vastly different needs than fully automated workflows (e.g. online advertising)

whiplash451 | a year ago

Author. Happy to answer any questions.

nikhilsimha | a year ago

How does Chronon handle mutable data when backfilling? Or does it make some assumptions on the underlying data?

xiasongh | a year ago

Looks very useful. I'm not aware of any open source alternative (although I could just be ignorant here!)

Reubend | a year ago

great work! When it comes to batched computations, why not leverage intermediate state much like streaming jobs. For example, if we need to calculate past 30 day sum for a value daily - it seems like this would compute so from scratch daily. Would it not make sense to model this as a sliding window that's updated daily?

evolutionblues | a year ago
[deleted]
| a year ago

What does Airbnb use ML for?

siquick | a year ago
[deleted]
| a year ago

Paywalled for me

travisporter | a year ago

The downside is after you use the platform for a week, you have to delete all the expired models yourself and clean up all the labels or face a hefty housekeeping surcharge.

djaykay | a year ago

Why do major sites still use Medium as a blog platform.

syntaxing | a year ago

[dead]

jumpora | a year ago