Previous discussions:
https://news.ycombinator.com/item?id=11388196
https://news.ycombinator.com/item?id=17773874
https://news.ycombinator.com/item?id=26886792
(For commentary, I'm not being snarky.)
I run Netdata on my home server using the official docker image. I mostly use it to detect run-away containers, and monitor system temprature. For these use-cases, it works great, and I like that it's self contained; way less headache than stringing together Graphviz stuff, or setting up Nagios or Prometheus.
Beware - by default it'll send home telemetry, and the web UI will try to "register" your instance with some kind of cloud. I find this super annoying, but it's possible to turn it off; just not well documented.
There's also a lot of plugins that scrape many kinds of logs, look at process data, etc. Again, might be useful, but for a home user it's much better to turn it all off.
My notes here have a write-up of how to run Netdata via Docker with example config files that disable the unwanted features here: https://jake.tl/notes/2019-11-09-cozy-server#netdata
It’s very neat for individual servers
Doesn’t work well for monitoring multiple servers though from what I can tell.
Note that netdata phones home without consent in the default configuration. For many, the whole point of doing system-administration is selfhosting and autonomy, and privacy is frequently a big component of that.
Netdata blows a big hole in that by transmitting your usage information off of your box without getting permission.
I have played around with netdata just yesterday on my home server. Great tool, but the defaults are overkill for my needs. After spending an hour trying to simplify (=disable most of the "collectors") using the documentation, I finally gave up.
Settled on neofetch [1] instead: pure bash, wrote my own custom inputs including color coding for incident reporting in less time than it took me to strip down netdata. Highly recommended if you want to spend your time on other things than (setting up) server monitoring.
Could someone enlighten me on the internals, how is netdata able to get realtime granularity, whereas prometheus defaults to 15s?
Haven't been able to use its graphical interface to view historical data. At least it uses fewer resources than Grafana.
We have a central influxdb with telegraf metrics among others, and some grafana graphs.
I still install netdata on every machine though. Almost never use it, but there have been some times where it was useful to look at netdata. It's light weight enough that it hasn't been a problem.
The only gripe I have with it is the approach to security, i.e. the lack of user accounts (even one). So you have to either block the stats by IP (who is doing it these days?) or use other workarounds like proxying by Nginx etc.
Netdata is a great building block in a monitoring system. It now does a lot of monitoring via eBPF, connects to Prometheus, and integrates with k8s.
Why netdata is popular on HN now? There must be some big news or something..
Why would I use it over DataDog?
Is it any good? ;)
How does it compare to New Relic who also happens to monitor, if enabled, containers and system things?
I write and maintain an open source monitoring tool and I looked into adding a mode to output metrics in Netdata format and ran away screaming. It's just an unstructured text format where you output commands to stdout, one per line. Each command consists of whitespace-separated fields. Which field is the units? Oh, the 4th. And some fields are optional, I'm not even sure how that works but I think you can't skip an optional field if you then want to use any field after that. It's like structured data formats like JSON or god forbid XML never happened.
All these graphs are never really actionable and are only of interest for a short period of time and you won't be looking at it after a while because they don't mean anything unless you know where and when the problem is.
A sever admin wants "Incident" panel that only shows anomaly components at the top coupled with adjustable alerting mechanism and not just a dump of all the data there is blindly.
There are so many tools that does this and pretend it's impressive including ELK but whether it's Grafana or Kibana, you need a lot of manual tweaking to make the dashboards actually useful.