The pytest-evals README mentions that it's built on pytest-harvest, which works with pytest-xdist and pytest-asyncio.
pytest-harvest: https://smarie.github.io/python-pytest-harvest/ :
> Store data created during your pytest tests execution, and retrieve it at the end of the session, e.g. for applicative benchmarking purposes
westurner | 15 hours ago
Hey HN! Creator here. I recently found myself writing evaluations for actual production LLM projects, and kept facing the same dilemma: either reinvent the wheel or use a heavyweight commercial system with tons of features I don't need right now.
Then it hit me - evaluations are just (kind of) tests, so why not write them as such using pytest?
That's why I created pytest-evals - a lightweight pytest plugin for building evaluations. It's intentionally not a sophisticated system with dashboards (and not suitable as a "robust" solution). It's minimalistic, focused, and definitely not trying to be a startup
Would love to hear your thoughts and if you find this useful, a GitHub star would be appreciated