Launch HN: Midship (YC S24) – Turn PDFs, docs, and images into usable data

maxmaio | 53 points

Congrats on the launch. I just sent y'all an email – I'm curious with what you can do with airline crew rosters.

ctippett | 4 hours ago

Heres a real world use case, our company has moved our pension provider. This provider like the old one sucks at providing me with a good way to navigate through the 120 funds I can invest in.

I want to create something that can paginate through 12 pages of html, perform clicks, download pdf fund factsheet, extract data from this factsheet into excel or CSV. Can this help? What's the best way to deal with the initial task of automating webpage interactions systematically?

monkeydust | 5 hours ago

Whats pricing look like with HIPAA compliance?

prithvi24 | an hour ago

Honest question but how do you see your business being affected as foundational models improve? While I have massive complaints about them, Gemini + structured outputs is working remarkably well for this internally and it's only getting better. It's also an order of magnitude cheaper than anything I've seen commercially.

serjester | 6 hours ago

Congrats on the launch!

I’m curious to hear more about your pivot from AI workflow builder to document parsing. I can see correlations there, but that original idea seems like a much larger opportunity than parsing PDFs to tables in what is an already very crowded space. What verticals did you find have this problem specifically that gave you enough conviction to pivot?

ivanvanderbyl | 7 hours ago

Saw reducto released benchmark related to your product: https://reducto.ai/blog/rd-tablebench Curious your take on the benchmark and how well midship performs

zh2408 | 7 hours ago

How does your accuracy compare with VLMs like ColFlor and ColPali?

nostrebored | 8 hours ago

Congrats on the launch... You're in a crowded space. What differentiates Midship? What are you doing that's novel?

tlofreso | 8 hours ago

Are users able to export their organized data?

seany62 | 8 hours ago

This is interesting.

Can you do this with emails?

hk1337 | 8 hours ago

Congrats on the launch! A quick search in the YC startup directory brought up 5-10 companies doing pretty much the same thing:

- https://www.ycombinator.com/companies/tableflow

- https://www.ycombinator.com/companies/reducto

- https://www.ycombinator.com/companies/mindee

- https://www.ycombinator.com/companies/omniai

- https://www.ycombinator.com/companies/trellis

At the same time, accurate document extraction is becoming a commodity with powerful VLMs. Are you planning to focus on a specific industry, or how do you plan to differentiate?

hubraumhugo | 8 hours ago

[dead]

magamanlegends | 4 hours ago