I love the idea - owning the browser definitely seems like the right approach.
I tried it out on a workflow I've been manually piecing together and it gave me a bunch of "Error encountered, contact support" messages when doing things like clicking on a form input field, or even a button.
The more complex "Instruction" block worked correctly instead (literally things like "click the "Sign In" button), but then I ran out of the 5 minutes of free run time when trying to go through the full flow. I expect this kind of thing will be fixed soon, as it grows.
In terms of ultimate utility, what I really want is something which can export scripts that run entirely locally, but falling back to the more dynamic AI enhanced version when an error is encountered. I would want AutoTab to generate the workflow which I could then run on my own hardware in bulk.
Anyway, great work! This is definitely the best implementation I've seen of that glimpsed future of capable AI web browsing agents.
This is awesome! What is your most common use case? Have you thought of competing with https://scribehow.com/ in the documentation space?
Very neat in theory but I'm failing to find any technical details.
Which layer is the automation happening? Inside using Dev tools? Multiple?
What is the self-healing mechanic? I'm guessing invoking an LLM to find what happened and fix it?
I guess what I'm wondering is. Is this some sort of hybrid between computer use and Dev tools usage?
This is awesome. I was just trying to get a rudimentary version of this for some "user" interaction heavy data extraction. Definitely giving it a try.
For a case with lots of requests how does Autotab handle ip-blocking? Does each run use a different portal instance?
I see it's able to perform data extraction, but what if you wanted to enter in data from another system, or generated by an LLM during the workflow?
You say "try it for free" but your website has no pricing information at all. Is this free for just a while? Free forever? What is your monetization strategy?
Can I point it at my own LLM or am I locked into using OpenAI?
Pretty slick. I recorded a session for ordering from a restaurant website, and it did repeat the entire workflow. It had some issues with a modal popped up but all in all well done! We have been trying to robotify the task of ordering from restaurant for our clients and seems like your solution can work well for us. I am guessing that you want your users to use Autotab browser, what is use for API?
Looks nice. Anybody else in this space? This one is on the pricier end but I’m just a single user so maybe not the target customer
If this was an OSS project automating a specific service many HN-ers would come and bleet about TOS violations & being scared/wary of C&Ds.
How does this not violate TOS? Do you have legal protection set up from megacorps trying to bully you with legal threats?
Automation despite TOS via Adversarial Interop should be a Digital Human Right. Godspeed.
Is Autotab able to scrape data from multiple websites with different structures and combine this data into structured data in one CSV or JSON file? Example: scrape interest rates offered on savings accounts from multiple bank websites and extract the name of the bank, bank logo, product name and interest rate for each account and run this saved query on a regular schedule (daily, weekly etc)?