YC is wrong about LLMs for chip design
I don’t mind LLMs in the ideation and learning phases, which aren’t reproducible anyway. But I still find it hard to believe engineers of all people are eager to put a slow, expensive, non-deterministic black box right at the core of extremely complex systems that need to be reliable, inspectable, understandable…
Anything that requires deep “understanding” or novel invention is not a job for a statistical word regurgitator. I’ve yet to see a single example, in any field, of an LLM actually inventing something truly novel (as judged by the experts in that space). Where LLMs shine is in producing boilerplate -- though that is super useful. So far I have yet to see anything resembling an original “thought” from an LLM (and I use AI at work every day).
YC doesn't care whether it "makes sense" to use an LLM to design chips. They're as technically incompetent as any other VC, and their only interest is to pump out dogshit startups in the hopes it gets acquired. Gary Tan doesn't care about "making better chips": he cares about finding a sucker to buy out a shitty, hype-based company for a few billion. An old school investment bank would be perfect.
YC is technically incompetent and isn't about making the world better. Every single one of their words is a lie and hides the real intent: make money.
I worked on the Qualcomm DSP architecture team for a year, so I have a little experience with this area but not a ton.
The author here is missing a few important things about chip design. Most of the time spent and work done is not writing high performance Verilog. Designers spent a huge amount of time answering questions, writing documentation, copying around boiler plate, reading obscure manuals and diagrams, etc. LLMs can already help with all of those things.
I believe that LLMs in their current state could help design teams move at least twice as fast, and better tools could probably change that number to 4x or 10x even with no improvement in the intelligence of models. Most of the benefit would come from allowing designers to run more experiments and try more things, to get feedback on design choices faster, to spend less time documenting and communicating, and spend less time reading poorly written documentation.
I agree with most of the technical points of the article.
But there may still be value in YC calling for innovation in that space. The article is correctly showing that there is no easy win in applying LLMs to chip design. Either the market for a given application is too small, then LLMs can help but who cares, or the chip is too important, in which case you'd rather use the best engineers. Unlike software, we're not getting much of a long tail effect in chip design. Taping out a chip is just not something a hacker can do, and even playing with an FPGA has a high cost of entry compared to hacking on your PC.
But if there was an obvious path forward, YC wouldn't need to ask for an innovative approach.
I know nothing about chip design. But saying "Applying AI to field X won't work, because X is complex, and LLMs currently have subhuman performance at this" always sounds dubious.
VCs are not investing in the current LLM-based systems to improve X, they're investing in a future where LLM based systems will be 100x more performant.
Writing is complex, LLMs once had subhuman performance, and yet. Digital art. Music (see suno.AI) There is a pattern here.
YC is just spraying & praying AI, like most investors
The way I read that, I think they're saying hardware acceleration of specific algorithms can be 100 times faster and more efficient than the same algorithm in software on a general purpose processor, and since automated chip design has proven to be a difficult problem space, maybe we should try applying AI there so we can have a lower bar to specialized hardware accelerators for various tasks.
I do not think they mean to say that an AI would be 100 times better at designing chips than a human, I assume this is the engineering tradeoff they refer to. Though I wouldn't fault anyone for being confused, as the wording is painfully awkward and salesy.
I've been designing chips for almost 30 years.
We have a bunch of AI initiatives in my company but most of them are about using Copilot to help write scripts to automate the design flow. Our physical design flow are thousands of lines of Tcl and Python code.
The article mentions High Level Synthesis. I've been reading about this since my first job in the 1990's. I've worked on at least 80 chips and I've never seen any chip use one of these tools except for some tiny section that was written by some academics who didn't want to learn Verilog for reasons.
This is a great article but the main principle at YC is to assume that technology will continue progressing at an exponential rate and then thinking about what it would enable. Their proposals are always assuming the startups will ride some kind of Moore's Law for AI and hardware synthesis is an obvious use case. So the assumption is that in 2 years there will be a successful AI hardware synthesis company and all they're trying to do is get ahead of the curve.
I agree they're probably wrong but this article doesn't actually explain why they're wrong to bet on exponential progress in AI capabilities.
As a former chip designer (been 16 years, but looks like tools and our arguments about them haven't changed much), I'm both more and less optimistic than OP:
1. More because fine-tuning with enough good Verilog as data should let the LLMs do better at avoiding mediocre Verilog (existing chip companies have more of this data already though). Plus non-LLM tools will remain, so you can chain those tools to test that the LLM hasn't produced Verilog that synthesizes to a large area, etc
2. Less because when creating more chips for more markets (if that's the interpretation of YC's RFS), the limiting factor will become the cost of using a fab (mask sets cost millions), and then integrating onto a board/system the customer will actually use. A half-solution would be if FPGAs embedded in CPUs/GPUs/SiPs on our existing devices took off
One of the consistent problems I'm seeing over and over again with LLMs is people forgetting that they're limited by the training data.
Software engineers get hyped when they see the progress in AI coding and immediately begin to extrapolate to other fields—if Copilot can reduce the burden of coding so much, think of all the money we can make selling a similar product to XYZ industries!
The problem with this extrapolation is that the software industry is pretty much unique in the amount of information about its inner workings that is publicly available for training on. We've spent the last 20+ years writing millions and millions of lines of code that we published on the internet, not to mention answering questions on Stack Overflow (which still has 3x as many answers as all other Stack Exchanges combined [0]), writing technical blogs, hundreds of thousands of emails in public mailing lists, and so on.
Nearly every other industry (with the possible exception of Law) produces publicly-visible output at a tiny fraction of the rate that we do. Ethics of the mass harvesting aside, it's simply not possible for an LLM to have the same skill level in ${insert industry here} as they do with software, so you can't extrapolate from Copilot to other domains.
> (quoting YC) We know there is a clear engineering trade-off: it is possible to optimize especially specialized algorithms or calculations such as cryptocurrency mining, data compression, or special-purpose encryption tasks such that the same computation would happen faster (5x to 100x), and using less energy (10x to 100x).
> If Gary Tan and YC believe that LLMs will be able to design chips 100x better than humans currently can, they’re significantly underestimating the difficulty of chip design, and the expertise of chip designers.
I may be confused, but isn’t the author fundamentally misunderstanding YC’s point? I read YC as simply pointing out the benefit of specialized compute, like GPUs, not making any point about the magnitude of improvement LLMs could achieve over humans.
Nvidia is trying something similar: https://blogs.nvidia.com/blog/llm-semiconductors-chip-nemo/
I'd want to know about the results of these experiments before casting judgement either way. Generative modeling has actual applications in the 3D printing/mechanical industry.
Glad to see that the author is highlighting verification as the important factor in design productivity.
We at Silogy [0] are directly targeting the problem of verification productivity using AI agents for test debugging. We analyze code (RTL, testbench, specs, etc.) along with logs and waveforms, and incorporate interactive feedback from the engineer as needed to refine the hypothesis.
Generative models are bimodal - in certain tasks they are crazy terrible , and in certain tasks they are better than humans. The key is to recognize which is which.
And much more important:
- LLMs can suddenly become more competent when you give them the right tools, just like humans. Ever try to drive a nail without a hammer?
- Models with spatial and physical awareness are coming and will dramatically broaden what’s possible
It’s easy to get stuck on what LLMs are bad at. The art is to apply an LLMs strengths to your specific problem, often by augmenting the LLM with the right custom tools written in regular code
They (YC) are interested in the use of LLMs to make the process of designing chips more efficient. Nowhere do they talk about LLMs actually designing chips.
I don't know anything about chip design, but like any area in tech I'm certain there are cumbersome and largely repetitive tasks that can't easily be done by algorithms but can be done with human oversight by LLMs. There's efficiency to be gained here if the designer and operator of the LLM system know what they're doing.
I agree LLMs aren't ready to design ASICs. It's likely that in a decade or less, they'll be ready for the times you absolutely need to squeeze out every square nanometer, picosecond, femtojoule, or nanowatt.
Gary Tan's was right[1] in that there is a fundamental inefficiency inherent in the von Neumann architecture we're all using. This gross impedance mismatch[4] is a great opportunity for innovation.
Once ENIAC was "improved" from its original structure to a general purpose compute device in the von Neumann style, it suffered a 83% loss in performance[2] Everything since is 80 years of premature optimization that we need to unwind. It's the ultimate pile of technical debt.
Instead of throwing maximum effort into making specific workloads faster, why not build a chip that can make all workloads faster instead, and let economy of scale work for everyone?
I propose (and have for a while[3]) a general purpose solution.
A systolic array of simple 4 bits in, 4 bits out, Look Up Tables (LUTs) latched so that timing issues are eliminated, could greatly accelerate computation, in a far nearer timeframe.
The challenges are that it's a greenfield environment, with no compilers (though it's probable that LLVM could target it), and a bus number of 1.
[1] https://www.ycombinator.com/rfs-build#llms-for-chip-design
[2] https://en.wikipedia.org/wiki/ENIAC#Improvements
[3] https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...
IDK about LLMs there either.
A non-LLM monte carlo AI approach: "Pushing the Limits of Machine Design: Automated CPU Design with AI" (2023) https://arxiv.org/abs/2306.12456 .. https://news.ycombinator.com/item?id=36565671
A useful target for whichever approach is most efficient at IP-feasible design:
From https://news.ycombinator.com/item?id=41322134 :
> "Ask HN: How much would it cost to build a RISC CPU out of carbon?" (2024) https://news.ycombinator.com/item?id=41153490
>If Gary Tan and YC believe that LLMs will be able to design chips 100x better than humans currently can, they’re significantly underestimating the difficulty of chip design, and the expertise of chip designers.
I don't think he's arguing that. More that ASICs can be 100x better than CPUs for say crypto mining and that using LLM type stuff it may be possible to make them for other applications where there is less money available to hire engineers.
(the YC request https://www.ycombinator.com/rfs-build#llms-for-chip-design)
> While LLMs are capable of writing functional Verilog sometimes, their performance is still subhuman.
The key word here is "still".
We don't know what the limits of LLMs are.
It's possible that they will reach a dead end. But it is also possible that they will be able to do logic and math.
If (or when) they achieve that point, their performance will quickly become "superhuman" in these kinds of engineering tasks.
But the very next step will be the ability to do logic and math.
I think the problem with this particular challenge is that it is incredibly non-disruptive to the status quo. There are already 100s of billions flowing into using LLMs as well as GPUs for chip design. Nvidia has of course laid the ground work with its culitho efforts. This kind of research area is very hot in the research world as well. It’s by no means difficult to pitch to a VC. So why should YC back it? I’d love to see YC identifying areas where VC dollars are not flowing. Unfortunately, the other challenges are mostly the same — govtech, civictech, defense tech. These are all areas where VC dollars are now happily flowing since companies like Anduril made it plausible.
I disagree with the premise of this article. Modern AI can absolutely be very useful and even disruptive when designing FPGA's. Of course, it isn't there today. That does not mean this isn't a solution who's time has come.
I have been working on FPGA's and, in general, programmable logic, for somewhere around thirty years (started with Intel programmable logic chips like the 5C090 [0] for real time video processing circuits.
I completely skipped over the whole High Level Synthesis (HLS) era that tried to use C, etc. for FPGA design. I stuck with Verilog and developed custom tools to speed-up my work. My logic was simple: If you try to pound a square peg into a round hole, you might get it done yet, the result will be a mess.
FPGA development is hardware development. Not software. If you cannot design digital circuits to begin with, no amount of help from a C-to-Verilog tool is going to get you the kind of performance (both in terms of time and resources) that a hardware designer can squeeze out of the chip.
This is not very different from using a language like Python vs. C or C++ to write software. Python "democratizes" software development at a cost of 70x slower performance and 70x greater energy consumption. Sure, there are places where Python makes sense. I'll admit that much.
Going back to FPGA circuit design, the issue likely has to do with the type, content and approach to training. Once again, the output isn't software; the end product isn't software.
I have been looking into applying my experience in FPGA's across the entire modern AI landscape. I have a number of ideas, none well-formed enough to even begin to consider launching a startup in the sector. Before I do that I need to run through lots of experiments to understand how to approach it.
[0] https://www.cpu-galaxy.at/cpu/ram%20rom%20eprom/other_intel_...
LLM based automated verification surely isn't something that easily works out of the box, but that doesn't mean ventures shouldn't try to work on it.
The purpose of capital is to make progress from where we are now.
They want to throw LLMs at everything even if it does not make sense. Same is true for all the AI agent craze: https://medium.com/thoughts-on-machine-learning/langchains-s...
I don't know the space well enough, but I think the missing piece is that YC 's investment horizon is typically 10+ years. Not only LLMs could get massively better, but the chip industry could be massively disrupted with the right incentives. My guess is that that is YC's thesis behind the ask.
This is not my domain so my knowledge is limited, but I wonder if the chip designers have some sort of a standard library of ready to use components. Do you have to design e.g. ALU every time you design a new CPU or is there some standard component to use? I think having a proven components that can be glued on a higher level may be the key to productivity here.
Returning to LLMs. I think the problem here may be that there is simply not enough learning material for LLM. Verilog comparing to C is a niche with little documentation and even less open source code. If open hw were more popular I think LLMs could learn to write better Verilog code. Maybe the key is to persuade hardware companies to share their closed source code to teach LLM for the industry benefit?
When I think of AI in chip design, optimizations like these come to mind,
https://optics.ansys.com/hc/en-us/articles/360042305274-Inve...
https://optics.ansys.com/hc/en-us/articles/33690448941587-In...
> If an application doesn’t warrant hardware acceleration yet, it’s probably because it’s a small market, and that makes it a poor target for a startup.
But selling shovels that are useful in many small markets can still be a viable play, and that’s how I understand YC’s position here.
I worry that this post assumes LLMs won't get much better over time. This is possible, but YC bets that they will. The right time to start an LLM application layer company is arguably 6-12 months before LLMs get good enough for that purpose, so you can be ahead of the curve.
The whole concept of "request for startup" is entirely misguided imo.
YC did well because they were good at picking ideas, not generating them.
I did my PhD on trying to use ML for EDA (de novo design/topology generation, because deepmind was doing placement and I was not gonna compete with them as a single EE grad who self taught ML/optimization theory during the PhD).
In my opinion, part of the problem i that training data is scarce (real world designs are literally called "IP" in the industry after all...), but more than that, circuit design is basically program synthesis, which means it's _hard_. Even if you try to be clever, dealing with graphs and designing discrete objects involves many APX-hard/APX-complete problems, which is _FUN_ on the one had, but also means it's tricky to just scale through, if the object you are trying to do is a design that can cost millions if there's a bug...
LLMs autocomplete text. That's all.
Other intelligent effects are coincidental.
LLMs are wrong for most things imo. LLMs are great conversational assistants, but there is very little linguistic rigor to them, if any. They have almost no generalization ability, and anecdotally they fall for the same syntactic pitfalls they've fallen for since BERT. Models have gotten so good at predicting this n-dimensional "function" that sounds like human speech, we're getting distracted from seeing their actual purpose and trying to apply them to all sorts of problems that rely on more than text-based training data.
Language is cool and immensely useful. LLMs, however, are fundamentally flawed from their basic assumptions about how language works. The distribution hypothesis is good for paraphrasing and summarization, but pretty atrocious for real reasoning. The concept of an idea living in a semantic "space" is incompatible with simple vector spaces, and we are starting to see this actually matter in minutia with scaling laws coming into play. Chip design is a great example of where we cannot rely on language alone to solve all our problems.
I hope to be proven wrong, but still not sold on AGI being within reach. We'll probably need some pretty significant advancements in large quantitative models, multi-modal models and smaller, composable models of all types before we see AGI
I think this whole article is predicated on misinterpreting the ask. It wasn't for the chip to take 100x less power, it was for the algorithm the chip implements. Modern synthesis tools and optimisers extensively look for design patterns the same way software compilers do. That's why there's recommended inference patterns. I think it's not impossible to expect an LLM to expand the capture range of these patterns to maybe suboptimal HDL. As a simple example, maybe a designer got really turned around and is doing some crazy math, and the LLM can go "uh, that's just addition my guy, I'll fix that for you."
I played around using genetic algorithms to design on fpga (Xilinx 6200 I think), about 25 years ago .. nothing came of that ...
<sarcasm>You could have the LLM design a chip that accurately documents all the ways an LLM will get chip design wrong.
I disagree with most of the reasoning here, and think this post misunderstands the opportunity and economic reasoning at play here.
> If Gary Tan and YC believe that LLMs will be able to design chips 100x better than humans currently can, they’re significantly underestimating the difficulty of chip design, and the expertise of chip designers.
This is very obviously not the intent of the passage the author quotes. They are clearly talking about the speedup that can be gained from ASICs for a specific workload, eg dedicated mining chips.
> High-level synthesis, or HLS, was born in 1998, when Forte Design Systems was founded
This sort of historical argument is akin to arguing “AI was bad in the 90s, look at Eliza”. So what? LLMs are orders of magnitude more capable now.
> Ultimately, while HLS makes designers more productive, it reduces the performance of the designs they make. And if you’re designing high-value chips in a crowded market, like AI accelerators, performance is one of the major metrics you’re expected to compete on.
This is the crux of the author's misunderstanding.
Here is the basic economics explanation: creating an ASIC for a specific use is normally cost-prohibitive because the cost of the inputs (chip design) is much higher than the outputs (performance gains) are worth.
If you can make ASIC design cheaper on the margin, and even if the designs are inferior to what an expert human could create, then you can unlock a lot of value. Think of all the places an ASIC could add value if the design was 10x or 100x cheaper, even if the perf gains were reduced from 100x to 10x.
The analogous argument is “LLMs make it easier for non-programmers to author web apps. The code quality is clearly worse than what a software engineer would produce but the benefits massively outweigh, as many domain experts can now author their own web apps where it wouldn’t be cost-effective to hire a software engineer.”
Software folk underestimating hardware? Surely not.
> If Gary Tan and YC believe that LLMs will be able to design chips 100x better than humans currently can, they’re significantly underestimating the difficulty of chip design, and the expertise of chip designers. While LLMs are capable of writing functional Verilog sometimes, their performance is still subhuman. [...] LLMs primarily pump out mediocre Verilog code.
What is the quality of Verilog code output by humans? Is it good enough so that a complex AI chip can be created? Or does the human need to use tools in order to generate this code?
I've got the feeling that LLMs will be capable of doing everything a human can do, in terms of thinking. There shouldn't be an expectation that an LLM is able to do everything, which in this context would be thinking about the chip and creating the final files in a single pass and without external help. And with external help I don't mean us humans, but tools which are specialized and also generate some additional data (like embeddings) which the LLM (or another LLM) can use in the next pass to evaluate the design. And if we humans have spent enough time in creating these additional tools, there will come a time when LLMs will also be able to create improved versions of them.
I mean, when I once randomly checked the content of a file in The Pile, I found an Craigslist "ad" for an escort offering her services. No chip-generating AI does need to have this in its parameters in order to do its job. So there is a lot of room for improvement and this improvement will come over time. Such an LLM doesn't need to know that much about humans.
Past failures do not rule out the possibility of success in future attempts.
The bottleneck for LLM is fast and large memory, not compute power.
Whoever is recommending investing in better chip(ALU) design hasn't done even a basic analysis of the problem.
Tokens per second = memory bandwidth divided by model size.
If cryptocurrency mining could be significantly optimized (one of the example goals in the article) wouldn't that just destroy the value of said currency?
hi, this is my article! thanks so much for the views, upvotes, and comments! :)
This heavily overlaps with my current research focus for my Ph.D., so I wanted to provide some additional perspective to the article. I have worked with Vitis HLS and other HLS tools in the past to build deep learning hardware accelerators. Currently, I am exploring deep learning for design automation and using large language models (LLMs) for hardware design, including leveraging LLMs to write HLS code. I can also offer some insight from the academic perspective.
First, I agree that the bar for HLS tools is relatively low, and they are not as good as they could be. Admittedly, there has been significant progress in the academic community to develop open-source HLS tools and integrations with existing tools like Vitis HLS to improve the HLS development workflow. Unfortunately, substantial changes are largely in the hands of companies like Xilinx, Intel, Siemens, Microchip, MathWorks (yes, even Matlab has an HLS tool), and others that produce the "big-name" HLS tools. That said, academia has not given up, and there is considerable ongoing HLS tooling research with collaborations between academia and industry. I hope that one day, some lab will say "enough is enough" and create a open-source, modular HLS compiler in Rust that is easy to extend and contribute to—but that is my personal pipe dream. However, projects like BambuHLS, Dynamatic, MLIR+CIRCT, and XLS (if Google would release more of their hardware design research and tooling) give me some hope.
When it comes to actually using HLS to build hardware designs, I usually suggest it as a first-pass solution to quickly prototype designs for accelerating domain-specific applications. It provides a prototype that is often much faster or more power-efficient than a CPU or GPU solution, which you can implement on an FPGA as proof that a new architectural change has an advantage in a given domain (genomics, high-energy physics, etc.). In this context, it is a great tool for academic researchers. I agree that companies producing cutting-edge chips are probably not using HLS for the majority of their designs. Still, HLS has its niche in FPGA and ASIC design (with Siemens's Catapult being a popular option for ASIC flows). However, the gap between an initial, naive HLS design implementation and one refined by someone with expert HLS knowledge is enormous. This gap is why many of us in academia view the claim that "HLS allows software developers to do hardware development" as somewhat moot (albeit still debatable—there is ongoing work on new DSLs and abstractions for HLS tooling which are quite slick and promising). Because of this gap, unless you have team members or grad students familiar with optimizing and rewriting designs to fully exploit HLS benefits while avoiding the tools' quirks and bugs, you won't see substantial performance gains. Al that to say, I don't think it is fair to comply write off HLS as a lost cause or not sucesfull.
Regarding LLMs for Verilog generation and verification, there's an important point missing from the article that I've been considering since around 2020 when the LLM-for-chip-design trend began. A significant divide exists between the capabilities of commercial companies and academia/individuals in leveraging LLMs for hardware design. For example, Nvidia released ChipNeMo, an LLM trained on their internal data, including HDL, tool scripts, and issue/project/QA tracking. This gives Nvidia a considerable advantage over smaller models trained in academia, which have much more limited data in terms of quantity, quality, and diversity. It's frustrating to see companies like Nvidia presenting their LLM research at academic conferences without contributing back meaningful technology or data to the community. While I understand they can't share customer data and must protect their business interests, these closed research efforts and closed collaborations they have with academic groups hinder broader progress and open research. This trend isn't unique to Nvidia; other companies follow similar practices.
On a more optimistic note, there are now strong efforts within the academic community to tackle these problems independently. These efforts include creating high-quality, diverse hardware design datasets for various LLM tasks and training models to perform better on a wider range of HLS-related tasks. As mentioned in the article, there is also exciting work connecting LLMs with the tools themselves, such as using tool feedback to correct design errors and moving towards even more complex and innovative workflows. These include in-the-loop verification, hierarchical generation, and ML-based performance estimation to enable rapid iteration on designs and debugging with a human in the loop. This is one area I'm actively working on, both at the HDL and HLS levels, so I admit my bias toward this direction.
For more references on the latest research in this area, check out the proceedings from the LLM-Aided Design Workshop (now evolving into a conference, ICLAD: https://iclad.ai/), as well as the MLCAD conference (https://mlcad.org/symposium/2024/). Established EDA conferences like DAC and ICCAD have also included sessions and tracks on these topics in recent years. All of this falls within the broader scope of generative AI, which remains a smaller subset of the larger ML4EDA and deep learning for chip design community. However, LLM-aided design research is beginning to break out into its own distinct field, covering a wider range of topics such as LLM-aided design for manufacturing, quantum computing, and biology—areas that the ICLAD conference aims to expand on in future years.
Thank god humans are superior in chips design especially when you have dozens billions of dollars behind you, just like Intel. Oh wait.
but.. but.. muh AI
the AI hype train is basically investors not understanding tech, don’t get me wrong AI in itself could be a huge thing if used right but the things getting the most attention in the current market aren’t it
I wonder if it’s because llm doesn’t have access to state of the art Verilog?
I mean I assume the best is heavily guarded.
The "naive", all-or-nothing view on LLM technology is, frankly, more tiring than the hype.
Had to nop out at "just next token prediction". This article isn't worth your time.
Please don’t do this, Zach. We need to encourage more investment in the overall EDA market not less. Garry’s pitch is meant for the dreamers, we should all be supportive. It’s a big boat.
Would appreciate the collective energy being spent instead towards adding to amor refining Garry’s request.
The article seems to be be based on the current limitations of LLMs. I don't think YC and other VCs are betting on what LLMs can do today, I think they are betting on what they might be able to do in the future.
As we've seen in the recent past, it's difficult to predict what the possibilities are for LLMS and what limitations will hold. Currently it seems pure scaling won't be enough, but I don't think we've reached the limits with synthetic data and reasoning.
LLMs have a long way to go in the world of EDA.
A few months ago I saw a post on LinkedIn where someone fed the leading LLMs a counter-intuitively drawn circuit with 3 capacitors in parallel and asked what the total capacitance was. Not a single one got it correct - not only did they say the caps were in series (they were not) it even got the series capacitance calculations wrong. I couldn’t believe they whiffed it and had to check myself and sure enough I got the same results as the author and tried all types of prompt magic to get the right answer… no dice.
I also saw an ad for an AI tool that’s designed to help you understand schematics. In its pitch to you, it’s showing what looks like a fairly generic guitar distortion pedal circuit and does manage to correctly identify a capacitor as blocking DC but failed to mention it also functions as a component in an RC high-pass filter. I chuckled when the voice over proudly claims “they didn’t even teach me this in 4 years of Electrical Engineering!” (Really? They don’t teach how capacitors block DC and how RC filters work????)
If you’re in this space you probably need to compile your own carefully curated codex and train something more specialized. The general purpose ones struggle too much.