Start at the Tails
Why the first useful place for AI in architecture and engineering is the tails: the bounded, repetitive, checkable work that surrounds engineering judgment rather than the judgment itself. Three working demos you can open. See the demos ↓
Architecture, engineering, and construction is one of the largest industries in the world and one of the least automated. The easy explanation is that it is a conservative trade, slow to adopt software, but that mostly misreads the actual constraint. The caution is rational: engineering is genuinely hard to experiment on. Every structural decision carries a stamp, a seal, and a liability that outlives the building. The real question is not whether AEC is ready for AI, but where AI is allowed to start.
It starts at the tails. Underneath the judgment that engineers are paid for sits a large body of work that is bounded, repetitive, and checkable: reading members off a drawing, cross-reading a set for contradictions, rating routine members against a code, running the same connection analysis again and again. None of that is the engineering call itself; it is the reading and arithmetic around it. Those are the tails, defined by distance from the decision rather than how often the work comes up. By volume this is most of an engineer's day, but each task sits out at the edge of the call rather than at its center, where the stamp and the liability are. That is what makes the tails safe to experiment on, because a mistake stays contained and the result can be checked.
The tails are also where the money leaks. Construction is among the least digitized industries in the world, and its labor productivity has grown about 1 percent a year for two decades while the broader economy grew nearly 3 percent; McKinsey puts the prize for closing that gap near $1.6 trillion a year. Little of that waste is in pouring concrete. It is in the work around it. Construction professionals spend more than a third of their time, over 14 hours a week, hunting for information, resolving conflicts, and fixing mistakes, according to a PlanGrid and FMI survey. A separate Autodesk and FMI study tied bad data and miscommunication to roughly $88 billion of avoidable rework in a single year, with nearly half of it traced to poor information. Reviews, coordination, quality control, and rework are where the hours and the margin go, and unlike the cost of the concrete, much of that cost is recoverable.
Until recently the tails were out of reach anyway, because they begin with reading a drawing, and a drawing resists software in two ways. One is perception. A line on a sheet might be a beam, a brace, or a label, and a note that says "all beams" might mean everything drawn or only what is left unmarked. The other is interpretation. Pulling text out of a document and scanning it for conflicts was always possible; what was missing was reading that text the way an engineer reads it, against the conventions, codes, and habits of the discipline. Multimodal large language models now do both. They take in a sheet and reason about it roughly the way a practitioner would, imperfectly, but wrong in ways a practitioner can catch.
Reading a drawing is not the same as being trusted with it, and the trust comes from how the work is divided. The model reads the sheet; it does not do the engineering. The calculations run in deterministic code you can inspect, checked against the relevant standard, and every claim points back to the spot on the sheet it came from, one click away. The model is never the source of a number, only of where to look for it. That split is what makes it responsible to hand a tail task over in the first place.
Three tail tasks, built and working:
These three are only a floor. They run off-the-shelf models on public drawings, with nothing trained or tuned. The leverage shows up when the same approach is pointed at a firm's own data: its drawing standards, its detail library, the RFIs it has already answered, the way it labels a plan. Tuned on that, a model reads a firm's work the way the firm does, and the range of tails it can take on widens quickly. The demos show the mechanism; a firm's own data is what makes it worth deploying.
None of this asks anyone to trust a model past what it can show. Each demo does one bounded task in the open: the model reads, code does the engineering, the source stays in view. It is a narrow place to start, and a real one, in a field where being wrong is expensive and starting at all has been the hard part.