Story
GenAI for Systems: Recurring Challenges and Design Principles from Software to Silicon
Key takeaway
Generative AI is transforming how computing systems are designed and built, but research is fragmented across software, hardware and chip design. This cross-stack perspective highlights common challenges and design principles for applying generative AI more broadly.
Quick Explainer
Generative AI is reshaping the design and optimization of computing systems, from software to silicon. Despite rapid progress, the field faces recurring challenges such as slow evaluators, tacit knowledge barriers, and cross-layer coordination. In response, effective design principles have emerged, including hybrid approaches that combine learning with classical structures, continuous feedback loops, role-based modularity, and reuse of existing systems knowledge. By bridging siloes across the computing stack, this integrated perspective aims to ensure that generative AI becomes a reliable tool for systems optimization, rather than encountering the brittleness that has limited earlier ML-based efforts.
Deep Dive
Technical Deep Dive: GenAI for Systems
Overview
Generative AI techniques are reshaping how computing systems are designed, optimized, and built, with research spanning software, hardware architecture, and chip design. Despite this rapid progress, the field remains fragmented, with the same structural challenges and effective design principles recurring across layers.
This technical deep dive aims to synthesize the key findings from a cross-stack analysis of over 275 papers, distilling five recurring challenges and five design principles that have emerged as generative AI is applied to real systems problems. It also identifies open research questions that become visible only from a cross-layer perspective.
Recurring Challenges
The five key challenges that consistently limit progress in applying generative AI to systems optimization are:
- The Feedback Loop Crisis: Across the stack, the generator is increasingly fast while the evaluator remains slow, expensive, noisy, or incomplete, creating a bottleneck in closing the loop quickly enough for learning and refinement.
- The Tacit Knowledge Problem: Systems design is shaped by knowledge that is real, decisive, and hard to write down, making it difficult to extract, represent, and update the implicit expertise that historically lived in people, tool defaults, and institutional memory.
- Trust and Validation: As generative components move closer to correctness-critical and cost-critical decisions, validation becomes the gating factor for deployment, requiring models to produce artifacts that are independently checked by formal verification engines, simulators, or proof tools.
- Co-Design Across Boundaries: Generative AI consistently breaks layer boundaries, but our abstractions, organizations, and toolchains are still largely layered, making it difficult to achieve cross-layer gains that require cross-layer control and feedback.
- From Determinism to Dynamism: A growing fraction of the stack is shifting from static artifacts and deterministic heuristics toward adaptive policies that change with workload, context, and time, making it challenging to retain predictability, debugging, and accountability.
Effective Design Principles
In response to these recurring challenges, five design principles have independently emerged as effective across the layers of the computing stack:
- Embrace Hybrid Approaches: The most robust systems combine learning with classical structure rather than replacing it, preserving stability while benefiting from adaptive guidance.
- Design for Continuous Feedback: Successful applications make the feedback loop a first-class design object, operationalizing iteration as a structured, measurable process rather than a final evaluation step.
- Separate Concerns by Role, Not by Tool: As stacks become agentic, modularity shifts from code boundaries to responsibility boundaries, localizing errors and scaling complexity when models participate in decision loops.
- Match Approach to Problem Structure: There is no single GenAI method for systems; the right approach depends on what the system exposes as structure and feedback.
- Build on Decades of Systems Knowledge: The strongest progress comes from reusing existing abstractions, benchmarks, and invariants, then extending them to support learning rather than attempting to replace them entirely.
Opportunities and Next Steps
The cross-stack convergence of challenges and principles points toward a more systematic engineering approach, where practitioners can use the identified patterns as a diagnostic and design aid. This includes developing shared methodologies for feedback loop design, building cross-layer benchmark suites, and establishing best practices for when to apply hybrid approaches versus end-to-end learning.
By bridging the current siloes between domains, the field can move beyond ad hoc construction and instead compound progress across the computing stack, ensuring that generative AI becomes a reliable tool for systems optimization rather than encountering the brittleness that has limited earlier ML-based efforts.
