Just created a new video using Avatars to demo Gemini and HAgent to extend a RISC-V Dino core with the B extension.
youtu.be/DMT0Xz_-U5g
Just created a new video using Avatars to demo Gemini and HAgent to extend a RISC-V Dino core with the B extension.
youtu.be/DMT0Xz_-U5g
A single system prompt will not solve this. They need a tiered solution. E.g: filter the rag of clear bad/conspiracy, do not trust the rag results, create a filtering...
🎉 Proud to be one of 70 Amazon Research Award recipients this year! Great news for my students' funding too.
Thanks Amazon for supporting academic research!
www.amazon.science/research-awa...
arxiv 📄 μRL: Discovering Transient Execution Vulnerabilities Using Reinforcement Learning
http://arxiv.org/abs/2502.14307v1
We propose using reinforcement learning to address the challenges of discovering microarchitectural vulnerabilities, such as Spectre and Meltdown, which exploit subtle int...
arxiv 📄 VulnBot: Autonomous Penetration Testing for A Multi-Agent Collaborative Framework
http://arxiv.org/abs/2501.13411v1
Penetration testing is a vital practice for identifying and mitigating vulnerabilities in cybersecurity systems, but its manual execution is labor-intensive and time-consu...
arxiv 📄 MARL-OT: Multi-Agent Reinforcement Learning Guided Online Fuzzing to Detect Safety Violation in Autonomous Driving Systems
http://arxiv.org/abs/2501.14451v1
Autonomous Driving Systems (ADSs) are safety-critical, as real-world safety violations can result in significant losses. Rigorous ...
arxiv 📄 Evaluating Agent-based Program Repair at Google
http://arxiv.org/abs/2501.07531v1
Agent-based program repair offers to automatically resolve complex bugs end-to-end by combining the planning, tool use, and code generation abilities of modern LLMs. Recent work has explored the use of age...
Working on this paper for a while. Pushing an extended version to arxiv after some rejections and improvements.
arxiv 📄 Enabling New HDLs with Agents
http://arxiv.org/abs/2501.00642v1
Large Language Models (LLMs) based agents are transforming the programming language landscape by facilitating learning for beginners, enabling code generation, and optimizing documentation workflows. Hardware Description La...
The new O3 ARC results are a new "oh shit moment" like the first time that I tried BERT or GPT-3.
www.youtube.com/watch?v=duQu...
going for a walk to think....
arxiv 📄 Design choices made by LLM-based test generators prevent them from finding bugs
http://arxiv.org/abs/2412.14137v1
There is an increasing amount of research and commercial tools for automated test case generation using Large Language Models (LLMs). This paper critically examines whether ...
arxiv 📄 GHIssuemarket: A Sandbox Environment for SWE-Agents Economic Experimentation
http://arxiv.org/abs/2412.11722v2
Software engineering agents (swe-agents), as key innovations in intelligent software engineering, are poised in the industry's end-of-programming debate to transcend from assis...
arxiv 📄 Generating Move Smart Contracts based on Concepts
http://arxiv.org/abs/2412.12513v1
The growing adoption of formal verification for smart contracts has spurred the development of new verifiable languages like Move. However, the limited availability of training data for these languages h...
arxiv 📄 PromptV: Leveraging LLM-powered Multi-Agent Prompting for High-quality Verilog Generation
http://arxiv.org/abs/2412.11014v1
Recent advances in agentic LLMs have demonstrated remarkable automated Verilog code generation capabilities. However, existing approaches either demand substanti...
Applications for our GenAI faculty position in the CSE department at UCSC close on Friday. Come and join our amazing team in Silicon Valley
> recruit.ucsc.edu/JPF01825
Ensure your applications are submitted by Friday as they will be reviewed over the Holiday Break!
Nice work that iterates with different agents over a waveform and creates/fixes the Verilog. A justification for TDD in Verilog, but requires a test first.
arxiv 📄 DialogAgent: An Auto-engagement Agent for Code Question Answering Data Production
http://arxiv.org/abs/2412.08069v1
Large Language Models (LLMs) have become increasingly integral to enhancing developer productivity, particularly in code generation, comprehension, and repair tasks. How...
arxiv 📄 Automated Soap Opera Testing Directed by LLMs and Scenario Knowledge: Feasibility, Challenges, and Road Ahead
http://arxiv.org/abs/2412.08581v1
Exploratory testing (ET) harnesses tester's knowledge, creativity, and experience to create varying tests that uncover unexpected bugs from t...
arxiv 📄 You Name It, I Run It: An LLM Agent to Execute Tests of Arbitrary Projects
http://arxiv.org/abs/2412.10133v1
The ability to execute the test suite of a project is essential in many scenarios, e.g., to assess code quality and code coverage, to validate code changes made by developers o...
arxiv 📄 MAGE: A Multi-Agent Engine for Automated RTL Code Generation
http://arxiv.org/abs/2412.07822v1
The automatic generation of RTL code (e.g., Verilog) through natural language instructions has emerged as a promising direction with the advancement of large language models (LLMs). However, p...
arxiv 📄 AiEDA: Agentic AI Design Framework for Digital ASIC System Design
http://arxiv.org/abs/2412.09745v1
The paper addresses advancements in Generative Artificial Intelligence (GenAI) and digital chip design, highlighting the integration of Large Language Models (LLMs) in automating hardware...
arxiv 📄 ExploraCoder: Advancing code generation for multiple unseen APIs via
planning and chained exploration
http://arxiv.org/abs/2412.05366v1
Through training on publicly available source code libraries, large language
models (LLMs) can invoke multiple encapsulated APIs to solve complex
pro...
arxiv 📄 The BrowserGym Ecosystem for Web Agent Research
http://arxiv.org/abs/2412.05467v1
The BrowserGym ecosystem addresses the growing need for efficient evaluation
and benchmarking of web agents, particularly those leveraging automation and
Large Language Models (LLMs) for web interaction ta...
arxiv 📄 GEE-OPs: An Operator Knowledge Base for Geospatial Code Generation on
the Google Earth Engine Platform Powered by Large Language Models
http://arxiv.org/abs/2412.05587v1
As the scale and complexity of spatiotemporal data continue to grow rapidly,
the use of geospatial modeling on the ...
arxiv 📄 Applications and Implications of Large Language Models in Qualitative
Analysis: A New Frontier for Empirical Software Engineering
http://arxiv.org/abs/2412.06564v1
The use of large language models (LLMs) for qualitative analysis is gaining
attention in various fields, including softwa...
arxiv 📄 Examining the Use and Impact of an AI Code Assistant on Developer
Productivity and Experience in the Enterprise
http://arxiv.org/abs/2412.06603v1
AI assistants are being created to help software engineers conduct a variety
of coding-related tasks, such as writing, documenting, and tes...
arxiv 📄 Why Do Developers Engage with ChatGPT in Issue-Tracker? Investigating
Usage and Reliance on ChatGPT-Generated Code
http://arxiv.org/abs/2412.06757v1
Large language models (LLMs) like ChatGPT have shown the potential to assist
developers with coding and debugging tasks. However, their ...
arxiv 📄 DRC-Coder: Automated DRC Checker Code Generation Using LLM Autonomous
Agent
http://arxiv.org/abs/2412.05311v1
In the advanced technology nodes, the integrated design rule checker (DRC) is
often utilized in place and route tools for fast optimization loops for
power-performance-area. I...