• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Chinese Research Team Uses AI to Design a Processor in 5 Hours

Joined
May 30, 2015
Messages
1,928 (0.56/day)
Location
Seattle, WA
A group of researchers in China have used a new approach to AI to create a full RISC-V processor from scratch. The team set out to answer the question of whether an AI could design an entire processor on its own without human intervention. While AI design tools do already exist and are used for complex circuit design and validation today, they are generally limited in use and scope. The key improvements shown in this approach over traditional or AI assisted logic design are the automated capabilities, as well as its speed. The traditional assistive tools for designing circuits still require many hours of manual programming and validation to design a functional circuit. Even for such a simple processor as the one created by the AI, the team claims the design would have taken 1000x as long to be done by humans. The AI was trained by observing specific inputs and outputs of existing CPU designs, with the paper summarizing the approach as such:
(...) a new AI approach, which generates large-scale Boolean function with almost 100% validation accuracy (e.g., > 99.99999999999% as Intel) from only external input-output examples rather than formal programs written by the human. This approach generates the Boolean function represented by a graph structure called Binary Speculation Diagram (BSD), with a theoretical accuracy lower bound by using the Monte Carlo based expansion, and the distance of Boolean functions is used to tackle the intractability.



The resulting RISC-V32IA processor dubbed "CPU-AI" was taped out on a 65 nm lithography and operates at 300 MHz—with up to 600 MHz being possible—and is able to successfully run Linux, SPEC CINT 2000, and Dhrystone, performing similarly to Intel's venerable i486SX from 1991. Though they did not qualify which speed of i486SX they compared against, be it 16, 25, or 33 MHz. They suggest that performance could still be improved with augmented algorithms, and In the conclusion of the paper the research team speculates on a self-evolving machine that can design its own iterative upgrades and improvements. While that may be far off in the future the AI did independently discover the von Nuemann architecture through its observation of inputs and outputs. This leads to speculation that the algorithm can be tweaked to focus on fine-grain architecture optimization to work around traditional bottlenecks that it may encounter, a task which can usually be quite difficult for human engineers to accomplish.



View at TechPowerUp Main Site | Source
 

Solaris17

Super Dainty Moderator
Staff member
Joined
Aug 16, 2005
Messages
26,967 (3.83/day)
Location
Alabama
System Name RogueOne
Processor Xeon W9-3495x
Motherboard ASUS w790E Sage SE
Cooling SilverStone XE360-4677
Memory 128gb Gskill Zeta R5 DDR5 RDIMMs
Video Card(s) MSI SUPRIM Liquid X 4090
Storage 1x 2TB WD SN850X | 2x 8TB GAMMIX S70
Display(s) 49" Philips Evnia OLED (49M2C8900)
Case Thermaltake Core P3 Pro Snow
Audio Device(s) Moondrop S8's on schitt Gunnr
Power Supply Seasonic Prime TX-1600
Mouse Lamzu Atlantis mini (White)
Keyboard Monsgeek M3 Lavender, Moondrop Luna lights
VR HMD Quest 3
Software Windows 11 Pro Workstation
Benchmark Scores I dont have time for that.
Makes sense, everyone already used it in a limited capacity as is. The next logical step was using it for entire generation. I am sure or rather it /should/ be manually looked over before you know your burning $$ on fab time but otherwise; cool we are starting to enter this phase of design. Other companies will or are following suite, and im sure things like mobo phy etc will be next if they arent already.
 
Joined
Apr 15, 2009
Messages
1,034 (0.18/day)
Processor Ryzen 9 5900X
Motherboard Gigabyte X570 Aorus Master
Cooling ARCTIC Liquid Freezer III 360 A-RGB
Memory 32 GB Ballistix Elite DDR4-3600 CL16
Video Card(s) XFX 6800 XT Speedster Merc 319 Black
Storage Sabrent Rocket NVMe 4.0 1TB
Display(s) LG 27GL850B x 2 / ASUS MG278Q
Case be quiet! Silent Base 802
Audio Device(s) Sound Blaster AE-7 / Sennheiser HD 660S
Power Supply Seasonic Vertex PX-1200
Software Windows 11 Pro 64
Next comes the automation of chip and manufacturing foundries, because there's lots of efficiencies to be had with AI there, right?
Humans are writing themselves right out of the equation.
 
Joined
Jan 11, 2022
Messages
877 (0.83/day)
A 300mhz part doing a job at the same speed as a 30 year old part running at 10th the clock?
its a first step I guess, I have a hard time placing he achievement.

could it be given an fpga and try out its own designs and improve upon them?
 
Joined
May 30, 2015
Messages
1,928 (0.56/day)
Location
Seattle, WA
A 300mhz part doing a job at the same speed as a 30 year old part running at 10th the clock?
its a first step I guess, I have a hard time placing he achievement.

could it be given an fpga and try out its own designs and improve upon them?

A 30 year old part that took a team of engineers thousands of hours to develop.

The performance aspect is the first thing anyone ever focuses on, but it's incredibly short sighted. The paper details how simple the starting algorithm actually is from an implementation level, and the amount of training required to construct a new processor from scratch using only input/output tracing was fairly reserved. They also did not implement any algorithm changes beyond what it took to start producing accurate results. They didn't even give it an overview of the processors it was tracing, only their inputs and outputs. Yet it 'discovered' the underlying design elements from only that training data. Give it iterative capabilities and train it on much deeper wells of data and even if it takes 100 hours to create a processor it's still exceeding the capabilities of a dedicated team of engineers by literal years.
 

eidairaman1

The Exiled Airman
Joined
Jul 2, 2007
Messages
42,213 (6.64/day)
Location
Republic of Texas (True Patriot)
System Name PCGOD
Processor AMD FX 8350@ 5.0GHz
Motherboard Asus TUF 990FX Sabertooth R2 2901 Bios
Cooling Scythe Ashura, 2×BitFenix 230mm Spectre Pro LED (Blue,Green), 2x BitFenix 140mm Spectre Pro LED
Memory 16 GB Gskill Ripjaws X 2133 (2400 OC, 10-10-12-20-20, 1T, 1.65V)
Video Card(s) AMD Radeon 290 Sapphire Vapor-X
Storage Samsung 840 Pro 256GB, WD Velociraptor 1TB
Display(s) NEC Multisync LCD 1700V (Display Port Adapter)
Case AeroCool Xpredator Evil Blue Edition
Audio Device(s) Creative Labs Sound Blaster ZxR
Power Supply Seasonic 1250 XM2 Series (XP3)
Mouse Roccat Kone XTD
Keyboard Roccat Ryos MK Pro
Software Windows 7 Pro 64
Next comes the automation of chip and manufacturing foundries, because there's lots of efficiencies to be had with AI there, right?
Humans are writing themselves right out of the equation.
Amd tried that with FX in a sense and it didnt bode well for them, however software as of Ryzen finally caught up.
 
Joined
Aug 20, 2007
Messages
21,469 (3.40/day)
System Name Pioneer
Processor Ryzen R9 9950X
Motherboard GIGABYTE Aorus Elite X670 AX
Cooling Noctua NH-D15 + A whole lotta Sunon and Corsair Maglev blower fans...
Memory 64GB (4x 16GB) G.Skill Flare X5 @ DDR5-6000 CL30
Video Card(s) XFX RX 7900 XTX Speedster Merc 310
Storage Intel 905p Optane 960GB boot, +2x Crucial P5 Plus 2TB PCIe 4.0 NVMe SSDs
Display(s) 55" LG 55" B9 OLED 4K Display
Case Thermaltake Core X31
Audio Device(s) TOSLINK->Schiit Modi MB->Asgard 2 DAC Amp->AKG Pro K712 Headphones or HDMI->B9 OLED
Power Supply FSP Hydro Ti Pro 850W
Mouse Logitech G305 Lightspeed Wireless
Keyboard WASD Code v3 with Cherry Green keyswitches + PBT DS keycaps
Software Gentoo Linux x64 / Windows 11 Enterprise IoT 2024
Sorry folks but this is a huge deal. Both incredible and scary but nothing we did not expect would or could happen. This will render USA's tech limits on China moot very quickly.
On the same token, don't imagine USA isn't working on the same.
 
Joined
May 17, 2021
Messages
3,005 (2.33/day)
Processor Ryzen 5 5700x
Motherboard B550 Elite
Cooling Thermalright Perless Assassin 120 SE
Memory 32GB Fury Beast DDR4 3200Mhz
Video Card(s) Gigabyte 3060 ti gaming oc pro
Storage Samsung 970 Evo 1TB, WD SN850x 1TB, plus some random HDDs
Display(s) LG 27gp850 1440p 165Hz 27''
Case Lian Li Lancool II performance
Power Supply MSI 750w
Mouse G502
this is the thing AI is very good at, let it do it's job. And keep it away from the party tricks of using math to guess the next word
 
Joined
Oct 6, 2021
Messages
1,605 (1.40/day)
Uh, I don't think this is relevant. The current level of complexity of chip designs is infinitely higher. Plus, If I'm not mistaken, all large companies use "AI"(Script) at some point in design validation or simulation.
 
Joined
Jun 18, 2018
Messages
158 (0.07/day)
Processor i5 3570K - 4x @ 5GHz (1.32V) - de-lid
Motherboard ASRock Z77 Extrem 4
Cooling Prolimatech Genesis (3x Thermalright TY-141 PWM) - direct die
Memory 2x 4GB Corsair VENGEANCE DDR3 1600 MHz CL 9
Video Card(s) MSI GTX 980 Gaming 4G (Alpenföhn Peter + 2x Noctua NF-A12) @1547 Mhz Core / 2000 MHz Mem
Storage 500GB Crucial MX500 / 4 TB Seagate - BarraCuda
Case IT-9001 Smasher -modified with a 140mm top exhaust
Audio Device(s) AKG K240 Studio
Power Supply be quiet! E9 Straight Power 600W
[...], all large companies use "AI"(Script) at some point in design validation or simulation.
We are still far from what one could call a proper AI, but neural network based simulations are not just “scripts” in classical terms like those past models were based on.
Further, they will only improve in complexity and scale with time.

Both incredible and scary but nothing we did not expect would or could happen. This will render USA's tech limits on China moot very quickly.
It wasn't a “would” or “could”, but very much a “when” and everyone with the bucks to spend worldwide is implementing machine learning currently.
 
Joined
Feb 3, 2023
Messages
217 (0.33/day)
This is a wonderful achievement. Hardly a new direction, simple machine learning was used to semi-automate some tedious work almost twenty years ago when I had an episode of working with ASIC designers, but this is many levels ahead. Give it a few years of development, more processing power, improved models and the barrier to entry for designing ASICs or even general purpose processors will be much lower.
 
Joined
May 30, 2015
Messages
1,928 (0.56/day)
Location
Seattle, WA
Plus, If I'm not mistaken, all large companies use "AI"(Script) at some point in design validation or simulation.

AI/ML assisted integrated circuit design and validation tools exist but still require parameter programming and testing from trained engineers, as noted above. The approach presented in the paper required no validation from any secondary tool, and no human intervention except at tape-out to print it. They generated the BSD from I/O trace and the algorithm designed and validated its own ASIC in under 5 hours from that with zero manual tuning, bug checking, validation, or revising. It did it all on its own.

Excerpt:
Actually, though modern commercial electronic design automation (EDA) tools such as logic synthesis or high-level synthesis tools are available to accelerate the CPU design, all these tools require hand-crafted formal program code as the input. Concretely, a team of talented engineers must use formal programming languages (e.g., Verilog, Chisel, or C/C++) to implement the circuit logic of a CPU based on design specification, and then various EDA tools can be used to facilitate functional validation and performance/power optimization of the circuit logic. The above highly complex and non-trivial process typically iterates for months or years, where the key bottleneck is the manual implementation of the input circuit logic in terms of formal program code.
 
Joined
Apr 24, 2020
Messages
2,710 (1.61/day)
I already wrote up a post on this elsewhere. Imma copy/paste it really quick: https://www.techpowerup.com/forums/...stable-diffusion-and-more.304892/post-5050639


Risk-v cpu designed for less of 5 hours with using AI.
This comment probably is also suitable to topic for that AI is dangerous for workers places but this topic is old, last comment is before more of two years abd I see no point to reviving it.


In this article, we report a RISC-V CPU automatically designed by a new AI approach, which generates large-scale Boolean function with almost 100% validation accuracy (e.g., > 99.99999999999% as Intel31) from only external input-output examples rather than formal programs written by the human. This approach generates the Boolean function represented by a graph structure called Binary Speculation Diagram (BSD), with a theoretical accuracy lower bound by using the Monte Carlobased expansion, and the distance of Boolean functions is used to tackle the intractability.

These guys invent a new Binary Decision Diagram (calling it a Binary Speculation Diagram), and then have the audacity to call it "AI".

Look, advancements in BDDs is cool and all, but holy shit. These researchers are overhyping their product. When people talk about AI today, they don't mean 1980s style AI. Don't get me wrong, I like 1980s style AI, but I recognize that the new name of 1980s-style is called "Automated Theorem Proving". You can accomplish awesome feats with automated theorem proving (such as this new CPU), but guess what? A billion other researchers are also exploring BDDs (ZDDs, and a whole slew of other alphabet-soup binary (blah) diagrams) because this technique is widely known.

"Chinese AI Team innovates way to call Binary Decision Diagram competitor an AI during the AI hype cycle". That's my summary of the situation. Ironically, I'm personally very interested in their results because BDDs are incredibly cool and awesome. But its deeply misrepresenting what the typical layman considers AI today (which is mostly being applied to ChatGPT-like LLMs, or at a minimum, deep convolutional neural networks that underpin techniques like LLMs).

------------------

Furthermore, standard BDDs can create and verify Intel 486-like chips quite fine. That's just a 32-bit function (64-bits with 2x inputs), probably without the x87 coprocessor (so no 80-bit floats or 160-bit 2x inputs). Modern BDD-techniques that's used to automatically verify say, the AMD EPYC or Intel AVX512 instructions are doing up to 3x inputs of 512-bits each, or ~1536-bits... and each bit is exponential-worst case for the BDD technique. So... yeah... 64-bit inputs vs 1536 isn't really that impressive.

-----------

In fact, the underlying problem is: with only finite inputs and their expected outputs (i.e., IO examples) of a CPU, inferring the circuit logic in the form of a large-scale Boolean function that can be generalized to infinite IO examples with high accuracy.

I literally have an example of this sitting in my 1990s-era BDD textbook. This isn't new at all. This team is overselling their achievement. Albeit my textbook is only on the "textbook" level Reduced Ordered Binary Decision Diagram, with a few notes on ZDDs and the like... but I'm not surprised that "new BDD-style" could lead to some unique advantages.

Now, BSD (or whatever this "Binary Speculation Diagram" thingy is) might be interesting. Who knows? New BDDs are discovered all the time, its an exceptionally interesting and useful field. Necessary to advance the state-of-the-art of CPU design, synthesis, testing. Furthermore, this is exactly the kind of technology I'd expect hardcore chip designers to be using (its obviously a great technique). But... its industry standard. This is what CPU researchers are studying / experimenting with every day for the past 40+ years, I kid you not.

------------

BTW: ROBDDs (and all data-structures based off of BDDs) are awesome. I'd love to divert this topic and talk about ROBDDs, their performance characteristics, #P complete problems, etc. etc. But... its not AI. Its automated theorem proving, its exhaustive 100% accurate search with perfectly accurate designs of perfectness.

They generated the BSD from I/O trace

Its not very hard to build a circuit today if you already have the boolean truth table.

Remember: BDDs (and BSD, being a derivative technique of BDDs), stores the entire truth table of boolean operations. In a highly optimized, compressed fashion in a graph-database yes. But its a complete representation of the boolean function.

Its easy to automatically create a multiplier, if you have the 32x32 exhaustive input list of multiplication already. And indeed, modern verification / testing / BDD data-structures build circuits using this.
 
Joined
Jan 22, 2020
Messages
945 (0.53/day)
Location
Turkey
System Name MSI-MEG
Processor AMD Ryzen 9 3900X
Motherboard MSI MEG X570S ACE MAX
Cooling AMD Wraith Prism + Thermal Grizzly
Memory 32 GB
Video Card(s) MSI Suprim X RTX 3080
Storage 500 GB MSI Spatium nvme + 500 GB WD nvme + 2 TB Seagate HDD + 2 TB Seagate HDD
Display(s) 27" LG 144HZ 2K ULTRAGEAR
Case MSI MPG Velox Airflow 100P
Audio Device(s) Altec Lansing
Power Supply Seasonic 750W 80+ Gold
Mouse HP OMEN REACTOR
Keyboard Corsair K68
Software Windows10 LTSC 64 bit
Is that what they called "technological singularity" ?
 
Joined
Dec 30, 2010
Messages
2,198 (0.43/day)
Next comes the automation of chip and manufacturing foundries, because there's lots of efficiencies to be had with AI there, right?
Humans are writing themselves right out of the equation.

That is already going on for at least 10 years, where alot of AI is used in the ground design of chips.

It takes on avg 150 experienced chip designers - what if you could replace half of that workforce and the financial benefit you get from it?

Thats 75 engineers less on a payroll for the next 3 years - with perhaps the job even done better, faster or more efficient.
 
Joined
Apr 24, 2020
Messages
2,710 (1.61/day)
That is already going on for at least 10 years, where alot of AI is used in the ground design of chips.

It takes on avg 150 experienced chip designers - what if you could replace half of that workforce and the financial benefit you get from it?

Thats 75 engineers less on a payroll for the next 3 years - with perhaps the job even done better, faster or more efficient.

What do you think those chip designers are doing?

They make Verilog or VHDL. This then compiles into RTL or some other lower level language. Eventually, the synthesis programs create a BDD. That BDD then synthesizes into NAND gates to be laid out, and then the autorouter automatically runs the wires between these parts and makes the final layout.

Maybe humans go over the layout and try to assist the autorouter. But all these low level steps are already programs.

All this paper would do is replace the BDD step (fully automated computer program) with BSD, a new data structure that this paper discusses.

------------

Like, imagine if LLVM compiler (aka clang) billed itself as an AI that would reduce programmer time. Like, it's not wrong. There are traditional AI tasks (such as the coloring problem to solve register allocation) and better compilers will allow fewer programmers to accomplish the task of programming.

But it'd be misleading to argue it were an AI to laypeople / lay audience.
 
Last edited:
Joined
Dec 17, 2020
Messages
139 (0.10/day)
Sorry folks but this is a huge deal. Both incredible and scary but nothing we did not expect would or could happen. This will render USA's tech limits on China moot very quickly.
I guess will all just have to learn to live in peace.
 
Joined
Aug 27, 2020
Messages
235 (0.15/day)
Location
Hungary
System Name Main rig
Processor AMD Ryzen 7 2700 @ 3.5 GHz/1.18750 V|SoC 0.9 V
Motherboard ASRock Fatal1ty B450 Gaming-ITX/ac |BIOS 3.80
Cooling Noctua NH-D9L
Memory 2 x 8 GB G.Skill Ripjaws V DDR4-3600 @ 3200/C18-18-18-36|1T /1.25 V
Video Card(s) EVGA GeForce GTX 950 SC+ ACX 2.0 2 GB GDDR5
Storage 500 GB Crucial MX500|1 TB Crucial MX500|250 GB Intel 510|128 GB Netac N600S|6+2 TB WD Purple
Display(s) 2 x LG 24GM77-B @ 144 Hz, DAS on, Motion 240 off | 1 x Icy Box VESA arm @ pivot
Case Fractal Design Define Nano S (no window, 1 x Noctua A14 PWM + 1 x Noctua NF-S12B redux-1200 PWM)
Audio Device(s) Edifier R1700 BTs
Power Supply Seasonic Prime Fanless PX-500
Mouse HSK|Hati S|Hati M Ace|Skoll SK-S|Pulsar Xlite|MM710|MM730|NP-01|Viper M.|Krait|MX300|MX500 etc
Keyboard Ducky One 3 TKL |One 3 SF|One 2 Horizon|Miya Pro|Akko 3068B Plus|Huntsman TE|K70 Pro Mini WL etc
VR HMD 3xShidenkai XS|Zero Classic XS|Hien XS|Raiden XS|P-51|Turbulence Teal SE|Talent L|2xAllsop XL etc
Software MS Windows 10 Home x64
Benchmark Scores over 9000
Sorry folks but this is a huge deal. Both incredible and scary but nothing we did not expect would or could happen. This will render USA's tech limits on China moot very quickly.
It's way more scary than it is incredible though.
 
Top