AI Gets Agents: ChatGPT Now Has Deep Research with Agentic Capabilities

AleksandarK · Feb 3, 2025

Today, OpenAI has announced a new ChatGPT feature called "Deep Research," which is capable of performing complex, multi-step research processes entirely on its own. Using so-called agents, which are autonomous bots working on top of the AI model, this feature searches the web and curates all needed information. This agentic behavior was trained on real-world browser usage, accompanied by Python code execution. Deep search, like OpenAI's o1 and o3 models, uses reinforcement learning, which steps back to "think" and creates a chain of thought before delivering users an answer to their question. Depending on the topic, deep research can take 5 to 30 minutes to search the web, crawl through data, and compile it in a reader-friendly manner.

Regarding benchmarks of its performance, OpenAI put out a lot of interesting comparisons and evaluations. Compared to all previous models, deep research gives these models additional context to help AI with more information. Thus, in evaluation benchmarks like Humanity's Last exam, deep research scored 26.6%, whereas o1 and o3-mini scored 9.1 and 13%, respectively. Other evaluations showed a modest improvement, while concrete comparisons were made in UX, business, and medical research. Turning the deep research feature on delivered more information every time, and you can see it for yourself here.

However, as with every Transformer-based AI model and technology, it is prone to hallucinations. Specifically, it can create false references, pick up on rumors and treat them as facts, and not distinguish confidently on the information. However, it is reportedly much better compared to an average AI model in ChatGPT. Interestingly, OpenAI expects this to get annulled with more usage as deep research advances and learns more about information processing on user prompts. This officially marks OpenAI's level three of AGI. Level one was chatbots, which we got with ChatGPT; level two was reasoning models, which was o1/o3; and level three was agents, who can now perform their own tasks. Level four is next: an AI model that can aid in human development and invention.

View at TechPowerUp Main Site | Source

Harthad · Feb 3, 2025

tumblr_528962b1a200b1d13fbaa915455e535f_bb86d736_500.gif

tpa-pr · Feb 3, 2025

Interesting. Since its bots crawling the web, will they be made to respect robots.txt or some similar standard?

ZoneDymo · Feb 3, 2025

and I take it AI will credit the sources used for this ermmm "research document" ?

Assimilator · Feb 3, 2025

Please can we not turn TPU into yet another LLM company PR regurgitation mouthpiece, kthx.

AleksandarK · Feb 3, 2025

Assimilator said:
Please can we not turn TPU into yet another LLM company PR regurgitation mouthpiece, kthx.

Technology progress is always reported, no mather if AI or crypto or anything else

MacZ · Feb 3, 2025

So you would expect that people start to realize that having accurate and truthful informations on the web is important.

And for example, in the news business, remember that their role is to inform but also to act as a bulkwark against the BS, twists, manipulation and outright lies, especially of powerful/rich persons and organizations, and especially of their own governement.

And this require a bit of ability to be skeptical, and a bit of courage (because they will be retaliated against)

Otherwise :

1/ If this is just cheerleading for companies and technologies (for example), you just need a model with 3 inputs : the press release, sentiment and style. It will do just as fine and nothing of value will be lost. And don't think about learning to code : it's too late.

2/ The AI superintelligence will have some other ways to go haywire than just hallucinations.

Tomorrow · Feb 3, 2025

VI Agents were predicted decades ago so this is hardly surprising.

Veseleil · Feb 4, 2025

ZoneDymo said:
and I take it AI will credit the sources used for this ermmm "research document" ?

People will figure out, sooner or later, that datamining operations (meta, google, etc.) aren't used just for the sake of the marketing companies.

Processor	Core i9 10900k @ 5.1 Ghz
Motherboard	Asus ROG Strix Z490-E
Cooling	DH15 3x fans
Memory	4x16 Crucial 3600 CL16
Video Card(s)	3090 FE
Storage	2x 970 Evo Plus 2Tb + WD Gold 14Tb 2x
Display(s)	Dell AW3418DW + BenQ PD2700Q
Case	BeQuiet Dark Pro 900v² + 2 fans
Audio Device(s)	Ext
Power Supply	Be Quiet Dark Power 13 1000w
Mouse	Zowie FK2 for Quake/Logitech G600 + Artisan Hien soft XL

System Name	IZALITH (or just "Lith")
Processor	AMD Ryzen 7 7800X3D (4.2Ghz base, 5.0Ghz boost, -30 PBO offset)
Motherboard	Gigabyte X670E Aorus Master Rev 1.0
Cooling	Deepcool Gammaxx AG400 Single Tower
Memory	Corsair Vengeance 64GB (2x32GB) 6000MHz CL40 DDR5 XMP (XMP enabled)
Video Card(s)	PowerColor Radeon RX 7900 XTX Red Devil OC 24GB (2.39Ghz base, 2.99Ghz boost, -30 core offset)
Storage	2x1TB SSD, 2x2TB SSD, 2x 8TB HDD
Display(s)	Samsung Odyssey G51C 27" QHD (1440p 165Hz) + Samsung Odyssey G3 24" FHD (1080p 165Hz)
Case	Corsair 7000D Airflow Full Tower
Audio Device(s)	Corsair HS55 Surround Wired Headset/LG Z407 Speaker Set
Power Supply	Corsair HX1000 Platinum Modular (1000W)
Mouse	Logitech G502 X LIGHTSPEED Wireless Gaming Mouse
Keyboard	Keychron K4 Wireless Mechanical Keyboard
Software	Arch Linux

System Name	Cyberline
Processor	Intel Core i7 2600k -> 12600k
Motherboard	Asus P8P67 LE Rev 3.0 -> Gigabyte Z690 Auros Elite DDR4
Cooling	Tuniq Tower 120 -> Custom Watercoolingloop
Memory	Corsair (4x2) 8gb 1600mhz -> Crucial (8x2) 16gb 3600mhz
Video Card(s)	AMD RX480 -> RX7800XT
Storage	Samsung 750 Evo 250gb SSD + WD 1tb x 2 + WD 2tb -> 2tb MVMe SSD
Display(s)	Philips 32inch LPF5605H (television) -> Dell S3220DGF
Case	antec 600 -> Thermaltake Tenor HTCP case
Audio Device(s)	Focusrite 2i4 (USB)
Power Supply	Seasonic 620watt 80+ Platinum
Mouse	Elecom EX-G
Keyboard	Rapoo V700
Software	Windows 10 Pro 64bit

System Name	Firelance.
Processor	Threadripper 3960X
Motherboard	ROG Strix TRX40-E Gaming
Cooling	IceGem 360 + 6x Arctic Cooling P12
Memory	8x 16GB Patriot Viper DDR4-3200 CL16
Video Card(s)	MSI GeForce RTX 4060 Ti Ventus 2X OC
Storage	2TB WD SN850X (boot), 4TB Crucial P3 (data)
Display(s)	Dell S3221QS(A) (32" 38x21 60Hz) + 2x AOC Q32E2N (32" 25x14 75Hz)
Case	Enthoo Pro II Server Edition (Closed Panel) + 6 fans
Power Supply	Fractal Design Ion+ 2 Platinum 760W
Mouse	Logitech G604
Keyboard	Razer Pro Type Ultra
Software	Windows 10 Professional x64

System Name	Purple Haze \| Vacuum Box
Processor	AMD Ryzen 7 5800X3D (-30 CO) \| Intel® Xeon® E3-1241 v3
Motherboard	MSI B450 Tomahawk Max \| Gigabyte GA-Z87X-UD5H
Cooling	Dark Rock 4 Pro, P14, P12, T30 case fans \| 212 Evo & P12 PWM PST x2, Arctic P14 & P12 case fans
Memory	32GB Ballistix (Micron E 19nm) CL16 @3733MHz \| 32GB HyperX Beast 2400MHz (XMP)
Video Card(s)	AMD 6900XTXH ASRock OC Formula & Phanteks T30x3 \| AMD 5700XT Sapphire Nitro+ & Arctic P12x2
Storage	ADATA SX8200 Pro 1TB, Toshiba P300 3TB x2 \| Kingston A400 120GB, Fanxiang S500 Pro 256GB
Display(s)	TCL C805 50" 2160p 144Hz VA miniLED, Mi 27" 1440p 165Hz IPS, AOC 24G2U 1080p 144Hz IPS
Case	Modded MS Industrial Titan II Pro RGB \| Heavily Modded Cooler Master Q500L
Audio Device(s)	Audient iD14 MKII, Adam Audio T8Vs, Bloody M550, HiFiMan HE400se, Tascam TM-80, DS4 v2
Power Supply	Rosewill Capstone 1000M \| Enermax Revolution X't 730W (both with P14 fans)
Mouse	Logitech G305, Bloody A91, Amazon basics, Logitech M187
Keyboard	Redragon K530, Bloody B930, Epomaker TH80 SE, BTC 9110
Software	W10LTSC 21H2, PBO2, FF, MusicBee, mpv, ImageGlass, OpenRGB, FanControl, Greenshot, DS4Win, Signal

AI Gets Agents: ChatGPT Now Has Deep Research with Agentic Capabilities

AleksandarK

News Editor

Harthad

tpa-pr

ZoneDymo

Assimilator

AleksandarK

News Editor

MacZ

Tomorrow

Veseleil