Chat with NVIDIA RTX Tech Demo Review 75

Chat with NVIDIA RTX Tech Demo Review

Using Chat with RTX »

Installation and Teething Issues

For an AI chat to have all its knowledge locally stored, it takes tens of gigabytes of data. Your Chat with RTX journey hence begins with a massive 35.1 GB installer download from NVIDIA. This comes as a zip file, with the datasets themselves being heavily compressed. Once you've unzipped the contents to a folder, you run the installer executable.

But before you do this, make sure you meet the system requirements:
  • A GeForce RTX 30-series "Ampere" or RTX 40-series "Ada" GPU with at least 8 GB of video memory. For some reason RTX 20-series "Turing" is not supported at this time
  • 100 GB of disk space, preferably on an SSD, because installation is a lot more painful on HDDs—we've checked
  • Windows 11 or Windows 10
  • The latest NVIDIA graphics drivers

The Chat with RTX installer looks very similar to your GeForce driver installer. Besides the 35 GB size for the application as downloaded, the installer will fetch additional dependencies as needed for Chat with RTX to work. Depending on what your machine already has, these dependencies will run into several additional gigabytes of data downloaded (NVIDIA did ask you to set aside 100 GB). This includes nearly 10 GB worth of Python and Anaconda-related dependencies. A conscious attempt has been made by the company to make the installation process as easy as possible, and to not appear as complicated as installing the other generative AI tools on your machine.

The installed size of Chat with RTX is 69.1 GB, 6.5 GB of that is for the Python-based Anaconda environment. The LLama2 and Mistral models take up 31 GB and 17 GB respectively, the rest is other Python-related libraries—yes, 10 GB.

For users with GeForce RTX GPUs that have 16 GB or more of video memory, the installer offers to install both Llama2 and Mistral AI models. For those with 8 GB or 12 GB of video memory, it only offers Mistral. This is because the Llama2 model and its dataset take an enormous amount of video memory. You can, however, override this limitation by editing the installer's config file that's found in a sub-folder of the installer executable (ask us in the comments, if you need more guidance).

Toward the end of the installation, the installer places a shortcut on the Windows Desktop, and offers to launch the application. It's highly advisable that you create this shortcut and let it launch the application at this stage, because otherwise beginners will be lost trying to find how to launch this thing. Chat with RTX by default is installed in your AppData folder. If for whatever reason the installer failed to create a Desktop shortcut, or you're lost, you can start the application by running the "%LOCALAPPDATA%\NVIDIA\ChatWithRTX\RAG\trt-llm-rag-windows-main\app_launch.bat" Windows Batch file.


When you run this batch file, a CMD Shell window pops up, and the application builds and load the existing data. This takes 30 seconds to a minute, and allocates around 6-8 GB of your graphics card's video memory for the AI models to run on—so don't try gaming or graphics benchmarking on the side. We've had no problems with video playback (YouTube), though.

Chat with RTX, like most current generative AI tools, is a service-client application—that CMD Shell window needs to be running in the background, as that's where the Chat with RTX service session is in progress. The application's front-end is web-browser based. With the service running, you point your browser to "http://127.0.0.1:1088/?__theme=dark" to launch the application's front-end. The port number seems to be random, it is displayed in the CMD Shell window after startup.


This is the Chat with RTX application. By default, an NVIDIA tech demo AI model and a small dataset of RTX marketing materials is loaded. You can ask it questions related to various NVIDIA RTX features. It gives you snappy text responses, and links to the exact text files it drew references from. You can use the "Select AI model" dropdown to toggle between this, Llama2, and Mistral. By default, you get your chosen AI model along with a mid-2022 updated dataset that's around 16-17 GB in size, so you can get talking about pretty much anything. The datasets aren't as comprehensive at GPT 3.5, some of the responses aren't as thoroughly researched as ChatGPT.


But this is hardly the story. The real ace up NVIDIA's sleeve is that Chat with RTX can be fed any magnitude of data, either in plain text (.txt) or documents in Word or PDF formats; and it will learn from them. We fed it all the news articles ever posted on TechPowerUp, to build a hardware technology mastermind AI. This required some extra coding work, because we had to export all our news posts into text files. This is some 250 MB of data in plain text, for around 60,000 articles. The application took about an hour on our GeForce RTX 4080 to train itself with this data, and began to answer questions related to computer hardware and technology.
Next Page »Using Chat with RTX
View as single page
Dec 21st, 2024 22:56 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts