News Posts matching #Moore Threads

Return to Keyword Browsing

Moore Threads MTLink Scales Up to 10,000 Home-Grown GPUs in AI Cluster

Chinese GPU manufacturer Moore Threads has announced a significant upgrade to its KUAE data center server. The company now has the ability to connect up to 10,000 GPUs in a single cluster, marking a huge leap in its scale-out capabilities for artificial intelligence and high-performance computing applications. The enhanced KUAE server incorporates eight MTT S4000 GPUs, leveraging Moore Threads' proprietary MTLink interconnect technology. These GPUs, based on the MUSA architecture, each feature 128 tensor cores and 48 GB of GDDR6 memory, delivering a bandwidth of 768 GB/s. While the full performance metrics of a 10,000-GPU cluster remain undisclosed, the sheer scale of 1,280,000 tensor cores suggests decent computing potential. Moore Threads' GPUs currently lag behind NVIDIA's GPU offerings in terms of performance. However, the company claims its MTT S4000 remains competitive against certain NVIDIA models, particularly in large language model training and inference tasks.

The Chinese company is facing significant challenges due to its inclusion on the U.S. Department of Commerce's Entity List, restricting access to advanced manufacturing processes. Despite these obstacles, the firm has secured partnerships with major Chinese state-run telecom operators and technology companies, focusing on developing new computing cluster projects. A recent financing round raised approximately $343.7 million will help Moore Threads' ambitious expansion plans. However, limited access to cutting-edge semiconductor fabrication technologies may constrain the company's future growth. Nonetheless, creating a scale-out server infrastructure with up to 10,000 GPUs is vital for LLM training and inference, especially as Chinese AI labs catch up to Western labs in terms of the performance of their AI models.

Moore Threads MTT S80 dGPU Struggles to Keep Up with Modern Radeon iGPUs

The Moore Threads MTT S80 first attracted wider media attention last summer due to it being introduced as the world's first PCIe Gen 5 gaming graphics card. Unfortunately, its performance prowess in gaming benchmarks did not match early expectations, especially for a 200 W TDP-rated unit with 4096 "MUSA" cores. Evaluators discovered that driver issues have limited the full potential of MTT GPUs—it is speculated that Moore Threads has simply repurposed existing PowerVR architecture under their in-house design: "Chunxaio." The Chinese firm has concentrated on driver improvements in the interim—mid-February experimentations indicated 100% performance boosts for MTT S80 and S70 discrete GPUs courtesy of driver version 240.90. Germany's ComputerBase managed to import Moore Threads MTT S80 and S30 models for testing purposes—in an effort to corroborate recently published performance figures, as disclosed by Asian review outlets.

The Moore Thread MTT S80—discounted down to $164 last October—was likely designed with MMO gamers in mind. VideoCardz (based on ComputerBase findings) discussed the card's struggles when weighed against Team Red's modern day integrated solutions: "S80 falls short when compared to the Ryzen 5 8600G, featuring the Radeon 760M iGPU with RDNA 3 graphics. A geometric mean across various titles reveals the S80's lag, but there are exceptions, like DOTA 2, where it takes the lead in framerate. It's clear that MTT GPUs (have a) less emphasized focus on supporting AAA titles." ComputerBase confirmed that DirectX 12 API support is still lacking, meaning that many popular Western games titles remain untested on the Moore Threads MTT S80 graphics card. The freshly launched entry-level MTT S30 card produced "1/4 of the performance" when compared to its flagship sibling.

Moore Threads MTT S30 Features AV1 Decode Acceleration, Supports Direct3D API

More details are emerging of the elusive Moore Threads MTT S30 entry-level graphics card, which was released earlier this week. VideoCardz reports that the GPU features hardware-accelerated decoding of AV1, besides H.265 and H.264, which should cover all the bases for its use as a streaming content consumption GPU. The card features an HDMI 2.0 port that supports 4K Ultra HD up to 60 Hz. The D-Sub connector tops out at 1920 x 1200. Most Windows-based media applications use Direct3D or DXVA codepaths for accelerated video decode, and so the GPU has some form of DirectX API support, although we still don't know up to which version that is. It also supports OpenGL, which should come in handy with certain Adobe applications that use GL contexts to draw their workspaces. The source also reports a rather attractive retail price of just RMB ¥399, or about USD $55.

Moore Threads Releases MTT S30 Entry-level GPU

Moore Threads, the Chinese company aiming to build a contemporary PC GPU family indigenous to China, formally introduced the MTT S30, an entry-level GPU. Given the performance positioning of the company's flagship MTT S80 GPU even with its recent performance doubling driver update, one can conclude that the MTT S30 isn't quite a gaming GPU. It has a quarter of the unified shaders of the MTT S80, 1/6th its FP32 throughput, and a quarter of its memory size; which means the GPU really is an iGPU replacement that accelerates one or more high-resolution displays for non-gaming productivity workloads, and perhaps some media acceleration.

The Moore Threads MTT S30 features 1,024 unified shaders, an unknown number of tensor accelerators, a 1.30 GHz GPU clock, and 4 GB of GDDR6 memory across a 128-bit wide memory bus. The reference design card is single-slot, half-height, and draws all its power from the PCIe slot, given that its power draw is rated at just 40 W. This card has just two display connectors—HDMI and D-Sub. It features a PCI-Express 4.0 x8 host interface.

Moore Threads S80 and S70 GPUs Get 100% Performance Boost with Driver Update

Moore Threads S80 and S70 discrete GPUs get a doubling in their gaming performance thanks to the latest driver update that seems to re-architect the software end of the graphics pipeline. The new version 240.90 driver update by Moore Threads was testted by Chinese technology publication EJ Hardware to show an over 100% performance gain in synthetic benchmarks such as 3DMark Fire Strike, and doubling in games such as "Call of Duty: Modern Warfare 3,"" "League of Legends," "Crossfire," "Risk of Rain 2," and "DOTA 2." The performance impact wasn't very pronounced in "Genshin Impact." In DOTA 2, the older 230.11 driver that was causing the S80 to fail the test, is posting an impressive 100% gain over the older drivers. With its launch drivers the Moore Threads S80 was tested to be comparable to a GeForce GTX 1050. This 100% performance boost should change its market positioning.

Moore Threads Launches MTT S4000 48 GB GPU for AI Training/Inference and Presents 1000-GPU Cluster

Chinese chipmaker Moore Threads has launched its first domestically-produced 1000-card AI training cluster, dubbed the KUAE Intelligent Computing Center. A central part of the KUAE cluster is Moore Threads new MTT S4000 accelerator card with 48 GB VRAM utilizing the company's third-generation MUSA GPU architecture and 768 GB/s memory bandwidth. In FP32, the card can output 25 TeraFLOPS; in TF32, it can achieve 50 TeraFLOPS; and in FP16/BF16, up to 200 TeraFLOPS. Also supported is INT8 at 200 TOPS. The MTT S4000 focuses on both training and inference, leveraging Moore Thread's high-speed MTLink 1.0 intra-system interconnect to scale cards for distributed model parallel training of datasets with hundreds of billions of parameters. The card also provides graphics, video encoding/decoding, and 8K display capabilities for graphics workloads. Moore Thread's KUAE cluster combines the S4000 GPU hardware with RDMA networking, distributed storage, and integrated cluster management software. The KUAE Platform oversees multi-datacenter resource allocation and monitoring. KUAE ModelStudio hosts training frameworks and model repositories to streamline development.

With integrated solutions now proven at thousands of GPUs, Moore Thread is positioned to power ubiquitous intelligent applications - from scientific computing to the metaverse. The KUAE cluster reportedly achieves near-linear 91% scaling. Taking 200 billion training data as an example, Zhiyuan Research Institute's 70 billion parameter Aquila2 can complete training in 33 days; a model with 130 billion parameters can complete training in 56 days on the KUAE cluster. In addition, the Moore Threads KUAE killocard cluster supports long-term continuous and stable operation, supports breakpoint resume training, and has an asynchronous checkpoint that is less than 2 minutes. For software, Moore Threads also boasts full compatibility with NVIDIA's CUDA framework, where its MUSIFY tool translates CUDA code to MUSA GPU architecture at supposedly zero cost of migration, i.e., no performance penalty.

Moore Thread's MTT S80, World's First PCIe Gen 5 Gaming Graphics Card, Now Priced at $164

The Moore Thread's MTT S80 discrete graphics card is now available as part of a special 11-11 (Single's Day) promotion in China, for the equivalent of USD $164, making it both the world's first PCIe Gen 5 gaming graphics card, and the most affordable one to feature 16 GB of memory. Moore Thread's is a Chinese GPU manufacturer that has been aiming to build a contemporary GPU to grab a slice of the entry-mainstream gaming market in China for a few years now.

Much of the PC gaming scene in China doesn't involve AAA productions in need of the fastest GPU out there, but rather GPUs from the mainstream performance tier—Moore Thread's knows this, and has been reinventing many wheels in the absence of the kind of graphics IP cross-licensing entanglement that exists among NVIDIA, AMD, and Intel. The company's fastest GPU is the MTT S80 launched in late 2022, which has the bragging rights to be the world's first with a PCI Express Gen 5 bus interface. Does it need this kind of bandwidth? We honestly don't know, after seeing how sensitive to PCIe interface and resizable-BAR even mainstream Intel GPUs can be. At launch the performance level of the MTT S80 made it more of a novelty than anything, with performance barely matching a Radeon RX 6400, making it about as fast as the iGPU of AMD's Ryzen 5000G "Cezanne" desktop APUs. This is just enough for China's homebrew MOBAs and MMORPGs that are designed to maximize market reach, and hence tend to contain a lot of pre-baked content.
Image Courtesy: Expreview

Moore Threads Prepares S90 and S4000 GPUs for Gaming and Data Center

Moore Threads Technology (MTT), a Chinese GPU manufacturer, is reportedly testing its next-generation graphics processors for client PCs and data centers. The products under scrutiny are the MTT S90 for client/gaming computers and the MTT S4000 for data centers. Characterized by their Device IDs, 0301 and 0323, this could imply that these GPUs belong to MTT's 3rd generation GPU lineup. While few details about these GPUs are available, the new Device IDs suggest a possible introduction of a novel microarchitecture following the MTT Chunxiao GPU series. The current generation Chunxiao series, featuring the MTT S70, MTT S80, and MTT S3000, failed to compete effectively with AMD, Intel, and NVIDIA GPUs.

Thanks to @Löschzwerg who found the Device Hunt submission, we see hardware identifiers in PCI ID and USB ID repositories earlier than launch, as this often signals the testing of new chips or drivers by various companies. In the case of MTT, the latest developments are complicated by its recent inclusion on the U.S. Entity List, limiting its access to US-made technologies. This introduces a problem for the company, as they can't access TSMC's facilities for chip production, and will have to turn to domestic production in the likely case, with SMIC being the only leading option to consider.

Moore Threads Driver Update Brings up to 40% Performance Uplift for S70 and S80 GPUs

Moore Threads latest driver update, 230.40.0.1, is a noteworthy advancement, bringing many improvements and new features, most significantly introducing OpenGL 3.3 support. This inclusion is crucial, as this API was previously incompatible with MTT S70 and S80 GPUs, highlighting MTT's commitment to broadening user experience across various platforms and games. Moreover, the update offers substantial performance enhancements, with notable increases in frame rates in popular games. Valorant supposedly gets a 40% FPS improvement at 1080p, while Project CARS brings a 10% increase. Game engines such as CryEngine v5.7 also receive a 40% performance uplift. However, the release of only percentage improvements without specific framerate values calls for measured optimism, as the tangible impact on playability is yet to be assessed.

Equally important in this update is the emphasis on stability, with many popular titles seeing enhancements for a smoother and more reliable gaming experience. This focus underscores MTT's dedication to maintaining a robust and stable gaming environment on the Windows 10 operating system. Integrating 20 fixes and ongoing resolutions to existing issues mark this update as crucial for users aiming for a refined and seamless experience in graphics-intensive applications and games. The amalgamation of enhanced compatibility, heightened performance, and bolstered stability in this update is pivotal for users looking to maximize graphic and gaming capabilities.

Moore Threads MTT S80 GPU Benchmarked by PC Watch Japan

The Moore Threads MTT S80 gaming-oriented graphics card has been tested mostly by Chinese hardware publications, but Japan's PC Watch has managed to get hold of a sample unit configured with 16 GB GDDR6 (14 Gbps) for evaluation purposes and soon published their findings in a "HotHot REVIEW!" The MTT S80 GPU appears to be based on PowerVR architecture (developed by Imagination Technologies), but official Moore Threads literature boasts that their own Chunxaio design is behind all proceedings with 4096 "MUSA" cores. The GPU's clock speed is set at 1.8 GHz, and maximum compute performance has been measured at 14.2 TFLOPS. A 256-bit memory bus grants a bandwidth transfer rate of 448 GB/s. PC Watch notes that the card's support for PCIe Gen 5 x 16 (offering up to 128 GB/s bandwidth) is quite surprising, given the early nature of this connection standard.

Moore Threads has claimed in the past that their cards support Direct X, but PC Watch has discovered that the S80 does not work with DX12, and their tests also demonstrated significant compatibility issues under DX11—with plenty of system freezes and error messages logged. The reviewer(s) had to downshift in some cases to DX9 game environments, in order to gather reliable/stable data. TPU's GPU-Z utility is shown to have no registration information for the S80, and it cannot read the GPU's clock. PC Watch compared their sample unit to an NVIDIA GeForce GTX 1050 Ti graphics card—the entry level 2016-era GPU managed to best the newer competition in terms of in-game performance and power efficiency.

Moore Threads Unveils MTT S60 & MTT S2000 Graphics Cards with DirectX Support

Chinese company Moore Threads has unveiled their MTT GPU series just 18 months after the company's establishment in 2020. The MT Unified System Architecture (MUSA) architecture is the first for any Chinese company to be developed fully domestically and includes support for DirectX, OpenCL, OpenGL, Vulkan, and CUDA. The company announced the MTT S60 and MTT S2000 single slot desktop graphics cards for gaming and server applications at a recent event. The MTT S60 is manufactured on a 12 nm node and features 2,048 MUSA cores paired with 8 GB of LPGDDR4X memory offering 6 TFLOPs of performance. The MTT S2000 is also manufactured on a 12 nm node and doubles the number of MUSA cores to 4096 paired with 32 GB of undisclosed video memory allowing it to reach 12 TFLOPs.

Moore Threads joins Intel in supporting AV1 encoding on a consumer GPU with MUSA cards featuring H.264, H.265, and AV1 encoding support in addition to H.264, H.265, AV1, VP8, and VP9 decoding. The company is also developing a physics engine dubbed Alphacore which is said to work with existing tools such as Unity, Unreal Engine, and Houdini to accelerate physics performance by 5 to 10 times. The only gaming performance shown was a simple demonstration of the MTT S60 running League of Legends at 1080p without any frame rate details.
Return to Keyword Browsing
Nov 23rd, 2024 02:59 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts