News Posts matching #LLM

Return to Keyword Browsing

Micron Unveils Its First PCIe Gen5 NVMe High-Performance Client SSD

Micron Technology, Inc., today announced the Micron 4600 PCIe Gen 5 NVMe SSD, an innovative client storage drive for OEMs that is designed to deliver exceptional performance and user experience for gamers, creators and professionals. Leveraging Micron G9 TLC NAND, the 4600 SSD is Micron's first Gen 5 client SSD and doubles the performance of its predecessor.

The Micron 4600 SSD showcases sequential read speeds of 14.5 GB/s and write speeds of 12.0 GB/s. These capabilities allow users to load a large language model (LLM) from the SSD to DRAM in less than one second, enhancing the user experience with AI PCs. For AI model loading times, the 4600 SSD reduces load times by up to 62% compared to Gen 4 performance SSDs ensuring rapid deployment of LLMs and other AI workloads. Additionally, the 4600 SSD provides up to 107% improved energy efficiency (MB/s per watt) compared to Gen 4 performance SSDs, enhancing battery life and overall system efficiency.

AMD & Nexa AI Reveal NexaQuant's Improvement of DeepSeek R1 Distill 4-bit Capabilities

Nexa AI, today, announced NexaQuants of two DeepSeek R1 Distills: The DeepSeek R1 Distill Qwen 1.5B and DeepSeek R1 Distill Llama 8B. Popular quantization methods like the llama.cpp based Q4 K M allow large language models to significantly reduce their memory footprint and typically offer low perplexity loss for dense models as a tradeoff. However, even low perplexity loss can result in a reasoning capability hit for (dense or MoE) models that use Chain of Thought traces. Nexa AI has stated that NexaQuants are able to recover this reasoning capability loss (compared to the full 16-bit precision) while keeping the 4-bit quantization and all the while retaining the performance advantage. Benchmarks provided by Nexa AI can be seen below.

We can see that the Q4 K M quantized DeepSeek R1 distills score slightly less (except for the AIME24 bench on Llama 3 8b distill, which scores significantly lower) in LLM benchmarks like GPQA and AIME24 compared to their full 16-bit counter parts. Moving to a Q6 or Q8 quantization would be one way to fix this problem - but would result in the model becoming slightly slower to run and requiring more memory. Nexa AI has stated that NexaQuants use a proprietary quantization method to recover the loss while keeping the quantization at 4-bits. This means users can theoretically get the best of both worlds: accuracy and speed.

Moore Threads Teases Excellent Performance of DeepSeek-R1 Model on MTT GPUs

Moore Threads, a Chinese manufacturer of proprietary GPU designs is (reportedly) the latest company to jump onto the DeepSeek-R1 bandwagon. Since late January, NVIDIA, Microsoft and AMD have swooped in with their own interpretations/deployments. By global standards, Moore Threads GPUs trail behind Western-developed offerings—early 2024 evaluations presented the firm's MTT S80 dedicated desktop graphics card struggling against an AMD integrated solution: Radeon 760M. The recent emergence of DeepSeek's open source models has signalled a shift away from reliance on extremely powerful and expensive AI-crunching hardware (often accessed via the cloud)—widespread excitement has been generated by DeepSeek solutions being relatively frugal, in terms of processing requirements. Tom's Hardware has observed cases of open source AI models running (locally) on: "inexpensive hardware, like the Raspberry Pi."

According to recent Chinese press coverage, Moore Threads has announced a successful deployment of DeepSeek's R1-Distill-Qwen-7B distilled model on the aforementioned MTT S80 GPU. The company also revealed that it had taken similar steps with its MTT S4000 datacenter-oriented graphics hardware. On the subject of adaptation, a Moore Threads spokesperson stated: "based on the Ollama open source framework, Moore Threads completed the deployment of the DeepSeek-R1-Distill-Qwen-7B distillation model and demonstrated excellent performance in a variety of Chinese tasks, verifying the versatility and CUDA compatibility of Moore Threads' self-developed full-featured GPU." Exact performance figures, benchmark results and technical details were not disclosed to the Chinese public, so Moore Threads appears to be teasing the prowess of its MTT GPU designs. ITHome reported that: "users can also perform inference deployment of the DeepSeek-R1 distillation model based on MTT S80 and MTT S4000. Some users have previously completed the practice manually on MTT S80." Moore Threads believes that its: "self-developed high-performance inference engine, combined with software and hardware co-optimization technology, significantly improves the model's computing efficiency and resource utilization through customized operator acceleration and memory management. This engine not only supports the efficient operation of the DeepSeek distillation model, but also provides technical support for the deployment of more large-scale models in the future."

ASUS AI POD With NVIDIA GB200 NVL72 Platform Ready to Ramp-Up Production for Scheduled Shipment in March

ASUS is proud to announce that ASUS AI POD, featuring the NVIDIA GB200 NVL72 platform, is ready to ramp-up production for a scheduled shipping date of March 2025. ASUS remains dedicated to providing comprehensive end-to-end solutions and software services, encompassing everything from AI supercomputing to cloud services. With a strong focus on fostering AI adoption across industries, ASUS is positioned to empower clients in accelerating their time to market by offering a full spectrum of solutions.

Proof of concept, funded by ASUS
Honoring the commitment to delivering exceptional value to clients, ASUS is set to launch a proof of concept (POC) for the groundbreaking ASUS AI POD, powered by the NVIDIA Blackwell platform. This exclusive opportunity is now open to a select group of innovators who are eager to harness the full potential of AI computing. Innovators and enterprises can experience firsthand the full potential of AI and deep learning solutions at exceptional scale. To take advantage of this limited-time offer, please complete this surveyi at: forms.office.com/r/FrAbm5BfH2. The expert ASUS team of NVIDIA GB200 specialists will guide users through the next steps.

NVIDIA GeForce RTX 50 Series AI PCs Accelerate DeepSeek Reasoning Models

The recently released DeepSeek-R1 model family has brought a new wave of excitement to the AI community, allowing enthusiasts and developers to run state-of-the-art reasoning models with problem-solving, math and code capabilities, all from the privacy of local PCs. With up to 3,352 trillion operations per second of AI horsepower, NVIDIA GeForce RTX 50 Series GPUs can run the DeepSeek family of distilled models faster than anything on the PC market.

A New Class of Models That Reason
Reasoning models are a new class of large language models (LLMs) that spend more time on "thinking" and "reflecting" to work through complex problems, while describing the steps required to solve a task. The fundamental principle is that any problem can be solved with deep thought, reasoning and time, just like how humans tackle problems. By spending more time—and thus compute—on a problem, the LLM can yield better results. This phenomenon is known as test-time scaling, where a model dynamically allocates compute resources during inference to reason through problems. Reasoning models can enhance user experiences on PCs by deeply understanding a user's needs, taking actions on their behalf and allowing them to provide feedback on the model's thought process—unlocking agentic workflows for solving complex, multi-step tasks such as analyzing market research, performing complicated math problems, debugging code and more.

DeepSeek-R1 Goes Live on NVIDIA NIM

DeepSeek-R1 is an open model with state-of-the-art reasoning capabilities. Instead of offering direct responses, reasoning models like DeepSeek-R1 perform multiple inference passes over a query, conducting chain-of-thought, consensus and search methods to generate the best answer. Performing this sequence of inference passes—using reason to arrive at the best answer—is known as test-time scaling. DeepSeek-R1 is a perfect example of this scaling law, demonstrating why accelerated computing is critical for the demands of agentic AI inference.

As models are allowed to iteratively "think" through the problem, they create more output tokens and longer generation cycles, so model quality continues to scale. Significant test-time compute is critical to enable both real-time inference and higher-quality responses from reasoning models like DeepSeek-R1, requiring larger inference deployments. R1 delivers leading accuracy for tasks demanding logical inference, reasoning, math, coding and language understanding while also delivering high inference efficiency.

KIOXIA Releases AiSAQ as Open-Source Software to Reduce DRAM Needs in AI Systems

Kioxia Corporation, a world leader in memory solutions, today announced the open-source release of its new All-in-Storage ANNS with Product Quantization (AiSAQ) technology. A novel "approximate nearest neighbor" search (ANNS) algorithm optimized for SSDs, KIOXIA AiSAQ software delivers scalable performance for retrieval-augmented generation (RAG) without placing index data in DRAM - and instead searching directly on SSDs.

Generative AI systems demand significant compute, memory and storage resources. While they have the potential to drive transformative breakthroughs across various industries, their deployment often comes with high costs. RAG is a critical phase of AI that refines large language models (LLMs) with data specific to the company or application.

NVIDIA Outlines Cost Benefits of Inference Platform

Businesses across every industry are rolling out AI services this year. For Microsoft, Oracle, Perplexity, Snap and hundreds of other leading companies, using the NVIDIA AI inference platform—a full stack comprising world-class silicon, systems and software—is the key to delivering high-throughput and low-latency inference and enabling great user experiences while lowering cost. NVIDIA's advancements in inference software optimization and the NVIDIA Hopper platform are helping industries serve the latest generative AI models, delivering excellent user experiences while optimizing total cost of ownership. The Hopper platform also helps deliver up to 15x more energy efficiency for inference workloads compared to previous generations.

AI inference is notoriously difficult, as it requires many steps to strike the right balance between throughput and user experience. But the underlying goal is simple: generate more tokens at a lower cost. Tokens represent words in a large language model (LLM) system—and with AI inference services typically charging for every million tokens generated, this goal offers the most visible return on AI investments and energy used per task. Full-stack software optimization offers the key to improving AI inference performance and achieving this goal.

NVIDIA AI Helps Fight Against Fraud Across Many Sectors

Companies and organizations are increasingly using AI to protect their customers and thwart the efforts of fraudsters around the world. Voice security company Hiya found that 550 million scam calls were placed per week in 2023, with INTERPOL estimating that scammers stole $1 trillion from victims that same year. In the U.S., one of four noncontact-list calls were flagged as suspected spam, with fraudsters often luring people into Venmo-related or extended warranty scams.

Traditional methods of fraud detection include rules-based systems, statistical modeling and manual reviews. These methods have struggled to scale to the growing volume of fraud in the digital era without sacrificing speed and accuracy. For instance, rules-based systems often have high false-positive rates, statistical modeling can be time-consuming and resource-intensive, and manual reviews can't scale rapidly enough.

Seagate Anticipates Cloud Storage Growth due to AI-Driven Data Creation

According to a recent, global Recon Analytics survey commissioned by Seagate Technology, business leaders from across 15 industry sectors and 10 countries expect that adoption of artificial intelligence (AI) applications will generate unprecedented volumes of data, driving a boom in demand for data storage, in particular cloud-based storage. With hard drives delivering scalability relative to terabyte-per-dollar cost efficiencies, cloud service providers rely on hard drives to store mass quantities of data.

Recently, analyst firm IDC estimated that 89% of data stored by leading cloud service providers is stored on hard drives. Now, according to this Recon Analytics study, nearly two-thirds of respondents (61%) from companies that use cloud as their leading storage medium expect their cloud-based storage to grow by more than 100% over the next 3 years. "The survey results generally point to a coming surge in demand for data storage, with hard drives emerging as the clear winner," remarked Roger Entner, founder and lead analyst of Recon Analytics. "When you consider that the business leaders we surveyed intend to store more and more of this AI-driven data in the cloud, it appears that cloud services are well-positioned to ride a second growth wave."

NVIDIA NeMo AI Guardrails Upgraded with Latest NIM Microservices

AI agents are poised to transform productivity for the world's billion knowledge workers with "knowledge robots" that can accomplish a variety of tasks. To develop AI agents, enterprises need to address critical concerns like trust, safety, security and compliance. New NVIDIA NIM microservices for AI guardrails—part of the NVIDIA NeMo Guardrails collection of software tools—are portable, optimized inference microservices that help companies improve the safety, precision and scalability of their generative AI applications.

Central to the orchestration of the microservices is NeMo Guardrails, part of the NVIDIA NeMo platform for curating, customizing and guardrailing AI. NeMo Guardrails helps developers integrate and manage AI guardrails in large language model (LLM) applications. Industry leaders Amdocs, Cerence AI and Lowe's are among those using NeMo Guardrails to safeguard AI applications. Developers can use the NIM microservices to build more secure, trustworthy AI agents that provide safe, appropriate responses within context-specific guidelines and are bolstered against jailbreak attempts. Deployed in customer service across industries like automotive, finance, healthcare, manufacturing and retail, the agents can boost customer satisfaction and trust.

Aetina & Qualcomm Collaborate on Flagship MegaEdge AIP-FR68 Edge AI Solution

Aetina, a leading provider of edge AI solutions and a subsidiary of Innodisk Group, today announced a collaboration with Qualcomm Technologies, Inc., who unveiled a revolutionary Qualcomm AI On-Prem Appliance Solution and Qualcomm AI Inference Suite for On-Prem. This collaboration combines Qualcomm Technologies' cutting-edge inference accelerators and advanced software with Aetina's edge computing hardware to deliver unprecedented computing power and ready-to-use AI applications for enterprises and industrial organizations.

The flagship offering, the Aetina MegaEdge AIP-FR68, sets a new industry benchmark by integrating Qualcomm Cloud AI family of accelerator cards. Each Cloud AI 100 Ultra card delivers an impressive 870 TOPS of AI computing power at 8-bit integer (INT8) while maintaining remarkable energy efficiency at just 150 W power consumption. The system supports dual Cloud AI 100 Ultra cards in a single desktop workstation. This groundbreaking combination of power and efficiency in a compact form factor revolutionizes on-premises AI processing, making enterprise-grade computing more accessible than ever.

Supermicro Begins Volume Shipments of Max-Performance Servers Optimized for AI, HPC, Virtualization, and Edge Workloads

Supermicro, Inc. a Total IT Solution Provider for AI/ML, HPC, Cloud, Storage, and 5G/Edge is commencing shipments of max-performance servers featuring Intel Xeon 6900 series processors with P-cores. The new systems feature a range of new and upgraded technologies with new architectures optimized for the most demanding high-performance workloads including large-scale AI, cluster-scale HPC, and environments where a maximum number of GPUs are needed, such as collaborative design and media distribution.

"The systems now shipping in volume promise to unlock new capabilities and levels of performance for our customers around the world, featuring low latency, maximum I/O expansion providing high throughput with 256 performance cores per system, 12 memory channels per CPU with MRDIMM support, and high performance EDSFF storage options," said Charles Liang, president and CEO of Supermicro. "We are able to ship our complete range of servers with these new application-optimized technologies thanks to our Server Building Block Solutions design methodology. With our global capacity to ship solutions at any scale, and in-house developed liquid cooling solutions providing unrivaled cooling efficiency, Supermicro is leading the industry into a new era of maximum performance computing."

UGREEN Shows Off High-End, AI Capable NAS Devices at CES 2025

At CES this year, UGREEN was showing off two new NAS models, the NASync iDX6011 and the iDX6011 Pro, with the i in the model name seemingly denoting that both models are using Intel Core Ultra processors. The basic design builds on last years NASync models and the UGOS Pro operating system, but with several added features that may or may not appeal to the target audience. The common feature set between the two models is six 3.5-inch drive bays and a pair of M.2 slots that can be used for either storage or as a cache for the mechanical drives. Both models are expected to ship with 32 GB of RAM as standard and can be expanded to 64 GB and both of them also supports a PCIe 4.0 x8 expansion card slot, although at this point it's not clear what that slot can be used for. As with the already launched NASync models, the two new SKUs will come with the OS installed on a 128 GB SSD.

Where things get interesting is on the connectivity side of things, as both models sports dual 10 Gbps Ethernet ports, what is said to be an 8K capable HDMI port, a pair of USB 3.2 (10 Gbps) ports, two USB 2.0 ports and an SD 4.0 card slot. However, the NASync iDX6011 comes with a pair of Thunderbolt 4 ports around the front, although it's not clear if it can be used as a DAS using these ports, or if they simply act as virtual network ports. The iDX6011 Pro on the other hand, sports two USB Type-C ports around the front—as well as a small status LCD display—in favour of an OCuLink port around the back. The OCuLink port is capable of up to 64 Gbps of bandwidth, compared to 40 Gbps for the Thunderbolt 4 ports. It's currently not know what the OCuLink port can be used for, but it's more or less an external PCIe interface. It's also unknown what type of AI or LLM features the two new NASync devices will support, but it's clear they'll rely on the capabilities of the Intel processors they're built around. No pricing was announced at CES and the NASync iDX6011 is expected to launch sometime in the second quarter this year, with the NASync iDX6011 Pro launching in the third quarter. We should also note that the NASync iDX6011 Pro wasn't on display at CES, hence the renders below.

UnifyDrive is Redefining AI-Driven Data Storage at CES 2025

UnifyDrive's participation at CES 2025 marked a pivotal moment in the evolution of portable data storage, as the company unveiled the UP6, the world's first AI-equipped portable tablet NAS. Met with overwhelming acclaim, attendees praised the UP6 as a leap forward in portable data storage, combining intuitive file organization, performance, and Artificial Intelligence to meet the demands of creators, businesses, and modern consumers.

Powered by Intel Core Ultra Processor and integrated with a Large Language Model (LLM), the UP6 has reshaped how users interact with their data. The device's ability to enable natural language searches, retrieve local data, and restore and enhance images resonated strongly with attendees. Many praised its potential to revolutionize productivity workflows, with one industry analyst describing it as "a fundamental change in smart storage solutions."

UGREEN Showcases Pioneering NASync AI NAS Lineup and More at CES 2025

UGREEN, a leading innovator in consumer electronics, is due to showcase its latest innovations at CES 2025 under the theme of "Activate the Possibility of AI." The highlight of the event will be the unveiling of the highly anticipated NASync iDX6011 and NASync iDX6011 Pro devices, which are from the cutting-edge AI NAS lineup of the NASync series. Alongside these groundbreaking products, the Nexode 500 W 6-Port GaN Desktop Fast Charger and the Revodok Max 2131 Thunderbolt 5 Docking Station will also take center stage.

The NASync series AI NAS models are set to redefine expectations with integrated large language models (LLMs) for advanced natural language processing and AI-driven interactive capabilities. Powered by cutting-edge Intel Core Ultra Processors, the iDX6011 and iDX6011 Pro deliver unmatched performance, enabling seamless functionality and exceptional AI applications. These models build on the success of earlier NASync series products, such as the NASync DXP models, which garnered widespread attention and raised over $6.6 million during a Kickstarter campaign in March 2024.

Gigabyte Unveils a Diverse Lineup of AI PCs With Groundbreaking GiMATE AI Agent at CES 2025

GIGABYTE, the world's leading computer brand, unveiled its next-gen AI PCs at CES 2025. GiMATE, a groundbreaking AI agent for seamless hardware and software control, takes center stage in the all-new lineup, redefining gaming, creation, and productivity in the AI era. Powered by NVIDIA GeForce RTX 50 Series Laptop GPUs and NVIDIA NIM microservices for advanced AI NIM and RTX AI, AMD Ryzen AI, Intel NPU AI, and enhanced by Microsoft Copilot, the AORUS MASTER, GIGABYTE AERO, and the GIGABYTE GAMING series deliver cutting-edge performance with upgraded WINDFORCE cooling in sleek, portable designs.

GiMATE, GIGABYTE's exclusive AI agent, integrates with an advanced Large Language Model (LLM) and the "Press and Speak" feature, making laptop control more natural and intuitive. From AI Power Gear II for optimal energy efficiency to AI Boost II's precision overclocking, GiMATE ensures optimal settings for every scenario. AI Cooling delivers 0dB ambiance, perfect for work environments, while AI Audio and AI Voice optimize sound for any setting. Safeguard your screen with AI Privacy, which detects prying eyes and activates protection instantly. GiMATE aims to be users' smart AI Mate, that redefines laptops in users' daily lives.

Gigabyte Demonstrates Omni-AI Capabilities at CES 2025

GIGABYTE Technology, internationally renowned for its R&D capabilities and a leading innovator in server and data center solutions, continues to lead technological innovation during this critical period of AI and computing advancement. With its comprehensive AI product portfolio, GIGABYTE will showcase its complete range of AI computing solutions at CES 2025, from data center infrastructure to IoT applications and personal computing, demonstrating how its extensive product line enables digital transformation across all sectors in this AI-driven era.

Powering AI from the Cloud
With AI Large Language Models (LLMs) now routinely featuring parameters in the hundreds of billions to trillions, robust training environments (data centers) have become a critical requirement in the AI race. GIGABYTE offers three distinctive solutions for AI infrastructure.

Qualcomm Launches On-Prem AI Appliance Solution and Inference Suite at CES 2025

At CES 2025, Qualcomm Technologies, Inc., today announced Qualcomm AI On-Prem Appliance Solution, an on-premises desktop or wall-mounted hardware solution, and Qualcomm AI Inference Suite, a set of software and services for AI inferencing spanning from near-edge to cloud. The combination of these new offerings allows for small and medium businesses, enterprises and industrial organizations to run custom and off-the-shelf AI applications on their premises, including generative workloads. Running AI inference on premises can deliver significant savings in operational costs and overall total cost of ownership (TCO), compared to the cost of renting third-party AI infrastructure.

Using the AI On-Prem Appliance Solution in concert with the AI Inference Suite, customers can now use generative AI leveraging their proprietary data, fine-tuned models, and technology infrastructure to automate human and machine processes and applications in virtually any end environment, such as retail stores, quick service restaurants, shopping outlets, dealerships, hospitals, factories and shop floors - where the workflow is well established, repeatable and ready for automation.

Axelera AI Partners with Arduino for Edge AI Solutions

Axelera AI - a leading edge-inference company - and Arduino, the global leader in open-source hardware and software, today announced a strategic partnership to make high-performance AI at the edge more accessible than ever, building advanced technology solutions based on inference and an open ecosystem. This furthers Axelera AI's strategy to democratize artificial intelligence everywhere.

The collaboration will combine the strengths of Axelera AI's Metis AI Platform with the powerful SOMs from the Arduino Pro range to provide customers with easy-to-use hardware and software to innovate around AI. Users will enjoy the freedom to dictate their own AI journey, thanks to tools that provide unique digital in-memory computing and RISC-V controlled dataflow technology, delivering high performance and usability at a fraction of the cost and power of other solutions available today.

NVIDIA Unveils New Jetson Orin Nano Super Developer Kit

NVIDIA is taking the wraps off a new compact generative AI supercomputer, offering increased performance at a lower price with a software upgrade. The new NVIDIA Jetson Orin Nano Super Developer Kit, which fits in the palm of a hand, provides everyone from commercial AI developers to hobbyists and students, gains in generative AI capabilities and performance. And the price is now $249, down from $499.

Available today, it delivers as much as a 1.7x leap in generative AI inference performance, a 70% increase in performance to 67 INT8 TOPS, and a 50% increase in memory bandwidth to 102 GB/s compared with its predecessor. Whether creating LLM chatbots based on retrieval-augmented generation, building a visual AI agent, or deploying AI-based robots, the Jetson Orin Nano Super is an ideal solution to fetch.

Advantech Introduces Its GPU Server SKY-602E3 With NVIDIA H200 NVL

Advantech, a leading global provider of industrial edge AI solutions, is excited to introduce its GPU server SKY-602E3 equipped with the NVIDIA H200 NVL platform. This powerful combination is set to accelerate the offline LLM for manufacturing, providing unprecedented levels of performance and efficiency. The NVIDIA H200 NVL, requiring 600 W passive cooling, is fully supported by the compact and efficient SKY-602E3 GPU server, making it an ideal solution for demanding edge AI applications.

Core of Factory LLM Deployment: AI Vision
The SKY-602E3 GPU server excels in supporting large language models (LLMs) for AI inference and training. It features four PCIe 5.0 x16 slots, delivering high bandwidth for intensive tasks, and four PCIe 5.0 x8 slots, providing enhanced flexibility for GPU and frame grabber card expansion. The half-width design of the SKY-602E3 makes it an excellent choice for workstation environments. Additionally, the server can be equipped with the NVIDIA H200 NVL platform, which offers 1.7x more performance than the NVIDIA H100 NVL, freeing up additional PCIe slots for other expansion needs.

Amazon AWS Announces General Availability of Trainium2 Instances, Reveals Details of Next Gen Trainium3 Chip

At AWS re:Invent, Amazon Web Services, Inc. (AWS), an Amazon.com, Inc. company, today announced the general availability of AWS Trainium2-powered Amazon Elastic Compute Cloud (Amazon EC2) instances, introduced new Trn2 UltraServers, enabling customers to train and deploy today's latest AI models as well as future large language models (LLM) and foundation models (FM) with exceptional levels of performance and cost efficiency, and unveiled next-generation Trainium3 chips.

"Trainium2 is purpose built to support the largest, most cutting-edge generative AI workloads, for both training and inference, and to deliver the best price performance on AWS," said David Brown, vice president of Compute and Networking at AWS. "With models approaching trillions of parameters, we understand customers also need a novel approach to train and run these massive workloads. New Trn2 UltraServers offer the fastest training and inference performance on AWS and help organizations of all sizes to train and deploy the world's largest models faster and at a lower cost."

Microsoft Office Tools Reportedly Collect Data for AI Training, Requiring Manual Opt-Out

Microsoft's Office suite is the staple in productivity tools, with millions of users entering sensitive personal and company data into Excel and Word. According to @nixCraft, an author from Cyberciti.biz, Microsoft left its "Connected Experiences" feature enabled by default, reportedly using user-generated content to train the company's AI models. This feature is enabled by default, meaning data from Word and Excel files may be used in AI development unless users manually opt-out. As a default option, this setting raises security concerns, especially from businesses and government workers relying on Microsoft Office for proprietary work. The feature allows documents such as articles, government data, and other confidential files to be included in AI training, creating ethical and legal challenges regarding consent and intellectual property.

Disabling the feature requires going to: File > Options > Trust Center > Trust Center Settings > Privacy Options > Privacy Settings > Optional Connected Experiences, and unchecking the box. Even with an unnecessary long opt-out steps, the European Union's GPDR agreement, which Microsoft complies with, requires all settings to be opt-in rather than opt-out by default. This directly contradicts EU GDPR laws, which could prompt an investigation from the EU. Microsoft has yet to confirm whether user content is actively being used to train its AI models. However, its Services Agreement includes a clause granting the company a "worldwide and royalty-free intellectual property license" to use user-generated content for purposes such as improving Microsoft products. The controversy raised from this is not new, especially where more companies leverage user data for AI development, often without explicit consent.

Aetina Debuts at SC24 With NVIDIA MGX Server for Enterprise Edge AI

Aetina, a subsidiary of the Innodisk Group and an expert in edge AI solutions, is pleased to announce its debut at Supercomputing (SC24) in Atlanta, Georgia, showcasing the innovative SuperEdge NVIDIA MGX short-depth edge AI server, AEX-2UA1. By integrating an enterprise-class on-premises large language model (LLM) with the advanced retrieval-augmented generation (RAG) technique, Aetina NVIDIA MGX short-depth server demonstrates exceptional enterprise edge AI performance, setting a new benchmark in Edge AI innovation. The server is powered by the latest Intel Xeon 6 processor and dual high-end double-width NVIDIA GPUs, delivering ultimate AI computing power in a compact 2U form factor, accelerating Gen AI at the edge.

The SuperEdge NVIDIA MGX server expands Aetina's product portfolio from specialized edge devices to comprehensive AI server solutions, propelling a key milestone in Innodisk Group's AI roadmap, from sensors and storage to AI software, computing platforms, and now AI edge servers.
Return to Keyword Browsing
Feb 21st, 2025 07:01 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts