Groq, an AI chip company founded by two former Google employees, has attracted attention for its distinct approach, with LPU chips capable of extremely high processing speeds.
In February 2024, Groq CEO Jonathan Ross held a meeting in Oslo, Norway, attended by members of the Norwegian parliament and numerous technology executives. During the event, he demonstrated a breakthrough product: an AI chatbot capable of answering questions almost instantly – faster than human reading speed.
However, the presentation did not go entirely as planned. The chatbot responded more slowly than expected, which worried Ross, as the system was being operated by a data center in Europe using Groq chips.
“I kept checking the numbers,” Ross told Forbes. “People didn’t understand why I was so distracted.”
The issue was later identified: a sudden surge of new users. A day before the demonstration, a developer unexpectedly shared information on X about “a super-fast AI chatbot,” triggering massive traffic to Groq’s servers and temporarily overloading the system.

(Groq CEO Jonathan Ross. Photo: Groq)
According to CNBC, Nvidia is in the process of acquiring assets from AI chip startup Groq in a deal valued at approximately $20 billion. As announced on Groq’s blog, the agreement is non-exclusive, meaning Nvidia would acquire rights to use Groq’s technology rather than taking over the entire company. Groq would continue operating independently under a new CEO.
Meanwhile, Ross, along with President Sunny Madra and other senior executives, would join Nvidia to help develop and scale the licensed technology.
The Power of Groq Chips
When videos showing Groq’s chatbot answering questions at extraordinary speeds began circulating on social media in early 2024, the company was seen as a potential challenger to industry giants such as Nvidia, AMD, and Intel. Analysts suggested Groq could open the door to new AI applications and accelerate the broader AI race.
According to Reuters, while Nvidia adapts GPU architectures originally designed for graphics processing to AI training, Groq has taken a different path by building a fully custom architecture called the Language Processing Unit (LPU). The LPU is optimized for deterministic, single-token inference. Each LPU reportedly costs around $20,000, similar to Nvidia’s A100 GPU.
One key metric for evaluating AI hardware is tokens per second (TPS) when running large language models. Higher TPS generally translates into faster and smoother AI responses.
In May, Indian technology expert Prathisht Aiyappa published an analysis on Medium comparing Groq LPUs with Nvidia’s H100 GPUs. He noted that Groq chips operate in a deterministic, token-by-token manner, while Nvidia GPUs rely on probabilistic, batch-based processing. According to his analysis, Groq delivers approximately 300 – 500 TPS with latency as low as 1 – 2 milliseconds, while Nvidia chips typically achieve 60 – 100 TPS with 8 – 10 milliseconds of latency.
Groq’s own tests showed that when running LLaMA 2 with 70 billion parameters, token generation speed reached 241 TPS – more than double that of many competing services.
“When latency becomes the bottleneck affecting user experience, Groq’s advantage is maximized,” Aiyappa wrote.
While Nvidia GPUs remain the industry standard for both training and inference – thanks to their parallel architecture and software ecosystem such as CUDA and TensorRT – Groq’s LPUs excel specifically at text-sequence inference. For model training, however, companies still rely on Nvidia GPUs or similar hardware.

(A Groq LPU chip. Photo: Groq)
According to Aiyappa, Groq is not trying to “win AI” outright but is instead targeting a narrow yet high-value market segment where Nvidia has been less effective: ultra-low latency, determinism, and developer control. Rather than addressing every possible use case, Groq focuses on delivering the fastest and most precise inference performance.
In the second quarter of 2024, Groq quietly signed contracts with a range of organizations, from startups to national institutions, in areas such as real-time transcription, industrial robotics, defense-grade edge AI, and healthcare. Its clients reportedly include Meta, Argonne National Laboratory (for nuclear research), and Aramco Digital.
“The ability to tightly integrate the compiler with Groq hardware allows us to deliver speeds we couldn’t achieve using Nvidia or AWS solutions,” a CTO of a US-based AI company told CNBC.
“Groq chips really strike at a critical weakness,” Yann LeCun remarked last year.
The Founder’s Vision
Jonathan Ross is known for maintaining a low public profile. According to his LinkedIn page, he studied mathematics and computer science at the Courant Institute of Mathematical Sciences at New York University and is listed as a former student of Yann LeCun, one of the pioneers of modern AI.
Ross began a PhD program at NYU in 2006 but left in 2008. He later joined Google in 2011, where he played a key role in early systems that laid the foundation for Google’s Tensor Processing Unit (TPU). From 2013 to 2015, he worked on the design and deployment of first-generation TPUs before joining Google X.
In 2016, Ross co-founded Groq with Doug Wightman, aiming to build chips optimized specifically for AI inference. Although the company struggled for years and faced repeated near-failures, it eventually gained traction. By 2021, Groq had raised $300 million and achieved unicorn status.
Despite posting just $3.4 million in revenue and an $88.3 million loss in 2023, Groq benefited from the explosion in AI demand in 2024. The company raised $640 million in August and another $750 million in September, reaching a valuation of $6.9 billion.
Now, Groq’s journey appears to be entering a new phase, as its core technology and key personnel become integrated into Nvidia through the reported deal – pending regulatory approval.
“There’s always room for innovative companies,” Ross said when named to the TIME 100 AI 2024 list. “Demand is enormous. As AI chips get cheaper, people will buy more.”
VnExpress (Summary)