Large Language Model Training

19h

IndexCache, a new sparse attention optimizer, delivers 1.82x faster inference on long-context AI models

Researchers at Tsinghua University and Z.ai built IndexCache to eliminate redundant computation in sparse attention models ...

Tech Xplore on MSN

Can AI understand literature? Researchers put it to the test

Even with all the recent advances in the ability of large language models (like ChatGPT) to help us think, research, ...

Nvidia's Nemotron-Cascade 2 wins math and coding gold medals with 3B active parameters — and its post-training recipe is now open-source

Nvidia's Nemotron-Cascade 2 is a 30B MoE model that activates only 3B parameters at inference time, yet achieved gold ...

Geeky Gadgets

PicoLM Framework: Simplifying Language Model Training and Analysis

Have you ever found yourself deep in the weeds of training a language model, wishing for a simpler way to make sense of its learning process? If you’ve struggled with the complexity of configuring ...

Fast Company

OpenAI unveils its new GPT-4.5 large language model

OpenAI released a new base model on Thursday called GPT-4.5, which the company said is its best and smartest model for chat yet. It’s not a reasoning model like OpenAI’s o1 and o3 models, but it can ...

Newsweek

DeepSeek’s More Efficient AI Model Throws Doubt on Tech’s Energy Outlook

A Chinese AI company's more frugal approach to training large language models could point toward a less energy-intensive—and more climate-friendly—future for AI, according to some energy analysts. "It ...

Tech Xplore on MSN

'Neuron-freezing' technique can stop LLMs from giving users unsafe responses

Researchers have identified key components in large language models (LLMs) that play a critical role in ensuring these AI ...

11don MSN

What Is Inference? Explaining the Massive New Shift in AI Computing

The focus of artificial-intelligence spending has gone from training models to using them. Here’s how to understand the ...

The Conversation

Large language models: how the AI behind the likes of ChatGPT actually works

Mark Stevenson has previously received funding from Google. The arrival of AI systems called large language models (LLMs), like OpenAI’s ChatGPT chatbot, has been heralded as the start of a new ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results