Wafer-scale chips can integrate hundreds of thousands of computation cores at the same time | Alamy

Edinburgh University researchers develop software to make AI 10 times faster

Researchers at the University of Edinburgh have developed software that could allow artificial intelligence (AI) systems to operate 10 times faster.

The researchers developed a software system called WaferLLM, designed specifically for wafer-scale chips to help improve their performance. Wafer-scale chips are the world's largest computer chip and are roughly the same size as a dinner plate.

"Wafer-scale computing has shown remarkable potential, but software has been the key barrier to putting it to work,” said Dr Luo Mai, lead researcher and reader at the University of Edinburgh's School of Informatics.

The process is based on new software that lets trained large language models (LLMs) draw conclusions from fresh data – a process called inference – in a much more efficient way.

“With WaferLLM, we show that the right software design can unlock that potential, delivering real gains in speed and energy efficiency for large language models,” said Dr Mai. “This is a step toward a new generation of AI infrastructure – one that can support real-time intelligence in science, healthcare, education, and everyday life.”

The researchers evaluated the software at EPCC, the UK’s National Supercomputing Centre based at the University of Edinburgh. The EPCC operates Europe's largest cluster of advanced Wafer Scale Engine processors and is also the future home of the UK’s next supercomputer, funded by the UK Government to the tune of £750m.

“Dr Mai’s work is truly ground-breaking and shows how the cost of inference can be massively reduced,” said Professor Mark Parsons, the director of EPCC.

https://www.holyrood.com/news/view,professor-mark-parsons-supercomputersaiand-the-dangers-of-a-digital-dark-age

Wafer-scale chips differ from typical AI chips not only in terms of size but also in how they operate. The larger chips have been designed to carry out many computation tasks simultaneously within a single chip, which is aided by massive on-chip memory.

With all the computation taking place on the same piece of silicon, data can move between different parts of the chip much faster than if it had to travel between separate groups of chips and memory via a network.

Because of this, a wafer-scale chip can integrate hundreds of thousands of computation cores all working in parallel, making it more efficient at completing the mathematical operations that power neural networks – the backbone of LLMs like ChatGPT.

The accelerated performance could have a major impact on industries that need LLMs to generate fresh insights in real-time in under a millisecond, such as chatbots, finance, healthcare, and scientific discovery, the researchers say.

‘Mistakes were made’ by Scottish ministers during Covid, Kate Forbes says

UK Covid-19 Inquiry: Scottish Government response was 'too little, too late'

First Minister grilled over missing Mossmorran transition plan

Edinburgh University researchers develop software to make AI 10 times faster

Ex-Conservative council leader jailed for romance scam

Rape perpetrators will be recorded as male, says chief constable

Scots more likely to back Greens than Labour, poll finds

Anas Sarwar: Muscatelli report will stop Scottish politics being an ‘economics-free zone’

‘Mistakes were made’ by Scottish ministers during Covid, Kate Forbes says

UK Covid-19 Inquiry: Scottish Government response was 'too little, too late'

First Minister grilled over missing Mossmorran transition plan

Edinburgh University researchers develop software to make AI 10 times faster

Holyrood Newsletters

Ex-Conservative council leader jailed for romance scam

Rape perpetrators will be recorded as male, says chief constable

Scots more likely to back Greens than Labour, poll finds

Anas Sarwar: Muscatelli report will stop Scottish politics being an ‘economics-free zone’