JFS6.2 | Cost-effective LLM accelerator using processing in memory technology (Invited)

Event Time

Originally Aired - Thursday, June 20 10:20 AM - 10:45 AM

Info Alert

Create or Log in to My Show Planner to see Videos and Resources.

Info Alert

Your account does not have access to this session.

Videos

Resources

Create or Log in to My Show Planner to see Videos and Resources.


{{chatHeaderContent}}

{{chatBodyContent}}

Resources

Create or Log in to My Show Planner to see Videos and Resources.


Info Alert

This Session Has Not Started Yet

Be sure to come back after the session starts to have access to session resources.

Event Location

Location: Tapa 3


Event Information

Title: JFS6.2 | Cost-effective LLM accelerator using processing in memory technology (Invited)

Description:


Authors:
hyungdeok Lee1, guhyun kim1, dayeon yun1, ilkon kim1, yongkee kwon1, Euicheol Lim1 1SK Hynix

Large language model (LLM)-based services continue to improve their performance requires the system with both large memory capacity and high memory bandwidth. For the GPT-3 175 billion model to operate at a minimum, it requires 800GB of storage. In addition, from frequent memory access and limited data reuse also affects memory bandwidth. More powerful memory performance requirements, however, comes with significant costs increase. The expenses associated with operating the necessary equipment and services to handle these memory and bandwidth requirements are considerable.



SK hynix aims to solve this issue by introducing a Processing in memory (PIM) device and PIM based accelerator called AiM and AiMX, respectively. By exploiting true bank-level parallelism, AiM and AiMX is expected to enhance the performance of LLM-based services as a core component of disaggregated system and multi-head attention acceleration. Additionally, AiM also has a potential in on-device AI, in direction of both performance and energy consumption with low batch size and reducing off-chip data movement.

Type: Joint Technical Session


Speakers


Parent Sessions

Thursday, June 20, 2024 - 09:55 AM
JFS6 | Memory-Centric Computing for LLM