Search Paper
  • Home
  • Login
  • Categories
  • Post URL
  • Academic Resources
  • Contact Us

 

Acceleration of CUDA Kernels through Fusion Measurements for AvgPool2D and ReLU

google+
Views: 307                 

Author :  Andreas Falkenberg

Affiliation :  Dr Falkenberg Technology Consulting Inc

Country :  USA

Category :  Artificial Intelligence

Volume, Issue, Month, Year :  15, 2, January, 2025

Abstract :


Acceleration of LLMs (large language models) requires the use of always advancing compiler technologies. The fusion of operators is one of the promising techniques to considerably improve the throughput of LLMs. This paper discusses the impact of operator fusion on the direct operator performance. The paper compares throughputs between pure CPU implementation, versus two kernel implementations versus a fused single kernel solution for AvgPool2D fused with ReLU.

Keyword :  AvgPool2D, ReLU, Kernel, AI, LLM, GPU, CPU

Journal/ Proceedings Name :  CS & IT

URL :  https://csitcp.org/abstract/15/152csit13

User Name : alex
Posted 22-03-2025 on 17:08:30 AEDT



Related Research Work

  • Augmented And Synthetic Data In Artificial Intelligence
  • Nohumansrequired: Autonomous High-quality Image Editing Triplet Mining
  • Cerberusdet: Unified Multi-dataset Object Detection
  • Gigacheck: Detecting Llm-generated Content

About Us | Post Cfp | Share URL Main | Share URL category | Post URL
All Rights Reserved @ Call for Papers - Conference & Journals