Polars has announced the release of its new GPU engine, powered by RAPIDS cuDF, which significantly enhances data processing speeds on NVIDIA GPUs. This advancement allows data scientists to process hundreds of millions of rows of data in seconds on a single machine, according to the NVIDIA Technical Blog.
Growing Data Challenges
Traditional data processing libraries such as pandas are single-threaded and often become impractical when handling datasets beyond a few million rows. While distributed data processing systems can manage billions of rows, they introduce complexity and overhead for smaller datasets. This presents a gap in tools that can efficiently process tens of millions to a few hundred million rows of data, a common need in industries such as finance, retail, and manufacturing for tasks like model development, demand forecasting, and logistics.
Polars, a rapidly growing Python library designed for data scientists and engineers, aims to address these challenges. It employs advanced query optimizations to minimize unnecessary data movement and processing, enabling smooth handling of hundreds of millions of rows on a single machine. Polars offers an appealing solution for medium-scale data processing, bridging the gap between single-threaded tools and complex distributed systems.
Bringing NVIDIA Accelerated Computing to Polars
Polars leverages multi-threaded execution, advanced memory optimizations, and lazy evaluation to deliver significant out-of-the-box acceleration compared to other CPU-only data manipulation tools. However, as data processing demands grow across various industries, higher performance is required. This is where accelerated computing becomes essential.
cuDF, part of the NVIDIA RAPIDS suite of CUDA-X libraries, is a GPU-accelerated DataFrame library that harnesses the massive parallelism of GPUs to significantly enhance data processing performance. By partnering with NVIDIA, the Polars team has integrated the speed of cuDF with Polars’ efficiency, resulting in performance boosts of up to 13x compared to CPU-based Polars. This integration allows users to maintain an interactive experience even as their data processing workloads scale to hundreds of millions or billions of rows.
The Polars GPU engine is built directly into the Polars Lazy API. Users can access GPU acceleration for their workflows by installing polars[gpu]
via pip and passing [engine="gpu"]
to the collect operation. This approach ensures efficient execution and minimal memory usage through Polars’ query optimizer, full compatibility with Polars’ ecosystem of data visualization, I/O, and machine learning libraries, and zero changes to existing Polars code.
pip install polars[gpu] --extra-index-url=https://pypi.nvidia.com import polars as pl (transactions .group_by("CUST_ID") .agg(pl.col("AMOUNT").sum()) .sort(by="AMOUNT", descending=True) .head() .collect(engine="gpu"))
Conclusion
The Polars GPU engine powered by RAPIDS cuDF is now available in open beta, offering data scientists and engineers a powerful tool for medium-scale data processing. By accelerating Polars workflows up to 13x on NVIDIA GPUs, the engine efficiently handles datasets of hundreds of millions of rows without the overhead of distributed systems. The Polars GPU engine is seamlessly integrated into the Polars API, making it easily accessible to all users.
Getting Started with the Polars GPU Engine
For more information and to get started with the Polars GPU engine, visit the official NVIDIA Technical Blog.
Image source: Shutterstock