Skip to main content
  1. Posts/

Benchmark results between Pandas, Polars and DuckDB

··300 words·2 mins·

🚀 Benchmark of Python data analysis libraries
#

I compared the performance of Pandas 3.0, Polars, and DuckDB processing both large (5 million rows) and small tables. The goal: measure execution time and RAM usage for typical data analysis operations.

📊 Key results
#

  • Polars was the fastest in almost all operations.
  • 🧠 Pandas remains the simplest to use and very competitive.
  • 🐤 DuckDB shines in SQL-like queries, but showed higher memory usage in complex transformations when writing tables. However, when using the Arrow format, it is faster.

🧩 Explanation in brief
#

  • Pandas → The classic Python library for data analysis. Easy, familiar, but not the fastest. Simple to write, read, and debug.
  • Polars → A modern alternative written in Rust. Very fast and memory-efficient. CSV reading, groupbys, and joins usually see the biggest gains.
  • DuckDB → An embedded SQL engine that lets you work with data as if it were database tables. Useful when SQL is clearer than chaining DataFrame operations.

If you’re just starting, Pandas is the friendliest path. If you’re looking for raw speed, Polars surprises. If you want maximum speed, go with DuckDB.

More information at the link 👇

Otros tests para considerar
#

More in the following external reference.
Also published on LinkedIn.
Juan Pedro Bretti Mandarano
Author
Juan Pedro Bretti Mandarano