speed.python.org tracks Python module performance improvement against several modules across Python versions. In the real world, the module level speed improvements don’t directly translate to application performance improvements. The application is composed of several hundreds of dependencies the performance of one specific module doesn’t improve total application performance. Nonetheless, it can improve performance parts of the API or certain flows.

When I first heard the faster CPython initiative, I was intrigued to find out, how does it translate to small application performance across various versions since a lot of critical components are already in C like Postgresql driver. The faster CPython presentation clear states, the performance boost is only guaranteed for pure python code and not C-extensions.

In this post, I’ll share my benchmark results on a couple of hand picked snippets. There is a PyPI package data, do some transformation or do some network operations or file operations. How does that perform against different Python versions.

Setup

The benchmark was run on Intel 11th Gen, i7 @ 2.30GHz with 16 cores. During entire benchmark no other user initialized programs was run like browser or text editor.
The benchmark result was measured using hyperfine command line tool with --warmup 1 flag and 10 runs for each version.
No CPU pinning during benchmark.
Python 3.9.13, Python 3.10.5, Python 3.11.0 versions were used for benchmarking.
Median of 10 runs is used over mean.

Here is the result of the benchmark.

                           Python performance - 3.9 vs 3.10 vs 3.11
┏━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Name                     ┃ Median 3.9 (s) ┃ Median 3.10 (s) ┃ Median 3.11 (s) ┃ 3.11 Change ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ pypicache                │ 7.4096         │ 7.2654          │ 6.9122          │ 6.71%       │
│ pypi_compression         │ 57.2634        │ 57.3878         │ 57.3969         │ -0.23%      │
│ pypi_postgres            │ 11.4657        │ 11.3525         │ 11.1345         │ 2.89%       │
│ pypi_sqlite_utils        │ 35.6113        │ 34.8789         │ 34.3522         │ 3.54%       │
│ pypi_write_file          │ 17.7075        │ 17.2318         │ 16.7363         │ 5.48%       │
│ pypi_write_file_parallel │ 12.7005        │ 13.0702         │ 12.5040         │ 4.33%       │
│ pypi_zstd_compression    │ 1.4794         │ 1.4687          │ 1.4643          │ 1.02%       │
└──────────────────────────┴────────────────┴─────────────────┴─────────────────┴─────────────┘

Experiments

PyPI Cache

import json
from operator import itemgetter
from urllib.parse import urlparse
from rich.console import Console
from rich.table import Table
from collections import defaultdict


def get_domain(addr):
    domain = urlparse(addr)
    return domain.netloc


def count_domains(data):
    result = defaultdict(int)
    for package in data:
        domain = get_domain(package['info']['home_page'])
        result[domain] += 1

    return result


def count_licenses(data):
    result = defaultdict(int)
    for package in data:
        classifiers = package['info'].get('classifiers', '')
        license = ''
        if classifiers:
           for classifier in classifiers:
               if 'License' in classifier:
                  license = classifier.split('::')[-1].strip()
                  result[license] += 1
    return result


def get_top(data, n=10):
    return sorted(data.items(), key=itemgetter(1), reverse=True)[:n]


def main():
    data = json.load(open('../pypicache/pypicache.json'))
    domains = count_domains(data)
    top_domains = get_top(domains, 10)
    licenses = count_licenses(data)
    top_licenses = get_top(licenses, 10)

    # Print result in a table
    table = Table(title="Project domains")
    table.add_column("Domain")
    table.add_column("Count")
    table.add_column("Percentage")

    for domain, count in top_domains:
        table.add_row(str(domain), str(count), str(count/len(data) * 100))

    console = Console()
    console.print(table)

    table = Table(title="Project licenses")
    table.add_column("License")
    table.add_column("Count")
    table.add_column("Percentage")


    for license, count in top_licenses:
        table.add_row(str(license), str(count), str(count/len(data) * 100))

    console.print(table)


if __name__ == "__main__":
   main()

The snippet loads the PyPI JSON file downloaded (720 MB) from https://pypicache.repology.org/pypicache.json.zst and does IO operations. Extract the zstd file and place it in pypicache directory.

The IO operations performs five activities

Find home page domain frequencies across various packages.
Get top 10 home page domains.
Find license frequencies of the packages.
Get top 10 used licenses.
Print the results in the table.

Python 3.11 faster compared to Python 3.9 by 6.71%. The median execution times - Python 3.9 - 7.40s, Python 3.10 - 7.26s, Python 3.11 - 6.91s.

PyPI Compression

import bz2
import json
import pathlib


def main():
    with open('../pypicache/pypicache.json', 'rb') as fp:
        filename = '../pypicache/pypicache.json.bz2'
        with bz2.open(filename, 'wb') as wfp:
            wfp.write(bz2.compress(fp.read()))

        pathlib.Path(filename).unlink()


if __name__ == "__main__":
    main()

The snippet compresses the decompressed PyPI JSON data to bz2 format and deletes the compressed file.

Python 3.11 was the slowest, the performance degraded by 0.23% compared to 3.9 and the median execution times are Python 3.9 - 57.26 s, Python 3.10 - 57.38 s, Python 3.11 - 57.39s.

Interesting part is Python 3.9 is faster than Python 3.10 and Python 3.10 is faster than 3.11. (No hunch why is it so)

PyPI Postgres

import json
import psycopg2


def main():
    data = json.load(open('../pypicache/pypicache.json'))
    conn = psycopg2.connect("dbname=scratch user=admin password=admin")
    cur = conn.cursor()
    stop = 100000
    for idx, package in enumerate(data[:stop]):
        info = package['info']
        cur.execute("""insert into
        pypi(author, author_email, bugtrack_url, license, maintainer, maintainer_email, name, summary, version)
        values(%s, %s, %s, %s, %s, %s, %s, %s, %s)""",
                    (info['author'], info['author_email'],
                     info['bugtrack_url'], info['license'], info['maintainer'],
                     info['maintainer_email'], info['name'], info['summary'],
                     info['version']))
        if idx % 100 == 1 or idx == stop:
            conn.commit()
    conn.commit()
    cur.execute('select count(*) from pypi')
    print("Total rows: ", cur.fetchone())
    cur.execute('delete from pypi')
    conn.commit()

The snippet uses psycopg2 to insert hundred thousand packages into a postgres database.

Insert pypi package details into a table and commit 100 records at a time to pypi table.
Find total number of inserted records in pypi table.
Delete the pypi table.

Python 3.11 is faster compared to Python 3.10 by 2.89%. The median execution times, Python 3.9 - 11.46s, Python 3.10 - 11.35s, Python 3.11 - 11.13s.

Since most of the code was doing network call, it’s surprising to see a small performance improvement in Python 3.11.

PyPI SQLite Utils

import json
from sqlite_utils.db import Database, Table
from pathlib import Path

def main():
    data = json.load(open('../pypicache/pypicache.json'))
    db_name = 'pypi.db'
    db = Database(db_name)
    table = Table(db, 'pypi')

    for idx in range(1000):
        table.insert_all(data[idx * 100:idx * 100 + 100])


    print("Rows: ", table.count)
    Path(db_name).unlink()

if __name__ == "__main__":
   main()

The snippet inserts hundred thousand PyPI package data to sqlite database over ten thousand iterations using sqlite_utils package and deletes the sqlite file.

Python 3.11 is faster compared to Python 3.9 by 3.54%. The median execution times, Python 3.9 - 35.61s, 34.87s, 34.35s.

PyPI Write To File

import json
from pathlib import Path


def write_to_file(directory, package):
    name = package['info']['name']

    with open(directory / (name + ".json"), "w") as fp:
        fp.write(json.dumps(package))


def delete_files(directory):
    for filename in list(directory.iterdir()):
        filename.unlink()

    directory.rmdir()


def main():
    data = json.load(open('../pypicache/pypicache.json'))
    directory = Path("/tmp/pypi")
    directory.mkdir()
    for package in data:
        write_to_file(directory=directory, package=package)
    delete_files(directory)


if __name__ == "__main__":
   main()

The snippet writes each PyPI package info to a separate JSON file and deletes all the file.

Python 3.11 is faster compared to Python 3.9 by 5.48%. The median execution times, Python 3.9 - 17.07 s, Python 3.10 - 17.23 s, Python 3.11 - 16.73 s.

PyPI Parallel Write To File

import json
from pathlib import Path
from multiprocessing import Pool
from functools import partial


def write_to_file(directory, package):
    name = package['info']['name']

    with open(directory / (name + ".json"), "w") as fp:
        fp.write(json.dumps(package))


def delete_files(directory):
    for filename in list(directory.iterdir()):
        filename.unlink()

    directory.rmdir()


def main():
    data = json.load(open('../pypicache/pypicache.json'))
    directory = Path("/tmp/pypi")
    directory.mkdir()
    with Pool(33) as p:
        p.map(partial(write_to_file, directory), data)
    delete_files(directory)


if __name__ == "__main__":
   main()

The snippet uses Python multi-processing to write PyPI package to a separate JSON file and deletes all the JSON file serially. 33 worker in the pool.

Python 3.11 is faster compared to Python 3.9 by 4.33%. The median execution time, Python 3.9 - 12,70 s, Python 3.10 - 13.07 s, Python 3.11 - 12.50s.

PyPI zstd Compression

import zstandard as zstd
import json
import pathlib


def main():
    with open('../pypicache/pypicache.json', 'rb') as fp:
        filename = '../pypicache/pypicache_benchmark.json.zstd'
        with zstd.open(filename, 'wb') as wfp:
            wfp.write(zstd.compress(fp.read()))

        pathlib.Path(filename).unlink()


if __name__ == "__main__":
    main()

The snippet compress the PyPI JSON file to zstd file format using zstandard library.

Python 3.11 is faster compared to Python 3.10 by 1.02%. In general, it’s safe to say, there is no useful performance improvement here. The median execution times, Python 3.9 - 1.47s, Python 3.10 - 1.46 s, Python 3.11 - 1.46s.

Compared to bz2 compression, zstd is atleast ~50 times faster.

Benchmark runner

The benchmark runner is similar for all experiments.

#!/usr/bin/env fish
ls -lat | grep venv | xargs rm -rf
echo "Running 3.9 benchmark"
python3.9 -m venv .venv_3_9
source .venv_3_9/bin/activate.fish
pip install -r requirements.txt
hyperfine --warmup 1 'python run_benchmark.py' --export-json py_3_9.json
echo "Running 3.10 benchmark"
python3.10 -m venv .venv_3_10
source .venv_3_10/bin/activate.fish
pip install -r requirements.txt
hyperfine --warmup 1 'python run_benchmark.py' --export-json py_3_10.json
echo "Running 3.11 benchmark"
python3.11 -m venv .venv_3_11
source .venv_3_11/bin/activate.fish
pip install -r requirements.txt
hyperfine --warmup 1 'python run_benchmark.py' --export-json py_3_11.json

python3.9, python3.10, python3.11 are aliases to pyenv python versions.

Conclusion

If the code uses standard library JSON, switching to Python 3.11 will provide significant performance improvement.
If you’re using bz2 for compression, consider using zstd that is 50 times faster. (I wasn’t aware of zstd format untill I downloaded pypi cache data).
If you’re using Python 3.8 or 3.9, it’s better to upgrade to Python 3.11 over Python 3.10.
Even when the code is entirely using C-extension like psycopg2, there is performance improvement. So it’s good to benchmark against 3.11 when all the dependencies run on Python 3.11.
Since all programs load JSON data, the performance gain can be due to pure JSON improvement.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.