This guide provides practical recommendations for starting and structuring a Python project in a clean and sustainable way.
Source codePython project recommendations
Setup and project structure
Project structure
A well-structured Python project generally follows this layout:
my_project/
├── src/
│ └── my_project/
│ ├── __init__.py
│ ├── some_package
│ │ ├── module_name.py
│ │ └── __init__.py
│ └── module.py
├── tests/
│ ├── __init__.py
│ └── test_module.py
├── docs/
│ ├── conf.py # Sphinx config
│ └── index.rst
├── notebooks/ # Jupyter notebooks
├── scripts/ # Utility scripts
├── data/ # useful test data
│ ├── raw/
│ └── processed/
├── pyproject.toml
├── README.md
├── LICENSE
└── .gitignore
As explained in packaging.python.org
the src/ layout is now the recommended approach because it prevents a classic pitfall:
without it, Python may import your package directly from the root instead of the installed version,
masking packaging errors. With src/, you must install the package to import it, ensuring your
tests run under the same conditions as your users.
Warning
All project data should be kept as small as possible. Large datasets can significantly slow down the CI pipeline, increase repository size, and make cloning the project more expensive. As a general recommendation, each individual file should remain below 10 MB. If larger datasets are required, consider alternative approaches such as downloading data during the CI pipeline, generating synthetic datasets, or storing them externally.
When versioning large files is unavoidable, tools such as Git LFS (Large File Storage) can be used to keep the repository lightweight while still tracking large assets.
Files typically found at the root
README.md: Project entry point: description, installation, quick usage, CI badges.LICENSE: Essential for any public project. MIT, Apache 2.0, GPL-2 and GPL-3 are the most common for open source. Without an explicit license, code is technically "all rights reserved"..gitignore: Specifies intentionally untracked files that Git should ignore.pyproject.toml: Modern unified configuration file.CHANGELOG.md: History of changes per version, ideally in Keep a Changelog format.CITATION.cff: A YAML file that tells others how to cite your project in academic work. It is recognized natively by Github (which displays a "Cite this repository" button) and by platforms like Zenodo or Zotero. Particularly important for research software.
packaging
Python dependency management has historically been built around tools like pip (with virtual environments) and conda,
which remain the most widely adopted solutions today. They are stable, well-documented, and form the foundation of many
existing projects and ecosystems.
More recently, a new generation of tools such as uv and pixi has emerged. These tools aim to simplify workflows,
improve performance, and provide more reproducible environments, often combining features that previously required multiple tools.
While uv and pixi can be seen as modern replacements or improvements in certain contexts, it is important to note that:
-
pip/pyproject.tomlremains the standard for packaging and distribution on PyPI -
condaremains essential for scientific computing and system-level dependencies
For this reason, most projects still rely on pip or conda for distribution and compatibility, while optionally using uv or pixi to improve the developer experience. In any case, built packages are still psuhed and distributed though PyPi and Conda channels like conda-forge. uv and pixi use those distribution platform and standards to build environments.
pip: pyproject.toml
Since PEP 517/518 and especially PEP 621,
pyproject.toml is the unified standard replacing setup.py and setup.cfg. It centralizes build system config, metadata,
and tooling (linters, formatters, etc.).
As a minimal example a pyproject.toml file may contain
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[project]
name = "my_project"
version = "0.1.0"
description = "A short description"
readme = "README.md"
requires-python = ">=3.10"
license = { text = "MIT" }
authors = [{ name = "Your Name", email = "you [at] example.com" }]
dependencies = [
"requests>=2.28",
"pandas>=2.0",
]
[project.optional-dependencies]
dev = ["pytest", "ruff", "mypy"]
docs = ["sphinx", "furo"]
[project.scripts]
my-cli = "my_project.cli:main"
[project.urls]
Homepage = "https://github.com/you/my_project"
All information for every section are available at writing-pyproject-toml
Note
project.optional-dependencies are extra dependencies which come by using commands such as
pip install my_project[dev]
Note
the tip How to push a package on Pypi can help you to push your first package on Pypi.
conda
Conda is particularly well suited for data science and machine learning projects because it can manage non-python dependencies as well (such as C libraries, CUDA, BLAS or GDAL).
There are two main ways to use Conda in a project: either by defining an environment file (commonly environment.yml),
or by describing the project and its dependencies in a Conda recipe in order to distribute it through a Conda channel (e.g., conda-forge).
environment file
An environment file defines a reproducible development environment for a project. It lists the required dependencies and their versions, allowing anyone to recreate the same environment easily.
The environment can then be created with:
conda env create -f environment.yml
as explained in www.anaconda.com/working-with-conda/environments
The environment.yml file can be generated thanks to the command
conda env export > environment.yml
This approach is commonly used for local development, research projects, or internal workflows, where the goal is simply
to ensure that collaborators can reproduce the same environment.
conda recipe
A Conda recipe describes how to build and package a project as a Conda package.
It specifies metadata, build instructions, and dependencies required to install and run the package.
Recipes are typically used when you want to distribute a project through a Conda channel, such as conda-forge, so that users can install it directly with:
conda install my-package
This approach is more suitable for publishing reusable packages that others can easily install through the Conda ecosystem. A step-by-step procedure is available at https://conda-forge.org/docs/maintainer/adding_pkgs/
Pixi
Pixi is a modern tool, cross-platform, multi-language package manager and workflow tool built on top of the Pip/Conda ecosystem.
It simplifies environment and dependency management for Python projects and multi-language workflows, offering reproducible builds and a user-friendly CLI.
Pixi makes use of packages deployed on Conda channels or Pypi, but is much faster at resolving dependencies than Conda, and better at reproducible environements.
While still relatively new and less widely adopted than Conda or Pip, Pixi provides a tools for managing complex projects with system-level dependencies.
Here is batch of useful Pixi commands
- Creating a new environment
pixi create my_project - Installing packages
pixi add numpy pandas - Activate a Pixi environment
bash pixi shell # from the directory containing the pixi.lock - Running a command in the environment
Run a command within the environment without activating it manually:
pixi run python script.py - Updating packages
Update all dependencies in your environment:
pixi update - Exporting the environment
Export the current environment to a
pixi.lockfile for reproducibility:This lockfile can then be shared with collaborators to recreate the exact same environment thankspixi exportpixi install # from the directory containing the pixi.lock
uv
uv is a modern Python package manager designed as a fast alternative to tools like pip and virtualenv, with a strong focus on speed and simplicity. Written in Rust, it enables fast dependency installation and virtual environment management while remaining fully compatible with standard formats such as requirements.txt and pyproject.toml. Lightweight and efficient, uv is ideal for developers who want a smooth and streamlined Python workflow without the overhead of more complex tools.
Here is batch of useful uv commands
- Create a virtual environment
uv venv - Install a package
uv pip install requests # or install all project dependencies uv pip install -r requirements.txt - List installed packages
uv pip freeze - Generate a locked dependency file
uv pip compile pyproject.toml - Running a command in the environment
uv run script.py - Running a command in the environment with a temporary dependency
uv run --with requests script.py - Installing packages
uv add numpy - Synchronize your environment with the project
uv sync
uv includes uvx, a powerful feature that lets you run Python-based CLI tools instantly without installing them globally. This is especially useful because it keeps your system clean: instead of setting up environments or managing versions, you can just run a command and uv will fetch and execute it in an isolated, temporary environment.
Docker
Docker is a containerization solution that allows you to package an application together with its full runtime environment (OS, system libraries, dependencies, etc.) into a portable image. Unlike pip, Conda, Pixi, or uv which focus on dependency management at the project or language level, Docker operates at a lower level by fully isolating the system environment.
In that sense, Docker does not replace these tools but rather complements them: while pyproject.toml or environment.yml
define dependencies, Docker ensures that the application runs consistently across environments (local machine, CI, production, cloud).
A minimal example of a Dockerfile for a Python project based on pyproject.toml:
FROM python:3.14
# Set working directory
WORKDIR /app
# Copy project files
COPY pyproject.toml README.md ./
COPY src/ ./src/
# Install build tool
RUN pip install --upgrade pip setuptools wheel
# Install project
RUN pip install --no-cache-dir .
#
ENTRYPOINT ["my_app"]
# Default command
CMD ["my-cli"]
Build and run the image:
ocker build -t my_project .
docker run --rm my_project
Docker is particularly useful in the following cases:
-
Production deployment (APIs, microservices, batch jobs)
-
Continuous integration (CI/CD)
-
Complex environments or system-level dependencies
-
Strong reproducibility across machines
Note
Docker can embed tools like pip, Conda, Pixi, or uv inside the container. It is therefore common to combine Docker with an existing dependency management tool rather than treating them as alternatives.
Versioning
Versioning is essential for tracking a project’s evolution and communicating changes clearly. The most common approach is Semantic Versioning or SemVer, which uses the MAJOR.MINOR.PATCH format:
-
MAJOR: when you make incompatible API changes -
MINOR: when you add functionality in a backward compatible manner -
PATCH: when you make backward compatible bug fixes
Another approach is Date-based Versioning (DateVer), where versions are based on release dates (e.g., 2026.03.30), which works well for frequently released updated projects.
In Python, you can automate versioning using tools that integrate with Git and your build system. For example setuptools_scm, which is a build backend, automatically derives the version from Git tags. More advanced tools like python-semantic-release bring full SemVer automation by analyzing commit messages (using Conventional Commits) to determine whether to set major, minor, or patch versions, and can even handle publishing. These tools integrate well with CI/CD pipelines, making it easy to automate releases and maintain consistent versioning without manual intervention.
Code Quality and testing
Testing your code and enforcing quality standards is very important, it is an essential practice that should be adopted from the start. Tests (unit, integration, etc.) first ensure that the code behaves as expected, but more importantly, they help quickly detect regressions when adding new features or modifying existing parts. Without tests, every change becomes risky and requires manually checking the entire program’s behavior, which is slow, incomplete, and error-prone. At the same time, tools like pytest make it much easier to write and run automated tests. However, quality does not stop at testing: enforcing coding standards with tools like black for code formatting or ruff for static analysis helps maintain code that is readable, consistent, and maintainable over time. This reduces technical debt, improves collaboration between developers, and makes the project easier to understand and maintain, even months later. Finally, these practices integrate naturally into continuous integration pipelines, where every change is automatically tested and validated before being merged.
Note
Using Black may seem counterintuitive at first, since it enforces a single coding style across the entire project. However, this is actually the intended effect: to unify and standardize the project’s style so that every contribution follows the same conventions (line breaks, spaces vs. tabs, spacing around certain special characters, etc.).
Info
Testing your code and applying quality standards is an investment in reliability, maintainability: you spend less time fixing unexpected bugs and more time building robust features.
In the next section, we will take a closer look at how pytest works and some of its key features, including fixtures, how to parametrize tests, and a few useful plugins.
After that, we will see how linters such as ruff can significantly improve your project’s code quality by enforcing consistency, detecting potential issues early, and speeding up development. We will then introduce pre-commit hooks, which help automate quality checks before code is committed to the repository. To conclude, we will cover continuous integration, showing how to automatically run tests and quality checks on every change to ensure the reliability and stability of the project over time.
pytest
Pytest is currently one of the most widely used testing frameworks in the Python
ecosystem. It is valued for its simplicity, flexibility, and ability to scale from small scripts to large industrial
applications. Unlike Python’s built-in unittest module, pytest
allows you to write tests in a very natural way using plain Python functions and standard assert statements.
It automatically discovers test files and test functions (by convention test_*.py and def test_*), which makes it extremely
easy to adopt.
One of its most powerful features is the fixture system, which allows you to define reusable setup logic for your tests,
improving code reuse and maintainability. Pytest also supports parameterized tests, enabling you to run the same test
function with multiple input datasets, which greatly reduces duplication. On top of that, a rich ecosystem of plugins
extends its capabilities, including code coverage reporting, parallel test execution, and CI/CD integration.
Overall, pytest has become a core tool in modern Python development workflows to ensure code reliability and quality.
Basic usage
To illustrate how pytest works, we can start with a file named test_math.py that contains the following code:
def add(a, b):
return a + b
def multiply(a, b):
return a * b
def test_add():
print("hello from test_add")
assert add(2, 3) == 5
assert add(-1, 1) == 0
def test_multiply():
print("hello from test_multiply")
assert multiply(2, 3) == 6
assert multiply(-2, 3) == -6
This file simply tests the behavior of two functions, add(a, b) and multiply(a, b), with test
functions, test_add() and test_multiply().
To run these tests, you just need to install pytest and execute it on the file:
pip install pytest
pytest -s test_math.py # -s to activate prints
which produce the following trace
======================================================================================================= test session starts ========================================================================================================
platform linux -- Python 3.11.15, pytest-9.0.3, pluggy-1.6.0
rootdir: ...
collected 2 items
test_math.py hello from test_add
.hello from test_multiply
.
======================================================================================================== 2 passed in 0.03s =========================================================================================================
Select only one test
pytest allow its users to select one or many tests through its command line interface
pytest -s test_math.py::test_add # to select only test_add in test_math.py
pytest -s test_math.py::test_add test_math.py::test_multiply # to select test_add and test_multiply in test_math.py
parametrize
Starting from the test_add example, it quickly becomes clear why parameterizing tests can be very useful.
In its basic form, test_add might contain several assertions to check different input values:
def test_add():
print("hello from test_add")
assert add(2, 3) == 5
assert add(-1, 1) == 0
While this works, it has a few limitations. First, all test cases (or asserts) are grouped into a single test function. If one assertion fails, the entire test is marked as failed, and you don’t immediately know which specific input caused the issue without inspecting the output carefully. This can make debugging less straightforward, especially as the number of test cases grows.
This is where parameterization with pytest becomes very helpful. By using @pytest.mark.parametrize, you can separate
each set of inputs into its own test case:
@pytest.mark.parametrize(
"a, b, expected",
[
(2, 3, 5),
(-1, 1, 0),
(-1, 1, 2), # will fail, -1 + 1 = 0, not 2
],
)
def test_add(a, b, expected):
assert add(a, b) == expected
With this approach, each tuple of parameters is treated as an individual test. This has several advantages:
failures are more precise and easier to identify, test output is clearer, and adding new test cases becomes as
simple as adding another line to the parameter list. It also improves readability by separating the test logic from the
test data.
By launching only the test test_add the following trace is produced:
======================================================================================================= test session starts ========================================================================================================
platform linux -- Python 3.11.15, pytest-9.0.3, pluggy-1.6.0
rootdir: ...
collected 3 items
test_math.py ..F
============================================================================================================= FAILURES =============================================================================================================
_________________________________________________________________________________________________________ test_add[-1-1-2] _________________________________________________________________________________________________________
a = -1, b = 1, expected = 2
@pytest.mark.parametrize(
"a, b, expected",
[
(2, 3, 5),
(-1, 1, 0),
(-1, 1, 2), # will fail
],
)
def test_add(a, b, expected):
> assert add(a, b) == expected
E assert 0 == 2
E + where 0 = add(-1, 1)
test_math:17: AssertionError
===================================================================================================== short test summary info ======================================================================================================
FAILED test_math.py::test_add[-1-1-2] - assert 0 == 2
=================================================================================================== 1 failed, 2 passed in 0.15s ====================================================================================================
As expected, three tests were executed: one failed and two passed. The traceback produced by pytest allows us to quickly identify the context in which the test failed.
fixtures
Fixtures are one of the most powerful features of pytest and play a central role in writing clean, maintainable, and reusable tests. A fixture is a function that provides a predefined context or data to your tests. Instead of duplicating setup code in every test, you define it once and inject it wherever needed.
This is particularly useful when your tests rely on shared data, resources, or initialization steps, such as creating input data, configuring objects, connecting to a database, or preparing files. By centralizing this logic in fixtures, you avoid repetition and make your tests easier to read and maintain.
We can extend our test_math.py file with a concrete example of fixture usage. We will create two fixtures, each
returning a numpy array representing satellite data containing cloudy conditions or vegetation.
These fixtures can then be directly used in our test functions.
import pytest
import numpy as np
@pytest.fixture
def cloudy_image():
return np.random.uniform(0.0, 0.1, size=(2, 2))
@pytest.fixture
def vegetation_image():
return np.random.uniform(0.5, 1.0, size=(2, 2))
def compute_mean(image):
return np.mean(image)
def test_low_ndvi(cloudy_image):
assert compute_mean(cloudy_image) < 0.1
def test_high_ndvi(vegetation_image):
assert compute_mean(vegetation_image) > 0.5
Typically, fixtures are defined in a conftest.py file (which is automatically discovered by pytest) and are made
available across all test files.
For instance, fixtures can be used to:
-
Read environment variables such as
DATA_ROOT,AWS_BUCKET, etc. -
Create a temporary directory that can be automatically cleaned up (optionally only on success, which is a useful pattern)
-
Provide a preconfigured processing pipeline to avoid repeating setup in every test
-
Load configuration data from JSON files
-
Simulate data in memory instead of loading it from disk (which improves test speed, reduces I/O overhead, and avoids managing multiple test files)
Note
Simulating data in memory is generally preferable to writing multiple files to disk for tests. It makes tests significantly faster by avoiding I/O operations, reduces external dependencies, and improves reproducibility since the data is generated in a controlled and deterministic way. It also simplifies test setup and cleanup, as there is no need to manage temporary files or deal with file system side effects. In addition, when fixtures are used to generate this data, their outputs can be reused across multiple tests (including in parallel test runs, depending on fixture scope), which further improves efficiency.
Warning
Simulating data in memory is also often better than versioning and reusing test data files directly in git, since storing datasets in the repository quickly increases its size, slows down cloning and CI pipelines, and makes the project harder to maintain over time. Large or numerous data files also tend to be duplicated or slightly modified across tests, which reduces clarity and increases the risk of inconsistencies between versions of the data.
Pytest comes also with a set of fixture which can be useful: https://docs.pytest.org/en/stable/reference/fixtures.html#built-in-fixtures
pytest plugins
As mentioned earlier, pytest is highly configurable: not only through built-in fixtures, but also thanks to the python community, which provides a rich ecosystem of plugins. The following section presents a few plugins that are particularly useful in everyday development.
All pytest plugins is available at: https://docs.pytest.org/en/stable/reference/plugin_list.html
pytest-cov
pytest-cov allows you to measure test coverage, i.e., which parts of the code are actually executed during tests.
To generate an HTML coverage report for a project named my_project, you can use the following command:
pytest --cov=my_project --cov-report=html
The main advantages of this plugin are as follows:
-
Identifying untested parts of the code
-
Improving test robustness
-
Avoiding untested areas in complex pipelines
When integrated into a CI system, this plugin helps ensure that external contributions do not degrade code quality by reducing coverage. It also verifies that newly added code is properly tested.
pytest-xdist
pytest-xdist enables running tests in parallel across multiple CPUs or machines.
Instead of executing tests sequentially, pytest distributes them across multiple workers:
pytest -n auto
Its main benefit is significantly reducing test execution time.
pytest-testmon
pytest-testmon allows running only the tests impacted by recent code changes.
Instead of executing the entire test suite after every modification, it analyzes which parts of the code have changed and runs only the relevant tests.
It can be enabled simply with:
pytest --testmon
Main advantages:
-
Time saving: when test suites are large and slow, running only impacted tests can greatly speed up development
-
Encourages well-structured tests: works best when tests are isolated and target specific functions
pytest-mock
pytest-mock simplifies the use of mocks by providing a mocker fixture. It allows you to easily replace functions, objects, or behaviors during tests.
def test_api_call(mocker):
# call_api is replaced with a mocked version
mock = mocker.patch("my_module.call_api", return_value={"status": "ok"})
result = my_function()
assert result == "success"
# Ensure the mock was called
mock.assert_called_once()
Main benefits:
-
Isolating external dependencies (e.g., S3 access, APIs)
-
Avoiding network calls during tests
Warning
Using mocks in integration tests is generally discouraged, as these tests are meant to validate real-world usage scenarios.
pytest-benchmark
pytest-benchmark allows you to measure the performance of your code directly within your tests. It provides reliable and reproducible benchmarking while integrating seamlessly with pytest.
Example usage:
def compute_mean(image):
return np.mean(image)
def test_compute_mean(benchmark):
image = np.random.rand(1000, 1000)
result = benchmark(compute_mean, image)
In this case, pytest will:
-
execute the function multiple times
-
measure execution time
-
produce statistics (min, max, mean, etc.)
The plugin comes also with useful features such as --benchmark-compare to compare results with previous runs.
linters
Beyond testing, maintaining a consistent and high-quality codebase is essential. This is where linters and formatters comes. They help enforce coding standards, detect potential issues early, and improve readability and maintainability across the project.
black
black is an opinionated code formatter that automatically formats Python code according to a consistent style.
Main benefits:
-
Enforces a single, consistent coding style across the entire project
-
Removes debates about formatting during code reviews
-
Improves readability and maintainability
-
Fully deterministic (same input = same output)
black is designed to be “uncompromising,” meaning it minimizes configuration in favor of consistency.
ruff
ruff is a modern, extremely fast linter and formatter, written in Rust. It can replace multiple tools such as flake8, isort, and even parts of black.
Main benefits:
-
Very fast (orders of magnitude faster than traditional linters)
-
Combines linting, formatting, and import sorting
-
Detects a wide range of issues (unused imports, bugs, style problems…)
-
Easy to configure via pyproject.toml
Type checking (to go further)
In addition to linters, using a type checker like mypy helps catch errors before runtime by validating that functions are used with the correct types. This reduces bugs that would otherwise only appear during execution, sometimes in production. It also improves code readability and maintainability, as type annotations act as lightweight documentation, making it easier to understand how functions are meant to be used. This is especially valuable in collaborative projects. Finally, type checking encourages better API design and refactoring safety: when modifying code, mypy can quickly highlight inconsistencies across the codebase, reducing the risk of introducing regressions.
Main benefits:
-
Detects type inconsistencies before execution
-
Improves code reliability and documentation
-
Encourages better API design
def add(a: int, b: int) -> int:
return a + b
mypy will report errors if incorrect types are used.
Pre-commit hooks
Pre-commit hooks allow you to automatically run checks before each commit, ensuring that code quality standards are consistently enforced. While it is good practice to ask developers to run linters manually, it is even better to automate this process so that every commit is validated without relying on individual discipline.
With tools like pre-commit, you can easily integrate linters and formatters such as black, ruff, and mypy.
Each time a commit is made, these tools are executed automatically to check formatting, detect potential issues, and
validate type correctness.
This approach has several advantages:
-
Ensures consistent code quality across all contributions
-
Prevents poorly formatted or invalid code from being committed
-
Reduces the burden during code reviews
-
Improves overall developer productivity
In practice, pre-commit hooks act as a first line of defense, catching issues early in the development workflow before they reach CI or production.
Continuous integration
Continuous Integration (CI) is a development practice where every change pushed to the repository automatically triggers a set of checks and validations. The goal is to ensure that new contributions do not break existing functionality and that the codebase remains in a healthy, consistent state.
Testing and visualize coverage
A typical CI pipeline for a Python project includes several key steps. First, running the test suite with pytest is
essential to validate the correctness of the code. Coupling this with pytest-cov allows you to generate a code coverage
report, which measures which parts of the code are executed by tests. Integrating coverage thresholds into CI ensures
that new contributions do not reduce the overall quality of the test suite.
As explained earlier, the CI pipeline may include the following command:
pytest --cov=my_project --cov-report html:cov_html
Here, cov_html is a directory where all HTML reports generated by pytest are stored.
To make this directory accessible after the pipeline runs, it must be saved in GitLab (and with equivalent instructions in GitHub)
as an artifact.
artifacts:
when: always
paths:
- ./cov_html/
expire_in: 1 day
In this configuration, the cov_html directory is stored as an artifact for one day and can be accessed through the
GitLab interface.
To further enhance the coverage analysis produced by pytest, it is worth noting that you can directly visualize, in merge
requests (within the Changes tab), which lines are covered by tests executed in the CI. To enable this feature, you need
to generate an additional report in Cobertura XML format:
pytest --cov=my_project --cov-report html:cov_html --cov-report xml:coverage.xml
Then, store this file as an artifact using:
artifacts:
reports:
coverage_report:
coverage_format: cobertura
path: coverage.xml
paths:
- coverage.xml
expire_in: 1 day
This will display coverage information directly in the merge request diff, highlighting which lines are covered and which are not.
linters
In addition to testing, a robust CI pipeline should include linters and static analysis tools. Linters automatically analyze the source code to detect programming errors, stylistic issues, and potential bugs without executing the code.
Integrating these tools into the CI pipeline ensures that all contributions follow consistent coding standards and helps catch issues early in the development process. For example:
lint:
stage: quality
script:
- ruff check # Lint files in the current directory.
- mypy .
Using linters improves code readability, reduces technical debt, and facilitates collaboration by enforcing a shared code style across the team.
In practice, linters can be executed at different stages of the development workflow:
-
locally: using tools such as pre-commit hooks, to catch issues before committing code -
in CI pipelines: to enforce quality checks on all contributions and prevent regressions
These approaches are complementary: running linters locally provides fast feedback to developers, while CI ensures that all code merged into the project complies with the defined standards.
Documentation build and deployment
In addition to running tests and linters, a CI pipeline should also validate that the project documentation can be successfully built, and optionally publish it.
For projects using tools such as Sphinx or MkDocs, it is recommended to include a step that builds the documentation
(e.g., make html or mkdocs build). This ensures that:
- the documentation remains valid and free of build errors
- all references, links, and code snippets are consistent
- the documentation can be successfully built by external platforms such as Read the Docs
This step is particularly important because documentation issues (broken links, invalid directives, missing dependencies) are often only detected at build time.
In projects hosted on platforms like GitHub Pages or GitLab Pages, the CI pipeline typically goes one step further: it is responsible not only for building the documentation but also for deploying it as a static website.
A typical workflow includes:
- building the documentation during CI
- storing the generated site (e.g.,
build/htmlorsite/) - deploying it automatically to a hosting service (GitHub Pages or GitLab Pages)
This approach ensures that the documentation is always up to date with the latest version of the code and removes the need for manual deployment.
Documentation and collaboration
Clear and accessible documentation is essential for the sustainability of a Python project. It improves onboarding, facilitates collaboration, and ensures that both users and contributors can understand and use the project effectively.
README
The README.md file, stored at the project root, is the entry point of your project. It should provide a concise
but comprehensive overview, typically including:
-
Project description and purpose
-
Installation instructions
-
Quick start / usage examples
-
Project structure (optional)
-
Contribution guidelines (link to contributing section)
-
License information
A good README allows a new user to understand and run the project in just a few minutes. Keeping it simple, structured, and up to date is key.
To go further, it is common practice to create more detailed documentation in a dedicated doc directory. This directory
typically contains all the documentation sources, usually static, written in various formats such as notebooks,
.md, or .rst files.
The following sections will explain how to use Sphinx to generate static HTML.
Sphinx
Sphinx is one of the most widely used tools for generating Python project documentation. It allows you to create structured, versioned, and navigable documentation websites.
reStructuredText (RST)
Sphinx natively uses reStructuredText (RST), a powerful markup language designed for writing structured and maintainable documentation.
Compared to Markdown, RST offers more advanced features (such as cross-references, directives), but it is also more verbose.
When getting started, the best approach is to experiment. Try out new directives or syntax elements in a local version of your documentation to understand how they behave before integrating them into your main project. Over time, this exploration will make RST much more natural to use.
The official Sphinx documentation provides excellent step-by-step guides to help you begin:
-
Install Sphinx: https://www.sphinx-doc.org/en/master/usage/installation.html
-
Create your first documentation: https://www.sphinx-doc.org/en/master/usage/quickstart.html
Taking the time to go through these initial steps will give you a solid foundation for building clear, structured, and scalable documentation.
Markdown
Sphinx can also support Markdown through extensions such as MyST-parser. This allows teams to write documentation in a simpler and more widely adopted format while still benefiting from Sphinx features.
Markdown is generally easier to read and write, making it a good choice for collaborative environments.
Integrating notebooks
It is often very useful to include Jupyter notebooks directly in the documentation, especially when the project involves tutorials, step-by-step explanations, or data analysis examples. Notebooks provide a unique combination of narrative text, executable code, and visual outputs, which makes them particularly effective for demonstrating how a project works in practice.
This integration can be achieved using tools such as nbsphinx or MyST-NB,
which allow notebooks to be rendered as part of a Sphinx-based documentation site. These tools make it possible to
seamlessly include .ipynb files in the documentation build process, ensuring that code examples remain synchronized with the rest of the project.
In addition, notebooks provide a more interactive and engaging experience for users. They allow readers to understand not only what the code does, but also how and why it works, making them particularly valuable for educational content and onboarding new users.
Note
Keeping notebooks up to date and functional is very important. Any content exposed to end users should be fully operational. However, notebook maintenance is often overlooked during development. To ensure reliability, notebooks can be tested using pytest together with the nbmake plugin, which allows them to be executed as part of the CI pipeline.
Interactive notebooks with Binder or Jupyterlite
When working with Jupyter notebooks, it can be very useful to allow users to execute them directly in a browser without installing anything locally: this is where Binder or Notebook.link become particularly valuable. Binder is a free service that builds a runnable environment from a Git repository and launches it in the cloud. Users can open notebooks, run code, and experiment interactively with your project in just one click.
Notebook.link is based on JupyterLite, an in-browser version of Jupyterlab, with no backend needed. It can really simplify running notebooks, but it is not fully compatible with all Python libraries.
To enable Binder, the repository must include environment configuration files such as requirements.txt, environment.yml
or pyproject.toml: Binder then automatically builds the environment and exposes a live session.
Typical use cases:
- sharing tutorials or demos
- providing interactive documentation
- simplifying onboarding for new users
- showcasing research workflows
However, Binder has some limitations:
- limited computational resources
- slower startup times (environment build)
- not suitable for heavy workloads or production use
Despite these limitations, Binder is a powerful tool for improving accessibility and user experience, especially in data science and research-oriented projects.
Automatic documentation (API)
One of the major strengths of Sphinx is its ability to automatically generate documentation directly from the source code through docstrings, using extensions such as autodoc. This approach makes it possible to build a large part of the API documentation without manually rewriting content outside the codebase.
By extracting information directly from docstrings, the documentation remains closely aligned with the implementation. This significantly reduces the risk of inconstancy between the code and its documentation, which is a common issue in long-lived projects. It also encourages developers to write clearer and more structured docstrings, since they become an integral part of the user documentation.
In more advanced configurations, Sphinx can go beyond pure API documentation and serve as a complete documentation framework for the project. It can include high-level explanations, usage examples, and tutorials that help users understand how to interact with the software in real-world scenarios. It is also commonly used to describe system architecture, design decisions, and internal workflows, which are particularly valuable for onboarding new contributors or maintaining complex projects over time.
One of the most well-known alternatives is autosummary (often used as a complement rather than a replacement). While autodoc directly injects documentation into a page, autosummary automatically generates structured pages for each module, class, or function. This results in more organized documentation, similar to what is found in large libraries.
You can also use napoleon, which is not a direct alternative but a very useful extension if you write your docstrings in the Google or NumPy format. It greatly improves readability and integrates smoothly with Sphinx.
Alternatives to Sphinx
While Sphinx is powerful, there are simpler or more modern alternatives.
For example, Zensical provides a clean and lightweight approach to documentation.
Warning
However, it is important to note that it does not host documentation by itself. You still need to deploy your site separately (see: https://zensical.org/docs/publish-your-site/).
Other alternatives:
- MkDocs (very popular, Markdown-based)
- Docusaurus (more frontend-oriented)
Hosting
Once documentation is generated, it needs to be hosted and made accessible. All hosting solutions bellow are free.
Read the Docs
Read the Docs is one of the most widely used platforms for hosting documentation. It can automatically build your documentation directly from your repository whenever changes are pushed, which greatly simplifies maintenance. It also provides built-in versioning, allowing users to navigate between different versions of the documentation that match specific releases of your project. Its seamless integration with Sphinx makes it a natural choice for many Python projects, especially when you want a solution that requires minimal configuration while still offering robust features.
Note
the tip How to build a documentation on Read the Docs can help you to push your first documentation on Read the Docs.
GitLab Pages / GitHub Pages
Another common approach is to use GitHub Pages or GitLab Pages, which provide static site hosting tightly integrated with their respective platforms. In these setups, the documentation is built as part of your CI/CD pipeline and then deployed as a static website. This approach gives you full control over how and when the documentation is built and deployed. It is also a cost-effective solution, as hosting is free for public projects, and it fits naturally into existing development workflows. Because the deployment is handled through CI/CD, it becomes easy to keep the documentation in sync with the codebase and to automate updates whenever changes are made.
Badges
Badges provide quick, visual information about the project status directly in the README.
Common badges can include:
- build status (CI passing/failing)
- test coverage
- documentation status
- license
- Python version compatibility
Example:


Badges improve transparency and give immediate feedback on project health.
Tutorials
There is no available tutorial for this tool.