Xarray makes working with labeled multi-dimensional arrays in Python simple. It integrates well with other PyData tools like pandas, NumPy, and dask to provide flexible, efficient, and scalable data handling.
Source codeOverview
Xarray is a powerful tool for working with labeled multi-dimensional arrays in Python. It is particularly useful in the scientific domain for handling large datasets like climate or geographical data. By extending the capabilities of NumPy, Xarray enables users to work more efficiently with N-dimensional data, including easy access to advanced indexing, group-by operations, and alignment of data.
Key Features:
- Labeled Data: Associate dimensions and coordinates with array data.
- Interoperability: Easily integrates with other PyData libraries such as pandas, NumPy, and Dask.
- Scalability: Supports large datasets by leveraging dask for parallel computing.
- Flexible Indexing: Intuitive and flexible tools for indexing and selecting data.
- NetCDF Support: Read and write data in NetCDF and other common formats like HDF5.
Usage/Documentation
You can pip install xarray. Required and optional dependencies are given here.
Resources
Tutorials
- Dask Cookbook by Project Pythia
- Using Geodes API to retrieve and process Satellite products on the CNES Datalake
- HoloViz Tutorial
- Data usage on hydroweb.next
- Data Types Tutorials
- Pangeo tutorial on CNES infrastructure