JuaDataset

JuaDataset

The JuaDataset is the primary container for weather data in the Jua Python SDK. When you request weather forecasts or hindcasts from any Jua model, the results are returned as a JuaDataset object.

Overview

JuaDataset serves as a specialized wrapper around xarray Datasets, providing:

  • Extensions to xarray functionality for weather-specific operations

  • Convenient methods to save data to disk in Zarr format

  • Memory usage information and management

Working with JuaDataset

Getting a JuaDataset

from datetime import datetime

from jua import JuaClient
from jua.weather import Models

client = JuaClient()

# Get the forecast for all of Switzerland from the 1st of January 2024
model = client.weather.get_model(Models.EPT2)
forecast_data = model.get_forecasts(  # Returns a JuaDataset
    init_time=datetime(2024, 1, 1, 0),
    latitude=slice(45, 48),
    longitude=slice(5, 11),
)

Accessing Variables

You can access variables directly using dictionary syntax:

# Using string variable names
temperature = forecast_data["air_temperature_at_height_level_2m"]

# Using the Variables enum (recommended for type safety)
from jua.weather import Variables
temperature = forecast_data[Variables.AIR_TEMPERATURE_AT_HEIGHT_LEVEL_2M]

Working with xarray

JuaDataset seamlessly integrates with xarray's functionality:

# Convert to xarray Dataset
ds = forecast_data.to_xarray()

# Direct variable access returns a TypedDataArray (extended xarray.DataArray)
temperature = forecast_data[Variables.AIR_TEMPERATURE_AT_HEIGHT_LEVEL_2M]

# Use xarray's powerful selection methods
temp_zurich = temperature.sel(latitude=47.3769, longitude=8.5417, method="nearest")

# convert temperature from Kelvin to Celcius
temperature_celcius = temperature.to_celcius()

# Convert lead time to absolute time
tempearture_abs_time = temperature.to_absolute_time()

# Create visualizations
temperature.plot()

Saving Data

JuaDataset can be saved to disk in Zarr format for later use:

# Save with default settings (to ~/.jua/datasets/<model_name>/<dataset_name>.zarr)
forecast_data.save()

# Save to custom location with progress bar
from pathlib import Path
forecast_data.save(
    output_path=Path("./my_data/forecast"),
    show_progress=True,
    overwrite=True
)

Obtaining Statistics from Ensemble Models

Ensemble statistics can be obtained from EPT-2e. The default behavior of get_forecasts is to simply return the ensemble mean. To obtain forecast statistics, specify which ones are needed.

from jua.weather import Statistics

# Get a forecast
model = client.weather.get_model(Models.EPT2_E)

# Using strings
forecast_data = model.get_forecasts(
    latitude=51.5,
    longitude=-0.12,
    statistics=["mean", "std", "q5", "q95"]
)

# Using the Statistics enum (recommended for type safety)
forecast_data = model.get_forecasts(
    latitude=51.5,
    longitude=-0.12,
    statistics=[
        Statistics.MEAN,
        Statistics.STD,
        Statistics.QUANTILE_5,
        Statistics.QUANTILE_95,
    ]
)

Statistics can then be accessed through the stat coordinate:

import matplotlib.pyplot as plt

# Use xarray's powerful selection methods
ds_forecast = forecast_data.to_xarray()
ds_forecast = ds_forecast.isel(points=0, init_time=0)
wind_speed = ds_forecast[Variables.WIND_SPEED_AT_HEIGHT_LEVEL_10M]
wind_speed = wind_speed.to_absolute_time()

# Create visualizations
wind_speed.sel(stat=Statistics.MEAN).plot(label=Statistics.MEAN.display_name)
wind_speed.sel(stat=Statistics.QUANTILE_5).plot(label=Statistics.QUANTILE_5.display_name)
wind_speed.sel(stat=Statistics.QUANTILE_95).plot(label=Statistics.QUANTILE_95.display_name)
plt.legend()
plt.show()

Best Practices

  1. Use Variables enum for type-safe access to variables

  2. Convert to xarray for complex operations and analysis

  3. Save large datasets to disk for repeated use

Complete Example

from jua import JuaClient
from jua.weather import Models, Variables
import matplotlib.pyplot as plt

# Initialize client
client = JuaClient()

# Get a specific forecast for a model
#   specify the forecast run to obtain data for (init_time)
#   specify the latitude, longitude to get the data for
model = client.weather.get_model(Models.EPT2)
forecast_paris = model.get_forecasts(
    init_time=datetime(2024, month=8, day=19, hour=0),
    latitude=48.8566,
    longitude=2.3522,
    method="bilinear",
    variables=[
        Variables.AIR_TEMPERATURE_AT_HEIGHT_LEVEL_2M,
        Variables.WIND_SPEED_AT_HEIGHT_LEVEL_10M,
    ],
)

# Access temperature data
temperature = forecast_paris[Variables.AIR_TEMPERATURE_AT_HEIGHT_LEVEL_2M]

# Convert to Celsius and plot
temperature.to_celcius().plot()
plt.title("Temperature Forecast for Paris")
plt.ylabel("Temperature (°C)")
plt.show()

# Access wind data
wind_speed = forecast_paris[Variables.WIND_SPEED_AT_HEIGHT_LEVEL_10M]

# Plot wind speed
wind_speed.plot()
plt.title("Wind Speed (10m) for Paris")
plt.ylabel("Wind Speed (m/s)")
plt.show()

# Save the dataset for later use
forecast.save(show_progress=True)

Last updated