JuaDataset
JuaDataset
The JuaDataset
is the primary container for weather data in the Jua Python SDK. When you request weather forecasts or hindcasts from any Jua model, the results are returned as a JuaDataset
object.
Overview
JuaDataset
serves as a specialized wrapper around xarray Datasets, providing:
Extensions to xarray functionality for weather-specific operations
Convenient methods to save data to disk in Zarr format
Memory usage information and management
Working with JuaDataset
Getting a JuaDataset
from jua import JuaClient
from jua.weather import Models
client = JuaClient()
# Get a forecast
model = client.weather.get_model(Models.EPT2)
forecast_data = model.forecast.get_forecast() # Returns a JuaDataset
Accessing Variables
You can access variables directly using dictionary syntax:
# Using string variable names
temperature = forecast_data["air_temperature_at_height_level_2m"]
# Using the Variables enum (recommended for type safety)
from jua.weather import Variables
temperature = forecast_data[Variables.AIR_TEMPERATURE_AT_HEIGHT_LEVEL_2M]
Working with xarray
JuaDataset
seamlessly integrates with xarray's functionality:
# Convert to xarray Dataset
ds = forecast_data.to_xarray()
# Direct variable access returns a TypedDataArray (extended xarray.DataArray)
temperature = forecast_data[Variables.AIR_TEMPERATURE_AT_HEIGHT_LEVEL_2M]
# Use xarray's powerful selection methods
temp_london = temperature.sel(latitude=51.5, longitude=-0.12, method="nearest")
# convert temperature from Kelvin to Celcius
temperature_celcius = temperature.to_celcius()
# Convert lead time to absolute time
tempearture_abs_time = temperature.to_absolute_time()
# Create visualizations
temperature.plot()
Saving Data
JuaDataset
can be saved to disk in Zarr format for later use:
# Save with default settings (to ~/.jua/datasets/<model_name>/<dataset_name>.zarr)
forecast_data.save()
# Save to custom location with progress bar
from pathlib import Path
forecast_data.save(
output_path=Path("./my_data/forecast"),
show_progress=True,
overwrite=True
)
Obtaining Statistics from Ensemble Models
Ensemble statistics can be obtained from EPT-2e. The statistics available for a given forecast are returned in the forecast metadata.
model = client.weather.get_model(Models.EPT2_E)
metadata = model.forecast.get_metadata()
print("EPT-2e ensemble stats:")
print(metadata.available_ensemble_stats)
The default behavior of get_forecast
is to simply return the ensemble mean. To obtain forecast statistics, specify which ones are needed.
from jua.weather import Statistics
# Get a forecast
model = client.weather.get_model(Models.EPT2_E)
# Using strings
forecast_data = model.forecast.get_forecast(
latitude=51.5,
longitude=-0.12,
statistics=["mean", "std", "q5", "q95"]
)
# Using the Statistics enum (recommended for type safety)
forecast_data = model.forecast.get_forecast(
latitude=51.5,
longitude=-0.12,
statistics=[
Statistics.MEAN,
Statistics.STD,
Statistics.QUANTILE_5,
Statistics.QUANTILE_95,
]
)
Statistics can then be accessed through the stat
coordinate:
import matplotlib.pyplot as plt
# Use xarray's powerful selection methods
ds_forecast = forecast_data.to_xarray()
ds_forecast = ds_forecast.isel(points=0, time=0)
wind_speed = ds_forecast[Variables.WIND_SPEED_AT_HEIGHT_LEVEL_10M]
wind_speed = wind_speed.to_absolute_time()
# Create visualizations
wind_speed.sel(stat=Statistics.MEAN).plot(label=Statistics.MEAN.display_name)
wind_speed.sel(stat=Statistics.QUANTILE_5).plot(label=Statistics.QUANTILE_5.display_name)
wind_speed.sel(stat=Statistics.QUANTILE_95).plot(label=Statistics.QUANTILE_95.display_name)
plt.legend()
plt.show()
Best Practices
Use Variables enum for type-safe access to variables
Convert to xarray for complex operations and analysis
Save large datasets to disk for repeated use
Complete Example
from jua import JuaClient
from jua.weather import Models, Variables
import matplotlib.pyplot as plt
# Initialize client
client = JuaClient()
# Get a forecast for a specific model
model = client.weather.get_model(Models.EPT2)
forecast = model.forecast.get_forecast()
# Access temperature data
temperature = forecast[Variables.AIR_TEMPERATURE_AT_HEIGHT_LEVEL_2M]
# Select data for a specific location and time range
temp_paris = temperature.sel(
latitude=48.8566,
longitude=2.3522,
method="nearest"
).sel(time=slice("2023-07-01", "2023-07-07"))
# Convert to Celsius and plot
temp_paris.to_celsius().plot()
plt.title("Temperature Forecast for Paris")
plt.ylabel("Temperature (°C)")
plt.show()
# Save the dataset for later use
forecast.save(show_progress=True)
Last updated