# JuaDataset

## JuaDataset

The `JuaDataset` is the primary container for weather data in the Jua Python SDK. When you request weather forecasts or hindcasts from any Jua model, the results are returned as a `JuaDataset` object.

### Overview

`JuaDataset` serves as a specialized wrapper around [xarray](https://xarray.dev/) Datasets, providing:

* Extensions to xarray functionality for weather-specific operations
* Convenient methods to save data to disk in Zarr format
* Memory usage information and management

### Working with JuaDataset

#### Getting a JuaDataset

```python
from datetime import datetime

from jua import JuaClient
from jua.weather import Models

client = JuaClient()

# Get the forecast for all of Switzerland from the 1st of January 2024
model = client.weather.get_model(Models.EPT2)
forecast_data = model.get_forecasts(  # Returns a JuaDataset
    init_time=datetime(2024, 1, 1, 0),
    latitude=slice(45, 48),
    longitude=slice(5, 11),
)
```

#### Accessing Variables

You can access variables directly using dictionary syntax:

```python
# Using string variable names
temperature = forecast_data["air_temperature_at_height_level_2m"]

# Using the Variables enum (recommended for type safety)
from jua.weather import Variables
temperature = forecast_data[Variables.AIR_TEMPERATURE_AT_HEIGHT_LEVEL_2M]
```

#### Working with xarray

`JuaDataset` seamlessly integrates with xarray's functionality:

```python
# Convert to xarray Dataset
ds = forecast_data.to_xarray()

# Direct variable access returns a TypedDataArray (extended xarray.DataArray)
temperature = forecast_data[Variables.AIR_TEMPERATURE_AT_HEIGHT_LEVEL_2M]

# Use xarray's powerful selection methods
temp_zurich = temperature.sel(latitude=47.3769, longitude=8.5417, method="nearest")

# convert temperature from Kelvin to Celcius
temperature_celcius = temperature.to_celcius()

# Convert lead time to absolute time
temperature_abs_time = temperature.to_absolute_time()

# Create visualizations
temperature.plot()
```

### Saving Data

`JuaDataset` can be saved to disk in Zarr format for later use:

```python
# Save with default settings (to ~/.jua/datasets/<model_name>/<dataset_name>.zarr)
forecast_data.save()

# Save to custom location with progress bar
from pathlib import Path
forecast_data.save(
    output_path=Path("./my_data/forecast"),
    show_progress=True,
    overwrite=True
)
```

### Obtaining Statistics from Ensemble Models

Ensemble statistics can be obtained from EPT-2e. The default behavior of `get_forecasts` is to simply return the ensemble mean. To obtain forecast statistics, specify which ones are needed.

```python
from jua.weather import Statistics

# Get a forecast
model = client.weather.get_model(Models.EPT2_E)

# Using strings
forecast_data = model.get_forecasts(
    latitude=51.5,
    longitude=-0.12,
    statistics=["mean", "std", "q5", "q95"]
)

# Using the Statistics enum (recommended for type safety)
forecast_data = model.get_forecasts(
    latitude=51.5,
    longitude=-0.12,
    statistics=[
        Statistics.MEAN,
        Statistics.STD,
        Statistics.QUANTILE_5,
        Statistics.QUANTILE_95,
    ]
)
```

Statistics can then be accessed through the `stat` coordinate:

```python
import matplotlib.pyplot as plt

# Use xarray's powerful selection methods
ds_forecast = forecast_data.to_xarray()
ds_forecast = ds_forecast.isel(points=0, init_time=0)
wind_speed = ds_forecast[Variables.WIND_SPEED_AT_HEIGHT_LEVEL_10M]
wind_speed = wind_speed.to_absolute_time()

# Create visualizations
wind_speed.sel(stat=Statistics.MEAN).plot(label=Statistics.MEAN.display_name)
wind_speed.sel(stat=Statistics.QUANTILE_5).plot(label=Statistics.QUANTILE_5.display_name)
wind_speed.sel(stat=Statistics.QUANTILE_95).plot(label=Statistics.QUANTILE_95.display_name)
plt.legend()
plt.show()
```

### Best Practices

1. **Use Variables enum** for type-safe access to variables
2. **Convert to xarray** for complex operations and analysis
3. **Save large datasets** to disk for repeated use

### Complete Example

```python
from jua import JuaClient
from jua.weather import Models, Variables
import matplotlib.pyplot as plt

# Initialize client
client = JuaClient()

# Get a specific forecast for a model
#   specify the forecast run to obtain data for (init_time)
#   specify the latitude, longitude to get the data for
model = client.weather.get_model(Models.EPT2)
forecast_paris = model.get_forecasts(
    init_time=datetime(2024, month=8, day=19, hour=0),
    latitude=48.8566,
    longitude=2.3522,
    method="bilinear",
    variables=[
        Variables.AIR_TEMPERATURE_AT_HEIGHT_LEVEL_2M,
        Variables.WIND_SPEED_AT_HEIGHT_LEVEL_10M,
    ],
)

# Access temperature data
temperature = forecast_paris[Variables.AIR_TEMPERATURE_AT_HEIGHT_LEVEL_2M]

# Convert to Celsius and plot
temperature.to_celcius().plot()
plt.title("Temperature Forecast for Paris")
plt.ylabel("Temperature (°C)")
plt.show()

# Access wind data
wind_speed = forecast_paris[Variables.WIND_SPEED_AT_HEIGHT_LEVEL_10M]

# Plot wind speed
wind_speed.plot()
plt.title("Wind Speed (10m) for Paris")
plt.ylabel("Wind Speed (m/s)")
plt.show()

# Save the dataset for later use
forecast_paris.save(show_progress=True)
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.jua.ai/python-sdk/juadataset.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
