File Access
This guide demonstrates how to access Jua's weather forecasts through our HTTP file server at https://data.jua.ai/forecasts/.
Authentication
To gain access to the file server, you may use the same API key details used for the API. This should consist of both an ID and a secret. The file server makes use of HTTP Basic authentication.
File format
We use xarray-compatible Zarr datasets for our forecasts. These datasets contain all the metadata needed to read them conveniently using the Xarray Python library.
For more help on using Zarr and Xarray, please see the official documentation:
Each Zarr dataset contains the full set of weather parameters for the corresponding forecast data. For the full list of available parameters, please see:
Folder structure
Our file server is located at https://data.jua.ai. All forecasts are uploaded to this server, but the path structure differs between model versions.
EPT-1.5 model
For the EPT-1.5 model, each lead time is stored as a separate Zarr file at the following path:
https://data.jua.ai/forecasts/MODEL/RUNDATEHOUR/LEADTIME.zarrwhere:
MODEL is the name of the model (for example
ept-1.5orept-1.5-early)RUNDATEHOUR is the initialisation date/time in UTC of the forecast in format
"YYYYMMDDHH", for example2025020106LEADTIME is the number of hours since the initialisation time for this specific predicted timestamp
For example:
https://data.jua.ai/forecasts/ept-1.5/2025020106/24.zarrwhere:
MODEL is
ept-1.5RUNDATEHOUR is
2025020106or the 1st of February 2025 at 06:00LEADTIME is 24 (hours)
So this dataset will contain the weather forecast for the 2nd of February 2025 at 06:00 UTC, as predicted by the forecast initiated at 06:00 on the 1st of February.
EPT-2 model
For the EPT-2 model, all lead times for a given forecast run are stored in a single Zarr file at the following path:
https://data.jua.ai/forecasts/MODEL/RUNDATEHOUR.zarrwhere:
MODEL is the name of the model (for example
ept-2)RUNDATEHOUR is the initialisation date/time in UTC of the forecast in format
"YYYYMMDDHH", for example2025020106
For example:
https://data.jua.ai/forecasts/ept-2/2025020106.zarrThis single Zarr dataset contains all lead times for the forecast initialised on the 1st of February 2025 at 06:00 UTC.
Accessing Forecast Data
Below are examples of how to authenticate and access forecast data for each model type. Before getting started, you will require several Python packages for the examples to work, namely:
xarrayaiohttpfsspeczarrdask
We recommend installing the dependencies in a Python virtual environment.
Authentication Setup
First, import the required libraries and set up authentication which will be used in all examples:
import os
from aiohttp import BasicAuth
import xarray as xr
import numpy as np
API_KEY_ID = os.environ["JUA_API_KEY_ID"]
API_KEY_SECRET = os.environ["JUA_API_KEY_SECRET"]
AUTH = BasicAuth(API_KEY_ID, API_KEY_SECRET)Accessing EPT-2 Forecast Data
For EPT-2, all lead times are stored in a single Zarr file:
# Define the path to the specific forecast
model = "ept-2" # or "ept-1.5-b"
init_time = "2025061112" # 2025-05-08 at 06:00 UTC
zarr_url = f"https://data.jua.sh/forecasts/{model}/{init_time}.zarr"
ds = xr.open_dataset(
zarr_url,
engine="zarr",
decode_timedelta=True,
storage_options={"auth": AUTH}
)
# Display basic information
print(ds)
# The dataset already includes all lead times as a dimension
# You can access specific lead times like this:
lead_time = np.timedelta64(24, "h") # 24 hours ahead
ds_at_lead_time = ds.sel(prediction_timedelta=lead_time)Checking Forecast Availability
When accessing the latest forecast, there is a risk that the Zarr files may be partially-written. Attempting to read these partially-written files could result in errors or data inconsistency.
In order to make sure that the latest Zarr dataset is complete, it's best to make use of the API to check that the latest forecast is fully available.
Below is partial example for checking if a forecast is available. For more details please refer to the API data access guide or the API reference documentation.
from datetime import datetime
import requests
AUTH_HEADER = {"X-API-Key": f"{API_KEY_ID}:{API_KEY_SECRET}"}
# In this example, we want the first two days' predictions of
# the EPT-1.5 forecast initialised on 2025-05-08 at 06:00 UTC
desired_init_time = datetime.fromisoformat("2025-05-08T06:00:00.000Z")
desired_forecasted_hours = 48
# Get the latest available forecast metadata from the API
response = requests.get(
"https://api.jua.ai/v1/forecasting/ept1_5/forecasts/latest",
headers=AUTH_HEADER
).json()
found_init_time = datetime.fromisoformat(response["init_time"])
available_forecasted_hours = response["available_forecasted_hours"]
# Check that our desired data is ready to be accessed:
if found_init_time > desired_init_time:
print("Yes, we are accessing a historic forecast, therefore it should already be fully written")
elif found_init_time == desired_init_time and available_forecasted_hours >= desired_forecasted_hours:
print("Yes, we are accessing the latest forecast, which already has the desired hours completed")
else:
print("No, we are trying to access a forecast which is not yet fully available")When attempting to retrieve the most recent forecast, poll for the latest available init_time and available_forecasted_hours via the API above and only read the data once it's available. Also refer to the expected dissemination times.
Accessing the Legacy EPT-1.5 Forecast Data
We are currently supporting two access types for the EPT-1.5 and EPT-1.5-early model. The standard approach is described above for EPT-2 and can be used to access the EPT-1.5 files under the names ept-1.5-b and ept-1.5-early-b. The lecagy approach is described below.
For the legacy EPT-1.5 format, each lead time is stored in a separate Zarr file. Here is how to open one or more lead times with Python:
# Define the path to the specific forecast
model = "ept-1.5" # or "ept-1.5-early"
init_time = "2025050806" # 2025-05-08 at 06:00 UTC
lead_time = 24 # 24 hours ahead
zarr_url = f"https://data.jua.sh/forecasts/{model}/{init_time}/{lead_time}.zarr"
# Open the remote dataset
ds = xr.open_dataset(
zarr_url,
engine="zarr",
decode_timedelta=True,
storage_options={"auth": AUTH}
)
# Print an overview of the dataset
print(ds)Last updated