Quickstart Examples

These examples provide a rapid introduction to accessing Jua's data products using Python. They focus on common initial tasks like finding and loading the latest forecast data.

Before running these scripts, ensure you have:

  1. Set up your API credentials by following the instructions in our Getting Started guide. The scripts expect your Key ID and Secret in environment variables.

  2. Installed the necessary Python packages. See the list below for each example. You can find more details on recommended packages here.

Finding and Accessing the Latest Forecast

This example demonstrates how to programmatically find the most recent forecast initialization time available via Zarr/HTTP access and then load data for a specific lead time and location.

Dependencies

This script requires the following Python packages:

aiohttp
fsspec
requests
xarray >= 2023.6.0 # Ensure version supports decode_timedelta=False with zarr
zarr

Setting up Authentication

The script requires your Jua API Key ID and Secret to be available as environment variables.

import os
from aiohttp import BasicAuth

# Retrieve credentials from environment variables
API_KEY_ID = os.environ["JUA_API_KEY_ID"]
API_SECRET = os.environ["JUA_API_SECRET"]

# Create authentication object for data requests
AUTH = BasicAuth(login=API_KEY_ID, password=API_SECRET)

# NB: It may take up to 5 minutes for new API keys to be deployed.
# In the meantime, you may get unauthorized errors.

Script

This script uses fsspec to list available forecast directories and xarray to open a specific dataset.

import xarray as xr
from fsspec.implementations.http import HTTPFileSystem
from aiohttp import BasicAuth
import os

# --- Authentication Setup ---
API_KEY_ID = os.environ["JUA_API_KEY_ID"]
API_SECRET = os.environ["JUA_API_SECRET"]
AUTH = BasicAuth(login=API_KEY_ID, password=API_SECRET)

# --- Find Latest Forecast Initialization ---
# Create an HTTP FileSystem object with authentication
fs = HTTPFileSystem(client_kwargs={"auth": AUTH})

# Base URL for the EPT 1.5km forecasts
folder_url = "https://data.jua.ai/forecasts/ept-1.5/"

# List available forecast initialization directories (e.g., "2024010100")
# fs.ls returns a list of dictionaries, we extract the 'name' (full path)
forecast_inits = sorted(
    [
        f["name"]
        for f in fs.ls(
            folder_url,
            refresh=True, # Ensure fresh listing
            # Pass auth details again for the listing itself
            # Note: Redundant if already in client_kwargs, but explicit here
            # storage_options={"auth": AUTH}, # Often not needed if in client_kwargs
        )
        # Filter out any non-directory entries if necessary (optional)
        if f["type"] == "directory"
    ]
)

# Get the path to the latest forecast initialization
latest_init_path = forecast_inits[-1]
print(f"Latest forecast initialization path: {latest_init_path}")

# --- List and Sort Available Lead Times ---
# List Zarr files (lead times) within the latest initialization directory
lead_time_paths = [
    f["name"] 
    for f in fs.ls(latest_init_path, refresh=True)
    if f["name"].endswith(".zarr") # Ensure we only list .zarr files
]

# Sort paths based on the integer lead time value (e.g., 0, 1, 2, ...)
lead_time_paths.sort(
    key=lambda path: int(path.split("/")[-1].removesuffix(".zarr"))
)

print(f"Found {len(lead_time_paths)} lead times.")
print(f"First available lead time: {lead_time_paths[0].split('/')[-1]}")
print(f"Last available lead time: {lead_time_paths[-1].split('/')[-1]}")

# --- Load Data for a Specific Lead Time and Location ---
# Select the path for the latest available lead time
latest_lead_time_path = lead_time_paths[-1]
print(f"\nOpening latest lead time: {latest_lead_time_path}")

# Open the Zarr dataset for this single lead time
single_lead_time_ds = xr.open_dataset(
    latest_lead_time_path,
    engine="zarr",
    decode_timedelta=False, # Keep prediction_timedelta as numeric hours
    storage_options={"auth": AUTH}, # Pass auth for data reading
)

# --- Example Data Selection ---
# Select the 'ssrd' (Surface Solar Radiation Downwards) variable
# at the first time step (time=0) for a specific lat/lon (nearest grid point)
latitude = 47.3
longitude = 8.5
variable = "ssrd"

selected_data = single_lead_time_ds[variable].sel(
    latitude=latitude, 
    longitude=longitude, 
    method="nearest"
).isel(time=0) # Assuming only one time step per file or selecting the first

print(f"\nSelected data ({variable}) at ({latitude}, {longitude}):")
print(selected_data)

# Close the dataset (good practice, though often handled by context managers)
single_lead_time_ds.close()

Key Concepts Demonstrated

  • Authenticating access to remote data using aiohttp.BasicAuth and fsspec.HTTPFileSystem.

  • Listing directory contents on a remote HTTP server using fsspec.ls to find available forecast initializations.

  • Identifying and sorting forecast lead times based on their numerical value extracted from Zarr file paths.

  • Opening a remote Zarr dataset corresponding to a specific forecast initialization and lead time using xarray.open_dataset.

  • Selecting specific variables and geographical points (using nearest neighbour) from the loaded dataset using xarray.sel and .isel.

  • Handling authentication details for both directory listing and data reading.

Last updated