githubEdit

Response Formats

Response Formats

The Jua Query Engine supports three response formats optimized for different use cases: JSON, Apache Arrow, and Arrow Streaming. This guide explains each format and provides examples for working with responses in Python and JavaScript.


Format Overview

Feature
JSON
Arrow
Arrow Streaming

Query Parameter

?format=json

?format=arrow

?format=arrow&stream=true

Content-Type

application/json

application/vnd.apache.arrow.stream

application/vnd.apache.arrow.stream

Max Rows

50,000

5,000,000

1,000,000,000

Use Case

Small queries, quick testing

Medium to large datasets

Very large datasets

Memory Usage

Higher

Lower

Lowest (incremental)

Parse Speed

Slower

Faster

Fastest

Best For

Quick exploration, small data

Data analysis, medium data

Production, historical data


JSON Format

Description

JSON format returns data in a columnar structure where each column is a key with an array of values. This format is human-readable and easy to work with but becomes inefficient for large datasets.

Request

Add ?format=json to the query endpoint (this is the default):

Response Structure

Limitations

  • Maximum 50,000 rows

  • Larger memory footprint compared to Arrow

  • Slower parsing for large datasets

  • Not suitable for production queries with large data volumes

When to Use

  • Quick testing and exploration

  • Small queries (<10k rows)

  • Debugging query structure

  • Simple web applications with limited data needs


Apache Arrow Format

Description

Apache Arrow is a high-performance columnar data format designed for efficient data interchange. It provides zero-copy reads and significantly faster parsing compared to JSON.

Request

Add ?format=arrow to the query endpoint:

Response Format

Returns a binary stream in Apache Arrow IPC format. The response must be parsed using an Arrow library.

Limitations

  • Maximum 5,000,000 rows (non-streaming)

  • Requires Arrow library to parse

  • Binary format (not human-readable)

When to Use

  • Medium to large datasets (10k - 1M rows)

  • Data science workflows with pandas/polars

  • When performance is important

  • Batch processing


Arrow Streaming Format

Description

Arrow Streaming builds on the Arrow format but streams data in chunks, allowing you to process datasets larger than available memory. This is the most efficient format for very large queries.

Request

Add ?format=arrow&stream=true to the query endpoint:

Response Format

Returns a chunked binary stream in Apache Arrow IPC format. Data arrives incrementally and can be processed as it's received.

Limitations

  • Requires Arrow library with streaming support

  • Binary format (not human-readable)

  • Cannot easily inspect data during transfer

When to Use

  • Very large datasets (>1M rows)

  • Historical data queries

  • Production applications

  • When memory is constrained

  • Long-running queries

circle-info

Are you using Python? Jua's Python SDKarrow-up-right handles requests and streaming responses for you.


Choosing the Right Format

Decision Flow

Recommendations by Use Case

Use Case
Recommended Format
Reason

API testing in browser

JSON

Easy to inspect

Dashboard (live data)

JSON or Arrow

Fast updates, moderate data

Data analysis (Jupyter)

Arrow

Fast pandas conversion

Historical data download

Arrow Streaming

Handles large volumes

Production ETL pipeline

Arrow or Arrow Streaming

Most efficient


Examples

Prerequisites

Example 1: JSON Format

Output:


Example 2: Apache Arrow Format

Output:


Example 3: Arrow Streaming Format

Output:


Next Steps

  • Query Structure: Learn how to construct queries in docs-query-structure.md

  • Examples: See complete examples in docs-examples.md

  • API Reference: Explore all endpoints in the OpenAPI documentationarrow-up-right

Last updated