Response Formats
Response Formats
The Jua Query Engine supports three response formats optimized for different use cases: JSON, Apache Arrow, and Arrow Streaming. This guide explains each format and provides examples for working with responses in Python and JavaScript.
Format Overview
Query Parameter
?format=json
?format=arrow
?format=arrow&stream=true
Content-Type
application/json
application/vnd.apache.arrow.stream
application/vnd.apache.arrow.stream
Max Rows
50,000
5,000,000
1,000,000,000
Use Case
Small queries, quick testing
Medium to large datasets
Very large datasets
Memory Usage
Higher
Lower
Lowest (incremental)
Parse Speed
Slower
Faster
Fastest
Best For
Quick exploration, small data
Data analysis, medium data
Production, historical data
JSON Format
Description
JSON format returns data in a columnar structure where each column is a key with an array of values. This format is human-readable and easy to work with but becomes inefficient for large datasets.
Request
Add ?format=json to the query endpoint (this is the default):
Response Structure
Limitations
Maximum 50,000 rows
Larger memory footprint compared to Arrow
Slower parsing for large datasets
Not suitable for production queries with large data volumes
When to Use
Quick testing and exploration
Small queries (<10k rows)
Debugging query structure
Simple web applications with limited data needs
Apache Arrow Format
Description
Apache Arrow is a high-performance columnar data format designed for efficient data interchange. It provides zero-copy reads and significantly faster parsing compared to JSON.
Request
Add ?format=arrow to the query endpoint:
Response Format
Returns a binary stream in Apache Arrow IPC format. The response must be parsed using an Arrow library.
Limitations
Maximum 5,000,000 rows (non-streaming)
Requires Arrow library to parse
Binary format (not human-readable)
When to Use
Medium to large datasets (10k - 1M rows)
Data science workflows with pandas/polars
When performance is important
Batch processing
Arrow Streaming Format
Description
Arrow Streaming builds on the Arrow format but streams data in chunks, allowing you to process datasets larger than available memory. This is the most efficient format for very large queries.
Request
Add ?format=arrow&stream=true to the query endpoint:
Response Format
Returns a chunked binary stream in Apache Arrow IPC format. Data arrives incrementally and can be processed as it's received.
Limitations
Requires Arrow library with streaming support
Binary format (not human-readable)
Cannot easily inspect data during transfer
When to Use
Very large datasets (>1M rows)
Historical data queries
Production applications
When memory is constrained
Long-running queries
Are you using Python? Jua's Python SDK handles requests and streaming responses for you.
Choosing the Right Format
Decision Flow
Recommendations by Use Case
API testing in browser
JSON
Easy to inspect
Dashboard (live data)
JSON or Arrow
Fast updates, moderate data
Data analysis (Jupyter)
Arrow
Fast pandas conversion
Historical data download
Arrow Streaming
Handles large volumes
Production ETL pipeline
Arrow or Arrow Streaming
Most efficient
Examples
Prerequisites
Example 1: JSON Format
Output:
Example 2: Apache Arrow Format
Output:
Example 3: Arrow Streaming Format
Output:
Prerequisites
Example 1: JSON Format (Node.js)
Output:
Example 2: JSON Format (Browser)
Example 3: Apache Arrow Format (Node.js)
Output:
Example 4: Arrow Streaming Format (Node.js)
Output:
Next Steps
Query Structure: Learn how to construct queries in docs-query-structure.md
Examples: See complete examples in docs-examples.md
API Reference: Explore all endpoints in the OpenAPI documentation
Last updated