Data Mapping
JayDeBeApiArrow converts JDBC data types through two stages: JDBC to Arrow (in the JVM) and Arrow to Python (on the Python side). This page documents the mapping and known limitations.
Type Mapping Table
Numeric Types
| JDBC Type |
Arrow Type |
Python Type |
Notes |
INTEGER |
Int32 |
int |
|
BIGINT |
Int64 |
int |
|
SMALLINT |
Int16 |
int |
|
TINYINT |
Int8 |
int |
|
BOOLEAN / BIT |
Boolean |
bool |
|
FLOAT / REAL |
Float32 |
float |
|
DOUBLE |
Float64 |
float |
|
DECIMAL / NUMERIC |
Decimal128 |
decimal.Decimal |
Full precision preserved |
String Types
| JDBC Type |
Arrow Type |
Python Type |
Notes |
VARCHAR |
Utf8 |
str |
|
CHAR / NCHAR |
Utf8 |
str |
|
NVARCHAR |
Utf8 |
str |
|
LONGVARCHAR / LONGNVARCHAR |
Utf8 |
str |
|
CLOB / NCLOB |
Utf8 |
str |
|
OTHER |
Utf8 |
str |
Fallback |
Temporal Types
| JDBC Type |
Arrow Type |
Python Type |
Notes |
DATE |
Date32 |
datetime.date |
|
TIME |
Time32 |
datetime.time |
|
TIME_WITH_TIMEZONE |
Utf8 |
str |
Fallback - not natively supported by Arrow |
TIMESTAMP |
Timestamp |
datetime.datetime |
Naive (no timezone) |
TIMESTAMP_WITH_TIMEZONE |
Timestamp(tz=UTC) |
datetime.datetime |
Timezone-aware, UTC |
Binary Types
| JDBC Type |
Arrow Type |
Python Type |
Notes |
BINARY |
Binary |
bytes / memoryview |
|
VARBINARY |
Binary |
bytes / memoryview |
|
BLOB |
Binary |
bytes / memoryview |
|
LONGVARBINARY |
Binary |
bytes / memoryview |
|
Special Types
| JDBC Type |
Arrow Type |
Python Type |
Notes |
ARRAY |
list |
list |
Nested list per element type (e.g. [1, 2, 3] for INT ARRAY) |
SQLXML |
Utf8 |
str |
Fallback via getString() |
ROWID |
Utf8 |
str |
Fallback |
JSON / JSONB / UUID / XML |
Utf8 |
str |
Detected from type name when reported as OTHER |
How Type Detection Works
The ExplicitTypeMapper in arrow-jdbc-extension.jar inspects ResultSetMetaData to build a per-column type mapping. It handles driver-specific quirks:
- Standard types - JDBC type codes like
Types.INTEGER, Types.VARCHAR are mapped directly
- Unknown type codes - Some drivers use non-standard codes (e.g., Oracle
BINARY_DOUBLE = 101). The mapper falls back to matching the column type name against known patterns
- Misreported types - Some drivers misreport types (e.g., SQLite reports
TIME as VARCHAR). The mapper detects these by comparing type code against type name
Known Limitations
Arrow-Unsupported Types
These types are not natively supported by the Arrow JDBC adapter. They are converted to VARCHAR as a degraded fallback:
SQLXML - Returned as string via getString().
ROWID - Returned as string.
OTHER - Returned as string. Columns with type names containing JSON, UUID, or XML are auto-detected and mapped to VARCHAR.
TIME_WITH_TIMEZONE - Not natively supported. Falls back to string representation.
Driver-Specific Quirks
SQLite
- TIME type: SQLite JDBC reports
TIME columns as VARCHAR. This is auto-detected and corrected.
- NUMERIC/DECIMAL precision: SQLite stores all numeric values as doubles internally, so precision information from
ResultSetMetaData is unreliable.
Oracle
- BINARY_DOUBLE: Oracle JDBC reports this with type code 101 (non-standard). Auto-detected by type name matching.
- TIMESTAMP WITH TIME ZONE: Type code varies between
ojdbc8 (101) and ojdbc11 (2013). Auto-detected by type name matching.
- BIGINT: Not supported natively by some Oracle JDBC driver versions.
PostgreSQL
- TIMESTAMPTZ: Reported as
Types.TIMESTAMP by the PostgreSQL JDBC driver. Auto-detected by checking if the type name is "timestamptz" and overriding to TIMESTAMP_WITH_TIMEZONE.
Parameter Binding
| Python Type |
JDBC Type |
Example |
str |
VARCHAR |
"hello" |
int |
INTEGER |
42 |
float |
DOUBLE |
3.14 |
bool |
BOOLEAN |
True |
decimal.Decimal |
DECIMAL |
Decimal("10.50") |
datetime.datetime |
TIMESTAMP |
datetime(2024, 1, 15, 10, 30) |
datetime.date |
DATE |
date(2024, 1, 15) |
datetime.time |
TIME |
time(10, 30, 0) |
bytes / bytearray |
BINARY |
b"\x00\x01\x02" |
None |
NULL |
None (uses setNull() internally) |
list |
ARRAY |
[1, 2, 3] converted to Java array |
None parameters are handled via JDBC setNull() rather than setObject(i, null) for driver compatibility (e.g., Teradata rejects setObject with null).
list parameters are converted to Java arrays (int[], String[], etc.) and bound via setObject(). The ARRAY column type is supported for both reading and writing with drivers that implement java.sql.Array.
Comparison with Parent JayDeBeApi
| Column Type |
Parent (jaydebeapi) |
Fork (jaydebeapiarrow) |
TIMESTAMP |
str |
datetime.datetime (naive) |
TIME |
str |
datetime.time |
DATE |
str |
datetime.date |
DECIMAL / NUMERIC |
float / int |
decimal.Decimal (full precision) |
BINARY |
str |
bytes / memoryview |
TIMESTAMP_WITH_TIMEZONE |
Raw Java object |
datetime.datetime (timezone-aware) |
ARRAY |
Raw Java object |
list (read and write) |
See Differences for the full list of changes from the parent project.