Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can't handle postgres table with geometry column #190

Open
xhwhis opened this issue Nov 29, 2024 · 5 comments
Open

can't handle postgres table with geometry column #190

xhwhis opened this issue Nov 29, 2024 · 5 comments

Comments

@xhwhis
Copy link
Contributor

xhwhis commented Nov 29, 2024

I can't handle postgres table with geometry column because of PR #151.
I think we should modify SCHEMA_QUERY. When c.data_type = 'USER-DEFINED', return t.typname.

WITH custom_type_details AS (
SELECT 
t.typname,
t.typtype,
CASE 
    WHEN t.typtype = 'e' THEN 
        jsonb_build_object(
            'type', 'enum',
            'values', (
                SELECT jsonb_agg(e.enumlabel ORDER BY e.enumsortorder)
                FROM pg_enum e 
                WHERE e.enumtypid = t.oid
            )
        )
    WHEN t.typtype = 'c' THEN
        jsonb_build_object(
            'type', 'composite',
            'attributes', (
                SELECT jsonb_agg(
                    jsonb_build_object(
                        'name', a.attname,
                        'type', pg_catalog.format_type(a.atttypid, a.atttypmod)
                    )
                    ORDER BY a.attnum
                )
                FROM pg_attribute a
                WHERE a.attrelid = t.typrelid 
                AND a.attnum > 0 
                AND NOT a.attisdropped
            )
        )
END as type_details
FROM pg_type t
WHERE t.typnamespace = (SELECT oid FROM pg_namespace WHERE nspname = $1)
)
SELECT 
c.column_name,
CASE 
WHEN c.data_type = 'USER-DEFINED' THEN
    CASE 
        WHEN t.typtype = 'e' THEN 'enum'
        WHEN t.typtype = 'c' THEN 'composite'
        ELSE t.typname
    END
WHEN c.data_type = 'ARRAY' THEN
    'array'
ELSE pg_catalog.format_type(a.atttypid, a.atttypmod)
END as data_type,
c.is_nullable,
CASE 
WHEN c.data_type = 'ARRAY' THEN
    jsonb_build_object(
        'type', 'array',
        'element_type', (
            SELECT pg_catalog.format_type(et.oid, a.atttypmod)
            FROM pg_type t
            JOIN pg_type et ON t.typelem = et.oid
            WHERE t.typname = c.udt_name
        )
    )
ELSE td.type_details
END as type_details
FROM 
information_schema.columns c
LEFT JOIN custom_type_details td ON td.typname = c.udt_name
LEFT JOIN pg_type t ON t.typname = c.udt_name
LEFT JOIN pg_attribute a ON 
a.attrelid = (
    SELECT oid 
    FROM pg_class 
    WHERE relname = c.table_name 
    AND relnamespace = (
        SELECT oid 
        FROM pg_namespace 
        WHERE nspname = c.table_schema
    )
)
AND a.attname = c.column_name
WHERE 
c.table_schema = $1
AND c.table_name = $2
ORDER BY 
c.ordinal_position;
   column_name   |       data_type        | is_nullable |                     type_details
-----------------+------------------------+-------------+-------------------------------------------------------
 id              | integer                | NO          |
 name            | character varying(100) | NO          |
 chinese         | character varying(100) | NO          |
 adcode          | character varying(20)  | NO          |
 center          | array                  | NO          | {"type": "array", "element_type": "double precision"}
 centroid        | array                  | NO          | {"type": "array", "element_type": "double precision"}
 geom            | geometry               | NO          |
 children_num    | integer                | NO          |
 level           | character varying(100) | NO          |
 parent          | character varying(20)  | NO          |
 acroutes        | array                  | NO          | {"type": "array", "element_type": "integer"}
 centro_id       | array                  | YES         | {"type": "array", "element_type": "double precision"}
 bound           | text                   | YES         |
 simplified_geom | geometry               | YES         |
@xhwhis xhwhis changed the title postgres table can't get schema with geometry column can't handle postgres table with geometry column Nov 29, 2024
@phillipleblanc
Copy link
Collaborator

I'm guessing geometry is being represented as a composite type? Otherwise we already return t.typname

WHEN c.data_type = 'USER-DEFINED' THEN
    CASE 
        WHEN t.typtype = 'e' THEN 'enum'
        WHEN t.typtype = 'c' THEN 'composite'
        ELSE t.typname
    END

@xhwhis
Copy link
Contributor Author

xhwhis commented Dec 3, 2024

I'm guessing geometry is being represented as a composite type? Otherwise we already return t.typname

No. geometry is being represented as a base type. So can we change to this sql?

WHEN c.data_type = 'USER-DEFINED' THEN
    CASE 
        WHEN t.typtype = 'e' THEN 'enum'
        WHEN t.typtype = 'c' THEN 'composite'
        WHEN t.typtype = 'b' THEN t.typname
        ELSE c.data_type
    END

@xhwhis
Copy link
Contributor Author

xhwhis commented Jan 8, 2025

I'm guessing geometry is being represented as a composite type? Otherwise we already return t.typname

No. geometry is being represented as a base type. So can we change to this sql?

WHEN c.data_type = 'USER-DEFINED' THEN
    CASE 
        WHEN t.typtype = 'e' THEN 'enum'
        WHEN t.typtype = 'c' THEN 'composite'
        WHEN t.typtype = 'b' THEN t.typname
        ELSE c.data_type
    END

@phillipleblanc plz push it forward

@xhwhis
Copy link
Contributor Author

xhwhis commented Jan 8, 2025

When dealing with special types like money, I've noticed that external table column types are converted to 'DataType::Int64', which results in losing the original formatting. I suggest preserving the original type in the Schema's Field metadata so that, for example, 'money' can still be displayed with a '$' symbol after import. I'm curious about how spiceai handles this type mapping to prevent the loss of semantic meaning?

@phillipleblanc
Copy link
Collaborator

When dealing with special types like money, I've noticed that external table column types are converted to 'DataType::Int64', which results in losing the original formatting. I suggest preserving the original type in the Schema's Field metadata so that, for example, 'money' can still be displayed with a '$' symbol after import. I'm curious about how spiceai handles this type mapping to prevent the loss of semantic meaning?

We currently only map into the best-matching Arrow type that exists - it's up to the consuming application to preserve the semantics if they need to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants