You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This table has 600M records. Average 4M records per created date bucket. Average Record size around 1KB.
Export the data for one day using INSERT INTO FILES() to S3 in parquet format with zstd compression. Using following SUBMIT TASK query :
SUBMIT TASK AS
INSERT
INTO
FILES (
"path" = "s3://test/exports/export1/messages/2024_12_26",
"format" = "parquet",
"compression" = "zstd",
"single" = "true", -- turn this on or off
"target_max_file_size" = "104857600",
"aws.s3.access_key" = "AAAA",
"aws.s3.secret_key" = "BBBB",
"aws.s3.region" = "ap-south-1",
"aws.s3.use_instance_profile" = "false"
)
SELECT
tenantId,
created,
msgId,
from,
to,
type,
source,
sent,
code,
description,
json_string(`payload`) as payload,
json_string(`metadata`) as metadata,
updatedAt,
version
FROM
messages
WHERE
created >= '2024-12-25T18:30:00.000Z'
AND created < '2024-12-26T18:30:00.000Z';
Check the output .parquet files in AWS S3.
Expected behavior (Required)
Case 1 - with single=true:
There should be a single .parquet file in the destination S3 location.
Case 2 - with single=false:
There should be multiple files of around 1 GB size in the destination S3 location.
Real behavior (Required)
Case 1 - with single=true:
Randomly seeing thousands of files (1500+) for 1 day of data. More if query targets more rows.
File sizes range from 100KB to 1 GB. (Only 4-5 1GB files)
Case 2 - with single=false:
Randomly seeing thousands of files (1500+) for 1 day of data. More if query targets more rows.
File sizes range from 100KB to 1 GB. (Only 4-5 1GB files)
StarRocks version (Required)
3.3.9
3.3.7
The text was updated successfully, but these errors were encountered:
Steps to reproduce the behavior (Required)
created
date bucket. Average Record size around 1KB..parquet
files in AWS S3.Expected behavior (Required)
single=true
:.parquet
file in the destination S3 location.single=false
:Real behavior (Required)
single=true
:single=false
:StarRocks version (Required)
The text was updated successfully, but these errors were encountered: