[Presto] 배열함수 ① - Array

Array_agg

- 여러개의 row들을 하나의 array로 합쳐주는 함수

- x 원소를 input으로 받아 array로 변환하여 반환한다.

(원문)

array_agg(x) → array<[same as input]>

: Returns an array created from the input x elements.

< 예시 코드 >

WITH dataset (id, n) AS (
  VALUES
    (1, 1),
    (1, 2),
    (1, 3),
    (2, 4),
    (2, 5),
    (2, 6)
)
SELECT id, array_agg(n)
FROM dataset
GROUP BY id;

< 실행 결과 >

추가 1 : Array_distinct로 고유원소만 맵핑하기

< 예시 코드 >

WITH dataset (id, n) AS (
  VALUES
    (1, 1),
    (1, 2),
    (1, 2),
    (1, 3),
    (2, 4),
    (2, 5),
    (2, 5),
    (2, 6)
)
SELECT id, array_distinct(array_agg(n))
FROM dataset
GROUP BY id;

< 실행 결과 >

추가 2 : 서브쿼리로요소별빈도순으로정렬하여 array로변환/맵핑하기 (내장기능 X)

< 예시 코드 >

WITH dataset (id, n) AS (
  VALUES
    (1, 1),
    (1, 2),
    (1, 2),
    (1, 3),
    (2, 4),
    (2, 5),
    (2, 5),
    (2, 6)
), freq AS (
  SELECT id, n, COUNT(*) OVER (PARTITION BY id, n) AS cnt
  FROM dataset
)
SELECT id, array_agg(n ORDER BY cnt DESC)
FROM freq
GROUP BY id;

< 실행 결과 >

Unnest

- Array 또는 Map을 각각 한 개 컬럼 또는 두 개 컬럼(key, value)으로 이루어진 원소들로 확장시켜주는 함수

- join과 보통 같이 쓰이며, 아래와 같이 cross join을 활용하여, 본래의 배열 내 원소들과 같은 구성으로 확장이 되게 통상적으로 사용한다.

(원문)

UNNEST can be used to expand an ARRAY or MAP into a relation. Arrays are expanded into a single column, and maps are expanded into two columns (key, value). UNNEST can also be used with multiple arguments, in which case they are expanded into multiple columns, with as many rows as the highest cardinality argument (the other columns are padded with nulls). UNNEST can optionally have a WITH ORDINALITY clause, in which case an additional ordinality column is added to the end. UNNEST is normally used with a JOIN and can reference columns from relations on the left side of the join.

< 예시 코드 >

WITH DATASET (ID, ARR) AS (
	VALUES
    	(1, ARRAY[1,2,3]),
        (2, ARRAY[4,5,6])
)
SELECT ID, N
FROM DATASET
CROSS JOIN UNNEST(ARR) T (N) ;

< 실행 결과 >

- 출처 : presto 공식 documentation ( SELECT — Presto 0.280 Documentation (prestodb.io) )

Bridge into Data Science

[Presto] 배열함수 ① - Array_agg, Unnest

Array_agg

Unnest

티스토리툴바