Questions tagged [paraccel]

Column-oriented DBMS for decision support and complex processing.

23 questions
23
votes
4 answers

How to measure table space on disk in RedShift / ParAccel

I have a table in RedShift. How can I see how many disk-space it uses?
diemacht
  • 2,022
  • 7
  • 30
  • 44
11
votes
5 answers

S3 -> Redshift cannot handle UTF8

We have a file in S3 that is loaded in to Redshift via the COPY command. The import is failing because a VARCHAR(20) value contains an Ä which is being translated into .. during the copy command and is now too long for the 20 characters. I have…
Elliot Chance
  • 5,526
  • 10
  • 49
  • 80
6
votes
1 answer

Long runtime when query is executed the first time in RedShift

I noticed that the first time I run a query on RedShift, it takes 3-10 second. When I run same query again, even with different arguments in WHERE condition, it runs fast (0.2 sec). Query I was talking about runs on a table of ~1M rows, on 3 integer…
diemacht
  • 2,022
  • 7
  • 30
  • 44
5
votes
2 answers

Efficient GROUP BY a CASE expression in Amazon Redshift/PostgreSQL

In analytics processing there is often a need to collapse "unimportant" groups of data into a single row in the resulting table. One way to do this is to GROUP BY a CASE expression where unimportant groups are coalesced into a single row via the…
Sim
  • 13,147
  • 9
  • 66
  • 95
5
votes
1 answer

Pivot a table with Amazon RedShift

I have several tables in Amazon RedShift that follow the pattern of several dimension columns and a pair of metric name/value columns. DimensionA DimensionB MetricName MetricValue ---------- ---------- ---------- ----------- dimA1 dimB1 …
Sim
  • 13,147
  • 9
  • 66
  • 95
5
votes
1 answer

Amazon Redshift Equality filter performance and sortkeys

Does Redshift efficiently (i.e. binary search) find a block of a table that is sorted on a column A for a query with a condition A=? As an example, let there be a table T with ~500m rows, ~50 fields, distributed and sorted on field A. Field A has…
user2886358
  • 91
  • 2
  • 3
4
votes
2 answers

concurrent query performance in amazon redshift

On Amazon Redshift, do concurrent queries affect each others performance? For example, lets say there are two queries: one on a relatively small table (~5m rows) retrieving all rows, and another on a large table (~500m) rows. Both tables have the…
user2886358
  • 91
  • 2
  • 3
3
votes
1 answer

Redshift SELECT * performance versus COUNT(*) for non existent row

I am confused about what Redshift is doing when I run 2 seemingly similar queries. Neither should return a result (querying a profile that doesn't exist). Specifically: SELECT * FROM profile WHERE id = 'id_that_doesnt_exist' and project_id = 1; …
AndySavage
  • 1,729
  • 1
  • 20
  • 34
2
votes
2 answers

Dynamic SQL in Postgres

I've coded a simple Function using Postgres but keep getting the following: ERROR: syntax error at or near "$2". The underlying database is ParAccel and I'm new to both Postgres and ParAccel. I'm using TOAD Data Point as the IDE: CREATE OR…
2
votes
1 answer

postgresql cursor is slow on update

As a short foreword, I'm new to postgresql. Further, the postgresql version I need the advice on is 8.1. The reason for that is postgresql 8.1 is the last implemented and supported version of this language by ParAccel. Postgresql cursor, at least in…
DiStas
  • 21
  • 2
2
votes
1 answer

Upsert in Amazon RedShift without Function or Stored Procedures

As there is no support for user defined functions or stored procedures in RedShift, how can i achieve UPSERT mechanism in RedShift which is using ParAccel, a PostgreSQL 8.0.2 fork. Currently, i'm trying to achieve UPSERT mechanism using…
Pratik Borkar
  • 475
  • 2
  • 7
  • 17
2
votes
2 answers

What is the FastLoad (in Teradata) equivalent for ParAccel?

I have recently shifted from Teradata to ParAccel and I use my BI DBMS with SAS environment. Teradata has this utility called FastLoad for loading large datasets fast and more efficiently. I often have to make use of this utility to transfer…
Macbook
  • 125
  • 1
  • 2
  • 9
1
vote
1 answer

SSAS tabluar mode processing fails with "a lot of rows"

I have a SSAS tabular mode cube that reads data from an Actian Matrix database using ODBC. The project processes fine when I'm using a data set with 1 Million rows but when I try to use a bigger one (300 Million rows), the process runs for around…
Diego
  • 34,802
  • 21
  • 91
  • 134
1
vote
1 answer

Alternative to references in a GROUP BY column to the results of a correlated subquery

This question comes as a result of a limitation in Amazon Redshift, the columnar analytics database based on Paraccel. One of the unsupported features is references in a GROUP BY column to the results of a correlated subquery. For example, the…
Sim
  • 13,147
  • 9
  • 66
  • 95
1
vote
2 answers

Very bad performance of UNION select query in RedShift / ParAccel

I have two tables in redshift: tbl_current_day - about 4.5M rows tbl_previous_day - about 4.5M rows, with the same data exactly as tbl_current_day In addition to it, I have a view called qry_both_days defined as following: CREATE OR REPLACE…
diemacht
  • 2,022
  • 7
  • 30
  • 44
1
2