Questions tagged [greenplum]

Greenplum is the worlds first open-source massively parallel processing database based on PostgreSQL.It provides powerful and rapid analytics on petabyte scale data volumes. Uniquely geared toward big data analytics, Greenplum Database is powered by the world’s most advanced cost-based query optimizer delivering high analytical query performance on large data volumes.

Greenplum is a massively parallel processing database based on PostgreSQL and is designed for analytic data warehouses to manage, store and analyze terabytes to petabytes of data. Greenplum is developed by Pivotal.

797 questions
31
votes
7 answers

20 Billion Rows/Month - Hbase / Hive / Greenplum / What?

I'd like to use your wisdom for picking up the right solution for a data-warehouse system. Here are some details to better understand the problem: Data is organized in a star schema structure with one BIG fact and ~15 dimensions. 20B fact rows…
Haggai
29
votes
4 answers

NOT EXISTS clause in Postgresql

Anyone knows how to perform such query in Postgresql? SELECT * FROM tabA WHERE NOT EXISTS ( SELECT * FROM tabB WHERE tabB.id = tabA.id ) When I execute such query, postgresql complains "ERROR: Greenplum Database does not yet support…
cheng
  • 2,106
  • 6
  • 28
  • 36
21
votes
2 answers

How to use a SQL window function to calculate a percentage of an aggregate

I need to calculate percentages of various dimensions in a table. I'd like to simplify things by using window functions to calculate the denominator, however I am having an issue because the numerator has to be an aggregate as well. As a simple…
14
votes
6 answers

Advantages of databases like Greenplum or Vertica compared to MongoDB or Cassandra

I am currently working in a few projects with MongoDB and Apache Cassandra respectively. I am also using Solr a lot and I am handling "lots" of data with them (approx. 1-2TB). I've heard of Greenplum and Vertica the first time in the last week and I…
disco crazy
  • 31,313
  • 12
  • 80
  • 83
14
votes
3 answers

Where clause inside an over clause in postgres

Is it possible to use the where clause inside an overclause as below ? SELECT SUM(amount) OVER(partition by prod_name WHERE dateval > dateval_13week) I cannot use preceding and following inside over clause as my dates are not in the order. All I…
user2569524
  • 1,651
  • 7
  • 32
  • 57
12
votes
7 answers

Greenplum vs PostgreSQL

What are the arguments for and against using Greenplum instead of PostgreSQL in a webapp (django) environment? My gut reaction is to prefer PostgreSQL's open-source approach and huge knowledgebase. My configuration (though I'd love to hear about any…
0atman
  • 3,298
  • 4
  • 30
  • 46
9
votes
1 answer

psql: database "template0" is not currently accepting connections

We have Installed fresh gpdb database.But,when trying to connect with template0 database. [gpadmin@mdw~]$ psql -d template0 psql: FATAL: database "template0" is not currently accepting connections [gpadmin@mdw~]$ We tried to Update the FLAG…
NEO
  • 389
  • 8
  • 31
8
votes
3 answers

Error : relation does not exist, on greenplum database

I'm working on PostgreSQL 8.2.15 (Greenplum database 4.2.0 build 1)(HAWQ 1.2.1.0 build 10335). I wrote a function like create or replace function my_function ( ... select exists(select 1 from my_table1 where condition) into result; I tested it…
Clxy
  • 505
  • 1
  • 5
  • 13
7
votes
4 answers

rodbc character encoding error with PostgreSQL

I'm getting a new error which I've never gotten before when connecting from R to a GreenPlum PostgreSQL database using RODBC. I've gotten the error using both EMACS/ESS and RStudio, and the RODBC call has worked as is in the past. library(RODBC) gp…
wahalulu
  • 1,447
  • 2
  • 17
  • 23
7
votes
1 answer

Talend greenplumRow error handling

I want to create views in greenplum HAWQ using a simple talend job, that would basically have a fileinput that contains all the views then I need to execute the CREATE VIEW script. Since these views (50-60.000) come from an oracle system I need to…
Balazs Gunics
  • 2,017
  • 2
  • 17
  • 24
6
votes
4 answers

Get last inserted row ID with Psycopg2 and a Greenplum database

How can you get the ID of the last inserted row using psycopg2 on a Greenplum database? Here are several things I've tried already that don't work. RETURNING isn't supported by Greenplum. psycopg2's cursor.lastrowid always returns 0. SELECT…
Kyo
  • 277
  • 3
  • 10
5
votes
2 answers

Data storage for financial analysis

I am building system to analyze large quantities of financial data regarding securities trading prices. A large challenge in this is determining what storage method to use for the data given that the data will be in the 10's of terrabytes. There…
user396404
  • 2,759
  • 7
  • 31
  • 42
5
votes
9 answers

Why do Column oriented databases such as Vertica/InfoBright/GreenPlum make a fuss of Hadoop?

What is the point in feeding an Hadoop cluster and using that cluster to feed data into a Vertica/InfoBright datawarehouse ? All thse vendor keep saying "we can connect with Hadoop", but I don't understand what's the point. What is the interest of…
SCO
  • 1,832
  • 1
  • 24
  • 45
5
votes
2 answers

How to increase greenplum concurrency and # query per sec

We have a fairly big Greenplum v4.3 cluster. 18 hosts, each host has 3 segment nodes. Each host has approx 40 cores and 60G memory. The table we have is 30 columns wide, which has 0.1 billion rows. The query we are testing has 3-10 secs response…
Shengjie
  • 12,336
  • 29
  • 98
  • 139
4
votes
1 answer

LEFt JOIN LATERAL showing error with SELECT

I am trying to run below query : SELECT tc.ID_NUMBER AS AFC_RPP_Number, hc.BUSINESS AS Business, hc.DIRECTOR AS Director, tc.REASON_FOR_REVISION AS Description_of_Change FROM alo_gg.AWS_PIM tc left join lateral( select BUSINESS,DIRECTOR …
Safala
  • 51
  • 1
  • 2
1
2 3
53 54