2

We are trying to pull data from external source (mssql) to postgres. But when i checked for invoicedate column entries are getting blank at the same time mssql is showing invoicedate values for those entries.

ie We tried following query on both the DBMS:

When query executed in SQL Server:

select * from tablename where salesorder='168490'

getting 12 rows where invoicedate column is '2015-10-26 00:00:00.000'

But same query is executed on Postgres

select "InvoceDt" from tablename where salesorder='168490'

Getting 12 rows where the column invoicedate is null.

Question is why? Postgres InvoiceDt column is coming null rather than we can see that SQL Server is showing appropriate data values.

Why is the data different between SQL Server and Postgres for this particular column?

NEO
  • 389
  • 8
  • 31
  • how do you copy the data? And hat is the data type of that column in Postgres and SQL Server? –  Oct 30 '15 at 08:22
  • There is also a mismatch in the column names: `invoicedate` vs. `InvoceDt` is that a copy & paste error in your question or do they really differ? If they _do_ differ then maybe your tool that copies the data cannot match the two columns. –  Oct 30 '15 at 08:25
  • data type of that column(invoicedate) is date in postgres and datetime in mssql. we are using spring-xd tool for data ingestion. and data type of salesorder is char in mssql and salesorder is character(6) in postgres – NEO Oct 30 '15 at 08:35
  • it is my typing mistake here both column same here column name is InvoiceDt for both mssql server and postgres db – NEO Oct 30 '15 at 08:37
  • same time we are getting values for InvoiceDt in postgres as well when i search for DISTINCT VALUES OF InvoiceDt. – NEO Oct 30 '15 at 08:38
  • Why do you store date **and** time in SQL Server, but only the date in Postgres? Maybe that's why your migration tool doesn't copy the values. Without the actual migration code, this is impossible to answer –  Oct 30 '15 at 08:41
  • when we created DDL in postgres for table which consist of InvoiceDt so we tried to keep datetime type of mssql as corresponding data type for postgres but it is not supported at the same time when we created postgres DDL with date . it is working – NEO Oct 30 '15 at 08:45
  • The corresponding data type is `timestamp` –  Oct 30 '15 at 08:48
  • Yes, Exactly i tried with but it is not picking timestamp so i replaced with date – NEO Oct 30 '15 at 08:49
  • 1
    I have no idea what "*it is not picking timestamp*" is supposed to mean. Postgres most certainly supports a data type named `timestamp`. –  Oct 30 '15 at 10:49
  • CREATE TABLE tablename ( "InvoiceDt" timestamp ) LOCATION ('pxf://hostname/path/to/hdfs/?profile=HdfsTextSimple') FORMAT 'csv' ( delimiter '^' null 'null' quote '~'); – NEO Oct 30 '15 at 10:55
  • Here is my new ddl statement in which mapping is timestamp as discussed here but getting same null values in InvoiceDt column for particular salesorder entry – NEO Oct 30 '15 at 10:57
  • 1
    That is not a valid Postgres statement. Are you _sure_ you are using Postgres? (And please do not post code in comments, **edit** your question) –  Oct 30 '15 at 10:59
  • we are using postgres code 8.2 my postgres version is PostgreSQL 8.2.15 originally it is HAWQ VERSION IS 1.3.0 – NEO Oct 30 '15 at 11:00
  • 1
    As I said: that is **not** valid for Postgres. Neither for the current nor for the outdated and unsupported version 8.2 - plus it seems that data is coming from a CSV file, not from SQL Server –  Oct 30 '15 at 11:01
  • our external source is mssql . as discussed spring xd is tool which is ingesting file into hadoop in csv format then i am reading through hawq which is postgres 8.2 – NEO Oct 30 '15 at 11:03
  • when i am READING same query into mssql it is giving me result for InvoiceDt as 2015-10-26 but same thing i am reading through pgadmin which is tool for hawq it is showing me null value. and even i am using unix machine for hawq it is also showing null values here – NEO Oct 30 '15 at 11:05
  • Is that possible to avoid additional null at dbms to dbms migration ? because hawq is creating additional null values. or is their any way to get actual value of InvoiceDt as it is in mssql server at the time of migration – NEO Oct 31 '15 at 07:36

1 Answers1

2

Vicps, you aren't using Postgres and that is why a_horse_with_no_name is having such a hard time trying to understand your question. You are using Pivotal HDB (formally called HAWQ). HAWQ is now associated with the incubator project, "Apache HAWQ" and the commercial version is "Pivotal HDB".

Pivotal HDB is a fork of Pivotal Greenplum database which is a fork of PostgreSQL 8.2. It has many similarities to Postgres but it is most definitely not Postgres.

You are also using Spring-XD to move the data from SQL Server to HDFS which is critical in understanding what the true problem is.

You provided this example:

CREATE TABLE tablename ( "InvoiceDt" timestamp ) 
LOCATION ('pxf://hostname/path/to/hdfs/?profile=HdfsTextSimple') 
FORMAT 'csv' ( delimiter '^' null 'null' quote '~');

Your file only has one column in it? How is this possible? Above, you mention the salesorder column. Secondly, have you tried looking at the file written by Spring-XD?

hdfs dfs -cat hdfs://hostname:8020/path/to/hdfs | grep 168490

I bet you have an extra delimiter, null character, or an escape character in the data which is causing the problem. You also may want to tag your question with spring-xd too.

Jon Roberts
  • 2,068
  • 1
  • 9
  • 11
  • BY changing ordinal position of InvoiceDt column in my external table syntax CREATE TABLE tablename ( "InvoiceDt" timestamp ) LOCATION ('pxf://hostname/path/to/hdfs/?profile=HdfsTextSimple') FORMAT 'csv' ( delimiter '^' null 'null' quote '~'); it is working – NEO Nov 03 '15 at 09:38
  • We have such 34 column in create external table synatx but InvoiceDt is listed at 34 number but when we changed the position as a first column it started taking null values into default date value. and we changed the spring xd job sql option as well in that we have fixed InvoiceDt column postion as first column in that way external table started pointing expected data . – NEO Nov 03 '15 at 09:41
  • We have not changed our format clause. our Format clause is FORMAT 'csv' ( delimiter '^' null 'null' quote '~') and our job delimiter='^' inside spring xd. is that extra delimiter is causing problem ? if it is problem then why started reading without making any changes in it. can. Thanks in advance. – NEO Nov 03 '15 at 09:42
  • The default escape character is "\" so a field with "c:\" for example, will effectively escape the trailing double quote and make you have not enough columns. You should look at gp_read_error_log() to see the errors of these poorly formatted rows. – Jon Roberts Nov 04 '15 at 17:20
  • default escape is picking same special character as quote that is '~' .and thanks for information for gp_read_error_log() to see the errors of these poorly formatted rows and command you gave me earlier is also helpful while debug. – NEO Nov 05 '15 at 08:48