2

Thank you for the support in advance.

Using streamset pipeline, I am trying to load the MSSQL CDC data using SQL Server CDC Client origin and load into destinations in Google cloud storage and local FS.

While the local FS writes as expected, the file in Google Cloud storages has inverted exclamation mark beginning of the file.

I could not figure out why it happens. Any idea?

¡{"header":{"stageCreator":"SQLServerCDCClient_01","sourceId":"cdc.HumanResources_Shift_CT::___$seqval=0000005500002F380008::___$operation=1::___$start_lsn=0000005500002F38000E","stagesPath":"SQLServerCDCClient_01","trackingId":"cdc.HumanResources_Shift_CT::___$seqval=0000005500002F380008::___$operation=1::___$start_lsn=0000005500002F38000E::SQLServerCDCClient_01","previousTrackingId":null,"raw":null,"rawMimeType":null,"errorDataCollectorId":null,"errorPipelineName":null,"errorStage":null,"errorStageLabel":null,"errorCode":null,"errorMessage":null,"errorTimestamp":0,"errorStackTrace":null,"errorJobId":null,"values":{"sdc.operation.type":"2","jdbc.__$seqval.jdbcType":"-2","jdbc.__$start_lsn":"0000005500002F38000E","jdbc.__$operation.jdbcType":"4","jdbc.__$update_mask.jdbcType":"-3","jdbc.cdc.source_name":"Shift","jdbc.tables":"HumanResources_Shift_CT","jdbc.__$seqval":"0000005500002F380008","jdbc.__$update_mask":"1F","jdbc.ShiftID.jdbcType":"-6","jdbc.__$operation":"1","jdbc.StartTime.jdbcType":"92","jdbc.ModifiedDate.jdbcType":"93","jdbc.cdc.source_schema_name":"HumanResources","jdbc.Name.jdbcType":"-9","jdbc.EndTime.jdbcType":"92","jdbc.__$start_lsn.jdbcType":"-2"}},"value":{"type":"LIST_MAP","value":[{"type":"SHORT","value":15,"sqpath":"/ShiftID","dqpath":"/ShiftID"},{"type":"STRING","value":"Full Day Shift","sqpath":"/Name","dqpath":"/Name"},{"type":"TIME","value":38836977,"sqpath":"/StartTime","dqpath":"/StartTime"},{"type":"TIME","value":67636977,"sqpath":"/EndTime","dqpath":"/EndTime"},{"type":"DATETIME","value":1625132836977,"sqpath":"/ModifiedDate","dqpath":"/ModifiedDate"}],"sqpath":"","dqpath":""}}

enter image description here

Configuration in the Google cloud storage stage, enter image description here enter image description here enter image description here enter image description here

Hari
  • 441
  • 6
  • 15
  • How does you Data Format configuration looks like in Google Cloud Storage stage? Which settings do you have there? – Andrey E Jul 07 '21 at 11:18
  • @AndreyE, I have updated my question with what you were asking for. – Hari Jul 07 '21 at 13:35
  • Thanks, looking at it. Do you really need the record header data to be written as well? Otherwise you could use JSON as a format, it doesn't seem to have this problem. – Andrey E Jul 08 '21 at 11:40
  • Well, the record header gives us lot of information when we load the cdc data into target database. For example timestamp, offset, operation(update/delete/insert) would help us to perform many other actions. – Hari Jul 08 '21 at 13:02

0 Answers0