0

I have a orders table which contain 134537 rows I want to fetch data from this table and insert to testing table. To do the same I wrote a query and used unnest function in it which increased rows 134537 to 234832.

I found some duplicate rows of users orders which reflect in final result. How to handle it?

Akshay
  • 85
  • 9
  • Does this answer your question? [Delete duplicate rows from a BigQuery table](https://stackoverflow.com/questions/36675521/delete-duplicate-rows-from-a-bigquery-table) – Kevin Quinzel Jul 01 '20 at 00:25
  • Why did you use UNNEST in the first place - you can copy nested and repeated (array) columns from one table to another without flattening. – Mosha Pasumansky Jul 01 '20 at 01:10
  • How can i select nested array columns without unnest? – Akshay Jul 01 '20 at 06:58

1 Answers1

1

It makes sense to have more rows after unnested your data. You are actually "flattening" your data .

There are different approaches to remove data duplication. It can be at the same moment you are unnested your data, or after that. In the second scenario, there is already an answer from Jordan Tigani that should help you.

SELECT *
FROM ( SELECT *, ROW_NUMBER()
          OVER (PARTITION BY <COLUMN_NAME>)
          row_number FROM <TABLE>)
WHERE row_number = 1
Alvaro
  • 813
  • 8
  • 16