We are trying to implement a clickstream flow of our e-commerce on AWS. The clickstream will catch all actions done by 'Anonymous' users. Anonymous users are tracked via a UUID , generated during their first visit , that is stored in a cookie. We used the AWS example here to suggest a solution architecture like the diagram below :
Now 2 questions :
Different pages in the e-commerce have different clickstream data. For example on the Item view page , we would like to send Item related info such as itemId as well. Or on Checkout page , we would like to have few order related info tied to the clickstream data. Should we have separate Firehose delivery streams for different pages to support custom clickstream data? Or we should send a generic clickstream record (with possible null values for some attributes) to a FH delivery stream?
At some point our anonymous users become identified (ex. they login so we know their User_ID) So we would like to link the {UUID and User_ID} to be able to have a customer 360 view. Should we consider a separate stream flow + separate S3 bucket for tracking UUID+ User_ID mappings? Should we then use Athena for showing aggregated reports for customer 360? Should we aggregate the data and create a customer dimension in the Redshift? What would be a good solution for this?
Regards, Lina
[Update]: Is the following diagram an acceptable solution for the question?