I have two tables:
"Sessions" - it have int key identity, "session_id" - varchar, "device_category" - varchar and some other colums. There are 149239 rows.
Session_events" - it have int key identity, "session_id" - uniqueidentifier and some other fields. There are 3140768 rows there.
This tables has been imported from not relational database - Cassandra, so I not created any connections in MS SQL Server designer. But real connection between Sessions and Session_events on column session_id is Many-To-Many
Now I want to delete all web-sessions that was not take place on Personal Computer "device_category". So I run request Delete * FROM sessions where device_category != "PC"
that was fast. Now I want to to delete all not PC sessions from Session_events table. So I run request
Delete FROM session_events where session_id Not In (SELECT distinct session_id FROM sessions)
That request is currently running for more then 24 hour and I don't know how long it can take...
(I have 16 GB ram and Intel Xenon).
I know that Left Join can be faster but 20% is not interesting. Do you see the way to finish my task much faster?
----
CREATE TABLE [dbo].[session_events](
[key] [bigint] IDENTITY(1,1) NOT NULL,
[session_id] [uniqueidentifier](max) NULL,
[visitor_id] [uniqueidentifier] NULL,
[shipping_method] [varchar](max) NULL,
[shipping_price] [varchar](max) NULL,
[site_id] [int] NULL,
[stream_data_chunk] [varbinary](max) NULL,
[total] [varchar](max) NULL,
[total_inc_tax] [varchar](max) NULL,
[tracker_ver] [varchar](max) NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
CREATE TABLE [dbo].[visitor_sessions](
[key] [int] IDENTITY(1,1) NOT NULL,
[visitor_id] [varchar](max) NULL,
[created] [varchar](max) NULL,
[session_id] [varchar](max) NULL
)
CONSTRAINT [PK_visitor_sessions4] PRIMARY KEY CLUSTERED
(
[key] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO