14

I am extracting content of the Files in SQL File Table. The following code works if I do not use Parallel.

I am getting the following exception, when reading sql file stream simultaneously (Parallel).

The process cannot access the file specified because it has been opened in another transaction.

TL;DR:

When reading a file from FileTable (using GET_FILESTREAM_TRANSACTION_CONTEXT) in a Parallel.ForEach I get the above exception.

Sample Code for you to try out:

https://gist.github.com/NerdPad/6d9b399f2f5f5e5c6519

Longer Version:

Fetch Attachments, and extract content:

var documents = new List<ExtractedContent>();
using (var ts = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled))
{
    var attachments = await dao.GetAttachmentsAsync();

    // Extract the content simultaneously
    // documents = attachments.ToDbDocuments().ToList(); // This works
    Parallel.ForEach(attachments, a => documents.Add(a.ToDbDocument())); // this doesn't

    ts.Complete();
}

DAO Read File Table:

public async Task<IEnumerable<SearchAttachment>> GetAttachmentsAsync()
{
    try
    {
        var commandStr = "....";

        IEnumerable<SearchAttachment> attachments = null;
        using (var connection = new SqlConnection(this.DatabaseContext.Database.Connection.ConnectionString))
        using (var command = new SqlCommand(commandStr, connection))
        {
            connection.Open();

            using (var reader = await command.ExecuteReaderAsync())
            {
                attachments = reader.ToSearchAttachments().ToList();
            }
        }

        return attachments;
    }
    catch (System.Exception)
    {
        throw;
    }
}

Create objects for each file: The object contains a reference to the GET_FILESTREAM_TRANSACTION_CONTEXT

public static IEnumerable<SearchAttachment> ToSearchAttachments(this SqlDataReader reader)
{
    if (!reader.HasRows)
    {
        yield break;
    }

    // Convert each row to SearchAttachment
    while (reader.Read())
    {
        yield return new SearchAttachment
        {
            ...
            ...
            UNCPath = reader.To<string>(Constants.UNCPath),
            ContentStream = reader.To<byte[]>(Constants.Stream) // GET_FILESTREAM_TRANSACTION_CONTEXT() 
            ...
            ...
        };
    }
}

Read the file using SqlFileStream: Exception is thrown here

public static ExtractedContent ToDbDocument(this SearchAttachment attachment)
{
    // Read the file
    // Exception is thrown here
    using (var stream = new SqlFileStream(attachment.UNCPath, attachment.ContentStream, FileAccess.Read, FileOptions.SequentialScan, 4096))
    {
        ...
        // extract content from the file
    }

    ....
}

Update 1:

According to this article it seems like it could be an Isolation level issue. Has anyone ever faced similar issue?

Zuhaib
  • 1,420
  • 3
  • 18
  • 34
  • Try opening the file on the same thread that did the rest of the SQL. Maybe this is simply not allowed. – usr May 11 '15 at 21:11
  • You are writing to `documents` on multiple threads, `List` is not thread safe and you can not do this (It is not likely the source of your problem but it is a problem) – Scott Chamberlain May 11 '15 at 22:01

1 Answers1

4

The transaction does not flow in to the Parallel.ForEach, you must manually bring the transaction in.

//Switched to a thread safe collection.
var documents = new ConcurrentQueue<ExtractedContent>();
using (var ts = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled))
{
    var attachments = await dao.GetAttachmentsAsync();
    //Grab a reference to the current transaction.
    var transaction = Transaction.Current;
    Parallel.ForEach(attachments, a =>
    {
        //Spawn a dependant clone of the transaction
        using (var depTs = transaction.DependentClone(DependentCloneOption.RollbackIfNotComplete))
        {
            documents.Enqueue(a.ToDbDocument());
            depTs.Complete();
        }
    });

    ts.Complete();
}

I also switched from List<ExtractedContent> to ConcurrentQueue<ExtractedContent> because you are not allowed call .Add( on a list from multiple threads at the same time.

Scott Chamberlain
  • 124,994
  • 33
  • 282
  • 431
  • Assuming the transaction is copied over, is concurrent SQL filestream access safe? – usr May 11 '15 at 22:08
  • @usr Yes, that is one of the benefits of using [SQL filestreams](https://technet.microsoft.com/en-us/library/bb933993(v=sql.105).aspx), they can take part in transactions. The actual file access is done via UNC network shares to a transient path that exists for the lifetime of the transaction. It would be no different than opening two Read only `FileStream` objects to the same network path. – Scott Chamberlain May 11 '15 at 22:16
  • 2
    @usr Out of curiosity I checked the reference source. SqlFileStream is just a wrapper around a normal FileStream and a managed wrapper around a [FILE_FULL_EA_INFORMATION](https://msdn.microsoft.com/en-us/library/windows/hardware/ff545793(v=vs.85).aspx) (The `byte[]` you pass in to the constructor is the data for that struct) – Scott Chamberlain May 11 '15 at 22:21
  • That's pretty conclusive assuming that the SQL Server file share can take concurrent access to the same file which seems likely. – usr May 11 '15 at 22:25
  • Just to sure I verified the data again. There are no duplicates in the collection. So each thread should be accessing a different file. – Zuhaib May 12 '15 at 14:56
  • This isn't working for me. Could you please check this gist that I created. https://gist.github.com/NerdPad/6d9b399f2f5f5e5c6519 – Zuhaib May 12 '15 at 16:02
  • Looks like it's an isolation level issue. – Zuhaib May 12 '15 at 18:07