4

I have a list of ids. How can I select all rows whose id is in the list and preserve the order?

This is what I have:

var ids = new int[]{5, 1, 8, 2, 3};
var items = Db.Items.Where(x => ids.Contains(x.Id));

But it doesn't preserve the order.

I know I can sort items based on ids, but I want O(n) complexity.

This question is similar to: Select multiple records based on list of Id's with linq but I need to keep the rows in order

Mathieu Renda
  • 14,069
  • 2
  • 35
  • 33
ison
  • 173
  • 2
  • 9
  • In order to declare a 1-d `int` array you should use `var ids = new int[] {5, 1, 8, 2, 3};`. – Willem Van Onsem Aug 24 '17 at 20:36
  • 3
    Furthermore this query is "compiled" so the where probably done at database level. Most databases do not make any guarantees on the order, except if one uses an `ORDER BY` statement (or the LINQ equivalent). – Willem Van Onsem Aug 24 '17 at 20:37
  • Thanks. It's done at database level, but there should be a way to perform my query and keep O(n) complexity, because it's theoretically possible. If there's no syntax to get this result then it seems like a design flaw. e.g. I can execute N queries and preserve the order, and have proper complexity. But I'd like to do this in 1 query. – ison Aug 24 '17 at 20:40
  • no, since a query without a where has no guaranteed order as well. If you would reimport the database, modify a few rows, etc. it can have dramatic impact on the order in which objects are obtained. – Willem Van Onsem Aug 24 '17 at 20:46
  • You're missing my point. I'm not saying Where should preserve order. I'm saying that there should be a way to achieve what I want in a single query, because it's possible to achieve the same result with N queries. If it's not possible, then it looks like a design flaw. – ison Aug 24 '17 at 20:50
  • This is interesting. I'm not sure if `O(n)` is possible if you want to keep the order as well. If you have `n` items in the DB and `m` in your list, isn't the worst caste scenario `O(m*n)`? – Sach Aug 24 '17 at 20:52
  • @ison: not per se. Mind that if you do `.Where` at the database level, that does *not* mean it does *n* queries per se. Usually a database will check if it has an index build around the `where` column, then it can boost performance by a lookup in a hastable, B-tree, etc. Why should `.Where` have to guarantee order if `Db.Items` has a random order. You can use `Db.Items.AsEnumerable().Where(..)`, but this will have a bad impact on performance. – Willem Van Onsem Aug 24 '17 at 20:52
  • Sach: I think theoretically it is, but realistically it's something like O(m*log(n)) or even O(m), because Contains is translated by EF into a simple Join operation. @Willem Van Onsem it means that there should be some kind of a new "FindMany" operation which you could execute at the database level which yields rows of given ids in the same order. Such operation would be easy to implement at the database level I think. – ison Aug 24 '17 at 20:57
  • @ison I don't understand what kind of order you would like to preserve. Do you want to have items with id==5 first, then with id==1 and so on? – Jakub Dąbek Aug 24 '17 at 21:00
  • @JakubDąbek Yes. In this case the order should be id==5 first, then 1, 8, 2, 3. – ison Aug 24 '17 at 21:03
  • @ison Actually `Contains()` can have a complexity ranging from `O(1)` to `O(n)` depending on the data structure; a `HashSet` would be `O(1)` whereas a `List` would be `O(n)`. So it's not really possible given that you have a list. [See here](https://stackoverflow.com/questions/2799427/what-guarantees-are-there-on-the-run-time-complexity-big-o-of-linq-methods). – Sach Aug 24 '17 at 21:10
  • 1
    @Sach Normally you would be right, but in this case LINQ operations are handled specially by Entity Framework. The query is analyzed and translated into an SQL query. In this case it's smart enough to translate Contains into a Join operation. – ison Aug 24 '17 at 21:12
  • I'm sure you would have tried this, and I'm not at work to try it and see if it works as a db query, but can you `var items = db.Items.Where(x => ids.Contains(x.Id)).OrderBy(x => ids.IndexOf(x.Id));` ? – Craig H Aug 24 '17 at 22:14
  • @CraigH great one-liner, but it's not optimal, IndexOf is linear – ison Aug 25 '17 at 15:12

4 Answers4

3

I'm not into queries and databases that much but you can do this quite quickly with just doing C# tricks

var ids = new int[]{5, 1, 8, 2, 3};
var dict = ids.Select((id, i) => new { Index = i, Id = id })
    .ToDictionary(x => x.Id, x => x.Index);
var items = Db.Items
    .Where(x => ids.Contains(x.Id))
    .OrderBy(x => dict[x.Id]);

I can't tell how it will get translated into a database query


I haven't tested it, but here'e a version without OrderBy, but less space-efficient (it might actually be slower):

var ids = new int[]{5, 1, 8, 2, 3};
var temp = Db.Items
.Where(x => ids.Contains(x.Id))
.ToLookup(x => x.Id);

var tempList = new List<IGrouping<int, Item>>();
for(int i = 0; i < ids.Length; i++)
{
    tempList.Add(temp[ids[i]]);
}

var items = tempList.SelectMany(x => x);

There is also another way - simply do a reverse join:

var ids = new int[]{5, 1, 8, 2, 3};
var items = from id in ids
            join item in Db.Items
            on id equals item.Id
            select item;

This will result in a query sorted by ids

Jakub Dąbek
  • 1,044
  • 1
  • 8
  • 17
  • This is the best answer, this could easily be wrapped into an extension method too for re-use. – Oliver Aug 25 '17 at 09:51
  • Looks good, but in theory there should be a better approach without having to use OrderBy. – ison Aug 25 '17 at 15:13
  • @ison I just got some fresh ideas if you would like to see something I have come up with – Jakub Dąbek Aug 25 '17 at 18:02
  • @JakubDąbek The new solution looks great and does exactly what I need, but won't it execute N queries instead of just 1? I think that starting with "from id in ids" will result in a simple non-queryable LINQ operation executed in place it was called instead of the database, but I could be wrong. – ison Aug 28 '17 at 16:38
  • @ison I've never worked with databases, so I can't tell you how it will behave. I only have experience with plain LINQ. You'd have to test it yourself – Jakub Dąbek Aug 28 '17 at 16:55
2

What about projecting into an intermediate type to preserve the original index, building a union, and sorting over the indexes?

class ItemWithIndex
{
    public int Index { get; set; }
    public Item Item { get; set; }
}

class Item
{
    public int Id { get; set; }
}

int[] ids = { 5, 1, 8, 2, 3 };

IQueryable<ItemWithIndex> query = null;

for(int index = 0; index < ids.Length; index++)
{
    int currentIndex = index;
    int currentId = ids[index];

    IQueryable<ItemWithIndex> next = db.Items
        .Where(i => i.Id == currentId)
        .Select(i => new ItemWithIndex { Index = currentIndex, Item = i });

    query = query == null ? next : query.Concat(next);
}

ItemWithIndex[] items = query
    .OrderBy(i => i.Index)
    .ToArray();

Here is the generated query:

SELECT 
    [UnionAll4].[Id] AS [C1], 
    [UnionAll4].[C1] AS [C2], 
    [UnionAll4].[Id1] AS [C3]
    FROM  (SELECT 
        [Extent1].[Id] AS [Id], 
        @p__linq__1 AS [C1], 
        [Extent1].[Id] AS [Id1]
        FROM [dbo].[Items] AS [Extent1]
        WHERE [Extent1].[Id] = @p__linq__0
    UNION ALL
        SELECT 
        [Extent2].[Id] AS [Id], 
        @p__linq__3 AS [C1], 
        [Extent2].[Id] AS [Id1]
        FROM [dbo].[Items] AS [Extent2]
        WHERE [Extent2].[Id] = @p__linq__2
    UNION ALL
        SELECT 
        [Extent3].[Id] AS [Id], 
        @p__linq__5 AS [C1], 
        [Extent3].[Id] AS [Id1]
        FROM [dbo].[Items] AS [Extent3]
        WHERE [Extent3].[Id] = @p__linq__4
    UNION ALL
        SELECT 
        [Extent4].[Id] AS [Id], 
        @p__linq__7 AS [C1], 
        [Extent4].[Id] AS [Id1]
        FROM [dbo].[Items] AS [Extent4]
        WHERE [Extent4].[Id] = @p__linq__6
    UNION ALL
        SELECT 
        [Extent5].[Id] AS [Id], 
        @p__linq__9 AS [C1], 
        [Extent5].[Id] AS [Id1]
        FROM [dbo].[Items] AS [Extent5]
        WHERE [Extent5].[Id] = @p__linq__8) AS [UnionAll4]
Mathieu Renda
  • 14,069
  • 2
  • 35
  • 33
  • This still does N queries – Jakub Dąbek Aug 24 '17 at 21:24
  • I think it does one round trip to the database so it looks good. But the problem is that you still have to sort the elements. Ideally they should come sorted from the database. – ison Aug 24 '17 at 21:28
  • Actually, I have the SQL Server profiler opened in front of me right now, and it all runs in a single (very ugly) query. – Mathieu Renda Aug 24 '17 at 21:29
  • They DO come sorted from the database. The OrdeyBy generates an ORDER BY in the query. – Mathieu Renda Aug 24 '17 at 21:30
  • Oh, this is interesting then. I wonder if it outperforms the multiple round trips approach. It looks a bit complex, but it's an interesting solution. – ison Aug 24 '17 at 21:32
  • Your idea is good, but Instead of `ids = { 5, 1, 8, 2, 3 }` go with `ids = {{ 5,1}, {1,2}, {8,3}, {2,4}, {3,5} }` – Juan Carlos Oropeza Aug 24 '17 at 21:37
  • It's (much) cleaner with Concat. I fixed the answer. – Mathieu Renda Aug 24 '17 at 21:38
  • It probably outperforms the multiple round trips approach. But I'd guess that the increase in complexity might cost you more than retrieving the rows in unsorted order from the database and sorting them in-memory afterwards. – Mathieu Renda Aug 24 '17 at 21:46
  • @MathieuRenda you can still change `query = query == null ? next : query.Concat(next);` into `query = query?.Concat(next) ?? next;` – Jakub Dąbek Aug 24 '17 at 21:54
  • In theory there should be a better solution without having to use OrderBy. E.g. it should be possible to execute N queries (items.Find(id)) with a single round trip to the database. But it looks like unfortunately it's not possible for some reason. Though I'd like someone to come and prove me wrong. – ison Aug 25 '17 at 15:18
  • EF Plus has something like that: https://github.com/zzzprojects/EntityFramework-Plus/wiki/EF-Query-Future-%7C-Entity-Framework-Combine-and-Execute-Multiple-SQL-Command – Mathieu Renda Aug 26 '17 at 07:30
1

I'd say that,

1st, you can get the items as you did before:

var ids = new int[]{5, 1, 8, 2, 3};
var items = Db.Items.Where(x => ids.Contains(x.Id));

and then you could do something like:

var orderedItems = new int[ids.Length()] // sorry, I'm codign in SO edit, not sure on the syntax
foreach(id in items)
{
var position = Array.IndexOf(items, id)
orderedITems[position] = id;
}

That should do what you asked (also could be simplified in a single line).

I hope it helps,

Juan

Juan
  • 2,156
  • 18
  • 26
0

This is a possible solution:

var ids = new int[] {5, 1, 8, 2, 3};
var items = new List<Item>();
for (int i = 0; i < ids.Length; i++)
{
    items.Add(Db.Items.Find(ids[i]));
}

However, it performs N queries, so there should be a better way.

ison
  • 173
  • 2
  • 9