-1

Writing a plugin for Jekyll, I am stucked at some ruby related code as I am not yet familiar on how to do .select, .map and all their friends on arrays (except for simple one array cases)

So my problem now is, having this as an array of docs, with their related sha1sum codes:

[
    [
        #<Jekyll::Document _projects/1st-project.md collection=projects>,
        "8f918a3d8263a957c206a9864d3507aaa5277a79"
    ],
    [
        #<Jekyll::Document _posts/2nd-project.markdown collection=posts>,
        "ca81eda5d49100bdf23db16fe1c9d17040fd33f8"
    ],
    [
        #<Jekyll::Document _posts/3rd-project.markdown collection=posts>,
        "37bf18464b00c9a808436853bc6d868aff218eb6"
    ],
    ...
    ...
]

And on the other side, the hash of linkage groups like so:

{
    "linkage_groups"=>
    [
        {
            "group_name"=>"test1",
            "sha1_sums"=> [
                "ca81eda5d49100bdf23db16fe1c9d17040fd33f8",
                "37bf18464b00c9a808436853bc6d868aff218eb6"
            ]
        },
        {
            "group_name"=>"test1",
            "sha1_sums"=> [
                "154c255d59e6063fc609ade2254dc1b09d1de8ab",
                "3e9ef9f059786888e39df608d104263cf0fdae76"
            ]
        }
    ]
}

How would I be able to cycle through these groups one at a time, and return for each group of sha1_sums docs from the above array where sha1_sum of a document is present in a particular group.

Expected output would be an array of hashes for each of the groups, holding docs which fulfill the condition, on their sha1_sum being present in a group:

Ex. 2nd and 3rd project fulfill the condition because their sha's are in a group named test1

[
    {
        "test1" => [#<Jekyll::Document _posts/2nd-project.markdown collection=posts>, #<Jekyll::Document _posts/3rd-project.markdown collection=posts]
    },
    {
        "test2" => [..., ...]
    },
    {
        "test3" => [..., ...]
    },
    ...
]

As of reply from @Lukas Baliak -

Here is what I am getting in case of both hashes belonging to the same group:

{
    "ca81eda5d49100bdf23db16fe1c9d17040fd33f8"=>"test1",
    "b673be35ad73ab48da23b271ab0dda95ea07c905"=>"test1",
    "154c255d59e6063fc609ade2254dc1b09d1de8ab"=>"test2",
    "3e9ef9f059786888e39df608d104263cf0fdae76"=>"test2"
}

[
    [
        #<Jekyll::Document _projects/my-first-project.md collection=projects>,
        "b673be35ad73ab48da23b271ab0dda95ea07c905"
    ],
    [
        #<Jekyll::Document _posts/2016-06-05-one-more-post.markdown collection=posts>,
        "ca81eda5d49100bdf23db16fe1c9d17040fd33f8"
    ]
]

{
    "test1"=> [#<Jekyll::Document _posts/2016-06-05-one-more-post.markdown collection=posts>, "ca81eda5d49100bdf23db16fe1c9d17040fd33f8"]
}

Only one document is listed, why? Where is b673be35ad73ab48da23b271ab0dda95ea07c905?

branquito
  • 3,864
  • 5
  • 35
  • 60

2 Answers2

2

I preffer to use simple data structures, so i "migrate" groups to Hash

doc

doc = [
  [
    "#<Jekyll::Document _projects/my-first-project.md collection=projects>",
    "8f918a3d8263a957c206a9864d3507aaa5277a79"
  ],
  [
   "#<Jekyll::Document _posts/2016-06-05-one-more-post.markdown collection=posts>",
    "ca81eda5d49100bdf23db16fe1c9d17040fd33f8"
  ]
]

group_config

group_config = {
  "linkage_groups" => [
    {
      "group_name" => "test1",
      "sha1_sums" => [
        "ca81eda5d49100bdf23db16fe1c9d17040fd33f8",
        "b673be35ad73ab48da23b271ab0dda95ea07c905"
      ]
    },
    {
      "group_name" => "test2",
      "sha1_sums" => [
        "154c255d59e6063fc609ade2254dc1b09d1de8ab",
        "8f918a3d8263a957c206a9864d3507aaa5277a79"
      ]
    }
  ]
}

Migrate to Hash

groups = group_config["linkage_groups"].each_with_object({}) do |h, exp|
  h["sha1_sums"].each { |sha1| exp[sha1] = h["group_name"] }
end

export groups

p groups

# {
#   "ca81eda5d49100bdf23db16fe1c9d17040fd33f8" => "test1",
#   "b673be35ad73ab48da23b271ab0dda95ea07c905" => "test1",
#   "154c255d59e6063fc609ade2254dc1b09d1de8ab" => "test2",
#   "8f918a3d8263a957c206a9864d3507aaa5277a79" => "test2"
# }

And process to generate hash structure

export = doc.each_with_object({}) do |arr, exp|
  exp[groups[arr[1]]] = arr
end

output

p export

# {
#   "test2" => ["#<Jekyll::Document _projects/my-first-project.md collection=projects>", "8f918a3d8263a957c206a9864d3507aaa5277a79"],
#   "test1" => ["#<Jekyll::Document _posts/2016-06-05-one-more-post.markdown collection=posts>", "ca81eda5d49100bdf23db16fe1c9d17040fd33f8"]
# }

EDIT:

If you need more then one use this modification

export = doc.each_with_object(Hash.new{|k, v| k[v] = []}) do |arr, exp|
  exp[groups[arr[1]]] << arr
end

EDIT 2

Ok, if you need one sha1_hash in to more groups may be you can use this update.

groups = group_config["linkage_groups"].each_with_object(Hash.new { |k, v| k[v] = [] }) do |h, exp|
  h["sha1_sums"].each { |sha1| exp[sha1] << h["group_name"] }
end

groups

p groups

# {
#   "ca81eda5d49100bdf23db16fe1c9d17040fd33f8" => ["test1"],
#   "b673be35ad73ab48da23b271ab0dda95ea07c905" => ["test1"],
#   "8f918a3d8263a957c206a9864d3507aaa5277a79" => ["test1", "test2"],
#   "154c255d59e6063fc609ade2254dc1b09d1de8ab" => ["test2"]
# }

process

export = doc.each_with_object(Hash.new { |k, v| k[v] = [] }) do |arr, exp|
  groups[arr[1]].each { |group| exp[group] << arr }
end

output

p export

# {
#   "test1" => [
#     ["#<Jekyll::Document _projects/my-first-project.md collection=projects>", "8f918a3d8263a957c206a9864d3507aaa5277a79"],
#     ["#<Jekyll::Document _posts/2016-06-05-one-more-post.markdown collection=posts>", "ca81eda5d49100bdf23db16fe1c9d17040fd33f8"]
#   ],
#   "test2" => [
#     ["#<Jekyll::Document _projects/my-first-project.md collection=projects>", "8f918a3d8263a957c206a9864d3507aaa5277a79"],
#     ["#<Jekyll::Document _posts/2016-06-0s-one-more-post.markdown collection=posts>", "154c255d59e6063fc609ade2254dc1b09d1de8ab"]
#   ]
# }

I hope this will helps.

Lukas Baliak
  • 2,849
  • 2
  • 23
  • 26
  • Great, now just to wrap my head around these `each_with_object` things.. Not getting it. – branquito Jun 13 '16 at 13:16
  • Actually I got the first ones with an `{}`, but the last one with `Hash.new{}` is confusing. – branquito Jun 13 '16 at 13:22
  • 1
    @branquito maybe this will helps. http://stackoverflow.com/questions/19064209/how-is-each-with-object-supposed-to-work and http://stackoverflow.com/questions/37679425/need-to-understand-hash-of-hashes-in-ruby/ – Lukas Baliak Jun 13 '16 at 13:23
  • another problem ocurred, your first hash migration does not allow for `sha1 sums` to repeat themselves in different combinations, which is the whole purpose I am doing this, to have posibility to have in say `test1` group `9ce...` and `3bf...`, and in `test2` say `9ce...` and `4a8...` But right now that `9ce` as `[ index ]` gets overwritten next time when it is encountered.. Is there some easy fix to this!? thanks. – branquito Jun 13 '16 at 23:02
  • And what is your expected output? – Lukas Baliak Jun 14 '16 at 07:22
  • And its possible to have one sha1 hash inside two groups ? – Lukas Baliak Jun 14 '16 at 08:49
  • exactly, but no duplicate hashes within the same group. – branquito Jun 14 '16 at 08:53
  • you've been of a great help – branquito Jun 14 '16 at 10:23
  • I am glad to help ;) – Lukas Baliak Jun 14 '16 at 10:26
1

You are not very clear in your question. Sometimes it helps to be detailed about what output is expected for which input. I assume the array contains the Document/Sha1 pairs and that hash contains the linkage_groups.

The hash looks like this:

{"linkage_groups"=>[{"group_name"=>"test1", "sha1_sums"=>["ca81eda5d49100bdf23db16fe1c9d17040fd33f8", "b673be35ad73ab48da23b271ab0dda95ea07c905"]}, {"group_name"=>"test1", "sha1_sums"=>["154c255d59e6063fc609ade2254dc1b09d1de8ab", "3e9ef9f059786888e39df608d104263cf0fdae76"]}]}

And the array looks like this:

[["Document1", "8f918a3d8263a957c206a9864d3507aaa5277a79"], ["Document2", "ca81eda5d49100bdf23db16fe1c9d17040fd33f8"]]

I'd try something like this:

hash["linkage_groups"].each { |group|  // for each linkage group
    group["sha1_sums"].each { |sha1|   // for each sha1 in group
        array.each { |array_element|   // for each array element (which itself is an array of document/sha1 pairs
            if array_element.include?(sha1) then 
                puts "#{array_element[0]} found in #{group["group_name"]} with #{group["sha1"]}" 
            end 
        } 
    } 
}

I leave it up to you to manage how to return the elements according to your needs.

Ely
  • 10,860
  • 4
  • 43
  • 64