I have a problem when starting to run my scrapers into threads. So, I have 3 services which scraps data from web-pages and I want to place them into 3 threads, and look how they are working together. Also, in the future I want to create more scrapers.
parser_controller.rb
def call_all_parsers
file = File.read('app/controllers/matches.json')
data = JSON.parse(file)
threads = []
data.each_key do |office|
data[office].each_key do |link|
if office == 'first_office'
p threads << Thread.new { Services::Scrapers::FirstScraperService.new.parse(link, data[office][link]) }
elsif office == 'second_office'
p threads << Thread.new { Services::Scrapers::SecondScraperService.new.parse(link, data[office][link]) }
elsif office == 'third_office'
p threads << Thread.new { Services::Scrapers::ThirdScraperService.new.parse(link, data[office][link]) }
end
end
end
p threads.map(&:join)
render 'calculate_arbitration/index'
end
When I started call_all_parsers method, it hung. How should I do this operation or you could give an advice to use something else instead of threads.
Update
My scrapers do some operations with database(read/write/delete operations). When I said it hungs, I meant that threads started to run but no results in database and I don't know how long should I wait for the result. Let me show the example of status of 3 threads:
Started GET "/parser" for 127.0.0.1 at 2019-12-18 13:08:35 +0300
Processing by ParserController#call_all_parsers as HTML
[Thread:0x000000031a2158@/home/test/web-programming/parser/backend/app/controllers/parser_controller.rb:19 run, Thread:0x0000000315ec50@/home/test/web-programming/parser/backend/app/controllers/parser_controller.rb:21 run, Thread:0x0000000315c450@/home/test/web-programming/parser/backend/app/controllers/parser_controller.rb:23 run]