0

My question is similar to this one RXJS: Single Observable from dynamically created Observables (no answer there).

I want to parse some pages continuously. There is the main page where I can get a list of links of pages to parse. This list of links changes over time. Then I follow the links and parse the pages continuously as well until their link disappears from the main page and start parsing new pages as they appear on the main page.

My setup for now is as follows. I have a class that given a url can return an Observable of continuously emitting items (parses a single link from the main page). This works great. However, I want to have a "master" class that will be able to return an Observable of same items but taken from multiple pages. The problem that I have is the list of pages is changing (and so is the list of underlying Observables and I can't just use Observable.merge.

TL;DR: I have multiple Observables that I want to merge. But this list of Observables is changing dynamically and I don't know how to handle this.

How can I approach this?

  • is this what are you looking for? https://stackoverflow.com/questions/35254323/rxjs-observable-pagination/35494766#35494766 – Oles Savluk Sep 30 '18 at 18:29
  • @OlesSavluk thank you for your suggestion. I am not sure how that is related to my question, though. I think I can redefine how my link parser constructs `Observable` using your answer but I struggle with merging those while they get created and destroyed under the hood. – Mikhail Borisov Oct 02 '18 at 11:18

1 Answers1

1

If you already have Observable of "main" page, and a function to fetch items based on this data. You can use switchMap operator for "switching" this dynamically changing list, something like:

getMainPages().pipe(
  switchMap(main => getItemsFromMultiplePages(main))
)

where:

  • getMainPages() - return Observable of main page data
  • getItemsFromMultiplePages(main) - return Observable of items, created by combining(maybe using merge) data from multiple pages
Oles Savluk
  • 4,315
  • 1
  • 26
  • 40
  • Thank you, I think that solves my problem. The main issue I see with this in my context is that each `Observable` (basically each page) internally uses an instance of Selenium WebDriver and thus I am going to initialize and destroy a lot of these objects (if I understood correctly how this pipeline works). Maybe this can be solved with some sort of caching in `getItemsFromMultiplePages()` function. – Mikhail Borisov Oct 04 '18 at 10:13
  • 1
    I think you can use `concat / concatMap` inside this function, so that all actions will be done sequentially (instead of in parallel like with `merge`) and thus WebDriver instance could be reused. You may create separate question with your code examples to get more detailed answer – Oles Savluk Oct 04 '18 at 15:51