2

I made a code that scrapy a website continuously and after several times a got this message

<--- Last few GCs --->

[17744:00000270608DE2C0] 16122001 ms: Scavenge 2023.5 (2082.0) ->
2017.3 (2082.5) MB, 3.6 / 0.1 ms  (average mu = 0.908, current mu = 0.941) task [17744:00000270608DE2C0] 16122645 ms: Scavenge 2023.9 (2082.5) -> 2017.7 (2083.0) MB, 3.5 / 0.0 ms  (average mu = 0.908, current mu = 0.941) task  [17744:00000270608DE2C0] 16128334 ms: Scavenge 2024.1 (2083.0) -> 2017.7 (2099.0) MB, 4.7 / 0.0 ms  (average mu = 0.908, current mu = 0.941) task 


<--- JS stacktrace --->

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory  1: 00007FF66A07013F v8::internal::CodeObjectRegistry::~CodeObjectRegistry+112495  2: 00007FF669FFF396 DSA_meth_get_flags+65526  3: 00007FF66A00024D node::OnFatalError+301  4: 00007FF66A9319EE v8::Isolate::ReportExternalAllocationLimitReached+94  5: 00007FF66A91BECD v8::SharedArrayBuffer::Externalize+781  6: 00007FF66A7BF61C v8::internal::Heap::EphemeronKeyWriteBarrierFromCode+1468  7: 00007FF66A7BC754 v8::internal::Heap::CollectGarbage+4244  8: 00007FF66A76C3B5 v8::internal::IndexGenerator::~IndexGenerator+22165  9: 00007FF669F90E9F v8::CFunctionInfo::HasOptions+22111 10: 00007FF669F8F6B6 v8::CFunctionInfo::HasOptions+15990 11: 00007FF66A0CF25B uv_async_send+331 12: 00007FF66A0CE9EC uv_loop_init+1292 13: 00007FF66A0CEB8A uv_run+202 14: 00007FF66A09DC95 node::SpinEventLoop+309 15: 00007FF669FB7AC3 cppgc::internal::NormalPageSpace::linear_allocation_buffer+53827 16: 00007FF66A034FBD node::Start+221 17: 00007FF669E588CC RC4_options+348108 18: 00007FF66AEB08F8 v8::internal::compiler::RepresentationChanger::Uint32OverflowOperatorFor+14472 19: 00007FFEB62C7034 BaseThreadInitThunk+20 20: 00007FFEB78A2651 RtlUserThreadStart+33

And after that my code stop work. Does anyone who has had this problem know how to solve it? I'm using python 3.8.8 and playwright 1.22.0

And I'm imported this libray to make the webpage

    from playwright.sync_api import sync_playwright

Thanks guys!

  • 1
    [refer this link for details about heap limit allocation](https://stackoverflow.com/questions/53230823/fatal-error-ineffective-mark-compacts-near-heap-limit-allocation-failed-javas?page=1&tab=scoredesc#tab-top) – Mohanraj Jul 22 '22 at 12:32

3 Answers3

0

Please refer this stack overflow link for more details. hope this will help to solve your issue

Mohanraj
  • 79
  • 9
0

For Q1 2023 this is probably best response: https://github.com/microsoft/playwright/issues/6319#issuecomment-1227405461

Save the browser's state to a local file (session, local storage, etc) after creating the browser/context and performing the actions required to meet your needs:

context.StorageState("state.json")

Close browser, context and kill all node.exe processes every 30 minutes. (this is where the memory leak exists for me), if you don't kill them it creates a separate node.exe process every time. The previous process remains in memory taking up space.

Create new browser/context and load in the saved state.. navigate back to where you need to be. context, err := browser.NewContext( playwright.BrowserNewContextOptions{ StorageStatePath: playwright.String("state.json"), })

In case of memory problems with Playwright read whole issue so maybe you will find some inspirations: https://github.com/microsoft/playwright/issues/6319

pbaranski
  • 22,778
  • 19
  • 100
  • 117
0

I had the same issue. I was using threads to reference objects in Playwright that were not getting cleared. Make sure that the thread you reference the Playwright object actually dies. You can check out an example of my github with a bare bones browser pool here: https://github.com/CrazedCoderNate/BrowserMemTest/tree/main that does not have any memory leakage with a clean thread pool.enter image description here

CrazedCoder
  • 294
  • 3
  • 14