8

My php-fpm process is facing performance issues on Ubuntu 14.04 LTS (Nginx server, MariaDB database).

strace -f $(pidof php-fpm7.1 | sed 's/\([0-9]*\)/\-p \1/g')

Gave me

<... epoll_wait resumed> {}, 1, 1000) = 0
[pid 32533] epoll_wait(8, {}, 1, 103)   = 0
[pid 32533] epoll_wait(8,  <unfinished ...>
[pid 32535] mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd933fdd000
[pid 32535] munmap(0x7fd933fdd000, 2097152) = 0
[pid 32535] mmap(NULL, 4190208, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd933dde000
[pid 32535] munmap(0x7fd933dde000, 139264) = 0
[pid 32535] munmap(0x7fd934000000, 1953792) = 0
[pid 32535] madvise(0x7fd933e00000, 2097152, MADV_HUGEPAGE) = 0
[pid 32533] <... epoll_wait resumed> {}, 1, 897) = 0
[pid 32533] epoll_wait(8, {}, 1, 1000)  = 0
[pid 32533] epoll_wait(8, {}, 1, 1000)  = 0
[pid 32533] epoll_wait(8,  <unfinished ...>
[pid 32535] mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd933c00000
[pid 32535] madvise(0x7fd933c00000, 2097152, MADV_HUGEPAGE) = 0
[pid 32533] <... epoll_wait resumed> {}, 1, 1000) = 0
[pid 32533] epoll_wait(8, {}, 1, 1000)  = 0
[pid 32533] epoll_wait(8, {}, 1, 1000)  = 0
[pid 32533] epoll_wait(8, {}, 1, 1000)  = 0
[pid 32533] epoll_wait(8,  <unfinished ...>
[pid 32535] mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd933a00000
[pid 32535] madvise(0x7fd933a00000, 2097152, MADV_HUGEPAGE) = 0
[pid 32533] <... epoll_wait resumed> {}, 1, 1000) = 0
[pid 32533] epoll_wait(8,  <unfinished ...>
[pid 32535] open("/usr/share/zoneinfo/UTC", O_RDONLY) = 7
[pid 32535] fstat(7, {st_mode=S_IFREG|0644, st_size=118, ...}) = 0
[pid 32535] read(7, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 20) = 20
[pid 32535] lseek(7, 0, SEEK_SET)       = 0
[pid 32535] mmap(NULL, 118, PROT_READ, MAP_SHARED, 7, 0) = 0x7fd946835000
[pid 32535] close(7)                    = 0
[pid 32535] munmap(0x7fd946835000, 118) = 0
[pid 32535] pwrite(5, "_sf2_attributes|a:2:{s:14:\"_secu"..., 979, 0) = 979
[pid 32535] close(5)                    = 0
[pid 32535] mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd933200000
[pid 32535] madvise(0x7fd933200000, 2097152, MADV_HUGEPAGE) = 0

I tried with php-fpm7.0, PHPMod7.1 but same issues.

CPU is up to 100% on requests with a small amount of data.

Configurations are the default ones.

On a duplicated instance php5.6-fpm works well.

Edit: Possibly related PHP script keeps doing mmap/munmap

Edit: I tried to enable hugepages https://wiki.debian.org/Hugepages A cat /proc/meminfo | grep Huge gave me

AnonHugePages:    108544 kB
HugePages_Total:     512
HugePages_Free:      497
HugePages_Rsvd:       50
HugePages_Surp:        0
Hugepagesize:       2048 kB

but still the same issue.

Edit: I tried to enable/disable OPCache, also set opcache.huge_code_pages=0, no results. There is no documentation about hugepages on http://php.net/

Etienne Gautier
  • 2,901
  • 4
  • 26
  • 35
  • Now happens on Ubuntu 16.04 LTS. – Etienne Gautier Sep 05 '17 at 12:13
  • What is the hypervisor used? – Wassim Dhif Sep 06 '17 at 09:52
  • Do you know if PHP is calling `mmap()`/`munmap()` directly, or is it through `malloc()`/`free()`? If it's through `malloc()`/`free()`, you can try setting the `M_MMAP_THRESHOLD` environment variable to the equivalent of something like 4MB or 8MB so the process uses "normal" heap memory for the 2MB allocations you're seeing. See [the `mallopt()` man page for details.](http://man7.org/linux/man-pages/man3/mallopt.3.html) – Andrew Henle Sep 06 '17 at 09:54
  • @WassimDhif ESXI – Etienne Gautier Sep 06 '17 at 11:40
  • @AndrewHenle Thanks, I'll try it but I'm pretty sure it uses `mmap`/`munmap()` directly. – Etienne Gautier Sep 06 '17 at 11:42
  • Cf http://developers-club.com/posts/270685/ – Etienne Gautier Sep 06 '17 at 11:45
  • @EtienneGautier Be aware that http://developers-club.com/posts/270685 is pretty poor code. Among other problems, it casts a `void *` to an `unsigned long` to try aligning the resulting pointer. A `void *` doesn't necessarily fit into a `long`. The linked code will fail on 64-bit Windows, for example, hardly an uncommon platform.. See https://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_models – Andrew Henle Sep 06 '17 at 12:16
  • You right, thanks for the information. – Etienne Gautier Sep 06 '17 at 12:50

3 Answers3

3

I'm not 100% sure if we are facing the same issue here, but it's the closest thing I could find in my search of Stack Overflow for my issue.

My findings were on a virtual machine with the following specs.

CPUs: 16

RAM: 16GB

OS: Ubuntu 16.04.4 LTS

Containers: LXD with ZFS filesystem using block devices

PHP Version: 7.1

I'm running a few LXD containers running MariaDB, Nginx, and PHP-FPM 7.1. These were dev environments, so I was only traffic hitting the server. The application running was built with the Symfony framework.

I was noticing extremely slow page load times. I would consistently wait over a minute for the page to load in dev mode. I didn't spend as much time confirming this as well, but CLI scripts also felt quite slow. I tried tweaking all kinds of PHP settings (realpath caches, opcache on/off/various settings, etc) but nothing worked.

I could consistently strace one of the PHP-FPM processes and see one system call that appeared to be slow. All the other calls would breeze by, but it was getting stuck on the following call many times throughout the process's lifespan.

madvise(0x7f4dcca00000, 2097152, MADV_HUGEPAGE) = 0

Long story short, I was able to drastically change the performance of this application by disabling THP. I ran the following commands on the LXD host and the page load times changed like night and day.

sysctl -w vm.nr_hugepages=0
echo never > /sys/kernel/mm/transparent_hugepage/enabled

I know that Redis recommends disabling THP because of performance issues related to copy-on-write. I also know ZFS filesystems also does copy-on-write, so maybe this issue is related?

William Byrne
  • 506
  • 2
  • 4
  • I knew it! I couldn't be alone facing this problem, this is too big. Unfortunately I don't work on this environment anymore, I'm afraid I do not have time to reproduce this behavior. By the way the application was built with Symfony as well. – Etienne Gautier Apr 08 '18 at 14:59
  • Same here, ZFS, LXC, and Symfony. Also having a large database and memory-hungry java application working. Confirmed on Linux 4.9 and 4.19. it took like 1 week to show up after reboot. – bhelm Jan 08 '19 at 07:30
1

What is the value selected in brackets when you do

cat /sys/kernel/mm/transparent_hugepage/enabled 

Make sure that it is either madvise (as mmaps request Transparent Huge Pages explicitly by MADV_HUGEPAGE) or always.

The next step would be to see if your kernel is not too busy doing compaction of pages (THP has to find contiguous chunk of memory of 2MB for a huge page, which can be difficult if physical memory is fragmented). Have a look at numbers at:

cat /proc/vmstat |grep compact

Before and after your run. Did they grow?

Next step would be to catch the kernel part of the stack of your process:

cat /proc/YOUR_PROCESS_PID/stack

Other useful commands are :

cat /proc/buddyinfo

to see the physical memory fragmentation. Look for the last two columns. They are free chunks of memory in sizes of 2MB and 4MB. Another one :

cat /proc/pagetypeinfo

and look for unmovable page blocks.

Jacek Tomaka
  • 422
  • 7
  • 15
0

Disabling transparent huge page support on the host worked for me:

echo never > /sys/kernel/mm/transparent_hugepage/enabled

In my case the problem happened when running composer in an LXD container on ZFS backend.

user13097
  • 343
  • 3
  • 12