How to reduce the time_starttransfer (TTFB) with AWS EC2

Question

My website is on AWS EC2.

I checked the TTFB (Time to First Byte) with this command:

curl --output /dev/null --silent --write-out "time_namelookup=%{time_namelookup}\ntime_connect=%{time_connect}\ntime_appconnect=%{time_appconnect}\ntime_pretransfer=%{time_pretransfer}\ntime_redirect=%{time_redirect}\ntime_starttransfer=%{time_starttransfer}\ntime_total=%{time_total}\n" --url http://13.37.46.163/

Here is the result when I run the command on my computer:

time_connect=0,014614
time_appconnect=0,000000
time_pretransfer=0,014657
time_redirect=0,000000
time_starttransfer=0,119092
time_total=0,134436

Here is the result when I run the command on the on the webserver itself:

time_namelookup=0.000058
time_connect=0.001296
time_appconnect=0.000000
time_pretransfer=0.001336
time_redirect=0.000000
time_starttransfer=0.084576
time_total=0.085031

I noticed that in both cases, the longest time is time_starttransfer. how can I reduce this time?

What is time_starttransfer?

The time, in seconds, it took from the start until the first byte was just about to be transferred. This includes time_pretransfer and also the time the server needed to calculate the result.

My website config

My website link is: http://13.37.46.163/

It is a Grav CMS witch run with EC2 + ServerPilot + PHP7

Amazon Machine Image (AMI)
Ubuntu Server 20.04 LTS (HVM),EBS General Purpose (SSD) Volume Type. 64 bits (x86)

EC2 instance type
t2.micro

Web server
Nginx

Programmation language
PHP

Reverse proxy
Nginx

Caching
I already use Opcache which is enabled as you can see here : http://13.37.46.163/info.php#module_zend+opcache

About CDN, i already use Grav CDN Plugin. (https://github.com/getgrav/grav-plugin-cdn)

My website logs (requests/min.)

i.e., on average 1 request / minute

Test(s) performed

Trying to run the TTFB test against a static file that Php does NOT host

I carried out the TTFB test on 'main.js' file.

Here the result:

time_namelookup=0.000034
time_connect=0.002659
time_appconnect=0.000000
time_pretransfer=0.002702
time_redirect=0.000000
time_starttransfer=0.003983
time_total=0.004026

Analysis of the result:
The result is satisfying (time_starttransfer=0.003983). But I think this result is due to the weight of the file which is light compared to the entire site. We can deduce that the problem is rather on the side of PHP rather than NGINX.

Running top and free commands to check what's running / what's using resources, what don't I need?

Here the result for top command:

+-----------+--------------+-------------+-------------+------------------+---------+---------+---------+--------+
| %Cpu(s):  | 4.0 us,      | 0.3 sy,     | 0.0 ni,     | 95.7 id,         | 0.0 wa, | 0.0 hi, | 0.0 si, | 0.0 st |
+-----------+--------------+-------------+-------------+------------------+---------+---------+---------+--------+
| MiB Mem : | 978.6 total, | 75.8 free,  | 332.2 used, | 570.6 buff/cache |         |         |         |        |
+-----------+--------------+-------------+-------------+------------------+---------+---------+---------+--------+
| MiB Swap: | 512.0 total, | 427.2 free, | 84.8 used.  | 461.7 avail Mem  |         |         |         |        |
+-----------+--------------+-------------+-------------+------------------+---------+---------+---------+--------+

I took the result when i reloaded my website to check the CPU %.

Here the result for free command:

+-------+---------+--------+--------+--------+------------+-----------+
|       | total   | used   | free   | shared | buff/cache | available |
+-------+---------+--------+--------+--------+------------+-----------+
| Mem:  | 1002052 | 334392 | 83368  | 16940  | 584292     | 478628    |
+-------+---------+--------+--------+--------+------------+-----------+
| Swap: | 524284  | 86784  | 437500 |        |            |           |
+-------+---------+--------+--------+--------+------------+-----------+

Analysis of the results:
I maybe should use t3.micro not t2.micro - slightly faster and slightly cheaper.(?)

What makes you sure this is an EC2 issue, and not an issue with the software you have configured to run on EC2 (of which you provide 0 information about in your question). — Mark B, Jul 22 '21 at 19:38
Whats the difference in performance between running that command on your computer vs. running it on the webserver itself? This will help narrow down if the issue is your server itself or the network connection. — MisterSmith, Jul 23 '21 at 17:08
@MisterSmith Good point. However, i got a really good connection (fibre optic) so i doubt that it's the problem. I am novice, so I don't know how can i run this command on the webserver itself. I tried to connect to my server trought ssh and to run the command, but the result is similar. Is it the good way to test the TTFB on the webserver itself ? If not, could you explain me how do this pls ? — MedMatrix, Jul 23 '21 at 17:47
If you ran that curl from an SSH session on your web server and the results are similar to running locally that points to the server itself. Is it Php or a static file being served from `/`? Try running your test against a static file that Php does NOT host - are the results consistent or wildly different? Im guessing its Php- but do the tests and follow the numbers. Also, can you expand on the spec of the machine cpu/ram etc, approx how many requests is it serving a minute? — MisterSmith, Jul 23 '21 at 18:13
@MisterSmith Have you got an example of the test ran against a static file that PHP doas not host ? Because, i really don't know how to do that. I think `/` serve a php file (in this case : index.php). The spec of the machine : Instance t2.micro / 1 GiB of Memory, 1 vCPUs, EBS only, 64-bit platform, RAM 1,0 Gio. Regarding requests per minute, I don't succeed to retrieve this info because I don't have ELB. — MedMatrix, Jul 23 '21 at 18:59
@MisterSmith I tried to run the test on a static file (main.js) and the result is satisfying (time_starttransfer=0.003983). But I think this result is due to the weight of the file which is light compared to the entire site. — MedMatrix, Jul 23 '21 at 19:13
OK, so its probably Php not nginx - bit closer. Take a look at the `grep` examples here - https://serverfault.com/questions/226982/how-to-measure-req-sec-by-analyzing-apache-logs - will need tweaking for nginx/fpm logs but those commands can extract a wealth of info from your log files (no elb required). Also - thats quiet a small instance - does a larger instance size respond faster? (as a quick test). Also, do you have a swap file enabled, and how big is your EBS volume (your disks bandwidth is relative to its capacity - 3 IOPS per GB)? (btw - edit your question instead of adding comments :) — MisterSmith, Jul 23 '21 at 19:23
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/235241/discussion-between-medmatrix-and-mistersmith). — MedMatrix, Jul 23 '21 at 21:00
Since you are running PHP, you should look into tuning the [PHP Opcache](https://www.php.net/manual/en/book.opcache.php). Once you've tuned the server as much as possible you should really look into using a CDN like CloudFront or Cloudflare. — Mark B, Jul 24 '21 at 12:49
@MarkB As you can check here http://13.37.46.163/info.php, my Zend Opcache is already enabled. I am not familiar with this config but i think it is correctly tuned. Could you confirm ? About CDN, i already use Grav CDN Plugin. I invite you to open my website on Safari and inspect the network to check the "transfer size" column. For each ressources, the transfer size is "memory". — MedMatrix, Jul 24 '21 at 15:07
@MarkB I have setup a Cloudfront CDN as you can see: http://13.37.46.163/ But there is no real impact on the TTFB (I maybe earned 0.01) It doesn't really solve my issue: the time_starttransfer is still important compared to the others. — MedMatrix, Jul 27 '21 at 19:36
Have you configured the CDN to cache your dynamic content, or just the default static content? — Mark B, Jul 27 '21 at 19:38
@MarkB I suppose. Here my cache behavior settings: https://i.ibb.co/QCTy1F2/Fire-Shot-Capture-001-Cloud-Front-console-aws-amazon-com.png I selected GET, HEAD, OPTIONS, PUT, POST, PATCH, DELETE option. Is it sufficient? — MedMatrix, Jul 27 '21 at 22:20
There is far too many options. By default PHP will hold all the output until completed, and then pass to NGNIX. You can get around this by "flushing" or some headers. Even with this, if you have GZIP enabled (and you should) that will cache until the end. If using FastCGI, that gets buffered in its entirety until you change http://nginx.org/en/docs/http/ngx_http_fastcgi_module.html#fastcgi_buffering. There are more changes you can make (see third answer here: https://stackoverflow.com/questions/3133209/how-to-flush-output-after-each-echo-call ) — Robbie, Aug 02 '21 at 02:35
(...cont) but adjusting those will reduce overall performance of your application. You will send more bytes, more interfacing between server and user and more interfacing between NGINX and fastCGI. Is this really what you want? — Robbie, Aug 02 '21 at 02:36
May be this help you: https://stackoverflow.com/a/60780320/2324206 — Haridarshan, Aug 03 '21 at 09:25

score 0 · Answer 1 · answered Jul 28 '21 at 14:05

First: Generally speaking, unless you run on OS that hasn't been patched to support T3, you should prefer T3 over T2 (especially on Linux - I have seen some discussion about some minor cost advantages for T2 on Windows). The slight reduction in price is, in my opinion, to get you to use T3 over T2 so they can eventually retire T2. T3 uses their Nitro instance flavor which is generally better (faster), especially in network IO, although I wouldn't expect an impact from your test. (BTW, if you are really looking for cheap, I have had good luck with the T3A instances which are even lower in price)

Second: You are using the T family of instances. From AWS:

T3 instances are the next generation burstable general-purpose instance type that provide a baseline level of CPU performance with the ability to burst CPU usage at any time for as long as required. T3 instances offer a balance of compute, memory, and network resources and are designed for applications with moderate CPU usage that experience temporary spikes in use.

That is all very nice speak for there are a lot of users on the same physical machine. Of course that is true for a lot of the other families too, but in this case you aren't 'assigned' a core to use. You, and a lot of other people, are telling AWS that your workload isn't all that high and you would like a cheaper instance at the expense of only using the CPU occasionally. That is fine, but AWS is trying to make money here and isn't giving you a dedicated CPU for your T instance (again, that is the choice you told them). In return, the CPU might not be available the millisecond you want it to and the instance may need to wait until the requested resources are available to use on the physical instance. Depending on how many other people are on that instance and how over-provisioned it is, your results may vary.

To my knowledge, AWS doesn't publish any information on how over-provisioned a T instance is. If you suspect you may have chatty neighbors, you could always switch to a different physical machine by stopping and starting the instance (you do not need to terminate the instance). That should switch which physical host you are running on, but there are no guarantees you will get a better machine. Intrinsically, you are asking for best-in-class performance from the cheapest-in-class instance family. That likely won't work out to your expectations.

In short, if you want minimum latency and guaranteed speed, you will need to switch to a different family of instance. The 'generic' instance type family of M5 may be more desirable if you need more guarantees on consistent and lower latency performance.

T3 is maybe better for performance, but T2.micro is free. How much does around cost T3.micro? You only talk about AWS operation. — MedMatrix, Jul 29 '21 at 22:08
@MedMatrix I am confused. You actually asked if you should use t3 instead of t2 in your original question. Pricing varies by region, but can be found here: https://aws.amazon.com/ec2/pricing/ I would recommend stopping the instance (not terminating) and in the console changing the instance type to a larger m5. When you start it, the same OS will load and you will only be charged for the time it is up, so you can quickly run your test. I still suspect your fundamental issue is you are expecting a lot of performance out of AWS's cheapest (and sometimes free) T series EC2 service. — Foghorn, Jul 31 '21 at 15:56

score 0 · Accepted Answer · answered Sep 15 '21 at 13:56

To improve performance and decrease the TTFB, I performed these improvements:

1 - PHP caching is critical

You should run a PHP opcache and usercache (such as APCu) in order to get the best performance out.

2 - SSD drives

SSD drives can make a big difference. Most things can get cached in PHP user cache, but some are stored as files, so SSD drives can make a big impact on performance. Avoid using network filesystems such as NFS.

3 - Cleaning the CSS

UnCSS is particularly important here. This tool examines all used CSS-selectors from a set of files and removes all selectors, not in use. You might think this sounds error-prone and unnecessary, but used intelligently it’s the most efficient reduction of a CSS-file possible.

4 - Optimizing the server

The server I host also supports Gzip-compression, and I set Expires-headers to avoid having the browser load files unnecessarily.

5 - Use a CDN

A CDN like CloudFront, CloudFlare or MaxCDN can be used to cache data closer to users. (content delivery network) Non-cached content can be retrieved from an origin.

The use of CDN can reduce asset delivery time from 30 to 3 seconds.

For Cloudfront users : don't hesitate to configure the CDN to cache your dynamic content https://www.youtube.com/watch?v=tqoDBNWBwas&t=2s

6 - Choose the good instance family (for AWS users)

For very small website, you should prefer t3.micro than t2.micro - slightly faster and cheaper.

How to reduce the time_starttransfer (TTFB) with AWS EC2

What is time_starttransfer?

My website config

My website logs (requests/min.)

Test(s) performed

2 Answers2