The combination of sendfile and kTLS should avoid round-trips to userland while sending files.
jms703 10 hours ago [-]
True, but the other OS's don't suppor that. If the goal is out of the box testing, kTLS would not be representative of that.
ehutch79 8 hours ago [-]
That makes no sense. Why would you not be testing with optimized hosting.
If one of the OSs has features that improve performance, why would you not include that in the comparison?
toast0 8 hours ago [-]
IMHO, it might be worthwhile for NGINX to default to sendfile+kTLS enabled where appropriate. Maybe the potential for negative experience is too high.
I know sendfile originally had some sharp edges, but I'm not sure how sharp it still is? You would need to use sendfile only for plain http or https with kTLS, and maybe that's too complex? Apache lists some issues [1] with sendfile and defaults to off as well; but I don't know how many sites are still serving 2GB+ files on Itanium. :P AFAIK, lighttpd added SSL_sendfile support on by default 3 years ago, and you can turn it off if you want.
I think there's also some complexity with kTLS on implementations of kTLS that limit protocol version and cipher choices, if it's on by choice it makes sense to refuse to operate with cipher selection and kTLS cipher availability that conflict, but if kTLS is on by default, you probably need to use traditional TLS for connections where the client selects a cipher that's not eligible for kTLS. Maybe that's extra code that nobody wants to write; maybe the inconsistency of performance depending on client cipher choice is unacceptable. But it seems like a worthwhile thing to me (but I didn't make a PR, did I?)
Just my two cents, as an end-user choosing a OS to use on an N150 to do static web hosting, I would sure like to know if those features make a meaningful difference.
But I also understand, that looking at that might have beyond the scope of the article.
draga79 10 hours ago [-]
Exactly. That's why I didn't enable it
whartung 9 hours ago [-]
But that said, it would be interesting to see the different systems after a tuning pass. Both as an example of capability, but also as an mechanic to discuss tuning options available to the users.
Mind, the whole "its slow get new hardware" comes from the fact that getting another 10% by tuning "won't fix the problem". By the time folks feel the sluggish performance, you're probably not looking for another 10 points. The 10 points matter at scale to lower overall hardware costs. 10% less hardware with a 1000 servers is a different problem with 10% less hardware with just one.
But, still, a tuning blog would be interesting, at least to me.
fabioyy 10 hours ago [-]
The numbers seems to be too much near 65535 to be a coincidence.
are you making the request from a single IP address source?
are you aware of the limit of using the same source IP address for the same destination IP address ( and port )? ( each connection can have only a unique source address and source port to the destination, maxing out in source 65535 ports ) for the same destination
Neil44 10 hours ago [-]
I wonder if that's why the cpu is idle for part of the time, it's waiting for sockets to become free.
toast0 10 hours ago [-]
I would expect http persistent connections (keep-alive) at these rates. It's very hard to get 64 k connections/second from a single IP to a single server ip:port without heavily tuning the client, which they don't mention doing. They're only testing for 10 seconds, but still, you'd need to clear all the closed connections out of TIME_WAIT pretty darn quick in order to re-use each port 10 times.
10 hours ago [-]
spankibalt 10 hours ago [-]
Sucks that that there's no ECC-RAM model. A phone-sized x86 slab, as opposed to those impractical mini-PC/Mini-Mac boxes, that one could carry around and connect to a powerbank of similar size, and/or various types of screens (including a smartphone itself), would make for a great ultramobile setup.
antonkochubey 3 hours ago [-]
Odroid H4 family (H4, H4 Plus, H4 Ultra) supports in-band ECC, which supports one-bit error correction and two-bit error detection. And the 8-core model is just $220 (+case, +heatsink/fan, +shipping, but oh well)
esseph 19 minutes ago [-]
Is the kernel support for those still awful or has it gotten better? Its been a long time since I had an odroid... C1 I think
zokier 10 hours ago [-]
If you want relatively small low-power box with ECC, checkout Asustor AS6804T. It is nominally a NAS but really you can use it for anything you want, it is just an x86-64 server with some disk bays. You also get nice 2x10GbE, which is rare with these minipcs
LTL_FTC 9 hours ago [-]
If it had a a few more cores, something like this would make for a great node in a distributed system like k8s or ceph for a homelab. At the asking price, however, one could also cross shop an HP micro server gen11.
antonkochubey 3 hours ago [-]
Odroid H4 Ultra? It has 8 Gracemont cores that can stay boosted for quite a long time, and supports in-band ECC. 4x SATA too for those who care.
hollerith 6 hours ago [-]
But the price of that is $1200, which is about 5 times the price of the average N150 mini PC.
userbinator 50 minutes ago [-]
How many times do you think ECC RAM has caught an error? Online anecdotes I've found indicate almost no one experiences regularly corrected errors that weren't due to imminently failing hardware.
karlgkk 48 minutes ago [-]
Fun fact: DDR6 contains built in ECC by default. RAM sizes are getting so large it's causing issues in the field and also issues with yields
So, the industry thinks its a problem.
userbinator 41 minutes ago [-]
In other words, the industry has gone to shit as usual, starting with rowhammer.
Arm RK3399 SoC is blob free and some (Pinephone Pro, N4S, Chrome tablet) devices are small enough for sidecar usage.
snvzz 7 hours ago [-]
I like to pretend options without ECC simply do not exist. (i.e. as it should be)
It shortens the list of options, making choices much easier.
10 hours ago [-]
artimaeis 11 hours ago [-]
I love how capable these tiny N150 machines are. I've got one running Debian for my home media and backup solution and it's never stuttered. I'd be curious about exactly what machine they're testing with. I've got the Beelink ME mini running that media server. And I use a Beelink EQ14 as a kind of jump box to remote into my work desktop.
transpute 10 hours ago [-]
Would you mind sharing the Linux hardware platform security report ("fwupdmgr security") for those Beelink boxes, e.g. what is enabled/disabled by the OEM? N150 SoC supports Intel TXT, which was previously limited to $800+ vPro devices, but it requires BIOS support from OEMs like Beelink. Depending on HSI status, OSS coreboot might be feasible on some N150 boxes.
I'm not the author but my parents have pretty much decided they will never use a game console newer than the nintendo wii, but so far two of their wiis have died. Since no one is making wiis anymore, I decided to future-proof their gaming by setting them up with a mele quieter 4c [0], with the official wii bluetooth module attached over USB for perfect wiimote compatibility, running the dolphin emulator. Not every game runs perfectly, but every game they want to play runs perfectly AND it is smaller, silent, and consumes less power than the real wii.
[0] My experience with that mini computer: I bought two. The first one was great, but the 2nd one had coil whine so I had to return it. Aside from the whine, I love the box. If I could guarantee I wouldn't get whine I'd buy another today.
draga79 10 hours ago [-]
It's a Minisforum UN150P
transpute 10 hours ago [-]
HSI report on that box would be useful.
PaulKeeble 10 hours ago [-]
I didn't see a size of the test page as I went through (Did I miss it?) and I think in this case it potentially matters. A 2.5 gbps link can do ~280 MB/s, which at 63k requests is just 4.55KB a request. That could easily be a single page and saturating the connection link, explaining the clustering at that value.
matthewhartmans 10 hours ago [-]
Love this! I have been running a N150 with Debian 13 as my daily driver and super impressed! For ~$150 it packs a punch!
transpute 10 hours ago [-]
Could you recommend make/model? Quality seems variable at those price points.
kevin_thibedeau 9 hours ago [-]
The Topton/CWWK boxes are consistently decent. Best choice if you want fanless.
sedawkgrep 8 hours ago [-]
For mini pcs, Beelink probably has the best support. I've owned a few and had one replaced under warranty.
baq 8 hours ago [-]
the N100 family has been the raspberry pi host killer for me, migrated to one from an rpi4, couldn't be happier.
toast0 10 hours ago [-]
I'd love to see benchmarks that hit CPU or NIC limits; the HTTPS test hit CPU limits on many of the configurations, but inquiring minds want to know how much can you crank out with FreeBSD. Anyway, overload behavior is sometimes very interesting (probably less so for static https). May well need more load generation nodes though; load generation is often harder than handling load.
OTOH, maybe this is a bad test on purpose? the blogger doesn't like running these tests, so do a bad one and hope someone else is baited into running a better test?
koakuma-chan 5 hours ago [-]
All these benchmarking utilities like wrk are notorious for not supporting HTTP/2. Why would you serve static content and not use HTTP/2?
YorickPeterse 4 hours ago [-]
At least one reason could be that `sendfile` is useless when using HTTP/2 or HTTP/3, as you can no longer just dump the contents directly onto a socket. Whether that actually makes a practical difference on modern hardware remains to be seen of course.
koakuma-chan 4 hours ago [-]
There is nothing that prevents you from using sendfile and HTTP/2 at the same time. You still dump the contents directly into the socket.
10 hours ago [-]
Neil44 10 hours ago [-]
Imagine what a big piece of iron could do, it makes me think of the stories recently of people who came out of cloud and run everything of one or few bare metal hosts.
draga79 10 hours ago [-]
That's the point!
LeoPanthera 9 hours ago [-]
Is there a guide somewhere to what low power CPUs exist in these new mini PC things? I feel like I'm increasingly out of touch.
n4bz0r 6 hours ago [-]
Mini PCs mostly run N-series Intel CPUs [0][1] nowadays AFAIK.
The cheaper and most popular one is N150 [2] which is a replacement for N100 [3]. The newer one boosts a bit higher. The 6-7W TDP in specs is a lie, but these CPUs still have fairly modest consumption working at about 10-20W on average.
There are some low power chips from AMD, but that's mostly NAS territory. Don't see them a whole lot and don't know much about them either.
N100/n150/n97 have similar performance. Power seems to be 6-12w at idle depending. Ram limited to 16GB usually. Low number of pcie lanes (NAS are limited). Cost used to be $100, but now it went up to $120+.
From amd side I have 4700u and 5700u, similar idle power (12w), similar cost ($200 with 32gb of ram, now more expensive). A lot more capable then n100, at a cost.
I use a whole bunch of mini pc in my lab, they are so much cheaper to run electricity wise (and cost)
hedora 4 hours ago [-]
There are also higher power AMD devices that work extremely well.
If you’re willing to go up to 60W TDP and $500-1000, then they’re good enough to run recent steam games under linux at 1080p and LLM inference (if you spring for > ~32GB of RAM).
Like many others on this thread, I’ve had good luck with beelink.
klipklop 10 hours ago [-]
Love these N150 systems. I wonder if the RAM/SSD/misc shortages are going to make these humble $140 boxes like $300+ soon.
transpute 10 hours ago [-]
Some N150 systems have integrated LPDDR5 from Chinese memory suppliers, who have been increasing production capacity, unlike Korean memory suppliers who have decreased production and increased prices in the face of higher demand. More NAND supplier competition needed.
klipklop 7 hours ago [-]
That is good news, but I have seem some sellers already jump their price +$100 on Amazon. Perhaps just price gouging to take advantage. I might pick up another if I can get it for ~$140.
waynesonfire 6 hours ago [-]
I'd really like one that has 2x M.2 slots. I'm very uncomfortable running a server on a single disk.
Also, ECC ram would be nice.
shadowpho 4 hours ago [-]
2x m.2 is usually reserved for more expensive (>$200) mini pc. Or nas based mini pc which have trade offs.
Ecc ram is rare because very few people are asking for it, and it costs extra
snvzz 7 hours ago [-]
It really should be "nginx static web hosting..." as it seems to be very specifically measuring nginx performance across OSs.
Otherwise, seL4/LionsOS webserver scenario could be tested.
Rendered at 06:20:49 GMT+0000 (UTC) with Wasmer Edge.
Also it doesn't look like they enabled sendfile() in the nginx conf: https://nginx.org/en/docs/http/ngx_http_core_module.html#sen...
The combination of sendfile and kTLS should avoid round-trips to userland while sending files.
If one of the OSs has features that improve performance, why would you not include that in the comparison?
I know sendfile originally had some sharp edges, but I'm not sure how sharp it still is? You would need to use sendfile only for plain http or https with kTLS, and maybe that's too complex? Apache lists some issues [1] with sendfile and defaults to off as well; but I don't know how many sites are still serving 2GB+ files on Itanium. :P AFAIK, lighttpd added SSL_sendfile support on by default 3 years ago, and you can turn it off if you want.
I think there's also some complexity with kTLS on implementations of kTLS that limit protocol version and cipher choices, if it's on by choice it makes sense to refuse to operate with cipher selection and kTLS cipher availability that conflict, but if kTLS is on by default, you probably need to use traditional TLS for connections where the client selects a cipher that's not eligible for kTLS. Maybe that's extra code that nobody wants to write; maybe the inconsistency of performance depending on client cipher choice is unacceptable. But it seems like a worthwhile thing to me (but I didn't make a PR, did I?)
[1] https://httpd.apache.org/docs/2.4/mod/core.html#enablesendfi...
But I also understand, that looking at that might have beyond the scope of the article.
Mind, the whole "its slow get new hardware" comes from the fact that getting another 10% by tuning "won't fix the problem". By the time folks feel the sluggish performance, you're probably not looking for another 10 points. The 10 points matter at scale to lower overall hardware costs. 10% less hardware with a 1000 servers is a different problem with 10% less hardware with just one.
But, still, a tuning blog would be interesting, at least to me.
are you making the request from a single IP address source? are you aware of the limit of using the same source IP address for the same destination IP address ( and port )? ( each connection can have only a unique source address and source port to the destination, maxing out in source 65535 ports ) for the same destination
So, the industry thinks its a problem.
But my question still stands.
Arm RK3399 SoC is blob free and some (Pinephone Pro, N4S, Chrome tablet) devices are small enough for sidecar usage.
It shortens the list of options, making choices much easier.
https://fwupd.github.io/libfwupdplugin/hsi.html
[0] My experience with that mini computer: I bought two. The first one was great, but the 2nd one had coil whine so I had to return it. Aside from the whine, I love the box. If I could guarantee I wouldn't get whine I'd buy another today.
OTOH, maybe this is a bad test on purpose? the blogger doesn't like running these tests, so do a bad one and hope someone else is baited into running a better test?
The cheaper and most popular one is N150 [2] which is a replacement for N100 [3]. The newer one boosts a bit higher. The 6-7W TDP in specs is a lie, but these CPUs still have fairly modest consumption working at about 10-20W on average.
There are some low power chips from AMD, but that's mostly NAS territory. Don't see them a whole lot and don't know much about them either.
[0] https://www.techpowerup.com/cpu-specs/?f=codename_=Gracemont
[1] https://www.techpowerup.com/cpu-specs/?f=codename_=Twin%20La...
[2] https://www.techpowerup.com/cpu-specs/processor-n150.c4109
[3] https://www.techpowerup.com/cpu-specs/processor-n100.c3007
From amd side I have 4700u and 5700u, similar idle power (12w), similar cost ($200 with 32gb of ram, now more expensive). A lot more capable then n100, at a cost.
I use a whole bunch of mini pc in my lab, they are so much cheaper to run electricity wise (and cost)
If you’re willing to go up to 60W TDP and $500-1000, then they’re good enough to run recent steam games under linux at 1080p and LLM inference (if you spring for > ~32GB of RAM).
Like many others on this thread, I’ve had good luck with beelink.
Also, ECC ram would be nice.
Ecc ram is rare because very few people are asking for it, and it costs extra
Otherwise, seL4/LionsOS webserver scenario could be tested.