vendredi 22 janvier 2016

PostgreSQL : Linux VS Windows - part 2

After my initial test, I had some interesting feedback my post on reddit that pushed my to do another benchmark.

And to be sure, I'll put it into words again, this is not a Linux vs Windows benchmark !

I don't care about what's the best !

I have a client with a big windows infrastructure (and no Linux) and a lot of windows expertise (and no Linux) and I want to know if I must advise him to use PostgreSQL on Linux.
Because to move to Linux, he would have to hire a Linux specialist, develop in-house expertise and adapt his budgets consequently.

Following  /u/graycube's advice, I ditched my benchmark tool and used PgBench for this second round of benchmark of Windows vs Linux for PostgreSQL hosting.


Following ants_a's advice, I tested with bigger data than the memory size and with a long duration.
The m4.xlarge instances at Amazon have 16gb of data and with a scale of 2000, pgbench generated a db of +-30gb.

PgBench, for the unaware (like me) does exactly what my benchmark program did but better !
It has a lot of options and is more heavily tested than my app that's only been tested ... by me.
PGBench's not really close to what I wanted because I wanted to test it with a .NET app that uses npgsql (pgbench uses libpq), but, as the saying goes "When in Rome, do as the romans do."

For thoses tests, the architecture is the same as the previous one :

 "The Client "
Windows 2012 R2 server on amazon, m4.xlarge type, with all the default settings.
The client "application" is PgBench.

"The Windows PostgreSQL Server " aka WS
Windows 2012 R2 server on amazon, m4.xlarge type, with all the default settings and a 100gb ssd.
PostgreSQL 9.4.5 installed with the wizard.
I changed the listen_addresses to * and the required changes to pg_hba.conf for the connection to work.

"The Linux PostgreSQL Server " aka LS
Amazon Linux AMI, m4.xlarge type, with all the default settings and a 100gb ssd.
PostgreSQL 9.4.5 installed with yum.
I made the same changes for postgresql.conf and pg_hba.conf as the one I made for Windows.

The scenarios

Scale of 500
a.1 - init PgBench
a.2 - run locally PgBench for 240 seconds.
a.3 - run locally PgBench for 240 seconds for 42 clients.
a.4 - run locally PgBench for 240 seconds for 42 clients with 21 threads.
a.5 - Set max_connections to 300 and restart PostgreSQL
a.6 - run locally PgBench for 240 seconds for 42 clients with 21 threads.
a.7 - run from The Client for 240 seconds for 42 clients with 21 threads.

Scale of 1000
b.1 - init PgBench
b.2 - restart PostgreSQL
b.3 - run from The Client for 240 seconds for 42 clients with 21 threads.
b.4 - run from The Client for 1200 seconds for 42 clients with 21 threads.

Scale of 2000
c.1 - init PgBench
c.2 - restart PostgreSQL
c.3 - run from The Client for 2400 seconds for 250 clients with 125 threads.

The results !

INIT PgBench 500
My computer
50000000 of 50000000 tuples (100%) done (elapsed 118.58 s, remaining 0.00 s).

WS
50000000 of 50000000 tuples (100%) done (elapsed 59.88 s, remaining 0.00 s).

LS
50000000 of 50000000 tuples (100%) done (elapsed 48.87 s, remaining 0.00 s).

For some "fun", I tried PgBench on my computer to compare it with a 0.5$/hr machine on amazon but I stopped quickly...

As it was with the tests with my app, creation on Linux is faster.


Or is it really ?

INIT PgBench 1000
WS
100000000 of 100000000 tuples (100%) done (elapsed 115.18 s, remaining 0.00 s).

LS
100000000 of 100000000 tuples (100%) done (elapsed 120.97 s, remaining 0.00 s).

INIT PgBench 2000
WS
200000000 of 200000000 tuples (100%) done (elapsed 261.00 s, remaining 0.00 s).

LS
200000000 of 200000000 tuples (100%) done (elapsed 264.69 s, remaining 0.00 s).


Summarized init numbers :

ServerScale 500Scale 1000Scale 2000
WS59.88s115.18s261.00s
LS48.87s120.97s264.96s


Linux is a little bit (<5%) slower when there's a lot of data !

Here again, we should be reminded that it's the default PostgreSQL settings and default settings for both OSs.
So maybe autovacuum kicked in, maybe the Amazon's Linux VM pool is more used than the windows' one,... or whatever reason you could fit in there !

Let's see if this is the same when we run PgBench instead of just the initialization.
For the sake of sanity, I rounded some of PgBench's results but they are all available on my github.


Duration (sec)TRANS WSTRANS LSTPS WSTPS LSdiff tps
a.224012847210953553545617,32
a.324032847132010813011333-2,40
a.4240382147365494159215224,60
a.6240421287399478171416633,07
a.72404687614258391952177410,03
b.324029180329444112041226-1,79
b.412007516471077216626896-30,13
c.32400129658277348254032168,22

a.6

a.6 is a.5 after restart and setting max_connections to 300.
Now I'm in the fog .
How come pushing the max_connections to 300 allowed the tps to rise ?
I guess that's because I restarted the PostgreSQL process
Please note that I haven't used the -C parameter so there should only be one connection per client. (So a max of 42)
PgBench's doc :
 -C
    Establish a new connection for each transaction, rather than doing it just once per client session.

b.4

A scale a 1000 makes a db of +-16gb so I guess a big chunk of it can fit in the memory during the whole process and that Linux'S memory management is better than what some may think.

c.3

That one made my jaw drop !
Windows made 68% more TPS than Linux under heavy load for a long time.

I couldn't believe it so I waited one week and did it again.

Duration (sec)TRANS WSTRANS LSTPS WSTPS LSdiff tps
c.3.22400 130367784438754235154,42

Linux went up 10% but Windows stood almost the same for a new diff of 54%.

So on both C.3 test, we have almost 61% diff in favor of Windows.

Summary

Basically, windows is 5% faster with a scale of 500 , 30% slower with a scale of 1000 and 61% faster with a scale of 2000.


I discussed about it with my colleagues and one of them said :
"Of course you noob, you have to tune Linux a minimum to reach the best perf !"

My answer was :
"No. That's the goal of my exercise.
I don't want a Linux specialist fiddling for 6 weeks to have a top notch server, nor do I want a Windows specialist to fiddle for 6 weeks for the same reason.
Moreover, if I'm using the cloud, there's a lot of area that I won't be able to tune. (Disk, network, ...)"


In my previous post, I said that Linux and Windows are at par when it comes to PostgreSQL performance.

After this second test I'm forced to say :
  • You have Linux infrastructure and no Windows, use PostgreSQL on Linux !
  • You have Windows infrastructure and no Linux, use PostgreSQL on Windows !
  • You have both, well it depends... you should hire a specialist to help you !

Which means, for me, they are at par.

Follow-up from the boutade from my first post, PostgreSQL in Linux in a VM in Windows is NOT faster than PostgreSQL on the same Windows, not at all !


What do you think ?

Aucun commentaire:

Publier un commentaire

Ce commentaire sera validé par notre modérateur.