Das aktuelle Setup ist:
- Insgesamt 4 x Ceph-Nodes
- Eine Node besteht aus 4 x 1 GBit/s Ethernet, 1 x OS Disk, 3 x Ceph-OSD-Disk mit je 1 TB Kapazität, Intel Core i7, 16 GB RAM
- Macht insgesamt 12 x OSD
- Alle Festplatten sind getestet und liefern ca. 140 MB/Sek beim seq. synchronen Schreiben
- 2 Systeme sind SATA-Systeme, 2 weitere sind SAS-Systeme mit Adaptec ASR 5405 Controller
- Also Software verwende ich PetaSAN(ein Ceph-Verwaltungstool), das für alle benötigten Netzwerkverbindungen dedizierte Links auf allen Knoten hat.
Das sollte doch eigentlich für einen vernünftigen Test reichen?
Hier auch nochmal die HW-Infos und Testergebnisse:
Code: Alles auswählen
***** Testsystem T1 *****
* Test-VM auf auf proxmox
* Avago LSI MegaRAID SAS 9271-4i Controller, 1 GB Cache auf Controller
* Synchron DRBD-Repliziert
System: Host: kvm07: 4.4.83-1-pve x86_64 (64 bit gcc: 4.9.2) Console: tty 0
Distro: /etc/ corrupted, use -% to override
Machine: System: Supermicro product: Super Server v: 0123456789 serial: 0123456789
Mobo: Supermicro model: X10DRi-LN4+ v: 1.01 serial: VM148S014598
Bios: American Megatrends v: 2.0 date: 12/17/2015
Chassis: type: 17 v: 0123456789 serial: 0123456789
CPU(s): 2 Hexa core Intel Xeon E5-2620 v3s (-HT-MCP-SMP-) cache: 30720 KB
flags: (lm nx sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx) bmips: 57601
Clock Speeds: 1: 3200 MHz 2: 2354 MHz 3: 2646 MHz 4: 1200 MHz 5: 3157 MHz 6: 1711 MHz 7: 1985 MHz
8: 2120 MHz 9: 1864 MHz 10: 1871 MHz 11: 2110 MHz 12: 2705 MHz 13: 2301 MHz 14: 2109 MHz 15: 1985 MHz
16: 2917 MHz 17: 1688 MHz 18: 2914 MHz 19: 2189 MHz 20: 2075 MHz 21: 2498 MHz 22: 1903 MHz
23: 1767 MHz 24: 2065 MHz
Graphics: Card: ASPEED ASPEED Graphics Family bus-ID: 06:00.0 chip-ID: 1a03:2000
Display Server: N/A driver: N/A tty size: 196x66 Advanced Data: N/A for root out of X
Audio: Card Failed to Detect Sound Card! Sound: ALSA v: k4.4.83-1-pve
Network: Card-1: Intel I350 Gigabit Network Connection
driver: igb v: 5.3.5.3 ports: 5020 Root bus-ID: 03:00.0 chip-ID: 8086:1521
IF: eth2 state: up speed: 1000 Mbps duplex: full mac: 0c:c4:7a:14:0e:ca
Card-2: Intel I350 Gigabit Network Connection
driver: igb v: 5.3.5.3 ports: 5000 Root bus-ID: 03:00.1 chip-ID: 8086:1521
IF: eth3 state: down mac: 0c:c4:7a:14:0e:cb
Card-3: Intel I350 Gigabit Network Connection
driver: igb v: 5.3.5.3 ports: 6020 Root bus-ID: 02:00.0 chip-ID: 8086:1521
IF: eth0 state: up speed: 1000 Mbps duplex: full mac: 0c:c4:7a:14:0e:c8
Card-4: Intel I350 Gigabit Network Connection
driver: igb v: 5.3.5.3 ports: 6000 Root bus-ID: 02:00.1 chip-ID: 8086:1521
Drives: HDD Total Size: 8000.5GB (0.1% used)
ID-1: /dev/sda model: MR9271 size: 8000.5GB serial: scsi-3600605b00b5804701e63789607dcae16 temp: 0C
Optical: No optical drives detected.
Partition: ID-1: / size: 37G used: 3.7G (11%) fs: ext4 dev: /dev/dm-0
label: N/A uuid: caf0fb82-1afa-4458-88f5-012fe4110269
ID-2: /boot size: 1.4G used: 173M (13%) fs: ext2 dev: /dev/sda4
label: N/A uuid: 73035898-02ff-485b-aaff-e1a3290eff1c
ID-3: /boot/efi size: 476M used: 132K (1%) fs: vfat dev: /dev/sda1 label: N/A uuid: DC0A-2677
ID-4: /etc/pve size: 30M used: 80K (1%) fs: fuse dev: /dev/fuse label: N/A uuid: N/A
ID-7: swap-1 size: 5.00GB used: 0.41GB (8%) fs: swap dev: /dev/dm-1
label: N/A uuid: 74592c9e-ff4e-4f64-818f-3d6e12569714
RAID: System: supported: N/A
No RAID devices: /proc/mdstat, md_mod kernel module present
Unused Devices: none
Sensors: System Temperatures: cpu: 34.0C mobo: N/A
Fan Speeds (in rpm): cpu: N/A
Info: Processes: 417 Uptime: 24 days Memory: 53786.3/128831.1MB
Init: systemd v: 215 runlevel: 5 Gcc sys: 4.9.2
Client: Shell (bash 4.3.301 running in tty 0) inxi: 2.1.28
***** Testsystem T2 *****
* VMWare ESXi VM
* Ceph-iSCSI-Storage mit 3-fach Replikation
* 12 x 1 TB SAS/SATA Einzelfestplatten
* 1 GBit Netzwerkswitch
A.1) 1 GB sequentielles lesen(synchron,Zufallsdaten,ohne OS-Cache)
***** T1 *****
Der Test wird hier vermutlich komplett im Controller-Cache ablaufen.
1048576000 bytes (1.0 GB) copied, 1.85688 s, 565 MB/s in 0m1.858s
1048576000 bytes (1.0 GB) copied, 0.927414 s, 1.1 GB/s in 0m0.929s
1048576000 bytes (1.0 GB) copied, 0.921461 s, 1.1 GB/s in 0m0.924s
1048576000 bytes (1.0 GB) copied, 1.0073 s, 1.0 GB/s in 0m1.009s
1048576000 bytes (1.0 GB) copied, 0.846048 s, 1.2 GB/s in 0m0.848s
1048576000 bytes (1.0 GB) copied, 0.654031 s, 1.6 GB/s in 0m0.656s
1048576000 bytes (1.0 GB) copied, 0.878819 s, 1.2 GB/s in 0m0.882s
***** T2 *****
1048576000 bytes (1.0 GB, 1000 MiB) copied, 19.8506 s, 52.8 MB/s in 0m24.084s
1048576000 bytes (1.0 GB, 1000 MiB) copied, 21.6789 s, 48.4 MB/s in 0m23.993s
1048576000 bytes (1.0 GB, 1000 MiB) copied, 18.2593 s, 57.4 MB/s in 0m18.307s
1048576000 bytes (1.0 GB, 1000 MiB) copied, 21.7597 s, 48.2 MB/s in 0m21.835s
1048576000 bytes (1.0 GB, 1000 MiB) copied, 18.3634 s, 57.1 MB/s in 0m18.452s
1048576000 bytes (1.0 GB, 1000 MiB) copied, 19.1834 s, 54.7 MB/s in 0m21.134s
A.2) 1 GB sequentielles schreiben(synchron,Zufallsdaten,ohne OS-Cache)
***** T1 *****
Der Test wird hier vermutlich komplett im Controller-Cache ablaufen.
1048576000 bytes (1.0 GB) copied, 1.93722 s, 541 MB/s in 0m1.938s
1048576000 bytes (1.0 GB) copied, 1.88414 s, 557 MB/s in 0m1.886s
1048576000 bytes (1.0 GB) copied, 2.00204 s, 524 MB/s in 0m2.003s
1048576000 bytes (1.0 GB) copied, 2.004 s, 523 MB/s in 0m2.006s
1048576000 bytes (1.0 GB) copied, 2.18278 s, 480 MB/s in 0m2.184s
1048576000 bytes (1.0 GB) copied, 2.07272 s, 506 MB/s in 0m2.074s
1048576000 bytes (1.0 GB) copied, 2.44182 s, 429 MB/s in 0m2.444s
1048576000 bytes (1.0 GB) copied, 2.0344 s, 515 MB/s in 0m2.036s
1048576000 bytes (1.0 GB) copied, 2.11452 s, 496 MB/s in 0m2.116s
***** T2 *****
1048576000 bytes (1.0 GB, 1000 MiB) copied, 52.1188 s, 20.1 MB/s in 0m52.120s
1048576000 bytes (1.0 GB, 1000 MiB) copied, 51.0985 s, 20.5 MB/s in 0m51.128s
1048576000 bytes (1.0 GB, 1000 MiB) copied, 48.5335 s, 21.6 MB/s in 0m48.535s
1048576000 bytes (1.0 GB, 1000 MiB) copied, 50.9038 s, 20.6 MB/s in 0m50.905s
1048576000 bytes (1.0 GB, 1000 MiB) copied, 49.6982 s, 21.1 MB/s in 0m49.699s
1048576000 bytes (1.0 GB, 1000 MiB) copied, 48.89 s, 21.4 MB/s in 0m48.891s
1048576000 bytes (1.0 GB, 1000 MiB) copied, 50.3211 s, 20.8 MB/s in 0m50.322s
1048576000 bytes (1.0 GB, 1000 MiB) copied, 49.5101 s, 21.2 MB/s in 0m49.511s
1048576000 bytes (1.0 GB, 1000 MiB) copied, 50.0243 s, 21.0 MB/s in 0m50.025s
1048576000 bytes (1.0 GB, 1000 MiB) copied, 49.8115 s, 21.1 MB/s in 0m49.812s
1048576000 bytes (1.0 GB, 1000 MiB) copied, 50.5847 s, 20.7 MB/s in 0m50.586s
A.3) sequentielles schreiben grösserer Datenmengen mit in Echtzeit generierten Zufallsdaten
***** T1 *****
Geschwindigkeit
MB/Sekunde MB geschrieben
===============================
70.36 861
72.77 861
75.75 861
63.35 861
61.44 861
63.37 861
61.15 862
66.04 861
65.87 861
69.50 861
63.24 861
61.83 861
65.67 861
68.31 861
68.30 861
69.47 861
73.16 1681
69.86 1681
62.91 1681
64.41 1681
67.97 1681
71.05 1681
63.57 1681
64.89 1681
57.75 1681
63.18 1681
65.89 1681
66.93 1681
70.08 1681
62.27 1681
60.17 1681
68.13 1681
67.71 1681
69.21 1681
72.29 1681
63.32 1681
62.61 1681
67.19 1681
73.02 1681
64.56 1681
63.47 1681
62.84 1681
65.71 1681
66.60 1681
73.46 1681
65.71 1681
***** T2 *****
Anmerkung: Gerade bei grösseren Datenmengen
im Volllastschreibbetrieb fällt die Leistung
stark ab.
Geschwindigkeit
MB/Sekunde MB geschrieben
===============================
54.88 1426
50.06 1301
40.34 1290
58.31 1574
46.74 1589
47.29 1702
63.61 1081
48.56 1456
48.37 338
45.02 180
62.50 187
50.65 202
28.22 169
58.07 174
42.73 299
54.94 604
52.87 2061
20.79 2328
12.37 1447
15.80 1216
13.27 1101
12.53 2179
13.62 1797
12.44 2576
13.16 2408
11.72 2015
14.56 1660
A.4) fio Festplattentest
***** T1 *****
fio --rw=readwrite --name=test --size=500M --direct=1 --bs=1024k --numjobs=30 --group_reporting --runtime=10
test: (g=0): rw=rw, bs=1M-1M/1M-1M/1M-1M, ioengine=sync, iodepth=1
...
test: (g=0): rw=rw, bs=1M-1M/1M-1M/1M-1M, ioengine=sync, iodepth=1
fio-2.1.3
Starting 30 processes
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
Jobs: 30 (f=30): [MMMMMMMMMMMMMMMMMMMMMMMMMMMMMM] [100.0% done] [499.2MB/533.1MB/0KB /s] [499/533/0 iops] [eta 00m:00s]
test: (groupid=0, jobs=30): err= 0: pid=3523: Thu Sep 28 16:05:43 2017
read : io=5432.0MB, bw=554021KB/s, iops=541, runt= 10040msec
clat (msec): min=1, max=178, avg=13.27, stdev= 9.07
lat (msec): min=1, max=178, avg=13.27, stdev= 9.07
clat percentiles (msec):
| 1.00th=[ 3], 5.00th=[ 4], 10.00th=[ 6], 20.00th=[ 7],
| 30.00th=[ 8], 40.00th=[ 9], 50.00th=[ 11], 60.00th=[ 13],
| 70.00th=[ 16], 80.00th=[ 20], 90.00th=[ 26], 95.00th=[ 32],
| 99.00th=[ 41], 99.50th=[ 50], 99.90th=[ 64], 99.95th=[ 69],
| 99.99th=[ 180]
bw (KB /s): min= 5678, max=46090, per=3.35%, avg=18579.28, stdev=6474.71
write: io=4536.0MB, bw=462636KB/s, iops=451, runt= 10040msec
clat (msec): min=2, max=146, avg=50.22, stdev=30.41
lat (msec): min=2, max=146, avg=50.26, stdev=30.40
clat percentiles (msec):
| 1.00th=[ 6], 5.00th=[ 9], 10.00th=[ 12], 20.00th=[ 18],
| 30.00th=[ 27], 40.00th=[ 39], 50.00th=[ 51], 60.00th=[ 60],
| 70.00th=[ 70], 80.00th=[ 79], 90.00th=[ 92], 95.00th=[ 103],
| 99.00th=[ 119], 99.50th=[ 124], 99.90th=[ 137], 99.95th=[ 137],
| 99.99th=[ 147]
bw (KB /s): min= 6131, max=26204, per=3.31%, avg=15294.94, stdev=3718.81
lat (msec) : 2=0.32%, 4=3.07%, 10=25.25%, 20=26.13%, 50=21.82%
lat (msec) : 100=20.55%, 250=2.86%
cpu : usr=0.18%, sys=0.72%, ctx=10080, majf=0, minf=842
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=5432/w=4536/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
READ: io=5432.0MB, aggrb=554020KB/s, minb=554020KB/s, maxb=554020KB/s, mint=10040msec, maxt=10040msec
WRITE: io=4536.0MB, aggrb=462635KB/s, minb=462635KB/s, maxb=462635KB/s, mint=10040msec, maxt=10040msec
Disk stats (read/write):
dm-0: ios=10702/9059, merge=0/0, ticks=134568/449996, in_queue=586200, util=99.07%, aggrios=10864/9132, aggrmerge=0/117, aggrticks=136072/451104, aggrin_queue=587148, aggrutil=98.96%
sda: ios=10864/9132, merge=0/117, ticks=136072/451104, in_queue=587148, util=98.96%
***** T2 *****
### Hinweis ###: Der fio hier nur 10 Jobs. Der vorhergehende 30, weil das System T2 wesentlich leistungsschwächer ist.
fio --rw=readwrite --name=test --size=500M --direct=1 --bs=1024k --numjobs=10 --group_reporting --runtime=10
test: (g=0): rw=rw, bs=1M-1M/1M-1M/1M-1M, ioengine=psync, iodepth=1
...
fio-2.16
Starting 10 processes
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
test: Laying out IO file(s) (1 file(s) / 500MB)
Jobs: 10 (f=10): [M(10)] [100.0% done] [31744KB/20480KB/0KB /s] [31/20/0 iops] [eta 00m:00s]
test: (groupid=0, jobs=10): err= 0: pid=28123: Thu Sep 28 16:11:16 2017
read : io=300032KB, bw=28667KB/s, iops=27, runt= 10466msec
clat (msec): min=19, max=627, avg=132.92, stdev=109.35
lat (msec): min=19, max=627, avg=132.92, stdev=109.35
clat percentiles (msec):
| 1.00th=[ 25], 5.00th=[ 35], 10.00th=[ 41], 20.00th=[ 51],
| 30.00th=[ 59], 40.00th=[ 69], 50.00th=[ 79], 60.00th=[ 102],
| 70.00th=[ 192], 80.00th=[ 241], 90.00th=[ 277], 95.00th=[ 322],
| 99.00th=[ 490], 99.50th=[ 515], 99.90th=[ 627], 99.95th=[ 627],
| 99.99th=[ 627]
write: io=310272KB, bw=29646KB/s, iops=28, runt= 10466msec
clat (msec): min=40, max=699, avg=205.95, stdev=138.94
lat (msec): min=40, max=699, avg=206.00, stdev=138.94
clat percentiles (msec):
| 1.00th=[ 57], 5.00th=[ 78], 10.00th=[ 86], 20.00th=[ 100],
| 30.00th=[ 110], 40.00th=[ 121], 50.00th=[ 133], 60.00th=[ 167],
| 70.00th=[ 277], 80.00th=[ 310], 90.00th=[ 449], 95.00th=[ 490],
| 99.00th=[ 603], 99.50th=[ 635], 99.90th=[ 701], 99.95th=[ 701],
| 99.99th=[ 701]
lat (msec) : 20=0.34%, 50=9.40%, 100=29.36%, 250=34.90%, 500=23.49%
lat (msec) : 750=2.52%
cpu : usr=0.02%, sys=0.09%, ctx=608, majf=0, minf=100
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=293/w=303/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: io=300032KB, aggrb=28667KB/s, minb=28667KB/s, maxb=28667KB/s, mint=10466msec, maxt=10466msec
WRITE: io=310272KB, aggrb=29645KB/s, minb=29645KB/s, maxb=29645KB/s, mint=10466msec, maxt=10466msec
Disk stats (read/write):
sda: ios=320/334, merge=0/2, ticks=42256/66628, in_queue=110092, util=98.96%
Was mir speziell aufgefallen ist, dass die Raten bei der Testumgebung 2 mit VMWare manchmal deutlich über normal sind und dann irgendwann einbrechen und auf dem Niveau bleiben. Ich vermute, dass da ein Caching aktiv ist, dessen Funktionsweise sich meiner Kenntnis aber komplett entzieht.