-
Notifications
You must be signed in to change notification settings - Fork 6
YCSB on 9.1G Movie Data
- 1. Introduction
- 2. Test Method
- 2.1. Environment
- 3. Writing Performance
- 4. Compression
- 5. Reading Performance
- 5.1. Data Much Smaller than Memory (Memory 64GB)
- 5.2. Data Smaller than Memory (Memory 8GB)
- 5.3. Data Larger than Memory (Memory 4GB)
- 5.4. Data Much Larger than Memory (Memory 2GB)
TerarkSQL is a MySQL distribution that using TerarkDB as its storage engine. We integrate TerarkDB into MySQL via MyRocks, which is a modification of MySQL that using RocksDB as its storage engine, powered by facebook.
- Test Tools
- Test Dataset
- Since YCSB's default datasets are generated from random string, way too far from real-world scenarios. So we changed YCSB, make it use Amazon movie data (~8 million reviews) as the testing dataset.
- Dataset Size
- About 9.1GB
- About 8 million records
- About 1KB per record
- Storage Engines
- MySQL + InnoDB (InnoDB for short)
- MySQL on RocksDB,using RocksDB as storage engine(aka. MyRocks), RocksDB for short
- TerarkSQL,using TerarkDB as storage engine (TerarkDB for short)
- All reading tests are using
uniform distribution
andzipf distributon
- We've caculated 95/99 percentile latency for all reading tests
- CPU: Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz x2 (16 cores in total with 32 threads)
- Memory: DDR4 16G @ 1866 MHz x4 (total 64G)
- SSD: INTEL 730 IOPS 89000
- OS: CentOS 7
-
Writing Speed:
-
Writing 95/99 Percentile Latency:
Original dataset is about 9.1GB,8 million records,1KB for each record.
- Disk Usage:
-
Memory limitation archived by
cgroups
-
Client side application in all tests are running from separated machine in the same local network(Connected with 1000M switcher)
-
All reading tests use 16 threads, one connection per thread.
-
InnoDB uses default settings
- Since
InnoDB
uses file API directly for data reading,cgroups
can't limit its memory usage, so we use system kernal settings to do that.
- Since
-
RocksDB enabled
allow_mmap_reads
socgroups
can limit its memory usage and we setBlockSize
to 16k -
TerarkDB uses default option settings(TerarkDB enable
allow_mmap_reads
by default)
All reading tests are using uniform distribution and zipf distribution
-
Physical memory is 64GB
-
RocksDB's
block_cache_size
was set to half of total memory(whici is its default behavior) -
Reading 95/99 percentile latency was
uniform distributon
-
Memory Usage
-
cgroups
limits the total memory to 8G - InnoDB's test uses
8.6G
memory since the OS uses about0.6 GB
- RocksDB's
block_cache_size
is set to 2G - TerarkDB only needs
3.02G
memory, much lesser than 8G, so it will not be affected. - Read 95/99 percentile latency is under
uniform distributon
- Memory was limited to 4G
- InnoDB needs 4.6G memory since system will cost about 0.6GB
- RocksDB's
block_cache_size
was set to 1G - TerarkDB only needs 3.02G,which is lesser than 4G and will not be affected
- Reading 95/99 percentile latency is under
uniform distribution
- Memory was limited to 2G
- InnoDB uses about 2.6GB memory in total including
0.6GB
of the system cost - RocksDB's
block_cache_size
is set to 500M - In this scenario, both RocksDB and InnoDB can't have enough memory, the bottleneck is disk IO and impacted the performance significantly. Since TerarkDB compressed data into about
2.5GB
, so its memory inefficient is not so serious. - Reading 95/99 percentile latency is under
uniform distribution
for the test