我這 對測試 mysql infobright 壓縮 和 查詢速度 進行測試
測試結果我會盡快登出的
#!/usr/bin/python
import MySQLdb
#conn = MySQLdb.Connection('127.0.0.1', 'root', '', 'dmspi')
conn=MySQLdb.connect(host="127.0.0.1",port=3307,user="root",passwd="",db="test")
cur = conn.cursor()
st = "create table testtime4 ( "
try :
for cc in xrange(1000):
if cc % 2 == 0 :
st += 'a'+str(cc)+' varchar(20),\n'
else :
st += 'a'+str(cc)+' int(20),\n'
st += 'a int(20)'
st = st + ");"
cur.execute(st)
# import sys
# sys.exit(1)
import random
ccs = lambda : random.choice(['apple', 'pear', 'peach', 'orange', 'lemon',''])
ccn = lambda : random.randint(0,10000)
fd = open('/data/logs/dataFormat/test/t4.data','w')
for cc in xrange(10000000):
st = ''
ss = ccs()
nn = str(ccn())
for cc in xrange(1000):
if cc < 15 :
if cc % 2 == 0 :
st += ss+'\t'
else :
st += nn+'\t'
else :
st += '\t'
st += nn
print >>fd,st
fd.close()
# cur.execute('load data infile \'/data/logs/dataFormat/test/t4.data\' into table testtime4 fields terminated by "\t";')
finally :
cur.close()
conn.close()
mysql infobright 測試結果報告:
一千萬條數據導入花費時間:
mysql> load data infile '/data/logs/dataFormat/test/t4.data' into table testtime4 fields terminated by "\t";
Query OK, 10000000 rows affected (36 min 47.00 sec)
測試一 :
1. 表屬性 有 500 列
2. 屬性列都有值, 無 Null 數據
3. 原始文件大小 26G ,導入數據倉庫 5G
部分測試時間:
select count(*) from testtime where a0="pear" and a2="orange";
1 row in set (3.63 sec)
select a6,count(*) from testtime group by a6 order by a6 desc ;
5 rows in set (2.24 sec)
mysql> select count(*) from testtime where a0="apple" ;
1 row in set (5.68 sec)
測試二 :
1. 表屬性 有 1000 列
2. 屬性列前 15 列有值 , 其余后面都為 Null
3. 原始文件大小 10G ,導入數據倉庫 215M
mysql> select a0,count(*) from testtime4 group by a0 ;
+--------+----------+
| a0 | count(*) |
+--------+----------+
| lemon | 1665543 |
| peach | 1666276 |
| orange | 1667740 |
| pear | 1665910 |
| apple | 1665678 |
| NULL | 1668863 |
+--------+----------+
6 rows in set (4.55 sec)
select * from testtime4 order by a6 desc limit 2000000,1 ;
1 row in set (3.30 sec)
整理 m.tkk7.com/Good-Game