Intel SSD 530 240GB vs Samsung 840 PRO 256GB

Posted on 2014 年 8 月 17 日 by yaoge123

Dell R620, 2*E5-2643, 32GB RAM, RHELS 6.5, Iozone 3.420，SSD分区4K对齐, ext4打开trim

./iozone -a -i 0 -i 1 -i 2 -y 4k -q 1m -s 64g -Rb ./test.xls

结果是除了大块数据的随机写840大幅落后外其它的小胜530，具体结果如下：

Intel SSD 530 240GB：

record size	4	8	16	32	64	128	256	512	1024
Writer Report	524440	523974	526780	527053	526968	526036	527036	525766	525890
Re-writer Report	522612	522613	522275	523206	522439	523246	522352	521813	522232
Reader Report	401357	400040	402251	404237	404396	403728	404063	403219	402921
Re-reader Report	400585	399216	400826	402105	402526	402288	403591	402457	402526
Random Read Report	24057	42185	71473	116760	181263	257251	320765	373026	400579
Random Write Report	265987	367024	436929	485591	506996	523238	522900	522817	522449

Samsung 840 PRO 256GB：

record size	4	8	16	32	64	128	256	512	1024
Writer Report	531851	532568	532486	534077	535260	535288	535185	535542	535103
Re-writer Report	529750	530555	530921	530615	530606	530297	530888	530316	531189
Reader Report	527696	527396	527468	527315	527693	527813	527855	527261	527416
Re-reader Report	527420	527422	527873	527182	527601	527822	527689	527391	527203
Random Read Report	34822	58758	94271	145413	216819	295532	338092	374540	391692
Random Write Report	269031	373047	407259	288842	286163	284136	285640	290691	295829

10Gb SPF+ VS 10GBASE-T

Posted on 2014 年 8 月 5 日 (Updated 2015 年 9 月 9 日) by yaoge123

万兆以太网现在有光纤SPF+和铜揽10GBASE-T两种接口，在性能上主要是延迟方面的差距。比较IBM BNT、Dell Force10、Arista等数据中心交换机的参数可以发现，SPF+的延迟在350ns(Arista 7150S-24)~880ns，而10GBASE-T则需要3.2us~3.3us，现在在追求高性能的环境应该是用SPF+。

DELL MD3800f 性能初测

Posted on 2014 年 5 月 25 日 by yaoge123

DELL MD3820f 双控

测试命令：iozone -i 0 -i 1 -r 1M -s 128G

	write	rewrite	read	reread
4TByte 3.5-inch 7.2Krpm NL-SAS 6个盘做一组RAID6，1组RAID6	532	558	911	765

xCAT 更新root ssh key方法

Posted on 2014 年 4 月 22 日 by yaoge123

流程如下：生成新的key，分发新key，替换所有节点key，替换xcat key

ssh-keygen //生成新的key命令为id_rsa1
pscp id_rsa1.pub all:/root/.ssh/authorized_keys
mv id_rsa id_rsa.old
mv id_rsa.pub id_rsa.pub.old
mv id_rsa1 id_rsa
mv id_rsa1.pub id_rsa.pub 
pscp id_rsa all:/root/.ssh/
pscp id_rsa.pub all:/root/.ssh/
cp /root/.ssh/id_rsa.pub /install/postscripts/_ssh/authorized_keys

限制root用户登陆IP

Posted on 2014 年 4 月 9 日 (Updated 2014 年 4 月 22 日) by yaoge123

在/etc/ssh/sshd_config中添加

DenyUsers root@”!10.1.0.0/16,*”

表示禁止root用户从除10.1网段以外的其它IP登陆

Platform LSF ELIM

Posted on 2014 年 3 月 17 日 (Updated 2014 年 4 月 22 日) by yaoge123

LSF可以让用户自定义一些资源，其中动态资源可以通过ELIM向LSF汇报，下面以本地磁盘（一个机械盘一个SSD）负载为例：

在$LSF_ENVDIR/lsf.shared的Begin Resource中增加

diskut   Numeric    60    Y    (Percentage of CPU time during which I/O requests were issued to local disk)
ssdut   Numeric    60    Y    (Percentage of CPU time during which I/O requests were issued to local SSD disk)

在$LSF_ENVDIR/lsf.cluster.的Begin ResourceMap中增加一行

diskut              [default]
ssdut               [default]

在$LSF_SERVERDIR/下新建一个文件elim.disk内容如下并且chmod +x elim.disk

#!/bin/sh

declare -a util
while true; do
	util=(`sar -d 60 1|grep Average|grep dev8|awk '{print $10}'`)
	case "${#util[@]}" in
		1)
			echo 1 diskut ${util[0]}
			;;
		2)
			echo 2 diskut ${util[0]} ssdut ${util[1]}
			;;
	esac
done

所有节点需要lsadmin limrestart，然后用lsload -l就可以看到多出来两列了

bsub时可以使用这些参数
-R “order[diskut]” 优先选择disk负载最轻的
-R “select[diskut < 10]” 要求disk负载小于10%
-R “rusage[diskut=10]” 为这个任务预留10%的disk负载。rusage不影响lsload的显示，但是会叠加到lsload显示的实际值上面从而影响order select的结果，除非lsf确定预留的资源被这个job所使用了（比如mem）。

Platform LSF Compute Units 调度策略

Posted on 2014 年 3 月 14 日 (Updated 2016 年 4 月 10 日) by yaoge123

Compute Units（CU）可以对一个队列中的机器在调度时进行分组，可以控制作业在这些组中的分配。

假设有三个cu，每个cu空闲的job slots如下：

cu name	free job slots
cu1	4
cu2	6
cu3	8

cu[pref=minavail]：把cu按照空闲的job slots从小到大排序，按顺序填充分配使用cu。例：-n 4则使用cu1的4个；-n 6则使用cu1的4个和cu2的2个。

cu[pref=maxavail]：把cu按照空闲的job slots从大到小排序，按顺序填充分配使用cu。例：-n 6则使用cu3的6个；-n 10则使用cu3的8个和cu2的2个。
上面的情况下，如果cu中空闲的job slots数量一样，则按照其在lsb.hosts中Begin ComputeUnitvs中的顺序使用

cu[balance]：按照在lsb.hosts中Begin ComputeUnitvs中的顺序，在尽量少的cu中分配使用且每个cu中使用的job slots尽量平衡。例：-n 6则使用cu2的6个；-n 8则使用cu3的8个；-n 10则使用cu2和cu3的各使用5个；-n 12则cu2和cu3个使用6个；-n 14则cu1使用4个、cu2和cu3各使用5个。
cu[balance:pref=minavail]和cu[balance:pref=maxavail]：把cu按照空闲的job slots排序，在尽量少的cu中分配使用且每个cu中使用的job slots尽量平衡。例：-n 4 -R “cu[balance:pref=minavail]”使用cu1，-n 4 -R “cu[balance:pref=maxavail]”使用cu3。

对于HPC来说，其实更想要一种类似于minavail但是又尽量分布到最少cu上的策略，如果必须跨cu则应尽量不等分减少跨cu通讯。

Plextor M5p 256GB vs Intel SSD 530 240GB

Posted on 2014 年 2 月 26 日 (Updated 2014 年 2 月 26 日) by yaoge123

Dell R620, 2*E5-2643, 32GB RAM, RHELS 5.3, Iozone 3.414，SSD分区4K对齐, ext4打开trim

./iozone -a -i 0 -i 1 -i 2 -y 4k -q 1m -s 64g -Rb ./test.xls

结果就是Intel SSD 530全面大幅超越Plextor M5p，具体结果如下：

Plextor M5p 256GB：

record size	4	8	16	32	64	128	256	512	1024
Writer Report	73495	49706	53834	55819	51266	51434	52833	51928	52610
Re-writer Report	82056	80580	81662	71514	71218	70992	70998	71210	73930
Reader Report	437508	437398	437296	436977	437826	437586	437788	436948	437527
Re-reader Report	437118	437650	437002	436111	437281	437213	442700	437818	437569
Random Read Report	36441	64764	91808	129377	190359	231620	309157	278560	273607
Random Write Report	53990	53750	52256	52058	52626	51981	51429	52504	52852

Intel SSD 530 240GB：

record size	4	8	16	32	64	128	256	512	1024
Writer Report	498716	523831	500965	527952	524793	525040	528580	529695	529030
Re-writer Report	523820	524679	527672	528101	525403	528344	526986	526756	527916
Reader Report	399111	405611	403717	401873	404561	401628	401842	401729	403944
Re-reader Report	399121	402200	399726	399718	408133	401520	401671	401337	401857
Random Read Report	26518	51309	81293	126808	209616	265063	336988	382371	409023
Random Write Report	270073	370813	443589	489812	511003	520004	521321	521664	524645

NetApp E2600 DDP测试

Posted on 2014 年 2 月 25 日 (Updated 2015 年 11 月 23 日) by yaoge123

存储为浪潮AS500H（NetApp E2600），24个900GB 2.5-inch 10Krpm SAS。IO节点为两台浪潮NF5270M3，每台2*E5-2620v2 64GB内存双端口MiniSAS卡。每台IO节点均和存储的两个控制器通过MiniSAS直接连接。24个盘做成一个DDP，保留盘1个或2个，划分两个卷，每个控制器一个，每个IO节点一个，持续写入约220MB，持续读取约610MB。可以说结果一塌糊涂！

RHEL6 正确关闭IPv6的方法

Posted on 2014 年 2 月 21 日 (Updated 2014 年 4 月 22 日) by yaoge123

正确的方法是：

sysctl -w net.ipv6.conf.all.disable_ipv6=1
sysctl -w net.ipv6.conf.default.disable_ipv6=1
sed -i '/net.ipv6.conf.all.disable_ipv6=/d' /etc/sysctl.conf
echo "net.ipv6.conf.all.disable_ipv6=1" >> /etc/sysctl.conf
sed -i '/net.ipv6.conf.default.disable_ipv6=/d' /etc/sysctl.conf
echo "net.ipv6.conf.default.disable_ipv6=1" >> /etc/sysctl.conf

/etc/ssh/sshd_conf中的AddressFamily any改为AddressFamily inet，否则sshd会有问题

/etc/modprobe.d/ 目录下新建一个文件，内容为install ipv6 /bin/true，的确能关闭IPv6但是会导致网卡bonding失败等各种问题

/etc/sysconfig/network 里面添加NETWORKING_IPV6=no 或者 IPV6INIT=no 都是没有用的