Lowering the Memory Usage on the vCenter Appliance
I set up many virtual labs that include one or more vCenters. In vCenter 5.1+ the default memory allocation has grown to 8GB, which is way more than I want to allocate in a lab.
The problem is that there are more and more services being packed into the vCenter and the majority are written in Java. Java is a great language as it is cross-platform, but it can be seen as bulky. Most of the bulk comes from the Java Virtual Machine(JVM) which is given a memory heap to draw from. The good news is that we can change the heap sizes for each JVM service. Here is a good post on changing these values. You can take that guide and get the vCenter Server down to a good number. In this post we will look a little deeper into the vCenter Server Appliance.
NOTE: this procedure is not supported by VMware. Lowering the heap allocations too low will result in Allocation Errors, so if you experience issues after lowering the values as described, look for these errors in the logs and increase the memory on the correct components.
Understanding the Heap
First we need to understand the java heap to be able to tweak it appropriately. This article goes over the basics of the java heap. Let’s take a look at the options that we have to set the heap.
# java -X |grep -i heap
-Xms<size> set initial Java heap size
-Xmx</size><size> set maximum Java heap size
So we have -Xms which is the minimum heap size. This will be the initial heap size given to java when it starts, even if the application is using less memory. VMware has set these to relatively high numbers for these services.
We also have the -Xmx which is the maximum heap size. This will be the upper limit to how large the heap can grow for this application. Since different services will have different memory usage patters, we will expect this to range in the configuration.
Determining Memory Usage
Before we start changing the java heap sizes, we should get an idea on what the memory usage looks like. This will give us a better idea of what we need to cut and how much we can expect to run on.
First, let’s take a look at the appliance before it has been configured. I will take this as the baseline for the minimum amount of memory it can run in. I deployed a new vCenter Appliance and logged in after it booted. The following shows that we are using 169M of memory.
localhost:~ # free -m
total used free shared buffers cached
Mem: 8002 471 7530 0 8 293
-/+ buffers/cache: 169 7833
Swap: 15366 0 15366
So 169M is our baseline for low memory. Let’s see what is running and what services we may be able to get rid of if we wanted to lower the stock footprint.
localhost:~ # chkconfig |grep on
auditd on
cron on
dbus on
dcerpcd on
earlysyslog on
eventlogd on
fbset on
haldaemon on
haveged on
irq_balancer on
kbd on
ldap on
lsassd on
lwiod on
lwregd on
lwsmd on
netlogond on
network on
network-remotefs on
nfs on
purge-kernels on
random on
rpcbind on
sendmail on
splash on
splash_early on
syslog on
syslog-collector on
vmci on
vmmemctl on
vmware-netdumper on
vmware-tools-services on
vsock on
In the output above we can see many stock services running. This would be a starting point if you wanted to lower the memory footprint by a few MB. I am not going to pursue that here as it will only save a few MB and may break vCenter services.
Let’s break down these processes by memory size. There is a great utility called ps_mem that will take the reservation size from the output of ps and show the memory utilization from a set of processes. The script can be found here.
Download the script.
localhost:~ # wget http://www.pixelbeat.org/scripts/ps_mem.py
Run the script.
localhost:~ # python ps_mem.py
Private + Shared = RAM used Program
160.0 KiB + 30.0 KiB = 190.0 KiB irqbalance
220.0 KiB + 22.0 KiB = 242.0 KiB dhcpcd
208.0 KiB + 51.5 KiB = 259.5 KiB init
244.0 KiB + 71.0 KiB = 315.0 KiB cron
372.0 KiB + 19.5 KiB = 391.5 KiB klogd
264.0 KiB + 128.5 KiB = 392.5 KiB hald-runner
488.0 KiB + 36.5 KiB = 524.5 KiB syslog-ng
332.0 KiB + 272.5 KiB = 604.5 KiB hald-addon-acpi
568.0 KiB + 40.5 KiB = 608.5 KiB dbus-daemon
340.0 KiB + 301.0 KiB = 641.0 KiB hald-addon-input
804.0 KiB + 42.5 KiB = 846.5 KiB vami-lighttpd
804.0 KiB + 297.5 KiB = 1.1 MiB mingetty (5)
324.0 KiB + 911.0 KiB = 1.2 MiB udevd (3)
640.0 KiB + 618.0 KiB = 1.2 MiB hald-addon-storage (2)
964.0 KiB + 486.0 KiB = 1.4 MiB vami_login
1.3 MiB + 324.0 KiB = 1.6 MiB console-kit-daemon
2.0 MiB + 148.0 KiB = 2.1 MiB bash
2.4 MiB + 305.5 KiB = 2.7 MiB hald
2.5 MiB + 220.5 KiB = 2.7 MiB vmtoolsd
2.8 MiB + 785.0 KiB = 3.5 MiB sshd (2)
2.8 MiB + 1.3 MiB = 4.1 MiB vami-sfcbd (8)
4.2 MiB + 18.0 KiB = 4.2 MiB haveged
33.7 MiB + 160.5 KiB = 33.8 MiB slapd
---------------------------------
64.6 MiB
=================================
We can see in the output above the user space processes are taking up 64M of memory. The biggest usage is slapd which is the openLDAP daemon. Another notable one is the vami-sfcbd which is the web interface on port 5840.
Now that we have a baseline, go ahead and configure the appliance the way you want it by connecting to https://hostname:5840. I went ahead and configured it with the default settings.
Let’s take a look at the memory using the free command again.
localhost:~ # free -m
total used free shared buffers cached
Mem: 8002 4658 3344 0 71 1487
-/+ buffers/cache: 3099 4903
Swap: 15366 0 15366
By starting the services, our memory utilization went from 169M to 3099M. So we either have processes that are using that memory or the java initial heaps have been created and the memory is sitting there reserved. Let’s take a look at the ps_mem command again.
localhost:~ # python ps_mem.py |tail -n 15
2.5 MiB + 161.5 KiB = 2.6 MiB vmtoolsd
2.4 MiB + 272.5 KiB = 2.6 MiB hald
1.7 MiB + 1.0 MiB = 2.8 MiB vami-sfcbd (7)
1.8 MiB + 1.4 MiB = 3.2 MiB sendmail (3)
2.6 MiB + 726.0 KiB = 3.4 MiB sshd (2)
2.5 MiB + 993.0 KiB = 3.5 MiB lsassd
4.2 MiB + 12.0 KiB = 4.2 MiB haveged
4.7 MiB + 421.0 KiB = 5.1 MiB python2.6
27.7 MiB + 160.0 KiB = 27.9 MiB slapd
45.3 MiB + 12.6 MiB = 57.8 MiB postmaster (23)
133.4 MiB + 353.5 KiB = 133.7 MiB vpxd
2.9 GiB + 10.4 MiB = 2.9 GiB java (6)
---------------------------------
3.2 GiB
=================================
Based on the output above, there are some more services that have been added. The noteable ones are vpxd (vCenter Server daemon) and java. Since this command only takes the binary as the name, 6 processes that are running in java are contained by the 2.9G of memory java is allocated.
Determine the Memory Configuration and Usage per process
We can look deeper into the processes and see which one has the largest Resident Set Size (rss). This website provides some commands for viewing the processes and ordering them by usage. Using one of the commands we get the following.
localhost:~ # ps -eo rss,vsz,pid,cputime,cmd --width 100 --sort rss,vsz | tail --lines 10
15504 168324 8769 00:00:00 postgres: vc VCDB 127.0.0.1(50889) idle
16124 162236 4319 00:00:00 postgres: writer process
29056 182200 8708 00:00:00 /usr/lib/openldap/slapd -h ldap://localhost -f /etc/openldap/slapd.co
103444 819668 7022 00:00:02 /usr/java/jre-vmware/bin/java -Xms128m -Xmx512m -Dcom.vmware.vide.log.di
137808 365148 8759 00:00:03 /usr/lib/vmware-vpx/vpxd
272664 857096 9169 00:00:08 /usr/java/jre-vmware/bin/java -Dorg.tanukisoftware.wrapper.WrapperSimple
488692 3708228 8245 00:00:20 /usr/java/jre-vmware/bin/java -Dorg.tanukisoftware.wrapper.WrapperSimpl
508980 1766476 8946 00:00:47 /usr/java/jre-vmware/bin/java -Djava.util.logging.config.file=/usr/lib/
853220 1605380 7634 00:01:39 /usr/java/jre-vmware/bin/java -Xmx1024m -Xms512m -XX:PermSize=128m -XX:
869924 2884000 5652 00:00:44 /usr/java/jre-vmware/bin/java -Djava.util.logging.config.file=/usr/lib/
Of the processes above we can see a few things. The first few are postgres processes. vPostgres is the internal database for the vCenter appliance. We then see slapd (openLDAP deamon) which will be assisting with the AD authentication. We also see the vpxd (vCenter Daemon) again. These processes are not running in the java heap, so we will not be tuning them.
The other 6 processes, which correlate to the 6 we saw with the ps_mem command, are the ones that add up to 2.9G of memory allocation.
Let’s just make sure that the math in ps_mem is correct and add up the rss for the java processes. The rss is the number of pages in physical memory.
localhost:~ # ps -eo rss,vsz,pid,cputime,cmd --width 100 --sort rss,vsz | tail --lines 10 |grep java |awk '{ SUM += $1;} END { print SUM/1024/1024 }'
2.97867
All of the rss of the java processes add up to 2.9G, so these are the processes taking up the majority of our memory.
Just to confirm, let’s add up the processes that are not based on java and see the memory utilization.
localhost:~ # ps -eo rss,vsz,pid,cputime,cmd --width 100 --sort rss,vsz |grep -v java |awk '{ SUM += $1;} END { print SUM/1024 }'
417.953
This comes out to 417M (notice that I removed a /1024 from the awk command to produce Megabytes instead of Gigabytes). So our base system would run 417M before the java processes.
In java, the initial memory allocation will come from the -Xms that we discussed above. Let’s see which processes are using more than their defined -Xms.
First let’s do a better job or correlating the rss to the service.
localhost:~ # ps -eo rss,vsz,pid,cputime,cmd --width 200 --sort rss,vsz | tail --lines 10 |grep java |awk '{ MB = $1/1024; print MB,$3,$6,$7,$8}'
103.223 7022 -Xms128m -Xmx512m -Dcom.vmware.vide.log.dir=/var/log/vmware/logbrowser/
283.336 9169 -Dorg.tanukisoftware.wrapper.WrapperSimpleApp.waitForStartMain=FALSE -Dxml.config=../conf/sps-spring-config.xml -XX:+ForceTimeHighResolution
481.582 8245 -Dorg.tanukisoftware.wrapper.WrapperSimpleApp.waitForStartMain=FALSE -Dvim.logdir=/var/log/vmware/vpx/inventoryservice/ -XX:+ForceTimeHighRes
501.492 8946 -Djava.util.logging.config.file=/usr/lib/vmware-vpx/tomcat/conf/logging.properties -Xss1024K -Xincgc
833.309 7634 -Xmx1024m -Xms512m -XX:PermSize=128m
857.297 5652 -Djava.util.logging.config.file=/usr/lib/vmware-sso/conf/logging.properties -Duser.timezone=+00:00 -Dhazelcast.logging.type=log4j
The bottom process is Single Sign On (SSO). Let’s get the full process command.
localhost:~ # ps -eaf |grep 5652
ssod 5652 1 2 15:55 ? 00:00:46 /usr/java/jre-vmware/bin/java -Djava.util.logging.config.file=/usr/lib/vmware-sso/conf/logging.properties -Duser.timezone=+00:00 -Dhazelcast.logging.type=log4j -XX:MaxPermSize=256M -Xms2048m -Xmx2048m -XX:ErrorFile=/var/log/vmware/sso/sso_crash_pid%p.log -Djava.security.krb5.conf=/usr/lib/vmware-sso/webapps/ims/WEB-INF/classes/krb5-lw.conf -Dsun.security.jgss.native=true -Dsun.security.jgss.lib=/opt/likewise/lib64/libgssapi_krb5.so -Djava.util.logging.manager=com.springsource.tcserver.serviceability.logging.TcServerLogManager -Djava.endorsed.dirs=/usr/lib/vmware-sso/endorsed -classpath /usr/local/tcserver/vfabric-tc-server-standard/tomcat-7.0.25.B.RELEASE/bin/bootstrap.jar:/usr/local/tcserver/vfabric-tc-server-standard/tomcat-7.0.25.B.RELEASE/bin/tomcat-juli.jar -Dcatalina.base=/usr/lib/vmware-sso -Dcatalina.home=/usr/local/tcserver/vfabric-tc-server-standard/tomcat-7.0.25.B.RELEASE -Djava.io.tmpdir=/usr/lib/vmware-sso/temp org.apache.catalina.startup.Bootstrap start
Notice that the process is started by the PID “1″ which is init. We also see the heap flags as follows.
- -XX:MaxPermSize=256M
- -Xms2048m
- -Xmx2048m
So we have a minimum and maximum heap size of 2048M. We also have a MaxPermSize which has been increased from 64M (Default) to 256M. More information on MaxPermSize can be found here.
Based on the rss, we can reduce the settings for this process a considerable amount. Let’s lower the minimum heap size of the process to 32M and restart it. We can then see how large it grows.
Edit the /usr/lib/vmware-sso/bin/setenv.sh file and change the -Xms option to 32M in the following line.
JVM_OPTS="$JVM_OPTS -XX:MaxPermSize=256M -Xms32m -Xmx2048m"
Now restart the service.
localhost:/usr/lib/vmware-sso/bin # /etc/init.d/vmware-sso restart
Let’s find the new PID
localhost:~ # cat /usr/lib/vmware-sso/logs/tcserver.pid
2479
Now we can get the current rss from the service.
localhost:~ # ps -o rss,vsz,pid,cputime,cmd --width 200 --sort rss,vsz -p 2479 |awk '{ MB = $1/1024; print MB}'
422.766
So the rss is now down to 422M for this process. With load the memory pages will vary, but this is likely the minimum we will be able to run with. Using this procedure you can go back and check each service to check the minimum. I would advise checking it under load to see what is an optimum setting for your environment.
Changing the Default Heap Allocation
Now that we know the new values for our environment, we can go ahead and set them. Below I have listed the places to change the values and what I put in my environment. My environment only uses the most basic services, so these numbers are likely lower than you will get.
Active VMware Services using Java
VMware SSO Service
/usr/lib/vmware-sso/bin/setenv.sh
- -XX:MaxPermSize=128M
- -Xms256m
- -Xmx1024m
VMware Inventory Service
This is with 2 hosts and only 20 VMs. More VMs will require more memory for the inventory service
/usr/lib/vmware-vpx/inventoryservice/wrapper/conf/wrapper.conf
- wrapper.java.initmemory=64
- wrapper.java.maxmemory=512
VMware vCenter Service
/etc/vmware-vpx/tomcat-java-opts.cfg
- -Xmx256m
- -XX:MaxPermSize=128m
/usr/lib/vmware-vpx/tomcat/bin/setenv.sh
Leave the options as is, unless you want to lower the Xss (Thread Stack Size) than 1M
/usr/lib/vmware-vpx/sps/wrapper/conf/wrapper.conf
- wrapper.java.initmemory=64
- wrapper.java.maxmemory=128
Log Browser
/etc/init.d/vmware-logbrowser
- -Xms64m
- -Xmx128m
vSphere Web Client
/usr/lib/vmware-vsphere-client/server/bin/dmk.sh
- -Xmx512m
- -Xms128m
- -XX:PermSize=64m
- -XX:MaxPermSize=128m
Final Memory Consumption
The final memory consumption is much lower than it was previously. By changing the minimum and maximum heap sizes we can reduce the memory footprint. Here is the output from the free command after the changes.
localhost:~ # free -m
total used free shared buffers cached
Mem: 8002 3803 4199 0 32 1332
-/+ buffers/cache: 2437 5564
Swap: 15366 0 15366
We are now using 2.4G of memory for the system instead of 3.2G before
localhost:~ # python ps_mem.py |tail -n 15
2.4 MiB + 161.5 KiB = 2.6 MiB vmtoolsd
2.4 MiB + 271.5 KiB = 2.6 MiB hald
1.8 MiB + 917.5 KiB = 2.7 MiB vami-sfcbd (7)
1.8 MiB + 1.4 MiB = 3.2 MiB sendmail (3)
2.6 MiB + 681.0 KiB = 3.3 MiB sshd (2)
2.5 MiB + 943.0 KiB = 3.4 MiB lsassd
4.2 MiB + 12.0 KiB = 4.2 MiB haveged
4.3 MiB + 635.5 KiB = 4.9 MiB python2.6
28.5 MiB + 162.5 KiB = 28.6 MiB slapd
34.2 MiB + 9.2 MiB = 43.4 MiB postmaster (21)
132.8 MiB + 350.5 KiB = 133.2 MiB vpxd
2.1 GiB + 10.3 MiB = 2.1 GiB java (6)
---------------------------------
2.4 GiB
=================================
The java services have been lowered from 2.9G to 2.1G. The biggest change is on the maximum amount of memory. Instead of being able to take 8G we are now down much lower. If we add up the numbers we put in for the maximum heap sizes and MaxPermSizes we get 2688M. So the java heaps should run under 3G of memory. That added with our 512M of system processes, we should be able to run comfortably at 4GB with out any swapping. You can now lower the amount of memory on the vCenter Appliance in the vSphere Client.
Stopping Unused Processes
Since these are lab vCenters, I do not need all of the processes to run on them. We will stop some of these services from starting at boot.
Let’s list the services.
vc:~ # chkconfig
...
vmware-inventoryservice on
vmware-logbrowser on
vmware-netdumper on
vmware-rbd-watchdog off
vmware-sps off
vmware-sso on
vmware-tools-services on
vmware-vpostgres on
vmware-vpxd on
vsock on
vsphere-client on
I still use the thick client for most operations, so I am going to disable the vsphere-client service.
vc:~ # chkconfig vsphere-client off
I am not going to be use the Log Browser or the Netdumper services.
vc:~ # chkconfig vmware-logbrowser off
vc:~ # chkconfig vmware-netdumper off
Let’s stop those services.
vc:~ # service vsphere-client stop
vc:~ # service vmware-logbrowser stop
vc:~ # service vmware-netdumper stop
Now we can check the memory on the appliance using the free command again.
vc:~ # free -m
total used free shared buffers cached
Mem: 3962 2528 1433 0 27 782
-/+ buffers/cache: 1718 2244
Swap: 15366 0 15366
We are down to 1.7G utilization on this server. I have 2 hosts and 15 VMs attached to this vCenter Appliance. I have lowered the RAM to 3G and it works great. </size>
blog comments powered by Disqus