靠大虾做法:Again about determining size of Java object

来源:百度文库 编辑:九乡新闻网 时间:2024/04/28 05:43:24

Again about determining size of Java object

Sometimes it is necessary to estimate the size a Java object takes in memory. The paper describes one of the approaches that uses Java Instrumentation API.

When it can be necessary
Java object size estimation can be helpful in the following cases:

  • Cache implementations ¢ Caches are normally used to speed up the performance of accessing more frequently used data. They normally can not keep all the data from database (or other storage) because of Java process memory size limitations. This means that cache must somehow estimate the size of data it keeps and throw away some old records to keep size not exceeding some defined for the cache limit.

  • Memory leak detection ¢ In some cases you can notice a memory leak and measure the heap size before and after leaking operations. If you suspect some objects you may need to measure their exact sizes and compare with leaking amount of memory. There are special big tools for such purposes however they are normally quite heavyweight, have performance impact and if you had a very basic size estimation method you cold solve the problem faster in some cases.

  • Other memory size estimations ¢ for instance you can analytically estimate the settings for JVM maximal heap size if you know how many objects you are going to create in your application.

  • Just for fun :)

Overview of different approaches
There are several different approaches of estimating Java object size. They mostly where known before JDK 5.0 has appeared.

  • http://jroller.com/page/mipsJava?entry=sizeof_java_objects - uses System.gc() and Runtime.freeMemory(), Runtime.totalMemory() methods for measuring size of Java object. This approach normally requires much resources for precise determining of the object size. It is necessary to create many instances (preferable several thousands) of estimated object, perform heap size measurements before and after. This makes this approach almost useless for production systems like cache implementations. The advantage of this approach is that it should gives quite precise size estimations on each Java implementation and OS where object sizes may differ.

  • Another more successful approach is described here: http://www.javaspecialists.co.za/archive/Issue078.html ¢ It is more tricky. It uses experimentally determined table of all primitive type sizes for determining full object size. Reflection API is used for iterating through object's member variable hierarchy and counting all primitive variable sizes. This approach does not require so much resources like previous one and can be used for cache implementations. The drawback is that primitive type size table is different for different JVM implementations and should be reevaluated for each one.

Here are also several other articles describing similar approaches:
http://www.javaworld.com/javaworld/javatips/jw-javatip130.html
http://www.javaworld.com/javaworld/javaqa/2003-12/02-qa-1226-sizeof.html
http://www.javapractices.com/Topic83.cjp
http://forum.java.sun.com/thread.jspa?threadID=565721&messageID=2790847


Using Instrumentation API for determining object size
Starting from 5.0 version JDK includes Instrumentation API that finally provides the getObjectSize method. However there are two issues with this method:

  1. It is impossible to use this method directly, It is necessary to implement instrumentation agent that must be packaged into a JAR file

  2. It returns only one object size without counting its child variable sizes which are derived from Object

The issues are easily solvable. Java agent can be implemented by declaring premain method in any class:

1: public class SizeOfAgent {2: 3:         static Instrumentation inst;4:         5:         /** initializes agent */6:         public static void premain(String agentArgs, Instrumentation instP) {7:                 inst = instP;           8:         }9: }


The premain method is called by JVM on startup and instance of Instrumentation is passed into. The SizeOfAgent just memorizes reference to Instrumentation in the static variable. To tell JVM about the instrumentation agent the class existence it must be packaged into a JAR file with special attributes in the manifest.mf file. In our case we need the following attributes:
    Premain-Class: sizeof.agent.SizeOfAgent
    Boot-Class-Path:
    Can-Redefine-Classes:
 false

Additionally java application must be launched with -javaagent parameter pointing the jar file. In our case it looks like this:

    java -javaagent:sizeofag.jar

After we got reference to Instrumentation instance it is very simple to implement a basic sizeOf method.

1: public class SizeOfAgent {2: 3:         static Instrumentation inst;4:         5:         // ...6:                 7:         public static long sizeOf(Object o) {8:                 return inst.getObjectSize(o);9:         }10: }

The SizeOgAgent.sizeOf() can be simply called from your application then. As it was mentioned the method returns only one object size without taking into account the sizes of its member variables. Full object size estimation can be done through reflection. We simply can recursively go through all child variables of estimated object and sum their sizes. Not everybody knows that it is possible to access private and protected variables through reflection. You just have to call Field.setAccessible(true) method before obtaining value of private field. Here is the full source code of SizeOfAgent with fullSizeOf method implementation:

001: package sizeof.agent;002: 003: import java.lang.instrument.Instrumentation;004: import java.lang.reflect.Array;005: import java.lang.reflect.Field;006: import java.lang.reflect.Modifier;007: import java.util.IdentityHashMap;008: import java.util.Map;009: import java.util.Stack;010: 011: /** Instrumentation agent used */012: public class SizeOfAgent {013: 014:         static Instrumentation inst;015:         016:         /** initializes agent */017:         public static void premain(String agentArgs, Instrumentation instP) {018:                 inst = instP;           019:         }020:         021:         /**022:          * Returns object size without member sub-objects.023:          * @param o object to get size of024:          * @return object size025:          */026:         public static long sizeOf(Object o) {027:                 if(inst == null) {028:                         throw new IllegalStateException("Can not access instrumentation environment.\n" +029:                                         "Please check if jar file containing SizeOfAgent class is \n" +030:                                         "specified in the java's \"-javaagent\" command line argument.");031:                 }032:                 return inst.getObjectSize(o);033:         }034:         035:         /**036:          * Calculates full size of object iterating over037:          * its hierarchy graph.038:          * @param obj object to calculate size of039:          * @return object size040:          */041:         public static long fullSizeOf(Object obj) {042:                 Map<Object, Object> visited = new IdentityHashMap<Object, Object>();043:                 Stack<Object> stack = new Stack<Object>();044: 045:             long result = internalSizeOf(obj, stack, visited);046:             while (!stack.isEmpty()) {047:               result += internalSizeOf(stack.pop(), stack, visited);048:             }049:             visited.clear();050:             return result;051:         }               052:           053:     private static boolean skipObject(Object obj, Map<Object, Object> visited) {054:             if (obj instanceof String) {055:               // skip interned string056:               if (obj == ((String) obj).intern()) {057:                 return true;058:               }059:             }060:             return (obj == null) // skip visited object061:                 || visited.containsKey(obj);062:          }063: 064:     private static long internalSizeOf(Object obj, Stack<Object> stack, Map<Object, Object> visited) {065:             if (skipObject(obj, visited)){066:                 return 0;067:             }068:             visited.put(obj, null);069:             070:             long result = 0;071:             // get size of object + primitive variables + member pointers 072:             result += SizeOfAgent.sizeOf(obj);073:             074:             // process all array elements075:             Class clazz = obj.getClass();076:             if (clazz.isArray()) {077:               if(clazz.getName().length() != 2) {// skip primitive type array078:                   int length =  Array.getLength(obj);079:                           for (int i = 0; i < length; i++) {080:                                   stack.add(Array.get(obj, i));081:                       } 082:               }       083:               return result;084:             }085:             086:             // process all fields of the object087:             while (clazz != null) {088:               Field[] fields = clazz.getDeclaredFields();089:               for (int i = 0; i < fields.length; i++) {090:                 if (!Modifier.isStatic(fields[i].getModifiers())) {091:                   if (fields[i].getType().isPrimitive()) {092:                           continue; // skip primitive fields093:                   } else {094:                     fields[i].setAccessible(true);095:                     try {096:                       // objects to be estimated are put to stack097:                       Object objectToAdd = fields[i].get(obj);098:                       if (objectToAdd != null) {                        099:                         stack.add(objectToAdd);100:                       }101:                     } catch (IllegalAccessException ex) { 102:                         assert false; 103:                     }104:                   }105:                 }106:               }107:               clazz = clazz.getSuperclass();108:             }109:             return result;110:          }111: }

Basic idea is very similar to approach published by Dr. Heinz M. Kabutz http://www.javaspecialists.co.za/archive/Issue078.html. I even reused the skipObject method from there. The algorithm is built so that each object is counted only once to avoid cyclic references. Additionally it skips interned strings (see String.intern() for more information).

Disadvantages
The main disadvantage of the approach is that it can not be used in sandbox environments like unsigned applets or Web Start applications. This limitation is because reflection is used for accessing private class members and instrumentation agents may not work in sandbox environments as well.

Files

The file sizeofag.jar contains compiled class and Java sources inside. So you can just add sizeofag.jar into your JVM's -javaagent option and useSizeOfAgent as a normal class in your program. Enjoy it :)

COMMENTS:
Wow. +10 
/dance 
/cheer 

Cheers. 

Posted by Mark Swanson on January 15, 2007 at 05:51 PM EET #

Nice one. 
Another drawback I see here is the performance . E.g. considering that you want a FixedMemory eviction policy in the case of a Cache, (which can be achieved indeed by using this approach), you still have to create a graph of objects (Stack, Fields etc) in order to determine the real memory size. And is not a small amount - and this will make the cache inefficient. I was thinking that perhaps there is another way of lower level instrumentation that can remove this impediment(?)

Posted by mmarkus on February 08, 2007 at 02:30 AM EET #

Markus, 

Thanks for your feedback. I think in your case performance may depend on how you organize your cache implementation. E.g. you can: 
1. Measure full cache size each time you are going to add a new object into. In this case this can be 
really too heavyweight 
2. Measure size of each object before addition into cache and accumulate total amount of cache size. This would distribute size calculation time among insertion operations, and probably may help with your issue 

Regards, 
Maxim

Posted by Maxim Zakharenkov on February 08, 2007 at 11:23 AM EET #

Hi, I know, this article was posted some time ago, but I try my luck getting an answer. :) 

How can I add the SizeOfAgent to an application in another way than passing the --javaagent option. 
I want to use it in a web application and won't have control over the startup settings of tomcat. 
The agent shall observe the size of a LinkedHashMap in a class (implemented as Singleton) where database queries are cached. 

Many thanks in advance! 

Best regards, 
Chris

Posted by Christian Schwinn on June 05, 2008 at 02:32 AM EEST #

Hi Christian 

You cannot get this to work in tomcat without this config being done. We have tried! This library also only works for 32 bit java too although I would expect it would not be hard to create a 64 but version. 

Mark

Posted by Mark on June 05, 2008 at 03:46 AM EEST #

>I try my luck getting an answer 
Hi, no problem, we're still here:) 

Unfortunately I don't know any other way to launch my program without "-javaagent" parameter. 

BUT!!! you can take a bit different approach described here: 
http://www.javaspecialists.co.za/archive/Issue078.html 

It won't always give you 100% accurate estimations, but they will be pretty close. 

It would be really nice if JDK had sizeOf function available out of the box.

Posted by Maxim Zakharenkov on June 05, 2008 at 10:24 AM EEST #

Hi Maxim, 

Thanks a lot for this. It works great. 

charlie

Posted by eddiherd on October 04, 2008 at 07:08 AM EEST #

Hello Maxim, 

I added SizeOfAgent to Apache Tomcat 6, to the Manager application so it displays the size of the session and the session attributes. 
It seems to work just fine, but one thing puzzles me a bit: An "empty" session (no attributes) already has a size of several MB. The sessions of one webapp have a size of 12 MB but when I sum up the sizes of the attributes I maybe get to 4 MB. 
If I for example set a byte array as an attribute the size is shown as expected. 
I'll try to look a bit into this in the hope to understand where this "session overhead" comes from. 

Thanks a lot for sharing this very useful information! 

Best Regards, 
Torsten

Posted by Torsten on October 12, 2008 at 11:48 AM EEST #

Hi Torsten, 

Interesting use case for the sizeOf function. 
It is really useful for estimating memory size 
necessary for running N users. 
It would be nice if you described the problem's cause when you solved it. 

Regards, 
Maxim 

Posted by Maxim Zakharenkov on October 13, 2008 at 01:10 PM EEST #

Hi Maxim, 

It seems that a session in Tomcat has a quite impressive hierarchy of members - so I guess that's why even a session that does not have any attributes already shows quite some size. 

The problem also is, that the size shown for each session is not really the size of it. I guess the sessions have mainly references to "shared" object instances while their size is taken in account for each session. 

So I was looking only at the sum of the sizes of the session's attributes. 

But here I got some confusing results. In one session I have seen a very large attribute. Browsing a heap dump made right after, I couldn't find any member of that attribute with a significant size. So I am wondering where the size comes from. 

So I am a bit lost right now. 

Best Regards, 
Torsten

Posted by Torsten on October 15, 2008 at 07:41 PM EEST #

Hi again, 

Sorry for spamming your blog - but I understand now why the size of some session attributes is unexpectedly big: If an attribute somewhere in its member hierarchy has an instance of some particular class of Tomcat's codebase, the size of all kinds of such instances is taken in account as well. 

The resulting size then seems to be the session's "overhead" + the actual size of the attribute. 

A simple example. In a JSP: 

class SomeClass { 


SomeClass someClass = new SomeClass(); 

request.getSession().setAttribute("someClass", someClass); 

I'd expect this attribute to have a size of just a few bytes. But the size shown is around 1500000. 
Browsing a heap dump, I can see that the SomeClass instance has a member: 

this$0 (L) : org.apache.jsp.index_jsp@0x8cf3ae88 (20 bytes) 

And this instance has a giant member hierarchy, so I guess that is where the size comes from. 

So, it is basically impossible to determine the size of a session attribute. Maybe by excluding the org.apache.* package... think I'll give it a try. 

And the bottom line is: Your approach works as expected ;-) 

Best Regards, 
Torsten

Posted by Torsten on October 16, 2008 at 05:02 PM EEST #

Hi, 

>Sorry for spamming your blog 
I don't think you are spamming. That's really good when blog is discussed. It is pleasant for me that I've published this stuff already 1.5 years ago but it is still useful by somebody. 

>Maybe by excluding the org.apache.* package 
I think it is the right way to do, hovewer there is no guarantee that some attribute will use some org.apache.collection that's size should be measured. So in general it is not that simple :) 

I think it is a problem of Java that it does not allow some explicit memory management operations. In some cases it is helpful of course, but I suppose for some effective Server programming more low level memory management is necessary. 
I'm not sure if it is easily to add such thing 
into Java. 

With explicit memory management it would be possible to 
- fully isolate all user session memory spaces 
- Control if session size does not exceed a specified size 
- Fully clean/reuse session memory after 
request is complete or session is closed 

These are the reasons why I consider Java 
is not ideal environment for implementing servers. That's actually the main reason why 
Java hosting is not as popular as Apache + PHP stuff. 

Cheers, 
Maxim 

Posted by Maxim on October 16, 2008 at 05:39 PM EEST #

Hi Maxim, 

> I suppose for some effective Server programming more low level memory management is necessary. 

If you know what you are doing, then I can agree. But many developers don't, I guess that's why Java deliberately doesn't allow to do low level memory stuff. Some manage to make nice memory leaks even in Java. 

I actually think that if a session is expired, all of the memory it occupied can be freed by the garbage collector. At least I have never experienced any memory leaks caused by sessions not completely cleaned up. 
I rather have the problem that the JVM runs out of PermGenSpace after a number of webapp redeployments in Tomcat. For a long time I wanted to find out more about this, why this happens and if something can be done about that. 

Greetings, 
Torsten

Posted by Torsten on October 16, 2008 at 08:41 PM EEST #

Maxim, 
thanks a lot for this utility. I am using eclipse, for a standard java project I have hard time setting -javaagent option. can you kindly help me out there? 

How can I set option -javaagent? 

In run dialog I specified -javaagent:sizeofag.jar but it gives me error 
"Error occurred during initialization of VM 
agent library failed to init: instrument 
Error opening zip file: sizeofag.jar" 

thanks in advance. 

Posted by nix on September 19, 2009 at 09:30 AM EEST #

Nix, 

I have no idea why it is so. May be your sizeofag.jar is corrupted? have you tried to run a very basic program with it? (Without eclipse)

Posted by 94.30.130.104 on September 21, 2009 at 10:50 PM EEST #

Although I haven't done this myself, it looks like you can also load an agent at runtime using the Java Attach API: http://java.sun.com/javase/6/docs/jdk/api/attach/spec/index.html. This may help with the cases where users have no control over how the JVM is started. This API is located in the JDK's lib/tools.jar.

Posted by Martin Serrano on October 28, 2009 at 02:25 PM EET #

More specifically, the relevant part of the Attach API is at http://java.sun.com/javase/6/docs/jdk/api/attach/spec/com/sun/tools/attach/VirtualMachine.html 

Posted by Martin Serrano on October 28, 2009 at 02:53 PM EET #

Hi Maxim 
thanks for the jar file. 
have you ever tried this jar file inside a plugin in eclipse? I have problem with invoking its method inside a plugin, the premain is executed without any problem, but when it comes to invoking the method it gives me "java.lang.NoClassDefFoundError". I checked the classpath at runtime, the jar file exists in the classpath. I also loaded the class again but i still receive the error. 
any idea? 

Thanks, 
Emi

Posted by Emi on December 01, 2009 at 02:06 PM EET #

Hi Emi, 

You could try to add some more classes into bootclasspath. The problem might be because your application and java agent have different classloaders. Normally I was able to solve such issues by adding more classes into bootclasspath. Take a look at java -Xbootclasspah option. 

Hope it helps :) 
Regards, 
Maxim

Posted by Maxim Zakharenkov on December 01, 2009 at 09:47 PM EET #

Hi Maxim 
I checked the classloader while the premain is executed and then inside the application, and they are the same, so apparently the java agent and the application have same classloaders. 
Following your suggestion, I added the jar file to the bootclasspath attribute in the manifest, but it does not work. Can you tell me what classes I should add? 

Thanks again, 
Cheers, 
Emi