Friday, May 14, 2010

Garbage.Collector - 4.0

Managed memory management in .Net is done through Garbage Collector .It keeps track of all the used and unused memory through weak and strong references. With the release of .Net 4.0 ,G.C has been significantly enhanced. Lets see first the working of G.C and then what's new in .Net 4.0.

G.C traverses G.C roots and from roots it visit every object pointed by root and so on. This creates a object graph of objects that G.C maintains on heap and are marked as alive. All non referencing objects are freed and the memory compacted to avoid fragmentation.

G.C heap is allocated by the O.S and is the part of process working set. G.C heap is one continuous memory addresses block. To maintain the life time of objects G.C divides the heap in three generation Gen0,Gen1 and Gen2. Object that survives the G.C cycle are promoted to next generation.

Older the object its on lower memory addresses and newly allocated objects are on higher memory addresses. Size limits of each generation is limited and is dynamically changing to optimize the allocation, depending on the allocation rates.
Memory allocated to the generation is in segments and typically 16mb is assigned to the Gen0 and so on for Gen1, Gen2. New segments are only allocated to Gen1 and Gen2.
Older the object gets , it is promoted to the next generation and newly created objects are in Gen0. Performing G.C for Gen2 is expensive than for Gen0 and Gen1. So more objects we have in Gen1 and Gen2 , more time will they take to get deallocated.
Objects that are more than 85 KB in size are allocated on the Large Object Heap.

Most active heap generation is Gen0. G.C preforms collection and compactation in two ways :-

1) Full Collection that include the collection of all the
generations.

2) Partial Collections :- that include collection of ephemeral generation (Gen0 and Gen1).

Full collection is expensive, so CLR delays the G.C for full collection.
G.C collection is called when Gen0 reaches its threshold, GC.Collect() is called or system has low memory.

G.C can be divided into two types blocking G.C or Server G.C and Concurrent G.C or workstation G.C.

Server G.C is blocking G.C that means , when the G.C start the collection process all other managed threads are suspended for the application . After G.C has reclaimed the memory and finished the compaction of memory heap , all managed suspended threads are resumed.

In case of Concurrent G.C , we are allowed to allocate memory while the G.C is in progress, but this applies only to the full G.C. If its not the full G.C then its Ephemeral G.C (Gen0 and Gen1 G.C) , that means it is blocking G.C and not concurrent G.C . So , the server G.C and ephemeral G.C are always blocking G.C.

Concurrent G.C allows us to simultaneously allocate even if the full G.C is in progress. But if during the full G.C we exceeds the segment limit while doing allocation then we have to wait for the ephemeral G.C to be performed. So there is again issue of latency(Time period for which the application threads were suspended.).

In CLR 4.0 , there is the concept of the Background and Foreground G.C . Like concurrent G.C, when we are out of memory in ephemeral segment we do not have to wait for the another cycle of ephemeral G.C , but a foreground thread will be initiated to do ephemeral G.C , while the full G.C is in progress(Background G.C).So we can have both background full G.C going on along with foreground G.C for ephemeral generation. This reduces the latency time. As for now background G.C is only available for the workstation G.C and server G.C are still blocking G.C.

The G.C heap is allocated in segments and is part of the process working set.
When we allocate the memory for new objects and do enough allocation that one segment is full, now G.C will be allocated another segment by the O.S for new allocation. If both the segment are full and we haven't released objects of our application then these two segments are the committed memory from our process working set. Now if we start releasing the objects of our application, G.C will find unreferenced objects and will start collecting them. Memory will be freed and our process heap will have free memory available. But CLR may or may not release this free memory to the O.S , depending upon the memory emergency by the O.S.

CLR optimizes the memory management and can assume that application will be needing memory soon , resulting in free memory on process heap.

This free memory can be returned back to the O.S, if there is memory scarcity detected by O.S. This is why sometime we notice high memory working set in Task Manager but the indication of actual tied up memory is the memory committed.

http://blogs.msdn.com/maoni/archive/2008/11/19/so-what-s-new-in-the-clr-4-0-gc.aspx

http://blogs.msdn.com/salvapatuel/archive/2009/06/10/background-and-foreground-gc-in-net-4.aspx

No comments:

Post a Comment