pic
Dans un précédent post, nous avions vu les grands concepts d’Hadoop (cf. http://www.opensides.fr/2011/03/10/hadoop-en-moins-de-5-minutes).  Aujourd'hui nous allons consacrer 5 nouvelles minutes avec Hadoop pour passer à la pratique. Les objectifs: Monter un environnement opérationnel en moins de 2 minutes Tour d'horizon de Cloudera en 1 minute Tester quelques commandes HDFS en 1 minute Tester un Map Reduce en 1 minute Monter un environnement opérationnel en 2 minutes 30 secondes pour parler de Cloudera Apache distribue une version packagée de l'écosystème complet Had...
pic
Dans ce tutorial, nous allons découvrir Hadoop au travers de son système de fichiers distribués et son mécanisme de Map/Reduce. Objectifs Comprendre les grands concepts de Hadoop Comprendre le HDFS et le mécanisme de Map/Reduce 2 minutes 30 pour comprendre les grands concepts Hadoop est un projet Open Source écrit en java, distribué par la fondation Apache. Ce framework est adapté dans le stockage et le traitement par lots de très grandes quantités de données (à partir du pétaoctet). Il a été mis en avant par des grands noms du web comme Yahoo! ou Facebook. Son sys...
pic
Dans cet article nous allons faire la connaissance rapide des serveurs web asynchrones. Le but de ce post est de vous faire découvrir cette nouvelle génération de serveur en montrant comment installer et configurer de façon basique celui qui me semble le plus aboutit et le plus performant. Nous verrons plus tard tirer profit de ce type d'architecture pour servir de hautes volumétries. Quelques rappels Un serveur http a pour vocation de servir du contenu en fonction des requêtes clientes. Ce contenu est distribué via le protocole http et peut être statique (images, css, javascript, â€...
pic
Tout le monde connaît Memcached ? Non ? Memcached est un cache Open Source distribué et non répliqué. Cela veut dire que que nous pouvons utiliser plusieurs instances de Memcached mais que chaque instance est autonome. Si l'une d'elle tombe, ses données seront donc perdues (pas de réplication entre instance). Pour aller plus loin, je vous conseille le wiki de Memcached: http://memcached.org/ Nous allons voir maintenant comment l'installer sur un unix: Installation de libEvent (une dépendance de Memcached) Vérifier que libEvent ne soit pas déja installé en tapant la ...
pic
Dans cet article nous allons voir comment apporter très facilement une solution de haute disponibilité à une application avec HAProxy. Quelques définitions Haute disponibilité (cf. wikipedia) La haute disponibilité est un terme souvent utilisé en informatique, à propos d'architecture de système ou d'un service pour désigner le fait que cette architecture ou ce service a un taux de disponibilité convenable. La disponibilité est aujourd'hui un enjeu important des infrastructures informatiques. On estime aujourd'hui que la non-disponibilité d'un service informatique peut avoi...
mar

21

The Java Virtual Machine (JVM Short, French JVM), well known to Java developers, can interpret and execute the bytecode. The interest of the JVM is to enable the portability of languages it supports, ie, it allows them to operate on any platform.

How it works exactly?

The different memory areas

Here is an overview of JVMs (Sun HotSpot):

We can see that the JVM is made up of different areas. This can be grouped into 2 main categories: the PERM and HEAP:

PERM

On loading the application, all classes (. Class) is loaded into this memory space. The JVM will then use the loaded classes in this space to create instances their bodies in the HEAP.

HEAP

The HEAP consists of 2 generations of objects:

1. YOUNG GENERATION: This category represents the new objects. It is organized in 3 areas:

* EDEN: new objects are created in this area. When it reaches its maximum size, a first garbage collector (GC) (minor GC) moves objects in life (related to other objects) in the FROM.

* FROM: this area accommodates objects moved from the EDEN. When GC occurs and this area is full, objects are moved into life in the To box.

* TO: This area accommodates objects moved from the FROM. When GC occurs and this area is full, objects are moved into life in the area OLD.

2. OLD GENERATION: This category accommodates objects still alive after the successive garbage.

How works the Garbage Collector

The JVM is composed of different memory area and its operation can be likened to a system of communicating vessels. The objects, created in a first zone, shifted gradually as in the following area where one of the area reached its maximum limit. This system is called the Garbage Collector. We can distinguish 2 types of garbage collector the major and minor.

Minor garbage collector

When a new object is allocated on the heap, the JVM created in the Eden area. When this area reaches its maximum size the JVM Garbage Collector is launching a collection called minor. This is to cover all items of Eden and flags according to 2 criteria:

Living objects (objects live): Live objects are those with references to other objects.

Orphaned objects (Orpheans): orphaned objects (those with no reference).

The minor garbage collector, then removes orphans and moving objects in life zone to zone Eden From.This operation is repeated each time the Eden area is full. If the From field is full, objects are moved to the area to.

The major advantage of this algorithm lies in its execution speed (1 / 100 to 1 / 10 of a second) because there is no liberation of memory itself. In practice, the JVM just changes a pointer to the object alive. After a minor collection, the Eden and survivor space are considered free. The copy job is in turn supported by a characteristic of the current JVM, which means that all of the HEAP is a single continuous segment of memory. Over the minor collections, objects stuck in life are a survivor space to another.

Major garbage collector (GC Full)

We have seen that the minor garbage collections, reorganized the storage spaces of the Young Generation who filled in as and when. When it is saturated, the JVM needs to free memory. It will then invoke a major garbage collector called Full GC. This operation is very expensive, we’ll see why.

The JVM will stop all running threads to complete the analysis of memory. It determines the objects to be deleted (Orpheans) and those who must subsist. It then operates a copy of these items in the zone to zone Young Old (or tenured).

This is called a « stop the word collection » for his need to stop everything. It is very expensive (up to several seconds) also because processor for these operations sweep (delete / copy).

For great performance, it is preferable to configure the JVM to avoid unnecessary copies of the young tenured space.

We shall see in next section how to optimize the JVM and (especially) our code to avoid these costly copy.

5 Responses

You can follow any responses to this entry through the RSS 2.0 feed.

You can leave a response, or trackback from your own site.

  1. kander dit :

    very helpful…

    I preferred to thank you for this good article. http://wtrjn.pimpblog.nl/ I by all odds liked every little bit of it…

  2. Jenzing dit :

    Great…

    love your blog, http://ksoney.jigsy.com/#dialog:close ,Thanks again….

  3. ghesrker dit :

    Great One…

    Can my Limewire music be seen from the authorities when I download from iTunes? , http://www.bokee.net/bloggermodule/blogadmin_viewEntry.do?id=8977705...

Leave a Reply