What is GeoNetwork?

GeoNetwork is a server side application that allows you to maintain a geographic referenced metadata catalogue. This means: a search portal that allows to view metadata combined with maps.

GeoNetwork logo
The yoga man

Based on Free and Open Source Software, it strictly follows different standards for metadata, from Inspire to OGC. It implements the CSW interface to be able to interact with generic clients looking for data. It also has built-in harvesters to connect to other servers and populate data.

This has allowed GeoNetwork to expand to a lot of organizations. For example: the swiss geoportal or the brasilian one, not forgetting the New zealander. GeoNetwork is the most used open sourced spatial catalog in the world. You can find it in most of the public administrations that use free and open source software.

The catalogue deploys on a java application container (like tomcat or jetty). It works over the Jeeves framework. Jeeves is based on XSLT transformation server library. This allows a powerful development of interfaces, for humans (HTML) or machines (XML). Therefore, it makes metadata from GeoNetwork to be easily accessible by different platforms.

Recently half-refactored to Spring and AngularJS, GeoNetwork has a REST API and an event hook system to make extensions and customizations easier.

Annotations and Decorators in Java

Annotations on the code or decorators have become very common. They allow the programmer to add additional useful information about how to improve the code or change how to compile / run a particular class. They are a Java extension to allow aspect-oriented programming.

We have three types of annotations based on the moment of usage:

Information for the Compiler

These annotations allow the compiler to indicate whether or not to ignore errors and warnings or what to do with them. If you’ve worked with a Java IDE (like eclipse) probably you would have used this type of annotations. For example, you can use @Override on a function to indicate that you are overwriting a method defined on a parent class. 

This annotation is completely optional, but allows both the compiler and the developer to check that they are indeed overwriting existing hierarchical functionality.

For example:

public class Parent {     
    public void do(){
        System.out.println("Parent");
     }
}

public class Son extends Parent{     
    @Override
    public void do(){
        System.out.println("Son");
     }
}
Compiler-time and deployment-time processing

These annotations allow the compiler to add extra information about how to generate the code. By adding or modifying functionality from that in the source code you can alter how a class behaves. Also to create new classes (based on a file descriptor), etc …

These annotations will only be visible at this point. They are not compiled to the .class files. Therefore they are not available at runtime.

Runtime Annotations

You can use this annotations on runtime and they work on a very similar way as an interface.

Let’s see an example on how to create a Runtime Annotation and how can we use it. The annotation named MyAnnotation can be applied to elements oftype field:

import java.lang.annotation.ElementType;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;

@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.FIELD)
     public @interface MyAnnotation {
}

Now we can create an annotated class by this annotation:

public class myObject
 {
 @MyAnnotation
 public String field;
 }

This way, we can check by reflection if an object has an annotated field on any part of the code:

Class<?> res = objeto.getClass();
for (Field f : res.getFields()) {
     if (f.isAnnotationPresent(MyAnnotation.class)) {
          System.out.println("OK");
      }
}

More Information:

High Concurrency

When facing high concurrency applications, we often find a number of generic problems. In this article I will focus on the problems of resources (CPU and memory). For now on, I will focus on the most typical and most direct solutions.

When we discover threads and the advantages of parallel processing it can happen that we end up abusing their use. We have a lot of threads (100 ¿? 1000?) simultaneously, and the processor will be jumping from one to another without stopping, not letting them finish, no matter how fast is their real excution. And over time there will be more and more threads only slowing down the process. To the cost of execution of each thread, we must consider also the added cost of creating and destroying threads. It can can become significant when we talk about so many threads at once.

High Concurrency with the Thread Pool Pattern
High Concurrency with the Thread Pool Pattern

Threads: the holy grail

In this case, the first method that we think of is the Thread Pool Pattern . This pattern will limit the number of threads running at the same time.
Instead of creating new threads, we create tasks, which are piled. Also, we have a pool of threads that will work picking these up and running as soon as possible. A classic example of this thread can be found on SwingWorker. If we want to implement bare hands our own pattern, we should take a look at the interface ExecutorService.

If you have a background thread that is making heavy use of processor, but we do not mind slowing it down for performance, we can use the command sleep ( Thread.sleep (...)) to periodically release the thread processor, allowing other threads to run faster .

This is useful for threads running in maintenance mode, which must be kept running but do not have to respond in real time. Another way to temporarily stop a running thread while another is using the method join ( Thread.Join () ), which makes a thread wait until another thread ends. Although more useful if we have a clearly higher priority thread than another, it is not viable if we can not have a reference to a higher priority thread from the lowest priority to tell which thread has to wait.

High Concurrency issues

But the high turnout is not given only by the use of the processor. It may be that multiple threads need access to large amounts of information almost simultaneously. These threads will not only be repeating the information in memory but often will be repeating the entire process of extracting that information.

This problem is usually solved in the majority of data access libraries (mostly database). For example, we have ehcache , which uses threads to store information ( Thread-Specific Storage Pattern ). This way, access and storage of this information is shared. Thus decreasing both the memory usage required and the processor time required to extract and shape information. As the threads wants to process this information, they will be asking ehcache for the data, which will optimize these hits.

To improve this solution have the concurrent collections. This allow different threads to use the same objects without any problems of concurrency.

There are more solutions to improve the high turnout (without going into optimizations to the code itself). But those described here are usually good ideas to start.

Useful References: