The other day, a friend of mine asked me the question “What is the difference between a Java process or thread?” My instant reply was, thread is a light weight process. But of course it was not a complete answer. After doing some amount of googling, I was able to reach to a logical conclusion. Here is the study note capturing those information.
When we start an application (say our old friend Tomcat), one instance of JVM starts. In most cases, an instance of JVM generally runs as a single process. If multiple Java application runs at the same time, that will start multiple instances of JVM and hence multiple processes. Of course, this is a much simplified version of the story. In reality an application may run as a set of processes communicating with each other through something called inter process communication (IPC) mechanism, but as an end user, we get a feel that only one process is getting executed.
Now, lets have a look at the definition of Process. As per Sun’s Java Tutorial, a process is an independent, self-contained entity to which system allocates private set of resources, CPU time and importantly memory. As mentioned earlier, each instance of JVM runs as a single process and hence it has also got it’s own set of memory (separate heap, method area etc which does not overlap with other instances of JVMs running on the system at the same time).
So, what about thread? A thread can be called as a light weight process, i.e. it does not require as much resource as a process. A process contains multiple threads and hence all the threads share the common set of resources among themselves (including the memory). In the context of JVM, all the threads share the same heap and method area (but individual stacks). And of course, the inter communication between the threads is cheaper than the IPC. Thus, an application has at least one process and each process has at least one thread. That thread in turn may start several other threads at the runtime.
Now, it’s time is to clear one important doubt. Although each process may contain multiple threads, at any point of time, in a “single process supporting system”, only one thread and hence only process may get executed (As a java developer we can consider, at any instance of time there may multiple “runnable” threads but only one “running” thread).
But why? To answer this question, we need to have some very basic idea about JVM. JVM can be divided into two logical entities: the class loader subsystem and the execution engine. The class loader subsystem loads the class files (i.e. imports the byte codes from the class files), whereas the execution engine executes the instruction sets (which is nothing but the code we write for the application/program) embedded in the byte codes. Here come the BIG surprise, each thread of a running application, is nothing but an instance of the execution engine. Through out the life, a thread executes the byte codes. This is true for all the non-daemon threads and is not applicable for the daemon threads (e.g. the Garbage Collection Thread) invisible to the running application. And not surprisingly, for a single processor system (with a single core), only one instruction can be executed at any point of time.