What is Thread in Java?
Ans) In Java, “thread” means two different things:
- An instance of class java.lang.Thread.
- A thread of execution.
An instance of Thread is just…an object. Like any other object in Java, it has variables and methods, and lives and dies on the heap.
But a thread of execution is an individual process (a “lightweight” process) that has a separate path of execution. It is called separate path of execution because each thread runs in a separate stack frame
In Java, there is one thread per call stack—or, to think of it in reverse, one call stack per thread. Even if you don’t create any new threads in your program, threads are back there running.
It’s way to take advantage of multiple CPU available in a machine. By employing multiple threads you can speed up CPU bound task. Java provides excellent support for multi-threading at language level, and its also one of strong selling point.
The main() method, that starts the whole ball rolling, runs in one thread, called (surprisingly) the main thread. If you looked at the main call stack (and you can, any time you get a stack trace from something that happens after main begins, but not within another thread), you’d see that main() is the first method on the stack— the method at the bottom. But as soon as you create a new thread, a new stack materializes and methods called from that thread run in a call stack that’s separate from the main() call stack.
Explain thread scheduling ?
Operating systems maintain a priority queue and all the threads which are waiting to access the processor will be in that queue. The first thread in the queue or a thread with higher priority than others (Usually system threads have higher priority than others) gets the first chance to use the processor. After a specific time, operating system saves the current state of executing thread into the memory, put it into the queue and select the next thread for execution. This circle continues for every threads until they complete their assigned task. When a thread gets the second chance to execute, operating system will restore its last known state and the thread will continue from where it left over. Once the assigned task of a thread has been completed or if the threads has been terminated by operating system, the thread will be permanently removed from the thread queue and memory. Sometimes a thread needs to wait for any events.
What is a scheduler?
A scheduler is the implementation of a scheduling algorithm that manages access of processes and threads to some limited resource like the processor or some I/O channel. The goal of most scheduling algorithms is to provide some kind of load balancing for the available processes/threads that guarantees that each process/thread gets an appropriate time frame to access the requested resource exclusively.
What is the difference between preemptive scheduling and time slicing?
Under preemptive scheduling, the highest priority task executes until it enters the waiting or dead states or a higher priority task comes into existence. Under time slicing, a task executes for a predefined slice of time and then reenters the pool of ready tasks. The scheduler then determines which task should execute next, based on priority and other factors.
What do we understand by the term concurrency?
Concurrency is the ability of a program to execute several computations simultaneously. This can be achieved by distributing the computations over the available CPU cores of a machine or even over different machines within the same network.
What is difference between thread and process?
A process is an execution environment provided by the operating system that has its own set of private resources (e.g. memory, open files, etc.). Threads, in contrast to processes, live within a process and share their resources (memory, open files, etc.) with the other threads of the process. The ability to share resources between different threads makes thread more suitable for tasks where performance is a significant requirement.
To word it differently a processes correspond to a running Java Virtual Machine (JVM) whereas threads live within the JVM and can be created and stopped by the Java application dynamically at runtime.
- Threads share the address space of the process that created it; processes have their own address.
- Threads have direct access to the data segment of its process; processes have their own copy of the data segment of the parent process.
- Threads can directly communicate with other threads of its process; processes must use inter process communication to communicate with sibling processes.
- Threads have almost no overhead; processes have considerable overhead.
- New threads are easily created; new processes require duplication of the parent process.
- Threads can exercise considerable control over threads of the same process; processes can only exercise control over child processes.
- Changes to the main thread (cancellation, priority change, etc.) may affect the behavior of the other threads of the process; changes to the parent process do not affect child processes.
What are the advantages or usage of threads?
Threads support concurrent operations. For example,
• Multiple requests by a client on a server can be handled as an individual client thread.
• Long computations or high-latency disk and network operations can be handled in the background without disturbing foreground computations or screen updates.
Threads often result in simpler programs.
• In sequential programming, updating multiple displays normally requires a big while-loop that performs small parts of each display update. Unfortunately, this loop basically simulates an operating system scheduler. In Java, each view can be assigned a thread to provide continuous updates.
• Programs that need to respond to user-initiated events can set up service routines to handle the events without having to insert code in the main routine to look for these events.For example, you may have experience with modern antiviruses, where whenever a new pen drive is detected the antivirus shows a pop up to scan the drive or not. If a thread wants to execute a specific action on USB arrival, it does not need to check for USB drives continuously, instead it can inform the operating system to notify it when there is a new USB drive detected and until that the thread can wait in a special memory area called waiting pool.
Threads provide a high degree of control.
• Imagine launching a complex computation that occasionally takes longer than is satisfactory. A “watchdog” thread can be activated that will “kill” the computation if it becomes costly, perhaps in favor of an alternate, approximate solution. Note that sequential programs must muddy the computation with termination code, whereas, a Java program can use thread control to non-intrusively supervise any operation.
Threaded applications exploit parallelism.
• A computer with multiple CPUs can literally execute multiple threads on different functional units without having to simulating multi-tasking (“time sharing”).
• On some computers, one CPU handles the display while another handles computations or database accesses, thus, providing extremely fast user interface response times.
How many threads does a Java program have at least?
Each Java program is executed within the main thread; hence each Java application has at least one thread.
How will you go about identifying and debugging Java concurrency issues like thread starvation, dead lock, and contention?
Firstly, concurrency issues mainly surface under load. So, you need to write JMeter scripts to reproduce the load and scenario under which the concurrency issues surface.
Debugging concurrency issues and fixing any thread starvation, dead lock, and contention require skills and experience to identify and reproduce these hard to resolve issues. Here are some techniques to detect concurrency issues.
- Manually reviewing the code for any obvious thread-safety issues. There are static analysis tools like Sonar, ThreadCheck, etc for catching concurrency bugs at compile-time by analyzing their byte code.
- List all possible causes and add extensive log statements and write test cases to prove or disprove your theories.
- Thread dumps are very useful for diagnosing synchronization problems such as deadlocks. The trick is to take 5 or 6 sets of thread dumps at an interval of 5 seconds between each to have a log file that has 25 to 30 seconds worth of runtime action. For thread dumps, use kill -3 in Unix and CTRL+BREAK in Windows. There are tools like Thread Dump Analyzer (TDA), Samurai, etc. to derive useful information from the thread dumps to find where the problem is. For example, Samurai colors idle threads in grey, blocked threads in red, and running threads in green. You must pay more attention to those red threads.
- There are tools like JDB (i.e. Java DeBugger) where a “watch” can be set up on the suspected variable. When ever the application is modifying that variable, a thread dump will be printed.
- There are dynamic analysis tools like jstack and JConsole, which is a JMX compliant GUI tool to get a thread dump on the fly. The JConsole GUI tool does have handy features like “detect deadlock” button to perform deadlock detection operations and ability to inspect the threads and objects in error states. Similar tools are available for other languages as well.
What are some of the best practices to keep in mind relating to writing concurrent programs?
- Favor immutable objects as they are inherently thread-safe.
- If you need to use mutable objects, and share them among threads, then a key element of thread-safety is locking access to shared data while it is being operated on by a thread. For example, in Java you can use the synchronized keyword.
- Generally try to keep your locking for as shorter duration as possible to minimize any thread contention issues if you have many threads running. Putting a big, fat lock right at the start of the function and unlocking it at the end of the function is useful on functions that are rarely called, but can adversely impact performance on frequently called functions. Putting one or many larger locks in the function around the data that actually need protection is a finer grained approach that works better than the coarse grained approach, especially when there are only a few places in the function that actually need protection and there are larger areas that are thread-safe and can be carried out concurrently.
- Use proven concurrency libraries (e.g. java.util.concurrency) as opposed to writing your own. Well written concurrency libraries provide concurrent access to reads, while restricting concurrent writes.
- Favor “optimistic concurrency control” over pessimistic concurrency control.
What are common production issues, and how will you go about resolving them?
There could be general run time production issues that either slow down or make a system to hang. In these situations, the general approach for troubleshooting would be to analyze the thread dumps to isolate the threads which are causing the slow-down or hang. For example, a Java thread dump gives you a snapshot of all threads running inside a Java Virtual Machine. There are graphical tools like Samurai to help you analyze the thread dumps more effectively.
- Application seems to consume 100% CPU and throughput has drastically reduced – Get a series of thread dumps, say 7 to 10 at a particular interval, say 5 -8 seconds and analyze these thread dumps by inspecting closely the “runnable” threads to ensure that if a particular thread is progressing well. If a particular thread is executing the same method through all the thread dumps, then that particular method could be the root cause. You can now continue your investigation by inspecting the code.
- Application consumes very less CPU and response times are very poor due to heavy I/O operations like file or database read/write operations – Get a series of thread dumps and inspect for threads that are in “blocked” status. This analysis can also be used for situations where the application server hangs due to running out of all runnable threads due to a deadlock or a thread is holding a lock on an object and never returns it while other threads are waiting for the same lock.
The solution to the above problems could vastly vary from fixing the thread safety issue(s) to reducing the size of synchronization granularity, and from implementing appropriate caching strategies to setting the appropriate connection timeouts, etc discussed under performance and scalability key areas.
Can you explain the terms “optimistic concurrency control” and “pessimistic concurrency control”?
Most web applications allow a user to query a database, retrieve a local copy of the queried data, make changes to that local copy, and then finally send the updates and the unchanged values back to the database. When two or more users are concurrently updating the same row in the database, it is possible to cause “lost update” issues. There are 2 ways to handle concurrent updates causing “lost update issues”.
1. Pessimistic concurrency control involves pessimistically locking the database row(s) with the appropriate database isolation levels or “select … for update nowait” SQL. Even when there is no conncurrent access, the row(s) will be locked. This can adversely impact performance and scalability.
2. Optimistic concurrency control assumes that concurrent updates occur very rarely, and deals with concurrent updates when they occur by detecting them, and prompting the users to retry via eror messages like “Another user has changed the data since your last request. Please try again.”. Version numbers or timestamps are used to detect concurrent updates. In optimistic concurrency control, the SQL query will be something like “update … where id = ? and timestamp = ?” or “update … where id = ? and vesrion = ?” . Hibernate provides support for optimistic concurrency control with “version” or “timestamp” columns.
What are the 3 common causes of concurrency issues in a multi-threaded application?
1) Atomicity means an operation will either be completed or not done at all. Other threads will not be able to see the operation “in progress” — it will never be viewed in a partially complete state.
2) Visibility determines when the effects of one thread can be seen by another.
3) Ordering determines when actions in one thread can be seen to occur out of order with respect to another.
Concurrency issues in Java can be fixed with a number of different ways
1) Carefully using synchronized keyword at block level, method level, or class level.
2) Using the combination of “volatile” for variables and block level synchronized keywords.
3) Using the Atomic classes like AtomicInteger, as atomic classes are inherently thread-safe.
4) Favoring immutable objects as they are inherently thread-safe. Once constructed, immutable objects cannot be modified.
5) Using the explicit locks, concurrent collection classes like ConcurrentHashmap, Semaphores, CountDownLatch, CyclicBarrier, etc provided by the java.util.concurrent package.