Course Notes‎ > ‎

Chapter 5 - Introduction to Java


Introduction to Java

 

"The Java platform is a fundamentally new way of computing, based on the power of networks and the idea that the same software should run on many different kinds of computers, consumer gadgets, and other devices"

 
 --SUN Microsystems

Java was initially developed by a team of software engineers at Sun Microsystems (now owned by Oracle Corporation) under the leadership of James Gosling in 1991, when he was investigating the development of a hardware-independent software platform using C++. The aim became to develop an alternative to C++, to implement control systems for consumer electronic devices.

C++ was found to be unsatisfactory for this task in many ways, and was dropped in favour of a new language called Oak. This new language consequently renamed Java was a powerful, yet straightforward language waiting for a new application.

In 1994 the World-Wide Web emerged and the Sun Developers used Java as the basis of a web browser, beginning the Java/HotJava project. The name was actually derived from the name of the programmers favourite coffee, during a brainstorming session. HotJava is a WWW browser developed by Sun to demonstrate the power of the Java programming language. Java was perfectly suited for use in WWW applications as the program code is compact, platform neutral, and could be used to generate compact programs called applets, that could be embedded in WWW pages.

In late 1995, Java (beta 2) was released along with the announcement of JavaScript by Sun Microsystems and Netscape Corporation. Support continued, and in late 1995 both Microsoft and IBM requested licensing rights from Sun. In early 1996, Java 1.0 along with JavaScript were officially released on the Internet.

The Java Life-cycle

As you are aware, computer programs are simply lists of instructions to be carried out by the six different logical units of a computer:

  • Input Unit receives data from the input devices such as the keyboard, mouse and any other peripherals

  • Memory Unit the primary memory unit (Random Access Memory(RAM)) provides fast access storage of computer programs, data from the input devices and data to be sent to the output devices.

  • Arithmetic and Logic Unit (ALU) performs the arithmetic calculations on data in memory, such as addition, subtraction, multiplication, division and comparison.

  • Central Processing Unit (CPU) manages the other units by sending messages to the input unit to read data into the memory unit, informs the ALU which data to operate on, etc.

  • Storage Unit stores and reads data and programs in long-term storage (e.g. harddisk drive) to be used at a later time.

  • Output Unit sends information from the computer to make it available outside of the computer, e.g. printer, network device etc.

Computer programmers write programs to interact with these logical units, either in a form that is directly comprehended by the computer or is comprehended after some form of translation step. Code that is directly understood by a computer is called machine code. This language is dependent on the exact type of machine that you are working on, i.e. it will differ if you are working on an Intel PC, or an Apple Macintosh. This code is the lowest level code where an operation to load a number into a particular address could look like: 10110 1100 1011 1001 0000. As it is virtually impossible for humans to write programs in binary code (or decimal/hexadecimal) a higher level language is required.

Assembly language improves on this situation as we use sequences of mnemonics to describe the operations that we wish to carry out. For example, MOV AX,[VarX] would load the accumulator with the value contained within the variable VarX. This code is much clearer to humans, but it still must be translated into machine code so that the computer can understand it.

At the level of assembly language it is possible to write complex computer programs, but they must be still written as low-level instructions. High-Level languages were developed to allow a single statement to carry out many tasks. Translator programs called compilers then convert the high-level languages into machine code. C, C++ and Java are all examples of high-level languages. Large programs can take significant time to compile from the high-level language form into the low-level machine code form. An alternative to this is to use interpreters; programs that execute high-level code directly by translating instructions on demand. These programs do not require compilation time, but the interpreted programs execute much more slowly.

Java programs exist in the form of compiled bytecode, that is similar to machine code, except that the platform target is the Java Virtual Machine (JVM). A JVM resides within every Java compatible WWW browser and indeed stand-alone with the Java Run-time Environment (JRE).

A JVM is, in effect, a bytecode interpreting machine running on a hardware machine. This interpreting stage has an overhead and slows the program execution performance of Java applications. Java bytecode is extremely compact, allowing it to be easily delivered over a network. In theory, the JVM in each Web browser is built to the same specification, requiring that one version of source code should be compatible with many platforms, provided that a Java enabled Web browser exists for that platform. In reality, not all Web browsers implement the exact same JVM specification, introducing minor inconsistencies when the same Java applet is viewed using several different browsers, on different platforms.

The Java application life cycle can be illustrated as in Figure 5.1, “The Java Life Cycle”. We can use any text editor to create the high-level Java text file. This file is saved as a .java file on the disk. We then compile this text file using the Java compiler, which result in a .class file being created on the disk. The .class file contains the bytecodes. The file is then loaded into memory by the class loader. The bytecode verifier confirms that the bytecodes are valid and not hostile. Finally, the JVM reads the bytecodes in memory and translates them into machine code.

Figure 5.1. The Java Life Cycle

The Java Life Cycle

Just-In-Time Compilation (Dynamic Translation)

In early versions of Java, when an applet was executed for the first time, the speed at which it executes was disappointing. The reason for this is that the Java applet's intermediate bytecode is interpreted by the Java Virtual Machine rather than compiled to native machine instructions as is the case for C++ as discussed previously.

One solution to this performance problem lies in Just-In-Time (JIT) Compilation (also called Dynamic Translation). The JIT compiler reads the bytecode of the Java applet and converts it into native machine instructions for the intended operating system, just after the applet is loaded from the disk. This can happen on a file-by-file basis or even on a method-by-method basis - hence the name, just-in-time. Once the Java applet is converted into native instructions, the application/applet then runs like it was natively compiled. The JIT compiler can, in certain cases, improve the run-time execution speed of applets by a factor of 5-10 times, while still providing portability through retaining the use of intermediate byte-code. Compiled code is kept in memory until the application terminates. In effect, when using just-in-time compilation Java applications are compiled twice, once when the source code is translated into platform portable bytecodes and again when bytecode is being executed on that platform.



These notes are copyright Dr. Derek Molloy, School of Electronic Engineering, Dublin City University, Ireland 2013-present. Please contact him directly before reproducing any of the content in any way.