Chapter 37
The Java Virtual Machine
CONTENTS
This chapter probes into the Java virtual machine (JVM) phenomenon.
You will learn about the structures of a .class
file, look at the virtual machine architecture, and be given a
reference for the JVM instruction set. After you have completed
this chapter, you will be able to diagram the internal structure
of a .class file and will
understand the machine architecture of the JVM.
When Java was created, the goal was to create a machine-independent
programming language that then could be compiled into a portable
binary format. In theory, that is exactly what was achieved. Java
code is portable to any system that has a Java interpreter. However,
Java is not at all machine independent. Rather, Java is machine
specific to the Java virtual machine.
The JVM concept allows a layer of translation between the executable
program and the machine-specific code. In a non-Java compiler,
the source code is compiled into machine- specific assembly code.
In doing this, the executable limits itself to the confines of
that machine architecture. Compiling Java code creates an executable
using JVM assembly directives. The difference of the two approaches
is quite fundamental to the portability of the executable. Non-Java
executables communicate directly with the platform's instruction
set. Java executables communicate with the JVM instruction set,
which is then translated into platform-specific instructions.
Every machine has a certain form for its executable file. Java
is no exception. The Java compiler creates its executable files
in the form of .class.
.class files are composed
of 8-bit values (bytes) that can be read in pairs of 16-bit values,
or in 4-byte groups to create 32-bit values. The bytes are arranged
in big-endian order, where the first byte contains the highest
order bits of the 32-bit value and the last byte contains the
lowest-order bits of the 32-bit value.
A .class file itself is broken
into 15 separate regions:
- magic
- version
- constant_pool_count
- constant_pool[constant_pool_count - 1]
- access_flags
- this_class
- super_class
- interfaces_count
- interfaces[interfaces_count]
- fields_count
- fields[fields_count]
- methods_count
- methods[methods_count]
- attributes_count
- attributes[attribute_count]
The regions are not padded or aligned with one another. Each region
can be of either fixed or variable size. Regions that contain
variable amounts of information are preceded by a field specifying
the size of the variable region. The following sections provide
more information about these regions.
The magic region must contain
a value of 0xCAFEBABE.
version holds the version
number of the compiler that created the .class
file. This is used to specify incompatible changes to either the
format of the .class file
or to the bytecodes.
constant_pool_count specifies
the size of the next region. As noted previously, there is no
alignment or padding. Instead, size fields are used to
denote the extents of different variable regions. These fields
are 2 bytes in length.
constant_pool contains an
array of constant_pool_count - 1
items that store string constants, class names, field names, and
all constants referenced in the body of the code.
The first byte in every entry of constant_pool
contains a type that specifies the content of the entry.
Table 37.1 identifies the items that are contained in the constant
pool.
Table 37.1. Constant types.
Constant Type |
Value
|
CONSTANT_Asciiz
| 1
|
CONSTANT_Unicode
| 2
|
CONSTANT_Integer
| 3
|
CONSTANT_Float
| 4
|
CONSTANT_Long
| 5
|
CONSTANT_Double
| 6
|
CONSTANT_Class
| 7
|
CONSTANT_String
| 8
|
CONSTANT_Fieldref
| 9
|
CONSTANT_Methodref
| 10
|
CONSTANT_InterfaceMethodref
| 11
|
CONSTANT_NamedType
| 12
|
CONSTANT_Asciiz and CONSTANT_Unicode
are represented by a 1-byte reference tag, a 2-byte length specifier,
and an array of bytes that is of the specified length.
CONSTANT_Integer and CONSTANT_Float
contain a 1-byte tag and a 4-byte value.
CONSTANT_Long and CONSTANT_Double
are used to store 8-byte values. The structure begins with a 1-byte
tag and includes a 4-byte value containing the high bytes, and
a 4-byte value containing the low bytes.
CONSTANT_Class holds a 1-byte
tag as well as a 2-byte index into the constant_pool
that contains the string name of the class.
CONSTANT_String represents
an object of type String.
The structure contains two fields, a 1-byte tag, and a 2-byte
index into constant_pool,
which holds the actual string value encoded using a modified UTF
scheme. constant_pool stores
only 8-bit values, with the capability of combining them to form
8- and 16-bit characters.
CONSTANT_Fieldref, CONSTANT_Methodref,
and CONSTANT_InterfaceMethodref
represent their data with a 1-byte tag and two 1-byte indexes
into constant_pool. The first
index references the class; the second references the name and
type.
CONSTANT_NameAndType contains
information about constants not associated with a class. The first
byte is the tag, followed by two 2-byte indexes into constant_pool
specifying the type and signature of the constant.
The access_flags section
is a 2-byte field that specifies 16 different values describing
various properties of fields, classes, and methods. Table 37.2
lists the values of the access flags.
Table 37.2. Access flags.
Access Flag |
Value
|
Acc_PUBLIC
| 0x0001
|
Acc_PRIVATE
| 0x0002
|
Acc_PROTECTED
| 0x0004
|
Acc_STATIC
| 0x0008
|
Acc_FINAL
| 0x0010
|
Acc_SYNchRONIZED
| 0x0020
|
Acc_THREADSAFE
| 0x0040
|
Acc_TRANSIENT
| 0x0080
|
Acc_NATIVE
| 0x0100
|
Acc_INTERFACE
| 0x0200
|
Acc_ABSTRACT
| 0x0400
|
this_class is a 2-byte index
into constant_pool specifying
the information about the current class.
interfaces_count is a 2-byte
value denoting the size of the interfaces
array.
The interfaces array contains
indexes into the constant_pool
specifying the interfaces that the current class implements.
fields_count is a 2-byte
value denoting the size of the fields
array.
The fields array contains
complete information about the fields of a class. This array contains,
for each element, a 2-byte value of access_flags,
two 2-byte indexes into constant_pool,
a 2-byte attribute_count,
and an array of attributes.
The first index, name_index,
holds the name of the field. The second, signature_index,
holds the signature of the field. The last field stores any needed
attributes about the field. Currently, the number of attributes
supported is one of type ConstantValue,
indicating that the field is a static constant value.
methods_count supplies the
number of methods stored in the methods
array. This number only includes the methods declared in the current
class.
The methods field contains
an array of elements containing complete information about the
method. The information is stored with a 2-byte access_flags
value, a 2-byte name_index
referencing the name of the method in the constant_pool,
a 2-byte signature_index
referencing signature information found in the constant_pool,
a 2-byte attributes_count
containing the number of elements in the attributes
array, and an attributes
array.
Currently, the only value that can be found in the attributes
array is the Code structure,
which provides the information needed to properly execute the
specified method. To facilitate this, the Code
structure provides the following information.
Contained in the first 2 bytes is attribute_name_index,
which provides an index into the constant_pool
identifying the attribute as a Code
structure.
The next 2 bytes, named attribute_length,
provide the length of the Code
structure, not including attribute_name_index.
Actual Code-specific information
begins with the next three 4-byte fields, followed by the method's
operation code (opcode). max_stack
contains the maximum number of entries on the operand stack during
the methods execution. max_locals
specifies the total number of local variables for the method.
code_length is the total
length of the next field, the code
field containing opcode.
After the code field, the
Code structure provides detailed
exception information for the method. This starts with exception_table_length
and exception_table, which
describe each exception handler in the method code. start_pc,
end_pc, and handler_pc
give the starting and ending positions in which the event handler,
pointed to by handler_pc,
is active. catch_type, which
follows handler_pc, denotes
the type of exception handled.
The remainder of the Code
structure is devoted to information that is used for debugging
purposes.
line_number is the 2-byte
line number of the method's first line of code.
LocalVariableTable_attribute
contains a structure used by the debugger to determine the value
of local variables. The structure consists of three 1-byte values
and a local_variable_table
structure.
The first two fields of the structure, attribute_name_index
and attribute_length, are
used to describe the structure. The third contains the length
of the local_variable_table.
local_variable_table contains
the following five 2-byte fields, in order: start_pc,
length, name_index,
signature_index, and slot.
start_pc and length
denote the offset where the variable value can be found.
name_index and signature_index
are indexes into constant_pool,
where the variable's name and signature can be found.
slot denotes the position
in the local method frame where the variable can be found.
attributes_count is the size
of the attributes array containing
attribute structure. Currently, the only attribute structure is
the SourceFile structure.
The SourceFile structure
consists of three 2-byte values. attribute_name_index
indexes into constant_pool
to the entry containing the string SourceFile.
attribute_length must contain
a value of 2. sourcefile_index
indexes into the constant_pool
to the entry containing the source filename.
The Java virtual machine's architecture revolves around the concept
of non-machine-specific implementation. It assumes no specific
platform architecture, but it does require certain facilities:
- Registers
- Stack
- Garbage-collected heap
- Method area
- Instruction set
Whether these facilities exist in hardware or software makes no
difference to the JVM. As long as they exist, the JVM can function
correctly.
The registers serve the same purpose as normal microprocessors'
register devices, the main difference being the functions provided
by each register. JVM is a stack-based machine, meaning it does
not define registers for the passing of variables and instructions.
This was a conscious decision when designing the JVM, and the
result is a model requiring fewer registers. These registers are
as follows:
- pc-The pc
register is a 32-bit-wide program counter.
- optop-optop
maintains a pointer to the top of the operation stack. Like all
JVM registers, optop is 32
bits wide.
- frame-frame
provides a pointer to the current stack frame from which the JVM
can retrieve needed operands or opcode for the maintenance of
the stack.
- vars-vars
points to the base offset of the local variable in the current
stack frame. Through this mechanism, the JVM has read access to
all local variables.
The Java stack is a 32-bit model used to supply the JVM with needed
operation data as well as store return values. Like normal programming
languages, the stack is broken into separate stack frames, containing
information about the method associated with the frame. The Java
stack frame comprises three separate regions:
- Local variable-The local variable region
of the method frame provides the vars
register with a base reference to access the local variables.
All local variables are 32 bits wide; 64-bit variables occupy
two variable entries.
- Execution environment-The execution environment
region of the stack frame is used to provide code for the maintenance
of the method's stack frame. It also maintains pointers to the
local variables, a pointer to the previous stack frame, and a
pointer to the top and bottom of the current frame's operand region.
- Operand stack-The operand stack region
contains the operands for the current method.
All objects are allocated from the garbage-collection heap. The
heap is also responsible for performing garbage collection, due
primarily to the fact that Java does not allow the programmer
to deallocate space. The JVM does not assume any method of garbage
collection.
The method area contains the binary method retrieved from the
methods section of the class
file. This includes the method's code as well as all symbol information.
The instruction set is the set of operation codes that are executed
by the JVM. When Java source code is compiled, the compiler converts
the Java source code into the language of the JVM, the instruction
set.
The JVM instruction set is currently comprised of more than 160
instructions held in an 8-bit field. The JVM will pop operands
off the stack and push the result back onto the stack for some
operations. If the operands are greater than 8 bits, the JVM uses
the big-endian encoding scheme to pack the value into its 8-bit
instruction alignment.
Because the JVM instruction set is 160 operations, the following
sections break them down into categories for quicker reference.
The instructions introduced in this section are used to push constants
onto the stack. In all these instructions, if the value pushed
onto the stack is less than 32 bits, the value is expanded into
a 32-bit form to fit properly onto the stack:
- bipush byte pushes
byte onto the stack
as a 1-byte signed integer.
- sipush byte1 byte2
pushes byte1 and byte2
onto the stack as a 2-byte signed integer.
- ldc1 index pushes
constant_pool[index]
value onto the stack.
- ldc2 index1 index2
constructs a 2-byte index into the constant_pool
and pushes the value onto the stack.
- ldc2w index1 index2
constructs a 2-byte index into the constant_pool
and pushes the long or double
values onto the stack. Because the stack is 32 bits wide, the
value will occupy two locations.
- aconst_null pushes a
NULL constant onto the stack.
- iconst_m1 pushes a value
-1 integer constant onto the stack.
- iconst_0 pushes a value
3 integer constant onto the stack.
- iconst_1 pushes a value
4 integer constant onto the stack.
- iconst_2 pushes a value
5 integer constant onto the stack.
- iconst_4 pushes a value
6 integer constant onto the stack.
- iconst_5 pushes a value
8 integer constant onto the stack.
- lconst_0 pushes a value
9 long constant onto the
stack.
- lconst_1 pushes a value
10 long constant onto the
stack.
- fconst_0 pushes a value
11 float constant onto the
stack.
- fconst_1 pushes a value
12 float constant onto the
stack.
- fconst_2 pushes a value
13 float constant onto the
stack.
- dconst_0 pushes a value
14 double constant onto the
stack.
- dconst_1 pushes a value
15 double constant onto the
stack.
In a stack-based computer, multiple registers are replaced by
a stack register from which operands are popped off as needed
and results are pushed on as generated. The following instructions
store a method's local variables onto the stack for later use:
- iload byte retrieves
the integer value at the byte
position in the local variable array of the current stack frame.
Once retrieved, the variable is pushed onto the stack.
- iload_0 retrieves the
integer value at the 26th position in the local variable array
of the current stack frame. Once retrieved, the variable is pushed
onto the stack.
- iload_1 retrieves the
integer value at the 27th position in the local variable array
of the current stack frame. Once retrieved, the variable is pushed
onto the stack.
- iload_2 retrieves the
integer value at the 28th position in the local variable array
of the current stack frame. Once retrieved, the variable is pushed
onto the stack.
- iload_3 retrieves the
integer value at the 29th position in the local variable array
of the current stack frame. Once retrieved, the variable is pushed
onto the stack.
- lload byte retrieves
the long value at the byte
and byte+1 positions in the
local variable array of the current stack frame. Once retrieved,
the values are assembled and pushed onto the stack.
- lload_0 retrieves the
long value at the 30th and
31st positions in the local variable array of the current stack
frame. Once retrieved, the values are assembled and pushed onto
the stack.
- lload_1 retrieves the
long value at the 31st and
32nd positions in the local variable array of the current stack
frame. Once retrieved, the values are assembled and pushed onto
the stack.
- lload_2 retrieves the
long value at the 32nd and
33rd positions in the local variable array of the current stack
frame. Once retrieved, the values are assembled and pushed onto
the stack.
- lload_3 retrieves the
long value at the 33rd and
34th positions in the local variable array of the current stack
frame. Once retrieved, the values are assembled and pushed onto
the stack.
- fload byte retrieves
the float value at the byte
position in the local variable array of the current stack frame.
Once retrieved, the variable is pushed onto the stack.
- fload_0 retrieves the
float value at the 34th position
in the local variable array of the current stack frame. Once retrieved,
the variable is pushed onto the stack.
- fload_1 retrieves the
float value at the 35th position
in the local variable array of the current stack frame. Once retrieved,
the variable is pushed onto the stack.
- fload_2 retrieves the
float value at the 36th position
in the local variable array of the current stack frame. Once retrieved,
the variable is pushed onto the stack.
- fload_3 retrieves the
integer value at the 37th position in the local variable array
of the current stack frame. Once retrieved, the variable is pushed
onto the stack.
- dload byte retrieves
the double value at the byte
and byte+1 positions in the
local variable array of the current stack frame. Once retrieved,
the values are assembled and pushed onto the stack.
- dload_0 retrieves the
double value at the 38th
and 39th positions in the local variable array inside the current
stack frame. Once retrieved, the values are assembled and pushed
onto the stack.
- dload_1 retrieves the
double value at the 39th
and 40th positions in the local variable array inside the current
stack frame. Once retrieved, the values are assembled and pushed
onto the stack.
- dload_2 retrieves the
double value at the 40th
and 41st positions in the local variable array inside the current
stack frame. Once retrieved, the values are assembled and pushed
onto the stack.
- lload_3 retrieves the
double value at the 41st
and 42nd positions in the local variable array inside the current
stack frame. Once retrieved, the values are assembled and pushed
onto the stack.
- aload byte retrieves
the object or array at the byte
position in the local variable array of the current stack frame.
Once retrieved, the object or array is pushed onto the stack.
- aload_0 retrieves the
object or array at the 42nd position in the local variable array
of the current stack frame. Once retrieved, the object or array
is pushed onto the stack.
- aload_1 retrieves the
object or array at the 43rd position in the local variable array
of the current stack frame. Once retrieved, the object or array
is pushed onto the stack.
- aload_2 retrieves the
object or array at the 44th position in the local variable array
of the current stack frame. Once retrieved, the object or array
is pushed onto the stack.
- aload_3 retrieves the
object or array at the 45th position in the local variable array
of the current stack frame. Once retrieved, the object or array
is pushed onto the stack.
As described earlier, each method frame has a local variable region.
When the method comes to the top of the stack, the base offset
of the local variable gets placed into the vars
register. These instructions provide methods for storing information
into the local variables of the current stack frame:
- istore index value
stores the integer value
at the index position
in the local variable array of the current stack frame.
- istore_0 value
stores the integer value
at the 59th position in the local variable array of the current
stack frame.
- istore_1 value
stores the integer value
at the 60th position in the local variable array of the current
stack frame.
- istore_2 value
stores the integer value
at the 61st position in the local variable array of the current
stack frame.
- istore_3 value
stores the integer value
at the 62nd position in the local variable array of the current
stack frame. Once retrieved, the variable is pushed onto the stack.
- lstore index value
stores the long value
at the index and index+1
positions in the local variable array of the current stack frame.
- lstore_0 value
stores the long value
at the 63rd and 64th positions in the local variable array of
the current stack frame.
- lstore_1 value
stores the long value
at the 64th and 65th positions in the local variable array of
the current stack frame.
- lstore_2 value
stores the long value
at the 65th and 66th positions in the local variable array of
the current stack frame.
- lstore_3 value
stores the long value
at the 66th and 67th positions in the local variable array of
the current stack frame.
- fstore index value
stores the float value
at the byte position
in the local variable array of the current stack frame.
- fstore_0 value
stores the float value
at the 67th position in the local variable array of the current
stack frame.
- fstore_1 value
stores the float value
at the 68th position in the local variable array of the current
stack frame.
- fstore_2 value
stores the float value
at the 69th position in the local variable array of the current
stack frame.
- fstore_3 value
stores the float value
at the 70th position in the local variable array of the current
stack frame.
- dstore index value
stores the double value
at the index and index+1
positions in the local variable array of the current stack frame.
- dstore_0 value
stores the double value
at the 71st and 72nd positions in the local variable array of
the current stack frame.
- dstore_1 value
stores the double value
at the 72nd and 73rd positions in the local variable array of
the current stack frame.
- dstore_2 value
stores the double value
at the 73rd and 74th positions in the local variable array of
the current stack frame.
- dstore_3 value
stores the double value
at the 74th and 75th positions in the local variable array of
the current stack frame.
- astore index value
stores an object or array of value
at the index position
in the local variable array of the current stack frame.
- astore_0 value
stores an object or array of value
at the 75th position in the local variable array of the current
stack frame.
- astore_1 value
stores an object or array of value
at the 76th position in the local variable array of the current
stack frame.
- astore_2 value
stores an object or array of value
at the 77th position in the local variable array of the current
stack frame.
- astore_3 value
stores an object or array of value
at the 78th position in the local variable array of the current
stack frame.
- iinc index const
increments the value stored at the index
position in the local variable array of the current stack frame
by a value of const.
The garbage-collection heap is responsible for the allocation
and deallocation of referenced data. The following instructions
allocate, deallocate, and store data to the garbage-collection
heap:
- newarray type size allocates
a new array of size to
hold the variable type specified by the type
parameter. Table 37.3 lists the variable types
specified by the type parameter.
Table 37.3. Variable types specified by the type
parameter.
Variable Type |
Value
|
T_ARRAY
| 0x0001
|
T_BOOLEAN
| 0x0004
|
T_chAR
| 0x0005
|
T_FLOAT
| 0x0006
|
T_DOUBLE
| 0x0007
|
T_BYTE
| 0x0008
|
T_SHORT
| 0x0009
|
T_INT |
0x000A
|
T_LONG
| 0x000B
|
- anewarray byte1 byte2 size
creates a new array with a length of size,
of the class type referenced by the position in the constant_pool,
and indexed by the 2-byte index constructed from byte1
and byte2. The handle
of the created stack is passed back on the stack.
- multianewarray byte1 byte2 dimension
creates a multidimensional array from the information retrieved.
byte1 and byte2
are used to construct an index into the constant_pool
referencing the type of array to create. dimension
is the dimension of the array to create-the actual size of each
dimension is popped off the stack. The handle of the created stack
is passed back on the stack.
- arraylength handle
returns the size of the array referenced by the supplied array
handle.
- iaload handle index
returns the integer at the index
position of the array referenced by the array handle.
- laload handle index
returns the long at
the index position
of the array referenced by the array handle.
- faload handle index
returns the float
at the index position
of the array referenced by the array handle.
- daload handle index
returns the double
at the index position
of the array referenced by the array handle.
- aaload handle index
returns the object at the index
position of the array referenced by the array handle.
- caload handle index
returns the character at the index
position of the array referenced by the array handle.
- saload handle index
returns the short at the index
position of the array referenced by the array handle.
- iastore handle index value
stores the integer value
at the index position
of the array referenced by the array handle.
- lastore handle index value
stores the long value
at the index position
of the array referenced by the array handle.
- fastore handle index value
stores the float value
at the index position
of the array referenced by the array handle.
- dastore handle index value
stores the double
value at the index
position of the array referenced by the array handle.
- aastore handle index value_handle
stores the object value_handle
at the index position
of the array referenced by the array handle.
- bastore handle index value
stores the signed byte value
at the index position
of the array referenced by the array handle.
- castore handle index value
stores the character value
at the index position
of the array referenced by the array handle.
- sastore handle index value
stores the short value
at the index position
of the array referenced by the array handle.
With the existence of any stack, there must be some fundamental
operations to operate the stack. The following instructions do
just that:
- nop has no effect; it
leaves the current stack state unchanged.
- pop pops the top word
off the stack.
- pop2 pops the top two
words off the stack.
- dup copies the top stack
word and places it on the stack.
- dup2 copies the top two
stack words and places them on the stack.
- dup_x1 copies the top
stack word and places the value two words down in the stack.
- dup2_x1 copies the top
two stack words and places the values two words down in the stack.
- dup_x2 copies the top
stack word and places the value three words down in the stack.
- dup2_x2 copies the top
two stack words and places the values three words down in the
stack.
- swap swaps the position
of the top two stack words. The word on top becomes the second
to the top, and the word second from the top becomes the new top
word.
All computers need to function as a calculator at some point.
The capability to do fundamental computations is inherent to all
computing devices, and the JVM is no exception. The following
instructions provide the JVM with arithmetic operations:
- iadd pops off the top
two integers on the stack and replaces them with the sum of the
two values.
- ladd pops off the top
two positions on the stack to create a long
value. Then the next two are popped off to create the second long
value. The sum of the two values is then pushed onto the stack.
- fadd pops off the top
two floats on the stack and replaces them with the sum of the
two values.
- dadd pops off the top
two positions on the stack to create a double
value. Then the next two are popped off to create the second double
value. The sum of the two values is then pushed onto the stack.
- isub pops off the top
two integers on the stack and replaces them with the first value
minus the second.
- lsub pops off the top
two positions on the stack to create a long
value. Then the next two are popped off to create the second long
value. The result of the first value minus the second is then
pushed onto the stack.
- fsub pops off the top
two floats from the stack
and replaces them with the first value minus the second.
- dsub pops off the top
two positions from the stack to create a double
value. Then the next two are popped off to create the second double
value. The result of the first value minus the second is then
pushed onto the stack.
- imul pops off the top
two integers from the stack and replaces them with the top two
positions on the stack, which are popped off to create a long
value. Then the next two are popped off to create the second long
value. The product of the two values is then pushed onto the stack.
- fmul pops off the top
two floats from the stack
and replaces them with the product of the two values.
- dmul pops off the top
two positions on the stack to create a double
value. Then the next two are popped off to create the second double
value. The product of the two values is then pushed onto the stack.
- idiv pops off the top
two integers from the stack and replaces them with the first value
minus the second.
- ldiv pops off the top
two positions on the stack to create a long
value. Then the next two are popped off to create the second long
value. The result of the first value divided by the second is
then pushed onto the stack.
- fdiv pops off the top
two floats from the stack
and replaces them with the first value divided by the second.
- ddiv pops off the top
two positions on the stack to create a double
value. Then the next two are popped off to create the second double
value. The result of the first value divided by the second is
then pushed onto the stack.
- imod pops off the top
two integers from the stack and replaces them with the first value
modulus the second.
- lmod pops off the top
two positions on the stack to create a long
value. Then the next two are popped off to create the second long
value. The result of the first value modulus the second is then
pushed onto the stack.
- fmod pops off the top
two floats from the stack
and replaces them with the first value modulus the second.
- ddiv pops off the top
two positions on the stack to create a double
value. Then the next two are popped off to create the second double
value. The result of the first value modulus the second is then
pushed onto the stack.
- ineg pops off the top
integer from the stack and replaces it with the negated value.
- lneg pops off the top
two positions on the stack to create a long
value. The negated value is pushed onto the stack.
- fneg pops off the top
float from the stack and
replaces it with a negated value.
- dneg pops off the top
two positions on the stack to create a double
value. The negated value is pushed onto the stack.
The following instructions implement logical operations:
- ishl shifts the value
at the top of the stack to the left by the amount indicated by
the low 5 bits of the second stack value. The result is then placed
on the stack.
- ishr shifts the value
at the top of the stack to the left by the amount indicated by
the low 5 bits of the second stack value while retaining the sign
extension. The result is then placed on the stack.
- iushr shifts the value
at the top of the stack to the left by the amount indicated by
the low 5 bits of the second stack value without retaining the
sign extension. The result is then placed on the stack.
- lshl assembles the top
two values at the top of the stack to create a long
value. The value is then shifted left by the amount indicated
by the low 6 bits of the third stack value. The result is then
placed on the stack.
- lshr assembles the top
two values at the top of the stack to create a long
value. The value is then shifted right by the amount indicated
by the low 6 bits of the third stack value while retaining the
sign extension. The result is then placed on the stack.
- lshur assembles the top
two values at the top of the stack to create a long
value. The value is then shifted right by the amount indicated
by the low 6 bits of the third stack value without retaining the
sign extension. The result is then placed on the stack.
- iand performs a logical
AND of the top integer value
on the stack with the next value. The result is then pushed onto
the stack.
- land assembles the top
two values on the stack into a long
value, and then assembles the second two into a long
value. The result forms a logical AND
of the two values and is then pushed onto the stack.
- ior ORs
the top integer value on the stack with the next value. The result
is then pushed onto the stack.
- lor assembles the top
two values on the stack into a long
value, and then assembles the second two into a long
value. The result from a logical OR
of the two values is then pushed onto the stack.
- ixor exclusive ORs
the top integer value on the stack with the next value. The result
is then pushed onto the stack.
- lxor assembles the top
two values on the stack into a long
value, and then assembles the second two into a long
value. The result forms a logical exclusive OR
of the two values and is then pushed onto the stack.
The following instructions provide the capability to convert data
types:
- i2l converts the integer
value at the top of the stack into a long
value. The result is then pushed onto the stack.
- i2f converts the integer
value at the top of the stack into a float
value. The result is then pushed onto the stack.
- i2d converts the integer
value at the top of the stack into a double
value. The result is then pushed onto the stack.
- l2i assembles the top
two values on the stack into a long
value that is then converted to an integer value and pushed onto
the stack.
- l2f assembles the top
two values on the stack into a long
value that is then converted to a float
value and pushed onto the stack.
- l2d assembles the top
two values on the stack into a long
value that is then converted to a double
value and pushed onto the stack.
- f2i converts the float
value at the top of the stack into an integer value. The result
is then pushed onto the stack.
- f2l converts the float
value at the top of the stack into a long
value. The result is then pushed onto the stack.
- f2d converts the float
value at the top of the stack into a double
value. The result is then pushed onto the stack.
- d2i assembles the top
two values on the stack into a double
value that is then converted to an integer value and pushed onto
the stack.
- d2l assembles the top
two values on the stack into a double
value that is then converted to a long
value and pushed onto the stack.
- d2f assembles the top
two values on the stack into a double
value that is then converted to a float
value and pushed onto the stack.
- int2byte converts the
integer value at the top of the stack into a byte
value. The result is then pushed onto the stack.
- int2char converts the
integer value at the top of the stack into a char
value. The result is then pushed onto the stack.
- int2short converts the
integer value at the top of the stack into a short
value. The result is then pushed onto the stack.
Conditional statements allow the computer to execute boolean
logic. In doing so, they give the computer the capability to make
simple decisions based on a true-or-false comparison. The following
instructions support conditional decisions and alter program flow
of control:
- ifeq-If the value at
the top of the stack is equal to 0, the next two values are used
to create a signed 16-bit offset from which execution will proceed.
- iflt-If the value at
the top of the stack is less than 0, the next two values are used
to create a signed 16-bit offset from which execution will proceed.
- ifle-If the value at
the top of the stack is less than or equal to 0, the next two
values are used to create a signed 16-bit offset from which execution
will proceed.
- ifne-If the value at
the top of the stack is not equal to 0, the next two values are
used to create a signed 16-bit offset from which execution will
proceed.
- ifgt-If the value at
the top of the stack is greater than 0, the next two values are
used to create a signed 16-bit offset from which execution will
proceed.
- if_icmpeq-The two topmost
integer values are compared. If the values are equal, the next
two values are used to create a signed 16-bit offset from which
execution will proceed.
- if_icmpne-The two topmost
integer values are compared. If the values are not equal, the
next two values are used to create a signed 16-bit offset from
which execution will proceed.
- if_icmplt-The two topmost
integer values are compared. If the first value is less than the
second, the next two values are used to create a signed 16-bit
offset from which execution will proceed.
- if_icmple-The two topmost
integer values are compared. If the first value is less than or
equal to the second, the next two values are used to create a
signed 16-bit offset from which execution will proceed.
- if_icmpgt-The two topmost
integer values are compared. If the first value is greater than
the second, the next two values are used to create a signed 16-bit
offset from which execution will proceed.
- if_icmpge-The two topmost
integer values are compared. If the first value is greater than
or equal to the second, the next two values are used to create
a signed 16-bit offset from which execution will proceed.
- lcmp-The top two values
on the stack are assembled into a long
value that is compared with the next assembled long
value on the stack. If the first value is greater than the second,
the value of 1 is pushed
onto the stack. Otherwise, if the first value is equal to the
second, the value of 0 is
pushed onto the stack. Otherwise, the value of -1
is pushed onto the stack.
- fcmpl-The top two float
values on the stack are compared. If the first value is greater
than the second, the value of 1
is pushed onto the stack. Otherwise, if the first value is equal
to the second, the value of 0
is pushed onto the stack. Otherwise, the value of -1
is pushed onto the stack. If the values are incompatible, a value
of -1 is pushed onto the
stack.
- fcmpg-The same as fcmpl
except that if the types are incompatible, a value of 1
is pushed onto the stack.
- dcmpl-The top two double
values on the stack are assembled and compared. If the first value
is greater than the second, the value of 1
is pushed onto the stack. If the first value is equal to the second,
the value of 0 is pushed
onto the stack. Otherwise, the value of -1
is pushed onto the stack. If the values are incompatible, a value
of -1 is pushed onto the
stack.
- dcmpg-The same as dcmpl
except if the types are incompatible, a value of 1
is pushed onto the stack.
- if_acmpeq-The two topmost
object handles are compared. If the first reference object is
equal to the second, the next two values are used to create a
signed 16-bit offset from which execution will proceed. Otherwise,
execution will continue.
- if_acmpne-The two topmost
object handles are compared. If the first reference object is
not equal to the second, the next two values are used to create
a signed 16-bit offset from which execution will proceed. Otherwise,
execution will continue.
- goto-The next two values
are used to create a signed 16-bit offset from which execution
will proceed.
- jsr-The next two values
are used to create a signed 16-bit offset from which execution
will proceed. All opcodes following the jsr
will be pushed onto the stack.
- ret-The next value on
the stack is used as an index into the local variables to retrieve
the offset from which execution will continue.
The following instructions are used to return a value from a function
call:
- ireturn returns an integer
value from a function call.
- lreturn returns a long
value from a function call.
- freturn returns a float
value from a function call.
- dreturn returns a double
value from a function call.
- areturn returns an object
reference from a function call.
- return returns from a
function call without returning a value.
After the value has been returned, the JVM begins execution of
the line following the function call. The value returned is then
the top element(s) of the stack.
The jump table stores the offset information when the program
execution jumps to a non-sequential location. This information
allows the program to resume execution at the next logical offset.
The program jump is achieved by adding the new opcode offset to
the current pc value. The
following instructions provide the capability to jump to locations
in the table:
- tableswitch uses the
top integer value of the stack as a table index. If the index
is not in the current range of the jump table, the program will
jump by the default offset. If the index is in the valid range,
the offset is extracted from the table and is used to determine
the next instruction to be executed.
- lookupswitch functions
the same as tableswitch,
except the integer value at the top of the stack is the key value
to be found in the table, rather than the index.
The following instructions provide the capability to access and
modify members of an object:
- putfield byte1 byte2-The
values of byte1 and byte2
form an index into the constant_pool.
The indexed value holds the class and field name of the member
to change. From that information, the location of the member is
found and the value at the top of the stack is stored into that
location.
- getstatic byte1 byte2-The
values of byte1 and byte2
form an index into the constant_pool.
The indexed value holds the class and field name of the static
member to change. From that information, the member location is
found and pushed onto the top of the stack.
- putfstatic byte1 byte2-The
values of byte1 and byte2
form an index into the constant_pool.
The indexed value holds the class and field name of the static
member to change. From that information, the member location is
found and the value at the top of the stack is stored into that
location.
- getfield byte1 byte2-The
values of byte1 and byte2
form an index into the constant_pool.
The indexed value holds the class and field name of the member
to change. From that information, the member location is found
and pushed onto the top of the stack.
The following instructions provide the capability to execute a
method of an object:
- invokevirtual byte1 byte2-The
values of byte1 and byte2
make an index into the constant_pool.
The referenced value is used to find the offset of the method
to execute. The stack is assumed to contain the number of arguments
to be passed to the method.
- invokestatic byte1 byte2-The
values of byte1 and byte2
make an index into the constant_pool.
The referenced value is used to find the offset of the method
to execute. The method type is assumed to be native or synchronized.
If the method is synchronized, the associated monitor will be
executed. The stack is assumed to contain the number of arguments
to be passed to the method.
- invokeinterface byte1 byte2-The
values of byte1 and byte2
make an index into the constant_pool.
The referenced value is used to find the offset of the interface
to execute. The method type is assumed to be native or synchronized.
If the method is synchronized, the associated monitor will be
executed. The stack is assumed to contain the number of arguments
to be passed to the method.
The athrow instruction implements
Java exception handling capabilities:
- athrow-The top exception
object on the stack is thrown. The current method frame is searched
for the nearest catch. If none is found, the classes are then
seated for a handler. If none of the above is found, the JVM default
handler is executed.
The following instructions provide some object operations that
don't fall into any other category:
- new byte1 byte2
creates a new object of the type referenced by the position in
the constant_pool, indexed
by the 2-byte index constructed from byte1
and byte2.
- checkcast byte1 byte2
checks that the type, resolved by constructing the class's string
name index into the constant_pool,
is compatible to the object handle at the top of the stack. If
the objects are a proper cast, execution continues and the handle
remains on the stack. If they are incompatible, the ClassCastException
is thrown.
- newfromname byte1 byte2
creates a new object of the type referenced by the position in
the constant_pool, indexed
by the 2-byte index constructed from byte1
and byte2.
- instanceof byte1 byte2
checks to see if an object is an instance of a particular type.
- verifystack checks to
see if the operand stack is empty. If not, it will be after this
call. This instruction is generated only by a compiler that is
generating debug information.
Due to the multithreaded nature of the JVM, there is a great need
for a mechanism to access shared memory resources. The following
instructions provide the capability to lock and unlock a memory
object:
- monitorenter locks the
object handle at the top of the stack until the current process
releases the resource.
- monitorexit releases
the object handle at the top of the stack.
The breakpoint instruction
calls the breakpoint handler to notify the debugger of a breakpoint.
This chapter diagrams the internals of a .class
file, discusses the JVM architecture, and provides insight into
the JVM instruction set.