Introduction
Object-oriented programming (OOP for short) is the programming paradigm of the 1990s. Many of the programming languages used today are either fundamentally object-oriented (Java, Eiffel, SmallTalk) or have been provided with object-oriented extensions over time (Basic, Pascal, ADA). Even some script languages allow access to (sometimes predefined) objects or have object-oriented properties (JavaScript, Python). Object-oriented programming was one of the "silver bullets" that should lead the software industry out of its crisis and lead to more robust, less error-free and easier to maintain programs.
So what are the secrets of object-oriented programming? What is the term and what are its main concepts? We want to first deal with the basic ideas of object-oriented programming and then explain step by step in this and the next chapters how they were implemented in Java.
Abstraction
One of the most important ideas of object-oriented programming is the separation between concept and implementation, for example between a component and its construction plan, a dish and the recipe required for preparation or a technical manual and the specific equipment that is described by it. This kind of distinction is very meaningful in the real world. If you know how to operate a single light switch, you can also operate other, similar switches. Anyone who has a recipe for a Sachertorte is able to bake it, even if they have no other cooking or baking skills. Anyone who has obtained a driver's license can drive a car without knowing in detail about the intricate inner workings of the car.
In object-oriented programming, this distinction manifests itself in the terms object and class. An object is an actually existing "thing" from the application world of the program. It does not matter whether it is the programmed implementation of a specific, existing object or whether "only" an abstract concept is being modeled. A "class", on the other hand, is the description of one or more similar objects. "Similar" means that a class only describes objects of a certain type. These do not have to be the same in every detail, but they have to agree in so many of them that a common description is appropriate. A class describes at least three important things:
- How is the object to be operated?
- What properties does the object have and how does it behave?
- How is the object made?
Similar to the way a recipe can be used to make hundreds of Sachertorte, a class allows in principle any number of objects to be created. Each one has its own identity and may differ from all others in certain details. Ultimately, however, the object is always an instance of the class according to which it was modeled. For example, there can be fifty light switch objects in a house. They are all instances of the "light switch" class, can be operated in a comparable manner and are constructed identically. Nevertheless, we do make a distinction between the light switch object that controls the corridor lighting and the one that illuminates the basement. And both in turn clearly differ from all other light switch instances in the house.
Incidentally, it is no coincidence that we use the terms "instance" and "object" interchangeably. This is very common in object-oriented programming (even if purists still see differences between the two terms), and we want to follow this usage.
Note
This distinction between objects and classes can be seen as an abstraction. It is the first important property of object-oriented languages. Abstraction helps ignore details, thereby reducing the complexity of the problem. The ability to abstract is one of the most important prerequisites for mastering complex apparatus and techniques and its importance cannot be overestimated.
Encapsulation
In object-oriented programming languages, a class is defined by the combination of a set of data and functions (now called methods) that operate on it. The data are represented by a set of variables that are created for each instantiated object (these are referred to as attributes, member variables, instance variables or instance characteristics). The methods exist only once in the executable program code, but operate with each call on the data of a very specific object (the runtime system transfers a reference to the set of instance variables with which the method is currently to work with each call of a method).
The instance variables represent the state of an object. They can be different for each instance of a class and change during its lifetime. The methods represent the behavior of the object. Apart from deliberate exceptions, in which variables are consciously made accessible from outside, they are the only way to communicate with the object and thus to gain information about its state or to change it. The behavior of the objects of a class is specified in its method definitions and depends on the program code contained therein and the current state of the object.
This combination of methods and variables into classes is called encapsulation. It represents the second important property of object-oriented programming languages. Above all, encapsulation helps to reduce the complexity of operating an object. In order to turn a lamp on, you don't need to know much about the internal structure of the light switch. However, it also reduces the complexity of the implementation, because undefined interactions with other components of the program are prevented or reduced.
Reuse
The abstraction and encapsulation promotes the reuse of program elements, the third important property of object-oriented programming languages. A simple example of this are collections, i.e. objects that accept collections of other objects and process them in a certain way. Collections are often very complex (typically to increase speed or reduce storage requirements), but usually have a simple interface. If they are implemented as a class and the complex details are "abstracted away" by encapsulating the code and data structures, they can be reused very easily. Whenever a corresponding collection is required in the program, only an object of the appropriate class needs to be instantiated and the program can access it via the easy-to-use interface. Reuse is an important key to increasing efficiency and error-free programming.
Relationships
Objects and classes usually do not exist completely alone, but are related to each other. For example, a bicycle is similar to a motorcycle, but it also has something in common with a car. A car, on the other hand, is similar to a truck. This can have a trailer with a motorcycle on it. A ferry is also a means of transport and can hold many cars or trucks, as well as a long freight train. This is pulled by a locomotive. A truck can or does not have to pull a trailer. A ferry does not require a tractor and it can not only move means of transport, but also people, animals or food.
We want to shed some light on these relationships and show how they can be reduced to a few basic types in object-oriented programming languages:
- "is-a" relationships (generalization, specialization)
- "part-of" relationships (aggregation, composition)
- Usage or calling relationships
Generalization and specialization
First, let's look at the "is-a" relationship. "is-a" means "is a" and means the relationship between "similar" classes. A bike is not a motorcycle, but both are two-wheelers. A two-wheeler, and thus both a bicycle and a motorcycle, is a road vehicle, just like the car and the truck. All these classes represent means of transport, which also include ships and freight trains.
The "is-a" relationship between two classes A and B says that "B is an A", that is, has all the properties of A, and probably a few more. B is therefore a specialization of A. Viewed the other way round, A is a generalization (generalization) of B.
"is-a" relationships are expressed in object-oriented programming languages through inheritance. A class is not completely redefined, but derived from another class. In this case, it inherits all the properties of that class and can add its own as desired. In our case, B would be derived from A. A is called the base class (sometimes called the parent class) and B is called the derived class.
Inheritance can be multilevel, i.e. a derived class can be the base class for other classes. In this way, multi-level inheritance hierarchies can arise that naturally represent the taxonomy (i.e. the structured conceptual structure) of the application world to be modeled. Inheritance hierarchies are also known as derivation trees because of their tree structure. They are mostly represented by graphs in which the derived classes are connected to the base classes by arrows and the base classes are above the derived classes.
Properties of the basic class of means of transport could be seen, for example, in its acquisition costs, service life or transport speed. They apply to all derived classes. In the second level of derivation we differentiate according to the type of locomotion (we could just as well have differentiated according to color, purpose or any other characteristic). In the watercraft class, properties such as displacement, seaworthiness and required crew could now be recorded. The ferry finally adds its transport capacities for cars, trucks and people, specifies the number of cabins in the different categories and defines whether or not it can be loaded and unloaded using the RORO procedure.
In some object-oriented programming languages, a derived class can have more than one base class (e.g. in C ++ or Eiffel). In this case one speaks of multiple inheritance. The inheritance hierarchy is then no longer necessarily a tree, but has to be generalized to a directed graph. In Java, however, there is no multiple inheritance, so we do not want to go into the specifics of this technology.
Aggregation and composition
The second type of relationship, the "part-of" relationship, describes the composition of an object from other objects (this is also known as composition). For example, the freight train consists of one (or sometimes two) locomotives and a large number of freight train trailers. The truck consists of the truck tractor and possibly a trailer. A bicycle consists of many individual parts. Object-oriented languages implement "part-of" relationships through instance variables that objects can hold. The freight train could therefore have one (or two) instance variables of the locomotive type and an array of instance variables of the freight train trailer type.
"Part-of" relationships do not necessarily have to describe what an object is composed of. Rather, they can also describe the more general case of simply adding other objects (which is also known as aggregation). Although there is a "part-of" relationship between the motorcycle that is on the truck trailer or the road vehicles that are accommodated on a ferry, it is not essential for the existence of the receiving object. The trailer exists even if no motorcycle is placed on it. And the ferry can also drive empty from Kiel to Oslo.
While a careful distinction is made between the two cases in object-oriented modeling (composition denotes the strict form of aggregation based on existential dependency), object-oriented programming languages treat them in principle in the same way. In both cases there are instance variables that can accommodate objects. If an optional object does not exist, this is expressed by the assignment of a special null object. The class itself is responsible for the semantic properties of the relationship.
Usage and calling relationships
The third type of relationship between objects or classes has the most general character. For example, if a method uses a temporary object during its execution, there is a usage relationship between the two: Object x uses an instance of class Y to carry out certain operations. If an object variable of class T appears in the argument list of a method, a similar relationship to T arises. This is not a "part-of" relationship, and the derivation relationship between the two classes is also irrelevant. At least the method must know the argument class and be able to call methods on it or to pass the object on to third parties.
General usage or call relationships are reflected in object-oriented programming languages in that objects are used as local variables or method arguments. They are also referred to by the term associations.
In the previous sections the terms member variable, instance variable and object variable were used several times. We always refer to an object variable as a variable that can accommodate an object, i.e. is of the type of a class. The opposite of an object variable is a primitive variable. The terms member variable and instance variable are used synonymously. They designate a variable that has been defined within a class and is newly created with each instance.
Polymorphism
As the last important concept of object-oriented programming languages, we want to deal with polymorphism. Polymorphism means in direct translation something like "diversity" and describes first of all the ability of object variables to accommodate objects of different classes. However, this does not happen in an uncontrolled manner; for an object variable of type X, it is limited to all objects of class X or a class derived from it.
An object variable of the road vehicle type can therefore not only accept objects of the road vehicle class, but also objects of the two-wheel, four-wheel, trailer, motorcycle, bicycle, car and truck classes. This casualness, which is astonishing at first glance, corresponds exactly to the usual way of dealing with inheritance relationships. A two-wheeler is a road vehicle, has all the properties of a road vehicle and can therefore be represented by a variable that refers to a road vehicle. The compiler doesn't mind that it may have a few additional properties. He only has to ensure that the properties of a road vehicle are completely available, because nothing more is made available to the program when accessing a variable of this type. However, this can be assumed based on the inheritance hierarchy.
The other way around, polymorphism doesn't work. For example, if it were possible to assign an object of the type two-wheeler to a variable of the type motorcycle, the runtime system could run into difficulties. Whenever a property is used on the motorcycle variable that is not yet available in the two-wheeler base class, the behavior of the program would be undefined if it were not a motorcycle but an object from the base class stored in it at the time of execution.
Polymorphism becomes interesting when the programming language also implements the concept of late binding. In contrast to "Early Binding", it is not decided at compile time which version of a certain method should be called, but only at runtime. For example, if a method with the name f is to be called on an object of class X, it is in principle already clear what the name is at compile time. Object-oriented programming languages, however, allow methods to be superimposed in derived classes, and since - as mentioned above - an object variable of type X can also accept objects from all classes derived from X, f could have been superimposed in one of these downstream classes. Which concrete method has to be called can only be decided at runtime. We will present a detailed application example in Section 8.4.
Now this behavior is in no way a hindrance or undesirable, but can be used very elegantly to make automatic type-based case distinctions. Let us look again at our hierarchy of modes of transport. Suppose our company has a diverse fleet of vehicles from all parts of the derivation tree. As an entrepreneur, we are of course interested in the costs of each means of transport per month, and we would define a method getMonatsKosten in the base class means of transport. Obviously, this cannot be implemented there, because, for example, calculating the monthly costs of our ferry is much more difficult than calculating the three bicycles that are also in the vehicle pool.
Instead of checking which type it is in complex case distinctions for each object, only this method has to be implemented in each derived class. If the program has an array of means of transport objects, this can simply be run through and getMonatsKosten can be called for each element. The runtime system knows the respective specific type and can call the correct method (and that is the one from its own class, not the one defined in the means of transport).
If it happens that the implementation in a certain class matches that of its base class, the method does not need to be overlaid again. In this case, the runtime system uses the implementation from the parent class that is closest to its own class.