Elements of Programming, in Java

by Steven J. Owens (unless otherwise attributed)

Elements of Programming: In Java

This is a work in progress, I keep adding to it and then realizing that what I added is too involved for this article, and moving it out to my Real Basics article. If you're totally new to programming, read that.

This is a very simple introduction to basic concepts in programming, using Java as the programming language. The point is mainly to orient the reader, so they can then usefully go on and read the excellent in-depth tutorials and reference works on the web, like The Java Tutorial, or Bruce Eckel's Thinking In Java.

My original intent had been more for this to be a concise overview of the language, for an experienced programmer to skim. However, it appears I've strayed into trying to explain more basic concepts at some points. Hopefully it's still a quick read for an experienced programmer. It speeds up a bit after I get the bare bones (what is an expression, etc) down.

To Start

In the ideal, platonic form:

Usually things don't work out quite that cleanly, but let's start with this and muddy it up from there.

Expressions, Statements and Blocks

Expressions

Generally, anything that can be treated like a value is an expression. So a literal, like:

"foo"

...is an expression. So is declaring a variable:

int someVariable

So is just referring to an existing variable:

someVariable

...or an operation:

someVariable + 1

1 + 1

Or an assignment:

someVariable = someOtherVariable + 1

Or a comparison:

someVariable == someOtherVariable

Or a method call that returns a value:

compareVariables(someVariable, someOtherVariable)

Statements

You can do a couple expressions at once, like we did up above in the assignment example. The whole phrase of expressions, lumped together, is called a statement. In java, you put a semi-colon ; at the end of a statement.

int someVariable = someOtherVariable + 1 ;

Blocks

Statements are grouped together in blocks. Blocks are usually associated with conditional keywords (if, else, switch, case) or flow-control keywords (for, while).

Most blocks in java are defined by braces for opening { and closing }. These are sometimes called "curly braces", because it's easy to confuse them with the brackets, [ and ], which are on the same keys (the curly braces are the shifted versions).

if (someVariable == someOtherVariable) {
  someVariable = someOtherVariable + 1 ;
}

Most block keywords make the curly braces optional if the block body is only going to contain one line:

if (someVariable == someOtherVariable)
  someVariable = someOtherVariable + 1 ;

I'm only mentioning this in case you run across it in code. Personally, I always, always use the braces. Not using the braces will save you typing two characters, but it's way too easy to typo and insert an extra line between the if and the single-line body. Always using the braces will cost you two extra characters of typing (don't laugh, there are programmers who actually get hung up over that) but will never, ever cause mysterious behavior. Plus, frankly it's an extra decision I don't have to worry about when coding, or when reading my own code.

Identifiers

The contents of a source code file are words or punctuation. The words are either one of a limited set of about 52 keywords, or variable names, or class names. The punctuation are either operators or are used to define:

Not all operators are punctuation. new and instanceof are described as operators in the Java Tutorial. I'm just going to ignore that and call them keywords.

Keywords

Besides punctuation, you have words. There's a limited set of keywords. In the old days they used to call these "reserved words" - reserved for the use of the compiler.

Note: There are also a few keywords that aren't actually in use, yet, but are reserved for future use. There's an alphabetical listing at the end, in the section titled Java Keywords, Alphabetical.

Flow Control

You use flow control keywords to control the flow of execution in your program, i.e, what operations get executed when, and what happens next.

Conditionals:

The goto keyword is reserved but not implemented. Guess they just wanted to make sure nobody ever added goto to java.

Loops:

Loops can also be labeled, but I'm describing that under Branches since you only use labels when you're using branching keywords.

Branches:

The break keyword is used with the switch conditional and with loops (for, while, and do-while).

The continue keyword is just used with loops (for, while, and do-while).

A label isn't a keyword, but is an option on break or continue to label a level of the loop. You label it by preceding the line with the loop keyword with a labelname: line.

Then you can use the label name with break or continue to specify which loop you want flow of control to go to.

Conditionals

The conditional keywords, if and else, switch and case and default, are used to make decisions. switch also gets to use the break keyword.

if (this condition is true) { do that ; }

if (this condition is true) { do that ;
} else { do the other thing; }

Some languages have an elseif, but java just handles that by embedding an if, with a space between the else and the if:

if (this condition is true) { do that ;
} else if (this other condition is true) { do the other thing ; }

if (this condition is true) do that ;
} else if (this other condition is true) { do the other thing ;
} else { do some third thing ; }

The switch and case keywords work together. You can simulate a switch using if and else, but switch and case are easier to use for large conglomerations of if/else. BUT... if you're using a switch statement, it's a good clue to look more closely at the design of your program, because you're probably not using objects right (many uses of switch statements should be handled via polymorphism instead). But at the intro level, don't worry about that.

Variables, Data Types, Primitives

Variables contain data. Data in java is strongly typed. This means that when you declare them, you also declare very specifically what kind of data is going inside them. If you then accidentally put code elsewhere that tries to put the wrong type of data in the variable, the compiler will yell at you.

Data in java comes in two flavors; primitives and objects.

Primitive Types

Primitives are what most people think of when you say data - numbers and letters, true or false. It's a little more complicated than that, of course. There are five or six primitive types for numbers, depending on what kind of mathematical stuff you're doing with the number. The simplest is an integer, int.

One general rule of thumb is that primitive type variables always have some default value; if you don't specify that value when you declare the variable, then it goes to a standard default value. The default for int is 0 (most of the numeric types default to some flavor of 0), the default for boolean is false, and so forth.

Arrays

Arrays are a structure for holding a bunch of bits of data in a line.

Arrays are kind of weird; although they're a primitive type, they have some extra bits that give them a sort of Class-like flavor. Arrays are strongly-typed; you specify what sort of thing an Array is going to hold when you declare the array — a specific primitive type, like int, or instance references to a specific class of object, like instances of File.

You use [square brackets] in both declaring an array and in getting at a specific element of an array.

int[] numbers = new int[20] ;

That declares an array with twenty elements of type int.

To get at the 10th element in the array you use the square brackets with an integer that is the index of the element you want. Non-programmers tend to assume that you'd use 10 to get the 10th element, but the index is actually an offset - how many elements down from the first. So the first element isn't offset at all - you use an index of 0. The 10th element is nine further down, so you use an offset of 9:

numbers[9]

or

int tenthnumber = numbers[9]

String

String isn't an official keyword, but it might as well be. It's a special case (I hate those :-). String is a java Class but it's a java class that gets special treatment. String is a standard Java class (java.lang.String), and it's in the standard java.lang package, it's automatically imported, so you don't have to import it to use it. There are several useful classes in java.lang, but String is one you'll be using all the time.

true, false and null

These aren't actually called keywords. What they actually are is special reserved words to represent certain literal values. But for now just think of them as keywords.

boolean

Likewise, there are these words to represent either true or false, which is named "boolean" in the math world and therefore in the programming world:

null

There's a special not-really-a-keyword called:

null

I dont want to get too side-tracked here, but I should point out that null, especially, is a bit of a big idea. It's sort of like the concept of "zero", it means something isn't there.

With the concept of "zero" in math, there are some places where you just can't use null. In math, if you try to divide 5 by 0, you just plain can't, and if you try to write a program that does that, you get an error.

With null, unfortunately, that sort of problem situation happens in a whole lot more places. Having an explicit value for null is nicer than having unpredictable things happen when you haven't defined a variable yet, but a lot of programming errors cause a NullPointerException and you'll probably come to hate null a bit. Sorry.

Object Types

Objects are complex types of data, defined either by the java API libraries, or by your own code.

To talk about objects, you have to talk about classes of objects and instances of objects.

A class is the definition, or blueprint of some sort of object. It's sort of like the platonic "ideal form" of the object, or a generic object. It defines all the stuff that's supposed to be in a particular type of object, all the things a particular type of object is supposed to be able do.

In the real world, we have a whole lot of individual objects that each has its own properties, and we look at some subset of those objects, describe the common properties and qualities that the objects all have, and from that we invent this abstract idea, a generalization, a named category that we call a "class" of objects; a description of properties and qualities that a number of particular examples - instances - of the class have in common.

In the software world, not too surprisingly, we do it backwards - we define an abstract idea of a class and then we create individual instances of that class. You "instantiate" an object with the new keyword:

StringBuffer foo = new StringBuffer();

If you're coming at this from a C background, a reference is basically a smart pointer, that won't let you play games with the underlying memory address value. It's also a pointer you don't have to worry about explicitly cleaning up after, because the java garbage collector (GC for short) takes care of that for you. This doesn't mean you get to ignore memory issues totally, but it simplifies things a lot, most of the time.

If you're not coming at this from a C background, it's time for a trip down memory lane...

In days of yore, in earlier languages where the whole object thing was sort of bolted on aftwards, they actually used addresses - memory addresses, to be specific. They had a pointer to a memory address which was where the stored data that made up the object started. This was an actual numeric value, and as you might guess, people started playing tricks with the numbers. As you also might guess, if you slipped up and got it wrong, Bad Things Happened. Often quite mysterious and cryptic Bad Things. And lastly, as you also might guess, people started playing tricks not just with how they wrote the program, but with how they used the program, looking for where somebody got either too sloppy or too tricky, or both, with the memory addresses, so they could trick the program into doing something it wasn't supposed to do.

In these new-fangled times, when we use programming languages based on concepts developed in the 1960s, we have references instead of pointers to memory addresses. References hide the underlying mechanics of memory addresses from the programmer. Like a pointer, a reference is a data value when you get right down to it; you can store a reference in a variable, or copy it into another variable. Unlike a pointer, you can't monkey around with the actual data of the reference.

References are always to instances of classes.

A variable that is declared for holding an object reference, but not yet set, holds the value null.

Keywords for Organizing Your Code

Not really done yet, but here's an outline:

Method signatures also use the throws keyword, which we talk about a little further down in the exceptions section.

The new and instanceof keywords are actually officially referred to as operators in The Java Tutorial. For some reason, return and super aren't called operators, though they also do things.

this is an automatically defined and available variable; in any method, this points to the object that contains the method. You don't have to always use this to refer to other methods or instance variables on the object, but I like to make it a habit to do so, so I can tell, looking at my code, where I'm using instance variables and where I'm using local variables.

Exceptions

The throw keyword starts an exception, which interrupts whatever your program is doing. You have the program throw an exception when it runs into some catastrophic error which means the program can't finish whatever it was doing. Exceptions are a bit complicated, so ignore them for a little while, but you're going to have to learn them soonish, because Exceptions are used a lot in the java API libraries.

You'll also see the throws keyword used in method signatures, to indicate that a method might use the throw keyword to throw an exception if it runs into a problem. throws usually lists the types of exceptions that a specific method can throw.

Access

These keywords go on class declarations, instance variable declarations and method declarations. They determine the rules for what other parts of the program get access to the class, instance variable or method.

There's also "package", which is what you get when you don't specify either private, protected or public.

Tricky Stuff

You're probably not going to use these very often, but you are going to run into them (particularly abstract, final and static). These keywords control some very tricky aspects of Java, and someday I'll write some useful information here for you.

Abstract is for when you have enough information to put some useful work into a class, but not enough to finish it. You put the work in and sketch out the remaining-to-be-done stuff with abstract declarations. Then some other programmer comes along and makes a descendant of your abstract class and finishes the job. Abstract tells the compiler that people shouldn't be trying to directly instantiate the class, and it tells the compiler what things people have to finish up before they can call their subclass finished.

Abstract is pretty kludgy-feeling, but it's definitely handy. It's for those in-between situations, when you have too much common stuff to put duplicates of it on each subclass but not enough to make a fully functional superclass.

Final is morally the opposite of Abstract - not only is it done, it's final - no changes will be allowed, no subclasses will be allowed. java.lang.String is final. Final seems to be intended for situations where security or performance is an issue - you really don't want people able to monkey with the innards of String, not when you're trying to run code from several different sources in the same environment. There are also some performance benefits if the compiler can count on the class not being modified at run-time. However, it's not a good idea to declare classes final just in the hopes you might eke a few more CPU cycles out of it.

Static is for non-OO methods. The main() method of your startup class will be static. Its job will be to set up all of the object instances for the application and then push over the first domino. In practice, you should rarely be using static on method declarations.

Also, for some reason "static" is also used when you declare an class-wide instance variables; the resulting instance variable is shared between the different instances of the class. I've never quite understood why they use the same word for both situations, but I've never been interested enough to try to find out.

Organizing your code

Not done yet.

Objects

Not done yet.

Methods (aka procedures, subroutines, functions)

Not done yet.

instance variables, local variables

Not done yet.

Statics (stateless, pseudo-global variables)

Not done yet.

Constructors

Not done yet.

Classes versus Instances

One thing that folks tend to get hung up on is the whole classes and instances confusion, especially if they've learned some basics in procedural programming languages (like BASIC, perl, C, etc).

This is natural. Most people can understand the concept of a list of instructions. Most people can understand breaking a list down ito discrete, labeled chunks. Object-oriented (OO) programs run a bit differently, and from my experience the OO world tends to sort of take this basic concept for granted.

Note: If you're a serious C programmer and this is your first exposure to OO, I recommend going and reading the first couple chapters (at least!) of Bruce Eckel's Thinking In Java. He talks about these topics in terms more natural to somebody who understands what, for example, a stack and a heap are.

Procedural programs run. Objects exist, and react to input.

In a procedural program, you tell the operating system to load and run the program. Depending on the procedural langauge, the operating system will just start at the first line, or it will evaluate the contents of the file, build some structure - a set of subroutines or functions, for example, and look for a set place to start - in the C programming language, the starting point is the "main" function.

In a purely object-oriented program, things happen different. Something has to bring the objects into existence (this is called instantiating an object - creating an instance of an object - more on this in a moment), and then some input comes in (usually described as an event happening), and the objects react to that input - and they may react by sending input to other objects, and so forth.

Objects, Instances, Classes

The word "class" is generally used in the sense of a kind, or category of things. The American Heritage definition of it is:

A set, collection, group, or configuration containing members regarded as having certain attributes or traits in common; a kind or category.

In programming, the word class is used to mean pretty much the same thing, except that often people shorten phrases like "class definition" to just "class."

I think one reason this tpoic is confusing to people is that the general direction that OO programmers use these terms is more or less the opposite of the direction that normal people use them. Notice I said the direction is opposite, not the meaning.

Most people deal with individual things in their day-to-day life, and occasionally they have to look at the similarities of those individual things and think about them as a class of things. OO programmers, on the other hand, spend a lot of time thinking about the characteristics of a class of things, and defining them in detail, and then thinking about how the various instances of a class interact. Good OO programmers tend to bounce back and forth between thinking about how the instances of a class interact, and how the class is defined, refining their understanding of both as they go.

So in general OO usage of the term, a class is an object definition - if you're a philosophy type, you can think of it as the platonic ideal form of the object. The class file is what you create when you put together the source code and compile it. An object is an instance of a class that gets created (instantiated) by something else.

The word object and the word instance are used interchangably a lot, almost as if they're the same thing, and they almost are. People usually use the word "instance" when they're talking about an object relative to a class (almost the same as you would say in normal conversation, "for instance...").

Start Me Up

But what gets the whole ball rolling? In the java world, any object can have a main() method. When you run the java interpreter, you give it the name of a class to start with. The java interpreter runs that class's main() method and the main method has to instantiate objects and do something.

However, it's also common - in fact, probably more common in java - for a given project to use a framework, like a web application server (a servlet engine), or an applet running in a browser. In that case, there's another application loading and running your code, and it'll look at some configuration to figure out what class to instantiate, and what methods on that class to call.

Note: Notice that the java interpreter does not instantiate an object of the class you named. If the main method wants to do anything with the class it's defined on, the first thing it has to do is instantiate an object of that class.

Note: In some languges with procedural roots, the programmer has to write a big loop that encompasses the whole program. The program starts up, sets up all the objects, then loops forever, looking for input and feeding it to the right objects. I am, of course, simplifying, not least because I've never had to actually write OO code using that approach.

Semantics Are Half Your Job

People like to handwave away some things and say "that's just semantics" but semantic means, roughly, "meaning". The word itself comes from the greek word for significant, which itself comes from the word sign - so the semantics of something are the signs to meaning, the important clues to what's going on.

The point of this little diatribe is that choosing the right names for the pieces of your program - the classes, the methods, the instance variables - is a really important thing, and you should always be ready to change a name if you suddenly realize that it's not saying what it really should be (of course, you have to be careful to change it everywhere you use it). Having the right semantics is at least half the job!

Literal on the Left: avoiding assignment-by-typo bugs

A single equals sign = is assignment:

A double equals sign == is comparison.

It is very, very easy to typo and type = when you meant ==.

Always put the literal on the left, when doing a comparison with a literal (or with a literal stored in a final static variable). It's trivial, but it will never do weird things in your code, and the compiler will refuse to compile it if you make a typo.

A reminder: a literal is when you type an actual value in your code, like:

1
"foo"
1.5

As an example of what not to do:

int i ;
i = 5 ;
if (i = 3) {
  System.out.println("i does equal 3") ;
} else {
  System.out.println(i is " + i) ;
}

In this example, you intend to check whether i is equal to the value 3, and print the first message, else print the second message. Trouble is, this code will always print the first message, because "i = 3" assigns the value 3 to the variable i.

Instead, put the literal value on the left side of the comparison. For example:

int i ;
i = 5 ;
if (3 = i) {
  System.out.println("i does equal 3") ;
} else {
  System.out.println(i is " + i) ;
}

This will cause the compiler to yell at you about trying to assign "i" to the literal 3. Also, the act of typing it in this sequence will make you think about it in a slightly different way. Your own brain looks at this and says "It's stupid to try to set 3 to some value!"

API Libraries

There are a ton of API libraries. I'll try to get back to here and give you some basic idea of where to start.

Collections

The Java Collections API is a set of objects for organizing other objects. This is a really, really useful toolbox to know. Take a look at my Java Collections Tutorial.

Odds and Ends

Naming Rules and Conventions

In the long run, you should follow The Java Tutorial's rules for variable naming, and also learn Sun's java naming conventions.

Conventions aren't "rules", but a lot of people think they're really, really important. Following the naming conventions makes it a lot easier for other people to read your code, which makes it a lot easier for you to get them to help you (or hire you).

Literals, Constants and Conventions

Having literals scattered around your code is generally considered very unwise. A common convention is to assign the literal value to a variable (convention is to name that variable with all UPPERCASE) and use the variable elsewhere. That way, if you ever have to change the literal, you can just change it in one spot. It's also common, if you have a bunch of literals that are related, to put them all in one spot in the code.

It's also common to use the final and static keywords with those literals, to tell the compiler that you intend to never change those values after compiling the program (unless, of course, you change the source and compile the program again). In theory this makes the program slightly faster, but more importantly, it makes it really clear to the compiler what you intend. This also means that the compiler will yell at you if you typo a == comparison and end up with:

SOMELITERAL = somevariable

Java Keywords, Alphabetical

Check out The Java Tutorial's full list of java keywords.

It's the authorative place to look for java keywords, but I don't really find that format easy to read, so here's a list of the java keywords in alphabetical order. Some of the keywords (label, true, false, null) are left out of the list at The Java Tutorial, so I have them in emphasized text in this list.



See original (unformatted) article

Feedback

Verification Image:
Subject:
Your Email Address:
Confirm Address:
Please Post:
Copyright: By checking the "Please Post" checkbox you agree to having your feedback posted on notablog if the administrator decides it is appropriate content, and grant compilation copyright rights to the administrator.
Message Content: