The Java String data type can contain a sequence (string) of characters. Strings are how you work with
text in Java.
Strings in Java are represented internally using bytes, encoded as UTF-16. UTF-16 uses 2 bytes to represent a single character. UTF is a character encoding that can represent characters in a lot of different languages.
Table of contents:
Java has a shorter way of creating a new String:
If you use the same string (e.g.
Here is a Java String concatenation example:
When executed by itself as a single statement, this extra object creation overhead is insignificant. When executed inside a loop, however, it is a different story.
Here is a loop containing the above type of String concatenation:
When the
The fastest way of concatenating Strings is to create a
The
The first character in a String has index 0, the second character has index 1 etc. The last character in the string has has the index
If the substring is not found within the string, the
There is a version of the
The output printed from this code would be:
The String class also has a
The output printed from the code above would be:
The first line (after the String declaration) checks if the String starts with the substring "
The second line checks if the String starts with the substring "
The third line checks if the String ends with the substring "
The fourth line checks if the String ends with the substring "
Here is an example:
The
You should be aware that the
You can also get the byte representation of the String method using the
The second
Strings in Java are represented internally using bytes, encoded as UTF-16. UTF-16 uses 2 bytes to represent a single character. UTF is a character encoding that can represent characters in a lot of different languages.
Table of contents:
Creating a String
Strings in Java are objects. Therefore you need to use thenew
operator
to create a new String object. Here is an example:
String myString = new String("Hello World");The text inside the quotes is the text the String object will contain.
Java has a shorter way of creating a new String:
String myString = "Hello World";This notation is shorter since it doesn't declare a new String. A new String is created behind the scenes by the Java compiler though.
If you use the same string (e.g.
"Hello World"
)
in other String variable declarations, the Java compiler may only create a single String instance a
make the various different variables initialized to that constant string point to the same String instance.
Here is an example:
String myString1 = "Hello World"; String myString2 = "Hello World";In this case the Java compiler will make both
myString1
and myString2
point
to the same String object. If you do not want that, use the new
operator, like this:
String myString1 = new String("Hello World"); String myString2 = new String("Hello World");
Concatenating Strings
Concatenating Strings means appending one string to another. Strings in Java are immutable meaning they cannot be changed once created. Therefore, when concatenating two String objects to each other, the result is actually put into a third String object.Here is a Java String concatenation example:
String one = "Hello"; String two = " World"; String three = one + two;The content of the String referenced by the variable
three
will be Hello World
;
The two other Strings objects are untouched.
String Concatenation Performance
When concatenating Strings you have to watch out for possible performance problems. Concatenating two Strings in Java will be translated by the Java compiler to something like this:String one = "Hello"; String two = " World"; String three = new StringBuilder(one).append(two).toString();As you can see, a new
StringBuilder
is created, passing along the first
String to its contructor, and the second String to its append()
method,
before finally calling the toString()
method. This code actually
creates two objects: A StringBuilder
instance and a new String instance
returned from the toString()
method.
When executed by itself as a single statement, this extra object creation overhead is insignificant. When executed inside a loop, however, it is a different story.
Here is a loop containing the above type of String concatenation:
String[] strings = new String[]{"one", "two", "three", "four", "five" }; String result = null; for(String string : strings) { result = result + string; }This code will be compiled into something similar to this:
String[] strings = new String[]{"one", "two", "three", "four", "five" }; String result = null; for(String string : strings) { result = new StringBuilder(result).append(string).toString(); }Now, for every iteration in this loop a new
StringBuilder
is created.
Additionally, a temporary String object is created by the toString()
method.
This results in a small object instantiation overhead per iteration.
This is not the real performance killer though.
When the
new StringBuilder(result)
code is executed, the StringBuilder
constructor copies all characters from the result
String into the StringBuilder
.
The more iterations the loop has, the bigger the result
String grows. The bigger the
result
String grows, the longer it takes to copy the characters from it into a new
StringBuilder
, and again copy the characters from the StringBuilder
into
the temporary String created by the toString()
method. In other words, the more iterations
the slower each iteration becomes.
The fastest way of concatenating Strings is to create a
StringBuilder
once, and reuse
the same instance inside the loop. Here is how that looks:
String[] strings = new String[]{"one", "two", "three", "four", "five" }; StringBuilder temp = new StringBuilder(); for(String string : strings) { temp.append(string); } String result = temp.toString();This code avoids both the
StringBuilder
and String object instantiations inside the loop,
and therefore also avoids the two times copying of the characters, first into the StringBuilder
and then into a String again.
String Length
You can obtain the length of a String using thelength()
method. The length of a String
is the number of characters the String contains - not the number of bytes used to represent the String.
Here is an example:
String string = "Hello World"; int length = string.length();
Substrings
You can extract a part of a String. This is called a substring. You do so using thesubstring()
method of the String class. Here is an example:
String string1 = "Hello World"; String substring = string1.substring(0,5);After this code is executed the
substring
variable will contain the string Hello
.
The
substring()
method takes two parameters. The first is the character index
of the first character to be included in the substring. The second is the index of the character
after the last character to be included in the substring. Remember that. The parameters
mean "from - including, to excluding". This can be a little confusing until you memorize it.
The first character in a String has index 0, the second character has index 1 etc. The last character in the string has has the index
String.length() - 1
.
Searching in Strings
You can search for substrings in Strings using theindexOf()
method. Here is an example:
String string1 = "Hello World"; int index = string1.indexOf("World");The
index
variable will contain the value 6
after this code
is executed. The indexOf()
method returns the index of where the first
character in the first matching substring is found. In this case the W
of
the matched substring World
was found at index 6
.
If the substring is not found within the string, the
indexOf()
method returns -1
;
There is a version of the
indexOf()
method that takes an index from which
the search is to start. That way you can search through a string to find more than
one occurrence of a substring. Here is an example:
String theString = "is this good or is this bad?"; String substring = "is"; int index = theString.indexOf(substring); while(index != -1) { System.out.println(index); index = theString.indexOf(substring, index + 1); }This code searches through the string "
is this good or is this bad?
" for occurrences
of the substring "is
". It does so using the indexOf(substring, index)
method. The index
parameter tells what character index in the String to start the
search from. In this example the search is to start 1 character after the index where the previous
occurence was found. This makes sure that you do not just keep finding the same occurrence.
The output printed from this code would be:
0 5 16 21The substring "
is
" is found in four places. Two times by itself, and two times inside
the word "this
".
The String class also has a
lastIndexOf()
method which finds the last occurrence of a
substring. Here is an example:
String theString = "is this good or is this bad?"; String substring = "is"; int index = theString.lastIndexOf(substring); System.out.println(index);The output printed from this code would be
21
which is the index of the last occurrence
of the substring "is
".
Comparing Strings
Java Strings also have a set of methods used to compare Strings. These methods are:- equals()
- equalsIgnoreCase()
- startsWith()
- endsWith()
- compareTo()
equals()
Theequals()
method tests if two Strings are exactly equal to each other.
If they are, the
equals()
method returns true
. If not, it
returns false
. Here is an example:
String one = "abc"; String two = "def"; String three = "abc"; String four = "ABC"; System.out.println( one.equals(two) ); System.out.println( one.equals(three) ); System.out.println( one.equals(four) );The two strings
one
and three
are equal, but one
is not
equal to two
or to four
. The case of the characters must match exactly too,
so lowercase characters are not equal to uppercase characters.
The output printed from the code above would be:
false true false
equalsIgnoreCase()
The String class also has a method calledequalsIgnoreCase()
which compares
two strings but ignores the case of the characters. Thus, uppercase characters are considered
to be equal to their lowercase equivalents.
startsWith() and endsWith()
ThestartsWith()
and endsWith()
methods check if the String starts
with a certain substring. Here are a few examples:
String one = "This is a good day to code"; System.out.println( one.startsWith("This") ); System.out.println( one.startsWith("This", 5) ); System.out.println( one.endsWith ("code") ); System.out.println( one.endsWith ("shower") );This example creates a String and checks if it starts and ends with various substrings.
The first line (after the String declaration) checks if the String starts with the substring "
This
". Since
it does, the startsWith()
method returns true.
The second line checks if the String starts with the substring "
This
" when starting
the comparison from the character with index 5. The result is false, since the character at index 5 is "i
".
The third line checks if the String ends with the substring "
code
". Since it does, the
endsWith()
method returns true
.
The fourth line checks if the String ends with the substring "
shower
". Since it does not,
the endsWith()
method returns false.
compareTo()
ThecompareTo()
method compares the String to another String and returns an int
telling whether this String is smaller, equal to or larger than the other String. If the String is earlier
in sorting order than the other String, compareTo()
returns a negative number. If the String is equal
in sorting order to the other String, compareTo()
returns 0. If the String is after the other String
in sorting order, the compareTo()
metod returns a positive number.
Here is an example:
String one = "abc"; String two = "def"; String three = "abd"; System.out.println( one.compareTo(two) ); System.out.println( one.compareTo(three) );This example compares the
one
String to two other Strings. The output printed
from this code would be:
-3 -1The numbers are negative because the
one
String is earlier in sorting order than the
two other Strings.
The
compareTo()
method actually belongs to the Comparable
interface.
This interface is described in more detail in my tutorial about Sorting.
You should be aware that the
compareTo()
method may not work correctly for Strings in different languages
than English. To sort Strings correctly in a specific language, use a Collator.
Getting Characters and Bytes
It is possible to get a character at a certain index in a String using thecharAt()
method.
Here is an example:
String theString = "This is a good day to code"; System.out.println( theString.charAt(0) ); System.out.println( theString.charAt(3) );This code will print out:
T ssince these are the characters located at index 0 and 3 in the String.
You can also get the byte representation of the String method using the
getBytes()
method. Here are two examples:
String theString = "This is a good day to code"; byte[] bytes1 = theString.getBytes(); byte[] bytes2 = theString.getBytes(Charset.forName("UTF-8");The first
getBytes()
call return a byte representation of the String using
the default character set encoding on the machine. What the default character set is
depends on the machine on which the code is executed. Therefore it is generally better
to explicitly specify a character set to use to create the byte representation (as in the next line).
The second
getBytes()
call return a UTF-8 byte representation of the String.
Converting to Uppercase and Lowercase
You can convert Strings to uppercase and lowercase using the methodstoUpperCase()
and toLowerCase()
. Here are
two examples:
String theString = "This IS a mix of UPPERcase and lowerCASE"; String uppercase = theString.toUpperCase(); String lowercase = theString.toLowerCase();