Just to give more detail to @Articuno's answer .
Like the @ bigown noted in your comment , strings are immutable entities. In the other comments were placed more links on the subject of immutability. One aspect of this important immutability is a matter of optimization: substring
operations in Java can be implemented very lightly ( source ). Depending on the implementation of Java, a substring can be implemented as a continuous set of information of a vector of characters. From the example mentioned in the source, in the% JDK6 "abcde".substring(1,3)
will generate a new string object, which carries with it the same character vector {'a', 'b', 'c', 'd', 'e'}
of the initial string, but it considers that it starts from position 1
( b
) and goes right before the 3
position, so it would be the position 2
( c
). A Java implementation does not always guarantee that a string is the size of the character vector that loads its data, so it might be that it only loads part of it. If it is the case of the string storing the start index and the end index, the length()
method returns fim - comeco
; if it is implemented with offset
of the first character and count
, then the result of the method would be count
(according to the example that it gave of as it was in the JDK6). If the string implementation makes a copy of the desired subvector is used (as the example informs it does in JDK7), then the length()
method simply returns vetorDeChar.length
.
One advantage of implementing string.length()
as a method is that you can completely change the internal implementation (including existing fields) without the programmer using a string being affected. For smaller applications, string implementation strategies are not initially felt.
@Maniero talks about encapsulation of access properties that could solve this Java idiosyncrasy
An interesting point about vectors is that access to their size is optimized by the JVM. There is a bytecode specialized in doing this: 0xBE
, identified by the arraylength
mnemonic. Using this specific bytecode has a lesser generated result than what would be generated if the traditional bytecode of catching fields, getfield
( 0xB4
), is used, which requires two additional bytes for the attribute index. The Wikipedia has a page listing bytecodes .