Let's look at this simpler case:
class TesteRegex {
public static void main(String[] args) {
String s = " A B C ";
String[] d = s.split(" ", 5);
System.out.println(d.length);
for (int i = 0; i < d.length; i++) {
System.out.println("[" + d[i] + "]");
}
}
}
It produces output:
5
[]
[A]
[B]
[C]
[]
See this running on ideone.
The problem is that the space is seen as a separator . So the first space separates the start of String
from A
, the second space separates A
from B
, the third space separates the B
of C
, the fourth space separates the C
from the end of the string. In this way, we would have 5 resulting particles: the beginning, A
, B
, C
and final.
However, if you remove , 5
from the above code, only the first four will come. The reason can be seen in the javadoc method split
that says this:
Trailing empty strings are therefore not included in the resulting array.
Translating:
Empty strings in the end therefore are not included in the resulting array.
However, there is no rule for empty strings at the beginning ( leading empty strings ), there is only rule for the strings at the end.
Looking at the code of the split(String, int)
method, it is concerned with removing empty strings at the end when limit
(which is the second parameter of split
) is zero:
// Add remaining segment
if (!limited || list.size() < limit)
list.add(substring(off, length()));
In the analog method of class java.util.regex.Pattern
also:
// Add remaining segment
if (!matchLimited || matchList.size() < limit)
matchList.add(input.subSequence(index, input.length()).toString());
But he does not bother to do this with empty strings at first.
I'm not sure what the reason for this behavior is. I think it might have something to do with this , but I'm not sure. However, whatever the motivation, this is a purposeful behavior and is not something accidental. Also, this behavior could not be changed due to compatibility issues.
So, the solution is you use trim()
or else check if the first element is blank and ignore it or remove it if that is the case.