Books
in black and white
Main menu
Share a book About us Home
Books
Biology Business Chemistry Computers Culture Economics Fiction Games Guide History Management Mathematical Medicine Mental Fitnes Physics Psychology Scince Sport Technics
Ads

More Java Pitfalls Share Reactor - Daconta M,C.

Daconta M,C. More Java Pitfalls Share Reactor - Wiley publishing, 2003. - 476 p.
ISBN: 0-471-23751-5
Download (direct link): morejavapitfallssharereactor2003.pdf
Previous << 1 .. 49 50 51 52 53 54 < 55 > 56 57 58 59 60 61 .. 166 >> Next

>java org.javapitfalls.util.mcd.i1.BadStringTokenizer input: 123###4#5###678###hello###wo#rld###9 delim: ###
If '###' treated as a group delimiter expecting 6 tokens...
tok[0]: 123
tok[1]: 4
tok[2]: 5
tok[3]: 678
tok[4]: hello
tok[5]: wo
tok[6]: rld
tok[7]: 9
# of tokens: 8
As is demonstrated in the above listing, the developer expected six tokens, but if a single "#" character was present in any token, he received more. The junior developer wanted the delimiter to be the group of three pound characters, not a single pound
8 This pitfall was first published by JavaWorld (www.javaworld.com) in the article, "Steer clear of Java Pitfalls", September 2000 (http://www.javaworld.com/javaworld/jw-09-2000 /jw-0922-javatraps-p2.html) and is reprinted here with permission. The pitfall has been updated from reader feedback.
Effective String Tokenizing 141
character. BadStringTokenizer.java in Listing 18.1 is the incorrect way to parse with a delimiter of “###".
01 package org.javapitfalls.item18;
02
03 import java.util.*;
04
05 public class BadStringTokenizer
06 {
07 public static String [] tokenize(String input, String delimiter)
08 {
09 Vector v = new Vector();
10 StringTokenizer t = new StringTokenizer(input, delimiter)
11 String cmd[] = null;
12
13 while (t.hasMoreTokens())
14 v.addElement(t.nextToken());
15
16 int cnt = v.size();
17 if (cnt > 0)
18 {
19 cmd = new String[cnt];
20 v.copylnto(cmd);
21 }
22
23 return cmd;
24 }
25
26 public static void main(String args[])
27 {
28 try
29 {
30 String delim = "###";
31 String input = "123###4#5###678###hello###wo#rld###9"
32 System.out.println("input: " + input);
33 System.out.println("delim: " + delim);
34 System.out.println("If '###' treated as a group 2
delimiter expecting 6 tokens...");
35 String [] toks = tokenize(input, delim);
36 for (int i=0; i < toks.length; i++)
37 System.out.println("tok[" + i + "]: " + toks[i]);
38 System.out.println("# of tokens: " + toks.length);
39 } catch (Throwable t)
40 {
41 t.printStackTrace();
42 }
43 }
44 }
Listing 18.1 BadStringTokenizer.java
142 Item 18
The tokenize() method is simply a wrapper for the StringTokenizer class. The StringTokenizer constructor takes two String arguments: one for the input and one for the delimiter. The junior developer incorrectly inferred that the delimiter parameter would be treated as a group of characters instead of a set of single characters. Is that such a poor assumption? I don't think so. With thousands of classes in the Java APIs, the burden of design simplicity rests on the designer's shoulders and not on the application developer's. It is not unreasonable to assume that a String would be treated as a single group. After all, that is its most common use: a String represents a related grouping of characters.
A correct StringTokenizer constructor would require the developer to provide an array of characters, which would better signify that the delimiters for the current implementation of StringTokenizer are only single characters—though you can specify more than one. This incompletion is an example of API laziness. The API designer was more concerned with rapidly developing the API implementation than the intuitiveness of the implementation. We have all been guilty of this, but it is something we should be vigilant against.
To fix the problem, we create two new static tokenize() methods: one that takes an array of characters as delimiters, the other that accepts a Boolean flag to signify whether the String delimiter should be regarded as a single group. The code for those two methods (and one additional utility method) is in the class GoodStringTokenizer:
01 package org.javapitfalls.item18;
02
03 import java.util.*;
04
05 public class GoodStringTokenizer
06 {
07 // String tokenizer with current behavior
08 public static String [] tokenize(String input, char [] 2
delimiters)
09 {
10 return tokenize(input, new String(delimiters), false);
11 }
12
13 public static String [] tokenize(String input, String
delimiters, boolean delimiterAsGroup)
14 {
15 String [] result = null;
16 List l = toksToCollection(input, delimiters, 2
delimiterAsGroup);
17 if (l.size() > 0)
18 {
19 result = new String[l.size()];
20 l.toArray(result);
21 }
Listing 18.2 GoodStringTokenizer.java
Effective String Tokenizing 143
22 return result;
23 }
24
25 public static List toksToCollection(String input, String 2
delimiters, boolean delimiterAsGroup)
26 {
27 ArrayList l = new ArrayList();
28
29 String cmd[] = null;
30
31 if (!delimiterAsGroup)
32 {
33 StringTokenizer t = new StringTokenizer(input, delimiters);
34 while (t.hasMoreTokens())
35 l.add(t.nextToken());
36 }
37 else
38 {
39 int start = 0;
40 int end = input.length();
41
42 while (start < end)
Previous << 1 .. 49 50 51 52 53 54 < 55 > 56 57 58 59 60 61 .. 166 >> Next