11

I've been reviewing Java Regex Library, surprised by the fact the Pattern class does not have a public constructor which I've taken for granted for years.

One reason I suspect the static compile method is being used in favor of constructor could be that constructor would always return a new object while a static method might return a previously created (and cached) object provided that the pattern string is the same.

However, it is not the case as demonstrated by the following.

public class PatternCompiler {
    public static void main(String[] args) {
        Pattern first = Pattern.compile(".");
        Pattern second = Pattern.compile(".");
        if (first == second) {
            System.out.println("The same object has been reused!");
        } else {
            System.out.println("Why not just use constructor?");
        }
    }
}

Any other strong rationales behind using static method over constructor?

Edit: I found a related question here. None of the answers there convinced me either. Reading through all answers, I get a feeling that a static method has quite a few advantages over a public constructor regarding creating an object but not the other way around. Is that true? If so, I'm gonna create such static methods for each one of my classes and safely assume that it's both more readable and flexible.

Community
  • 1
  • 1
Terry Li
  • 16,870
  • 30
  • 89
  • 134
  • 1
    Did you try do dive into the source code? – John Dvorak Dec 07 '12 at 07:30
  • Perhaps the reason is to ensure readability. – John Dvorak Dec 07 '12 at 07:35
  • 1
    @JanDvorak `new Pattern(".")` is as readable if not more so, isn't it :) – Terry Li Dec 07 '12 at 07:36
  • What if, hypothetically, there was another way to build a Pattern than from a regex string, that also happens to be a String (say, from a JSON-encoded DFS transition table)? – John Dvorak Dec 07 '12 at 07:38
  • @JanDvorak Well, you can overload a constructor as often as you can a static method. – Terry Li Dec 07 '12 at 07:40
  • 2
    not if both overloads have the same signature. – John Dvorak Dec 07 '12 at 07:42
  • @JanDvorak Good point. I think I got what you meant. `compile` may potentially have a mechanism to decide which constructor to use depending on the actual input provided that multiple constructors are available. Feel free to list it as an answer and I'll vote for it :) – Terry Li Dec 07 '12 at 07:45
  • No, the point is you can have several methods with different names, but you can't have several constructors with different names. Example: `Pattern.compile(str)` and `Pattern.compileFromJson(str)` and `Pattern.compileFromXML(str)`. – Philipp Wendler Dec 07 '12 at 07:47
  • @PhilippWendler Thanks for correcting me. This is a good point deserving to be an answer I think. – Terry Li Dec 07 '12 at 07:50
  • 1
    What would be gained by adding a constructor? This is an honest question. – jahroy Dec 07 '12 at 07:51
  • [This][1] is related. [1]: http://stackoverflow.com/questions/855518/why-does-java-pattern-class-use-a-factory-method-rather-than-constructor – Leaf Shadow Dec 07 '12 at 14:30

6 Answers6

15

Generally, a class won't have a public constructor for one of three reasons:

  • The class is a utility class and there is no reason to instantiate it (for example, java.lang.Math).
  • Instantiation can fail, and a constructor can't return null.
  • A static method clarifies the meaning behind what happens during instantiation.

In the class of Pattern, the third case is applicable--the static compile method is used solely for clarity. Constructing a pattern via new Pattern(..) doesn't make sense from an explanatory point of view, because there's a sophisticated process which goes on to create a new Pattern. To explain this process, the static method is named compile, because the regex is essentially compiled to create the pattern.

In short, there is no programmatic purpose for making Pattern only constructable via a static method.

FThompson
  • 28,352
  • 13
  • 60
  • 93
  • Does the second apply here too? – Sanandrea Mar 07 '16 at 13:46
  • 1
    @Sanandrea Nope, there's no case in which `Pattern.compile` returns null. Instantiation can still fail, but it just throws an exception. – FThompson Mar 07 '16 at 18:15
  • look at this gist please: https://gist.github.com/sanandrea/686a5a3d762e177d74fe it returns null when it fails to compile the pattern. If I did it correctly... – Sanandrea Mar 08 '16 at 08:34
  • `Pattern.compile` itself isn't returning null; you set `pattern` to null. And then nothing alters `pattern` if `Pattern.compile` throws an exception. – FThompson Mar 08 '16 at 09:45
  • A constructor can't return null, so it can be necessary to create a static construction wrapper if the goal is to either return an instance or return null. – FThompson Mar 08 '16 at 19:42
  • 1
    @Sanandrea Is this clear to you? I feel like I'm not getting my points across to you. What is unclear about the second point? – FThompson Mar 09 '16 at 21:47
  • yes I am not understanding why the second point does not apply here. Could you please provide an example where the second point applies if this is not the case. – Sanandrea Mar 10 '16 at 14:57
  • 1
    If you look at the [source code of the pattern class](http://www.docjar.com/html/api/java/util/regex/Pattern.java.html), you can see that `Pattern.compile` can't return null, because it simply returns what calling the constructor does, and constructors can't return null, so the second point does not apply here. I can't think of an example of the second point off the top of my head, but it's something I've seen before. – FThompson Mar 10 '16 at 20:24
  • So either the second point is not well defined or it applies also in this case. – Sanandrea Mar 11 '16 at 09:38
  • 1
    How does it apply in the case of pattern? The source code clearly cannot return null, so I'm not sure why you think this. Your example creates a use case where null is the result of a failed pattern compilation, but this doesn't mean that `Pattern.compile` itself returns null (because it doesn't!). – FThompson Mar 11 '16 at 19:18
  • `SplashScreen` is an example of the second point: http://docs.oracle.com/javase/8/docs/api/java/awt/SplashScreen.html – FThompson Mar 11 '16 at 19:28
  • thank you for all the explanations in first place. I want to remove the downvote but yet your point does not convince me. `SplashScreen` example does not count too: *This class cannot be instantiated. Only a single instance of this class can exist, and it may be obtained by using the getSplashScreen() static method. In case the splash screen has not been created at application startup via the command line or manifest file option, the getSplashScreen method returns null.* So it is **not** an instantiation but a singleton pattern. – Sanandrea Mar 13 '16 at 22:58
9

One possible reason is that this way, caching can later be added into the method.

Another possible reason is readability. Consider this (often cited) object:

class Point2d{
  static Point2d fromCartesian(double x, double y);
  static Point2d fromPolar(double abs, double arg);
}

Point2d.fromCartesian(1, 2) and Point2d.fromPolar(1, 2) are both perfectly readable and unambiguous (well... apart from the argument order).

Now, consider new Point2d(1, 2). Are the arguments cartesian coordinates, or polar coordinates? It's even worse if constructors with similar / compatible signatures have entirely different semantics (say, int, int is cartesian, double, double is polar).

This rationale applies to any object that can be constructed in multiple different ways that don't differ in just the argument type. While Pattern, currently, can only be compiled from a regex, different representations of a Pattern may come in the future (admittably, then, compile is a bad method name).

Another possible reason, mentioned by @Vulcan, is that a constructor should not fail.

If Pattern.compile encounters an invalid pattern it throws a PatternSyntaxException. Some people may consider it a bad practice to throw an exception from a constructor. Admittably, FileInputStream does exactly that. Similarly, if the design decision was to return null from the compile method, this would not be possible with a constructor.


In short, a constructor is not a good design choice if:

  • caching may take place, or
  • the constructor is semantically ambiguous, or
  • the creation may fail.
John Dvorak
  • 26,799
  • 13
  • 69
  • 83
  • 2
    +1, but I disagree about constructors not supposed to throw exceptions. I nearly included the same argument about exception throwing in my answer, but then I considered the counter-point of `new FileInputStream(..)` which can throw an `IOException`. While I agree that throwing exceptions in constructors is messy, it's not an uncommon practice, especially in the `java.net` and `java.io` packages. – FThompson Dec 07 '12 at 08:08
  • Also, the `compile` method is also very often called from a static initialization block (`static final Pattern p = Pattern.compile(".");`), so there is no difference here. – Philipp Wendler Dec 07 '12 at 08:12
  • @PhilippWendler au contraire. The constructors that are most often in a static initialisation block are the worst to throw an exception (but the same applies to other methods there - you are right. – John Dvorak Dec 07 '12 at 08:14
  • @Vulcan what about "might be considered a bad practice"? – John Dvorak Dec 07 '12 at 08:18
  • 1
    `Pattern.compile` is nothing more than a named constructor. It can and will be called anywhere and everywhere someone would have said `new Pattern` if they could -- including static init blocks -- and would cause all the same problems. So any advice about avoiding exceptions in constructors should apply to it as well. The rules don't change just because the name does. – cHao Dec 08 '12 at 04:08
  • @cHao I believe constructors are not _expected_ to fail. Static methods are. – John Dvorak Dec 08 '12 at 05:23
  • 1
    @JanDvorak: Do you have a link or something that says this? I've heard it said in C++, but there are reasons for it there that don't quite apply in Java. – cHao Dec 08 '12 at 10:13
  • @cHao I don't. Should I remove the statement from the answer? – John Dvorak Dec 08 '12 at 10:19
  • 1
    @JanDvorak: Personally, i would. But that's cause i disagree with it, so i'm a bit biased already. :) Keep it if you can back it up. – cHao Dec 08 '12 at 10:27
  • @cHao I have used the "may be considered" clause. Is it enough to say it might be a matter of opinion? – John Dvorak Dec 08 '12 at 10:33
  • @JanDvorak: As long as the "matter of opinion" part is clarified. Maybe "Some people might consider it a bad practice...", or something like that. Still a bit weasel-wordy, but at least it has less of a "this is so" feel. – cHao Dec 08 '12 at 10:49
6

This is just a design decision. In this case there is no "real" advantage. However, this design allows optimisation (caching for instance) without changing the API. See http://gbracha.blogspot.nl/2007/06/constructors-considered-harmful.html

rmuller
  • 12,062
  • 4
  • 64
  • 92
  • The one more reason is that adding optimisation later into constructor would be a bad practice (passing this while in constructor). Doing the same in a static method is ok. – Peter Ivan Dec 07 '12 at 08:05
5

Factory methods have several advantages, some of which are already specified in other answers. The advice to consider factory methods instead of constructors is even the very first chapter in the great book "Effective Java" from Joshua Bloch (a must-read for every Java programmer).


One advantage is that you can have several factory methods which have the same parameter signatures but different names. This you can't achieve with constructors.

For example, one might want to create a Pattern from several input formats, all of which are just Strings:

class Pattern {
  compile(String regexp) { ... }
  compileFromJson(String json) { ... }
  compileFromXML(String xml) { ... }
}

Even if you are not doing this when you create the class, factory methods give you the ability to add such methods latter without causing weirdness.

For example, I have seen classes where the need for a new constructor came later and a special meaning-less second parameter had to be added to the second constructor in order to allow overloading. Obviously, this is very ugly:

class Ugly {
  Ugly(String str) { ... }

  /* This constructor interpretes str in some other way.
   * The second parameter is ignored completely. */
  Ugly(String str, boolean ignored) { ... }
}

Unfortunately, I can't remember the name of such a class, but I think it even was in the Java API.


Another advantage which has not been mentioned before is that with factory methods in combination with package-private constructors you can prohibit sub-classing for others, but still use sub-classes yourself. In the case of Pattern, you might want to have private sub-classes like CompiledPattern, LazilyCompiledPattern, and InterpretedPattern, but still prohibit sub-classing to ensure immutability.

With a public constructor, you can either prohibit sub-classing for everybody, or not at all.

Philipp Wendler
  • 11,184
  • 7
  • 52
  • 87
2

If you really want to take the deep dive, plunge into the archives of JSR 51.

Regular expressions have been introduced as part of JSR 51, that’s where you might still find the design decisions in their archives, http://jcp.org/en/jsr/detail?id=51

akuhn
  • 27,477
  • 2
  • 76
  • 91
1

It has a private constructor.

 /**
     * This private constructor is used to create all Patterns. The pattern
     * string and match flags are all that is needed to completely describe
     * a Pattern. An empty pattern string results in an object tree with
     * only a Start node and a LastNode node.
     */
    private Pattern(String p, int f) {

and compile method calls into that.

public static Pattern compile(String regex) {
        return new Pattern(regex, 0);
    }

Since you are using == comparison which is for references it will not work

The only reason I can think of this behaviour is that the match flag will be defaulted to zero in the compile method which acts a factory method.

Ajay George
  • 11,759
  • 1
  • 40
  • 48