2

Hi I am trying to learn JAVA deeply and so I am digging into the JDK source code in the following lines:

URL url = new URL("http://www.google.com");
URLConnection tmpConn = url.openConnection();

I attached the source code and set the breakpoint at the second line and stepped into the code. I can see the code flow is: URL.openConnection() -> sun.net.www.protocol.http.Handler.openConnection() I have two questions about this

First In URL.openConnection() the code is:

public URLConnection openConnection() throws java.io.IOException {
        return handler.openConnection(this);
    }

handler is an object of URLStreamHandler, define as blow

transient URLStreamHandler handler;

But URLStreamHandler is a abstract class and method openConnection() is not implement in it so when handler calls this method, it should go to find a subclass who implement this method, right? But there are a lot classes who implement this methods in sun.net.www.protocol (like http.Hanlder, ftp.Handler ) How should the code know which "openConnection" method it should call? In this example, this handler.openConnection() will go into http.Handler and it is correct. (if I set the url as ftp://www.google.com, it will go into ftp.Handler) I cannot understand the mechanism.

second. I have attached the source code so I can step into the JDK and see the variables but for many classes like sun.net.www.protocol.http.Handler, there are not source code in src.zip. I googled this class and there is source code online I can get but why they did not put it (and many other classes) in the src.zip? Where can I find a comprehensive version of source code?

Thanks!

user1722361
  • 377
  • 1
  • 4
  • 14

3 Answers3

8

First the easy part:

... I googled this class and there is source code online I can get but why they did not put it (and many other classes) in the src.zip?

Two reasons:

  • In the old days when the Java code base was proprietary, this was treated as secret-ish ... and not included in the src.zip. When they relicensed Java 6 under the GPL, they didn't bother to change this. (Don't know why. Ask Oracle.)

  • Because any code in the sun.* tree is officially "an implementation detail subject to change without notice". If they provided the code directly, it helps customers to ignore that advice. That could lead to more friction / bad press when customer code breaks as a result on an unannounced change to sun.* code.

Where can I find a comprehensive version of source code?

You can find it in the OpenJDK 6 / 7 / 8 repositories and associated download bundles:


Now for the part about "learning Java deeply".

First, I think you are probably going about this learning in a "suboptimal" fashion. Rather than reading the Java class library, I think you should be reading books on java and design patterns and writing code for yourself.

To the specifics:

But URLStreamHandler is a abstract class and method openConnection() is not implement in it so when handler calls this method, it should go to find a subclass who implement this method, right?

At the point that the handler calls than method, it is calling it on an instance of the subclass. So finding the right method is handled by the JVM ... just like any other polymorphic dispatch.

The tricky part is how you got the instance of the sun.net.www.protocol.* handler class. And that happens something like this:

  1. When a URL object is created, it calls getURLStreamHandler(protocol) to obtain a handler instance.

  2. The code for this method looks to see if the handler instance for the protocol already exists and returns that if it does.

  3. Otherwise, it sees if a protocol handler factory exists, and if it does it uses that to create the handler instance. (The protocol handler factory object can be set by an application.)

  4. Otherwise, searches a configurable list of Java packages to find a class whose FQN is package + "." + protocol + "." + "Handler", loads it, and uses reflection to create an instance. (Configuration is via a System property.)

  5. The reference to handler is stored in the URL's handler field, and the URL construction continues.

So, later on, when you call openConnection() on the URL object, the method uses the Handler instance that is specific to the protocol of the URL to create the connection object.

The purpose of this complicated process is to support URL connections for an open-ended set of protocols, to allow applications to provide handlers for new protocols, and to substitute their own handlers for existing protocols, both statically and dynamically. (And the code is more complicated than I've described above because it has to cope with multiple threads.)

This is making use of a number of design patterns (Caches, Adapters, Factory Objects, and so on) together with Java specific stuff such as the system properties and reflection. But if you haven't read about and understood those design patterns, etcetera, you are unlikely to recognize them, and as a result you are likely to find the code totally bamboozling. Hence my advice above: learn the basics first!!

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
0

But URLStreamHandler is a abstract class and method openConnection() is not implement in it so when handler calls this method, it should go to find a subclass who implement this method, right?

It has to be declared or abstract or implemented in URLStreamHandler. If you then give an instance of a class that extends URLStreamHandler with type URLStreamHandler and call the openConnection() method, it will call the one you have overriden in the instance of the class that extends URLStreamHandler if any, if none it will try to call the one in URLStreamHandler if implemented and else it will probably throw an exception or something.

fonZ
  • 2,428
  • 4
  • 21
  • 40
  • Yes the class with the override method will extends the URLStreamHandler. I mean that there are a lot classes who extends URLStreamHandler, and the handler type is the parent class, how should it know which exactly implementation it should really execute? – user1722361 Oct 06 '12 at 23:33
  • It will do that automatically with methods you override. As long as you give the correct instance. – fonZ Oct 06 '12 at 23:35
  • That is just the point of an abstract class, you dont have to know the subclass, as long as the method is properly overriden in the subclass. The method is basically the same, the only difference is the body of the method. – fonZ Oct 06 '12 at 23:42
  • Thank you. I think I find a line in URL class. if (handler == null && (handler = getURLStreamHandler(protocol)) == null) { throw new MalformedURLException("unknown protocol: "+protocol); } which will return a correct instance of the handler. – user1722361 Oct 06 '12 at 23:46
  • Do you know why there are no source codes for many of the library classes in jdk in src.zip? – user1722361 Oct 06 '12 at 23:47
  • They are probably compiled in lib files. – fonZ Oct 06 '12 at 23:56
0

Take a look at URL.java. openConnection uses the URLStreamHandler that was previously set in the URL object itself.

The constructor calls getURLStreamHandler, which generates a class name dynamically and loads, and the instantiates, the appropriate class with the class loader.

Jeremy Roman
  • 16,137
  • 1
  • 43
  • 44