I am doing some large scale analysis of Java projects on Github; this involves cloning the project via git (nothing special here), then doing a static read of the code for analysis, and so on.
My question: is there a way to programmatically recover the Github url for each source file in the cloned repository? I'm trying to get this so I can then link back into Github and see the original source.
For example, the following url (what I want): https://github.com/elastic/elasticsearch/blob/master/libs/elasticsearch-nio/src/main/java/org/elasticsearch/nio/BytesWriteOperation.java#L35 points to a particular function. Of course, the directory structure in the cloned version doesn't match the url name, e.g., the /blob/master/ seems particular to this project and structure of the repository.