1

I'm trying to make use of Roslyn in my application. I added the NuGet package for that (Microsoft.CodeAnalysis.CSharp). It turns out that this package depends on many others:

enter image description here

Most of them seem unused on the .NET Framework (e.g. System.Runtime which provides types such as Int32).

enter image description here

There is an incredible contrast between the size of those packages and what is actually being used. A lot of the data appear to be translations:

enter image description here

(I'm not saying someone did a bad job here. I'm not competent enough to make that determination. I just want to find out how to deal with this, see below.)

It seems like a bad idea to check all of that in to source control. If I update those packages every few months then then repository will bloat by gigabytes from all those files over time. I'm also worried about IDE speed and frankly about confusing myself in this huge sea of packages.

Package restore cannot be used due to security concerns and reliance on another company's ongoing service for our source code to be able to build. By "other company" I mean that I need to trust Microsoft and the package authors to be available at all times, not send me a virus and not remove old versions causing my app to break. If I download just once, or every few month, the exposure is greatly reduced.

How would I best minimize the source control impact of using Roslyn?

I really hope I will be able to put the packages into source control. I hope there is a way to remove all the packages I don't need and restrict them to what ends up in the bin folder.

boot4life
  • 4,966
  • 7
  • 25
  • 47
  • You don't want to check in the binaries from Nuget. All that stuff can be downloaded automatically once you've declared the dependency. Ignore the `packages` directory. – Berin Loritsch Mar 27 '17 at 17:46
  • 1
    You can always set up [your own in-house package respository](https://learn.microsoft.com/en-us/nuget/hosting-packages/overview) – stuartd Mar 27 '17 at 17:46
  • I will investigate the in-house repo. By "other company" I mean that I need to trust Microsoft and the package author to be available at all times, not send me a virus and not remove old versions causing my app to break. If I download just once, or every few month, the exposure is greatly reduced. – boot4life Mar 27 '17 at 17:50
  • Does this boil down to [this question](http://stackoverflow.com/q/19951328/3063273)? – Matt Thomas Mar 27 '17 at 17:57
  • 1
    @MattThomas I tried that but all those packages appear "used" to the tools. R# reports "50 implicitly used references" which includes the dreaded System.Runtime. – boot4life Mar 27 '17 at 17:59
  • "It turns out it depends on all these packages." "I hope there is a way to remove all the packages I don't need..." By using Roslyn, you're requiring all the dependencies it tells you it needs. Don't try to remove those other packages. – krillgar Mar 27 '17 at 18:00
  • @krillgar these are not runtime dependencies for sure. The DLLs don't end up in the bin folder. They are not even referenced by the project. I understand this is just a byproduct of the NuGet package supporting all .NET platforms under the sun. Maybe I can get away with manually deleting the most egregious ones. System.Runtime is worth 80MB alone. – boot4life Mar 27 '17 at 18:01
  • @boot4life "I tried that but all those packages appear "used" to the tools" -- to have a reliable solution, it sounds like you might have to invent your own tool. If you have a manual way to determining packages that "are not even referenced by the project", then perhaps that can be automated? – Matt Thomas Mar 27 '17 at 18:03
  • I could write a little tool that crudely deletes packages that are not used in any bin folder. Then, I run that tool after each NuGet update session. I will investigate that. – boot4life Mar 27 '17 at 18:04
  • You need to do some reading about how NuGet works. Especially if you're getting first-party packages from Microsoft, the likelihood that NuGet will serve you a virus is minuscule. Microsoft is committed to 100% availability of those servers. They control the code that is served after it has been uploaded to the NuGet package repository. If you're that concerned about security, then you will need to download all of the .NET Source Code, test it is not a virus itself, put it into your own repository, and be responsible for keeping it up to date. – krillgar Mar 27 '17 at 18:05
  • @krillgar just want to limit exposure. What if some package author decides to pull the package? And surely there is a way to update package versions after the fact (even if this is not regularly done). Hack Jimmy Bogard and AutoMapper now contains a virus on 1000 production servers. That's my thinking. – boot4life Mar 27 '17 at 18:12
  • @boot4life interesting, I have a dependency on Automapper in more than one version and I run the antivirus now with no risk. Nuget is feeding the files with virus you mean? – Zinov Mar 27 '17 at 18:16
  • @Zinov it's not going to be a "virus". It's custom code like `if (RunningOnServerOS && DateIs3DaysInFuture) { SendWebConfigToAttacker(); OpenShell(); }`. It's literally true that Jimmy Bogard has an access key to a few, random production servers that he does not own. (Not picking on that guy. Just a good example.) I recently consulted a finance company which has banned NuGet for that reason (banning in particular was wrong but the security concerns are true). – boot4life Mar 27 '17 at 18:17
  • @boot4life in my concern I don't know any tools that can help you to identify base on a package.config which are the real dependencies that you will need on your dependency graph for Roslyn for example. Most of the time, I centralize all my dependencies in a folder, and all my projects point to them, instead of replicate them per project. And I save them in the TFS – Zinov Mar 27 '17 at 18:29
  • Preliminary testing on that little console app shows that it's possible to detect unused references and automatically kill them. I needed to find usages by looking at all bin folders and comparing the file contents. Also, I looked at file names in csproj-files. The tool deleted 218MB and the console app still builds (after clean) and runs. I'll see if this works on the real app. NuGet offers package restore. – boot4life Mar 27 '17 at 18:31
  • @Zinov my idea is to use the real build on the developer machine to detect what is needed. Anything that's needed *must* end up in the bin folder so that the app can run on a different machine. – boot4life Mar 27 '17 at 18:33
  • Your fears about a package being pulled [are not totally unfounded](https://arstechnica.com/information-technology/2016/03/rage-quit-coder-unpublished-17-lines-of-javascript-and-broke-the-internet/). However, that was in NPM, not NuGet. NPM has changed their policy to not allow this to happen again. I know that NuGet does not allow for easy removal of packages, also for the same reason, but they had that in place preemptively. – krillgar Mar 27 '17 at 18:39
  • @krillgar right. I just made the Jimmy Bogard plot even better. He could release the DLL himself to nuget.org and then claim he was hacked. Also, steal Bitcoin wallets from the developers or private pictures to dox them :) It would be really tragic, yet funny, if automapper.dll activated your webcam. – boot4life Mar 27 '17 at 18:43
  • If that's your fear, then don't use any 3rd party libraries and write everything yourself from scratch. If you don't trust the Microsoft developed libraries, then you shouldn't be using .NET. However, you'd find the same issues no matter what language you use. – krillgar Mar 27 '17 at 19:17
  • Well, my doors have locks although they can be broken. Just picking the right trade-off. – boot4life Mar 27 '17 at 19:40
  • @boot4life take a look on this discussion, and see if this guys can help you http://chat.stackoverflow.com/rooms/138741/discussion-between-citizenmatt-and-jon-skeet – Zinov Mar 27 '17 at 20:03

1 Answers1

3

You shouldn't check the packages folder to source control; that'll make you sad. It sounds like your real problem is concern of the reliability of nuget.org or packages existing. If that's a concern you might want to consider just storing the .nupkg files in a place where you control and can keep them around. You might want to take a look at these instructions of how you could run your own server, which can be as low tech as "have a network drive".

You could also check in the .nupkg files and then use that as your feed, which would at least reduce the number of files you need to manage. At least those will be smaller and compressed vs. the uncompressed folders.

Jason Malinowski
  • 18,148
  • 1
  • 38
  • 55
  • 1
    I will look into that. My objection to not checking in is that it makes the repo not self contained. The repo alone cannot be used to build. Also, the guaranteed ability to build or understand historical versions gets lost. External dependencies should be checked in in principle. – boot4life Mar 30 '17 at 18:43