12

I have written some antivirus software in Python, but am unable to find virus signatures. The software works by dumping each file on the hard disk to hex, thus getting the hex signature. Where do i get signatures for all the known viruses?

nwalke
  • 3,170
  • 6
  • 35
  • 60
Zac Brown
  • 5,905
  • 19
  • 59
  • 107
  • 6
    not all viruses have static signatures – Lasse V. Karlsen Apr 27 '10 at 04:06
  • How do I do heuristic analysis with Python. That is what I originally wanted to do, but couldn't find any help for it. I think it is a more reliable way of detecting and removing viruses, worm, trojans and spyware. – Zac Brown Apr 27 '10 at 04:17
  • 2
    Why convert each file on the hard disk to hex? There's no point in doing that. Virus signatures are mainly created by companies that write antivirus software. You could use the signature database(s) from a specific antivirus vendor, but there's no point (besides learning) in writing a new antivirus that checks only the same signatures another one already does. Besides that, the "best" viruses/worms are updated frequently (sometimes, more than once per day), making signatures nearly useless. For that (and polymorphic code), you can use heuristic analysis. PS: sorry, I had to update my comment. – jweyrich Apr 27 '10 at 04:17
  • jweyrich: Got any idea how to do this? I am a Python programmer, but that is about it. – Zac Brown Apr 27 '10 at 04:25
  • @Zachary yes, I've a great understanding of antivirus techniques, but I have to say it's a very long subject, and I'm unable to explain/detail it in few sentences. It also requires excellent understanding of executable file formats, system internals, and so on. – jweyrich Apr 27 '10 at 04:30
  • Able to point me in the right direction? I will follow tutorials, read books etc. – Zac Brown Apr 27 '10 at 04:32
  • @Zachary I'd suggest to first study how viruses/worms work. Spend a lot of time reverse engineering them. - How they hide themselves; - How they spread; - How they manipulate network traffic; - How they manipulate syscalls; - How they inject/infect other processes; - How they do privilege escalation; - And the list doesn't end here. Once you know all this, you'll figure out yourself how to write a "common" antivirus software, and why current AV softwares are pointless in many cases. Note: It's not my intention to discourage the use, or your study, but it will be a long journey. – jweyrich Apr 27 '10 at 04:50
  • It seems like it would be much faster to convert the hex codes to binary once, when they're first ingested, and then search the byte-strings of your disk files for each of the signatures... (Byte-strings in Python can be specified by putting a `b` before the string, e.g. `b'hello'` or `b'\x68\x65\x6c\x6c\xf'`, which are both equivalent to `'hello'` in Python 2.) – Sarah Messer Apr 03 '17 at 19:46

3 Answers3

15

There's Clamav, the open source GPL anti-virus. You can read its source code to see how it implements heuristics and other stuff. It's written in C, though.

You can download a virus database there as well. They're free and updated frequently.

nosklo
  • 217,122
  • 57
  • 293
  • 297
6

I doubt such a list exists, anti-virus companies spend a lot of time/money building their databases and it would seem unlikely that any of them would release the data for free.

Also, as Lasse says, not all viruses have a static signature. The "good" ones (and I would assume that means the majority of viruses from this century) would all be self-mutating.

Dean Harding
  • 71,468
  • 13
  • 145
  • 180
  • Ok, thanks for the responses. I am willing to re write the software to make it "good" and not pointless. I just am not sure I know how. I need the software to be written in python. How would I go about making it good? – Zac Brown Apr 27 '10 at 04:15
  • @Zachary: Why do you want to write anti-virus software? What do you want to do that your competitors (McAfee, Symantec, AVG, Microsoft, etc) aren't doing, or aren't doing well? – Michael Petrotta Apr 27 '10 at 04:21
  • I want to provide top quality antivirus software, that update automatically at a reasonable price. I am also learning along the way. – Zac Brown Apr 27 '10 at 04:23
  • @codeka no, all antivirus release this information. They just aren't in a readable form to any other software. But one could certainly reverse engineer it (disregarding the legal part). – jweyrich Apr 27 '10 at 04:25
  • 3
    @Zachary: good luck to you, and I hope you learn something. Note that Microsoft (to pick the AV vendor that I use) publishes high-quality AV software that updates automatically and frequently, for free (for most Windows SKUs). I don't mean to discourage you, but I hope you comprehend your market - it's saturated, and very difficult to develop for. – Michael Petrotta Apr 27 '10 at 04:27
  • OK, then. Do you have any ideas for useful software that is not in a saturated market? I am open to ideas. Just trying to use my skills for good stuff. – Zac Brown Apr 27 '10 at 04:29
  • @jweyrich: Of course they "release" it (otherwise the software wouldn't be able to work) but you can't just "disregard" the legality of reverse engineering the database, particularly when you want to release whatever software you develop from that action. – Dean Harding Apr 27 '10 at 04:37
  • @Zachary - ok, write some software that watches a webcam, and translates hand gestures into software actions (pans, zooms, movements). Sell it as a library for games, image processing apps, etc. – Michael Petrotta Apr 27 '10 at 04:44
  • Michael Petrotta ; I like that idea? Can you point me in the right direction for that? – Zac Brown May 04 '10 at 06:53
2

There is a database of malware signatures in CSV format
on comodo.com you can download them from their site
Download Virus signature database

That is a quite large file(about 432MB) so it should contain a lot of signatures.

AVX-42
  • 755
  • 2
  • 13
  • 21