0

I'm trying to use a small script to download a field from multiple pages. For one thing, I'm only able to get it from one page..., but the real problem I'm having is that I don't know how to hand the output off to a database table? How can I take the output from curl/lynx|grep (which is going to be all the list items) and move it, list item by list item, to a table in my DB or to a CSV where it will be ready for import to the DB?

#!/bin/bash

lynx --source "http://www.thewebsite.com"|cut -d\" -f8|grep "<li>"

The database I would connect to would be a MySQL database. We could call the dummy table "listTable". Please, try to stick to bash? I'm not allowed to compile on the server I'm using, and I can't seem to get curl to work with PHP. Anyway, I'm thinking I need to make a variable and then systematically pass the contents of the variable to the database, right?

Wolfpack'08
  • 3,982
  • 11
  • 46
  • 78

2 Answers2

2

Use something like awk, sed or perl to create INSERT statements, then pipe that to your sql client (psql or mysql).

Tassos Bassoukos
  • 16,017
  • 2
  • 36
  • 40
  • Tassos, would you please give me some sample to show what the piping would look like or how it is formed? – Wolfpack'08 Jan 15 '11 at 22:45
  • Basically, like the other guy said: curl whatever.com | ./script_name, right? – Wolfpack'08 Jan 15 '11 at 22:47
  • Well, it really depends on what you want to do - could be something like `curl somesite.com | grep sed etc | sed -e '/^(.*)/INSERT tableName (columnName) VALUES (\1)/' |psql dbname`. – Tassos Bassoukos Jan 15 '11 at 22:59
  • I'm going to mess around with this for a while. Very helpful, I think. A strong push in the right direction. Thank you. – Wolfpack'08 Jan 15 '11 at 23:15
0

Just write a Python script which reads everything from the stdin an puts it into the database and do something like:

curl http://www.google.com | ./put_to_db.py

Elalfer
  • 5,312
  • 20
  • 25