0

I want to fill a database table with certain items from the Steam Marketplace, specifically at the moment, guns from CSGO. I can't seem to find any database or list already of all the gun names, skin names and skin qualities, which is what I want.

One way I thought of to do it is to get to the list of items I want, EG "Shotguns", and save each item on the page into the database, and go through each page of that search. EG: http://steamcommunity.com/market/search?appid=730&q=shotgun#p1_default_desc http://steamcommunity.com/market/search?appid=730&q=shotgun#p2_default_desc Ect..

Firstly, I'm not exactly sure how I would do that, and secondly, I wanted to know if there would be an easier way.

I plan on using the names of items to later get the prices by substituting the names into this: http://steamcommunity.com/market/priceoverview/?currency=3&appid=730&market_hash_name=StatTrak%E2%84%A2%20P250%20%7C%20Steel%20Disruption%20%28Factory%20New%29

And updating the prices every hour or so by running that check for every item. (probably at least a few thousand..)

Mitch8910
  • 185
  • 1
  • 2
  • 15

1 Answers1

1

The general gist of what you need to do boils down to:

  • Identify the urls you need to parse. In your case you'll notice that the results are loaded via ajax. Right-click the page, click 'inspect element' and go to the network tab. You'll see that the actual url is: http://steamcommunity.com/market/search/render/?query=&start=<STARTVALUE>&count=<NUMBEROFRESULTS>&search_descriptions=0&sort_column=quantity&sort_dir=desc&appid=730&category_730_ItemSet%5B%5D=any&category_730_TournamentTeam%5B%5D=any&category_730_Weapon%5B%5D=any&category_730_Type%5B%5D=tag_CSGO_Type_Pistol&category_730_Type%5B%5D=tag_CSGO_Type_SMG&category_730_Type%5B%5D=tag_CSGO_Type_Rifle&category_730_Type%5B%5D=tag_CSGO_Type_SniperRifle&category_730_Type%5B%5D=tag_CSGO_Type_Shotgun&category_730_Type%5B%5D=tag_CSGO_Type_Machinegun&category_730_Type%5B%5D=tag_CSGO_Type_Knife
  • Identify what the response type is. In this case it is json, and the data we want is inside a html-snippet
  • Find the framework required to parse it. You can use json_decode(...) to decode the json string. This question will give more information how to parse html.
  • You can now feed these urls to a function that loads the page. You can use file_get_contents(...) or the curl library.
  • Enter the values you parse from the response into your database. Make sure that the script does not get killed when it runs for too long. This question will give you more information about that.

You can use the following as a framework. You'll have to figure the structure of the html yourself, and lookup a tutorial of the html parser and mysql library you want to use.

<?php
  //Prevent this script from being killed. Please note that if this script never
  //ends, you'll have to kill it manually
  set_time_limit( 0 );

  //The api does not allow for more than 100 results at a time
  $start = 0;
  $count = 100;
  $maxresults = PHP_INT_MAX;
  $baseurl = "http://steamcommunity.com/market/search/render/?query=&start=$1&count=$2&search_descriptions=0&sort_column=quantity&sort_dir=desc&appid=730&category_730_ItemSet%5B%5D=any&category_730_TournamentTeam%5B%5D=any&category_730_Weapon%5B%5D=any&category_730_Type%5B%5D=tag_CSGO_Type_Pistol&category_730_Type%5B%5D=tag_CSGO_Type_SMG&category_730_Type%5B%5D=tag_CSGO_Type_Rifle&category_730_Type%5B%5D=tag_CSGO_Type_SniperRifle&category_730_Type%5B%5D=tag_CSGO_Type_Shotgun&category_730_Type%5B%5D=tag_CSGO_Type_Machinegun&category_730_Type%5B%5D=tag_CSGO_Type_Knife";

  while( $start < $maxresults ) {
    //Constructing the next url
    $url = str_replace( "$1", $start, $baseurl );
    $url = str_replace( "$2", $count, $url );

    //Doing the request
    $ch = curl_init();
    curl_setopt( $ch, CURLOPT_URL, $url );
    curl_setopt( $ch, CURLOPT_RETURNTRANSFER, 1 );
    $result = json_decode( curl_exec( $ch ), TRUE );
    curl_close( $ch );

    //Doing things with the result
    //
    //First let's see if everything went according to plan
    if( $result == NULL || $result["success"] !== TRUE ) {
      echo "Something went horribly wrong. Please edit the script to take this error into account and rerun it.";
      exit( -1 );
    }

    //Bookkeeping for the next url we have to fetch
    $count = $result["pagesize"];
    $start += $count;
    $maxresults = $result["total_count"];

    //This is the html we have to parse
    $html = $result["results_html"];

    //Look up an example how to parse html, and how to get data from it
    //Look up how to make a database connection and how to insert data into
    //your database
  }

  echo "And we are done!";
Community
  • 1
  • 1
Sumurai8
  • 20,333
  • 11
  • 66
  • 100
  • Thanks for the answer. When I Inspect Element > Network, there is nothing there to show that is the URL, though, the info shown when I visit that URL looks like it could be what I need. But I'm still not sure how to access it. I can see that you have laid out what I should have to do, but I didn't expect there to be so much more to learn. As much as I would love to learn all the extra info to do this, this project is on a time-line and I don't have the time. Is there any other possible ways for me to do this, or could you possibly help me with some code for this to get me started off? – Mitch8910 Apr 18 '15 at 12:55
  • @Samurai8 I know I'm not supposed to do this, but thank you so much for the code! It helps a lot! One question though, you noted that if the script never ends, I'll have to kill it manually. Is there a reason it shouldn't end? When I'm testing the code it works perfectly, I can't see where I should add an argument to see if it doesn't end. – Mitch8910 Apr 19 '15 at 15:46
  • Badly written functions, or functions you have called might get stuck and never finish. I have not done a great deal of reading on the curl_* functions, so I am not sure if this is actually a problem with those functions. I know with sockets, in some cases a function can be unaware a socket closed, and continue waiting for something to happen on that socket. You might imagine that when a socket is closed, there will never be any data sent on that socket, so the function reading data on that socket will not do anything until something else wakes it up. – Sumurai8 Apr 19 '15 at 19:59