4

I'm developing a web scraper for the Hoymiles monitoring system. One of the statistics I can get is historical data, but I get data in a strange format. After a lot of research and a search in the platform code, I found that in the post request made in addition to the headers and payload, they use a parameter that is the responseType: "arraybuffer". Hence, after more research, I found that arraybuffer is "a data type used to represent a generic, fixed-size binary data buffer".

My code is as follows:

def plants_data_historycal(self, authorization):

    payload = '''
        {
            "mode":3,
            "date":"2022-01-20"
        }
    '''

    headers = {
        'Accept': 'application/json, text/plain, */*',
        'Accept-Encoding': 'gzip, deflate, br',
        'authorization': authorization,
        'Content-Type': 'application/json;charset=UTF-8',
        'Origin': 'https://global.hoymiles.com',
        'Referer': 'https://global.hoymiles.com/platform/login',
        'Cookie': cookie,
        'User-Agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Mobile Safari/537.36'
    }

    response = self.session.post(self.url+'/pvm-data/api/0/statistics/count_station_eq', headers=headers, data=payload)

    if response.status_code != 200:
        raise RuntimeError("A requisição falhou: %s", response)

    print(response.text)
    data = BeautifulSoup(response.text, 'html.parser')
    data = json.loads(data.text)

    return data

The response to my request looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20Y
pv_eqP��G�@H��[H�nH���H�0H��G���G�QH���G��H�JH�
�H`�H�1�H�ݗH@]sH��_H`j�H�!�H

The response.text before BeatifulSoup

\n\x011\n\x012\n\x013\n\x014\n\x015\n\x016\n\x017\n\x018\n\x019\n\x0210\n\x0211\n\x0212\n\x0213\n\x0214\n\x0215\n\x0216\n\x0217\n\x0218\n\x0219\n\x0220\n\x0221\n\x0222\n\x0223\x12e\n\x05pv_eq\x12\\\x00��G\x00�@H��[H\x00�nH���H�\x080H\x00��G���G�Q\x00H���G��\x02H�J\x03H�\r�H`�H�1�H�ݗH@]sH��_H`j�H�!�H���H`z�H I�H

To try to turn this string into something understandable, I tried to use the code available in the inspect option of the Hoymiles home page Chrome browser (https://global.hoymiles.com/platform/home). From there I found that they transformed the arraybuffer with the function

transformResponse: [
    function (e) {
            if ("string" == typeof e)
                   try {
                        e = JSON.parse(e);
                   } catch (e) {}
                        return e;
     },
],

But even with that, the arraybuffer comes empty. So I turned the response as arraybuffer into a Uint8Array, but I can't understand what the data means.

Uint8Array {
0: 123
1:34
10:34
11:49
12:48
13:48
14:34
15:44
16:34
17: 100
18:97
19: 116
2: 115
20: 97
21:34
22:58
23:34
24: 34
25:44
26: 34
27: 109
28: 101
29: 115
3: 116
30: 115
31: 97
32: 103
33: 101
34: 34
35: 58
36: 34
37: 116
38: 111
39: 107
4: 97
40: 101
41: 110
42: 32
43: 118
44: 101
45: 114
46: 105
47: 102
48: 121
49: 32
5: 116
50: 101
51: 114
52: 114
53: 111
54: 114
55: 46
56: 34
57: 125
6: 117
7: 115
8:34
9:58
}

Does anyone know how to turn this into readable or understandable data?

  • This is probably not the answer you're looking for. But I always found web scrapping in javascript to be a much more enjoyable experience using the `puppeteer` library. In addition you can convert from arraybuffer to string in javascript quite easily (https://stackoverflow.com/questions/6965107/converting-between-strings-and-arraybuffers). If you know python learning javascript will be very easy. – carlosdafield Jan 22 '22 at 03:07
  • 1
    Sorry for the late answer, but I found that actually this is the python answer already in string and not in bytes. I tried to figure out how to turn the arraybuffer into something understandable in javascript, but I couldn't. I turned this arraybuffer into a Uint8Array, but I can't understand what the data means. I'll edit the question and add the array :) – Orrana Lhaynher Jan 24 '22 at 20:24
  • You have a token verify error read my post below and accept the answer if it helped you :) – carlosdafield Jan 24 '22 at 23:13
  • I made a post with what I received with your code, in fact the response I am getting from the request is in json format – Orrana Lhaynher Jan 25 '22 at 12:11

2 Answers2

2

So I don't know what data type you're expecting from that Uint8Array. But below are some insights that might help you answer your question. The Uint8Array typed array represents an array of 8-bit unsigned integers. So theres ambiguity of what this is suppose to represent. It could be a String, a float, json, an int.

So the output you printed of the Uint8Array I'm assuming treats the keys as index positions in the array and the values are the values at that index in the array. However index 50 is not a key in your output so that's confusing

Anyways I made a Uint8Array with the values in the same order as your output below.


// Same array you printed out
const lst = [ 123, 34, 115, 116, 97, 116, 117, 115,  34,  58,  34,  49,  48, 
 48, 34,  44,  34,  100,  97,  116,  97,  34,  58,34,  34,  44,  34,  109,  
 101,  115,  115,  97,  103,  101,  34,  58,  34,  116,  111,  107,  101, 
 110,  32,  118,  101,  114,  105,  102,  121,  32,  undefined,  114,  114,  
  111,  114,  46,  34 ]

// Make the Uint8Array
var data = new Uint8Array(lst);

// Print as String
let str = Buffer.from(data).toString('base64');
console.log(str); // output: eyJzdGF0dXMiOiIxMDAiLCJkYXRhIjoiIiwibWVzc2FnZSI6InRva2VuIHZlcmlmeSAAcnJvci4i


// Print as float 
const floatValue =new DataView(data.buffer).getFloat64(0);
console.log(floatValue) // output: 1.3718470458079746e+285

// Print as json
function printAsJson(arr) {
  let str = "";
  for (var i=0; i<arr.byteLength; i++) {
    str += String.fromCharCode(arr[i]);
  }

  var serializedData = JSON.stringify(str);
  let message = JSON.parse(serializedData);

  console.log(message)
}
printAsJson(data); // output: {"status":"100","data":"","message":"token verify rror."

So to sum it up the json makes the most sense and it seems like you have a token verify error.

carlosdafield
  • 1,479
  • 5
  • 16
0

Actually I had a token verification error, since I was using an outdated version of the authorization token in the javascript code. So I compiled the code with the updated token and used your code.

Code

const lst = [10, 1, 49, 10, 1, 50, 10, 1, 51, 10, 1, 52, 10, 1, 53, 10, 1, 54, 10, 1, 55, 10, 1, 56, 10, 1, 57, 10, 2, 49, 48, 10, 2, 49, 49, 10, 2, 49 , 50, 10 , 2, 49 , 51, 10 , 2, 49 , 52 , 10 , 2 , 49 , 53 , 10 , 2 , 49 , 54 , 10 , 2 , 49 , 55 , 10 , 2 , 49 , 56 , 10 , 2 , 49 , 57 , 10 , 2 , 50 , 48 , 10 , 2 , 50 , 49 , 10 , 2 , 50 , 50 , 10 , 2 , 50 , 51 , 10 , 2 , 50 , 52 , 18 , 105 , 10 , 5 , 112 , 118 , 95 , 101 , 113 , 18 , 96 , 0 , 244 ,  214 ,  71 ,  0 ,  144 ,  64 ,  72 ,  128 ,  237 ,  91 ,  72 ,  0 ,  162 ,  110 ,  72 ,  128 ,  138 ,  130 ,  72 ,  128 ,  8 ,  48 ,  72 ,  0 ,  236 ,  252 ,  71 ,  128 ,  209 ,  205 ,  71 ,  192 ,  81 ,  0 ,  72 ,  128 ,  189 ,  253 ,  71 ,  128 ,  156 ,  2 ,  72 ,  192 ,  74 ,  3 ,  72 ,  128 ,  13 ,  128 ,  72 ,  96 ,  234 ,  150 ,  72 ,  192 ,  49 ,  161 ,  72 ,  224 ,  221 ,  151 ,  72 ,  64 ,  93 ,  115 ,  72 ,  192 ,  145 ,  95 ,  72 ,  96 ,  106 ,  129 ,  72 ,  160 ,  33 ,  164 ,  72 ,  160 ,  240 ,  143 ,  72 ,  96 ,  122 ,  147 ,  72 ,  32 ,  73 ,  158 ,  72 ,  160 ,  159 ,  139 ,  72]

var data = new Uint8Array(lst);

let str = Buffer.from(data).toString('base64');
console.log(str); 

const floatValue =new DataView(data.buffer).getFloat64(0);
console.log(floatValue) 

function printAsJson(arr) {
  let str = "";
  for (var i=0; i<arr.byteLength; i++) {
    str += String.fromCharCode(arr[i]);
  }

  var serializedData = JSON.stringify(str);
  let message = JSON.parse(serializedData);

  console.log(message)
}
printAsJson(data); 

This was the result:

str

CgExCgEyCgEzCgE0CgE1CgE2CgE3CgE4CgE5CgIxMAoCMTEKAjEyCgIxMwoCMTQKAjE1CgIxNgoCMTcKAjE4CgIxOQoCMjAKAjIxCgIyMgoCMjMKAjI0EmkKBXB2X2VxEmAA9NZHAJBASIDtW0gAom5IgIqCSIAIMEgA7PxHgNHNR8BRAEiAvf1HgJwCSMBKA0iADYBIYOqWSMAxoUjg3ZdIQF1zSMCRX0hgaoFIoCGkSKDwj0hgepNIIEmeSKCfi0g=

floatValue

1.7470648220442632e-260

json

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24i
pv_eq`

Unfortunately, I still haven't gotten data that can be interpreted.