python : Read HTTPResponse twice

Asked Aug 06 '21 at 06:31

Active Aug 06 '21 at 06:31

Viewed 232 times

I have urllib3.response.HTTPResponse (a "file-like" object) and lxml.html.lxml_parse that takes only a file path, an URL or a "file-like" object (in case of HTTPResponse it also extracts the URL). Also I need the content from the response.

The answers from Why can't I call read() twice on an open file? do not fit the situation since the stream consumes while reading into a variable and .seek is not defined for HTTPResponse.

copy.copy and copy.deepcopy do not work too.

asked Aug 06 '21 at 06:31

Nick Vee

2

Read the content into a variable first and then pass that string to lxml.html.fromstring instead? – Iain Shelvington Aug 06 '21 at 06:36
@IainShelvington Not a bad idea. But it requires many changes in the third-party package. Nevertheless, you can post it as an answer. – Nick Vee Aug 06 '21 at 08:04
What's the third party package? If you have access to the lxml html object you could convert that to a string instead? – Iain Shelvington Aug 06 '21 at 08:07
I can trace the input path to the one of its usage but it does not mean that the properties of the input can be used in separate things. However, I was able to solve my problem using `Element` from `lxml` – Nick Vee Aug 07 '21 at 15:08

python : Read HTTPResponse twice

0 Answers0