0

Basically, I have to make a script that does some stuff, one part of it is that I have to access a file that contains various data that I need, more specifically link and the name, the problem is that the file doesn't have a really specific format. File:

Package For IoT - 0.0.4 - 279 - https://package-server.something.eu-central-1.host.com/production/package_manager/
device/tlt_custom_pkg_package_iot_0.0.4_morestuff.extention - More text - tlt_custom_pkg_package_iot 
- 0.0.4
Package For IoT - 1.2.3.4 - 562 - https://package-server.something.eu-central-1.host.com/production/package_manager/
device/tlt_custom_pkg_package_iot_1.2.3.4_morestuff.extention - More text - tlt_custom_pkg_package_iot 
- 1.2.3.4
Package For IoT - 5.2.1 - 556 - https://package-server.something.eu-central-1.host.com/production/package_manager/
device/tlt_custom_pkg_package_iot_5.2.1_morestuff.extention - More text - tlt_custom_pkg_package_iot 
- 5.2.1

This is more or less the format the file follows, of course, there are more words in the names naturally, and the "More text" part is usually the description. To read the file I just do this:

        ssh.exec_command("gen list command")
        cat_stdin, cat_stdout, cat_stderr = ssh.exec_command("cat listpath")
        cat_output = cat_stdout.readlines()
        for line in cat_output:
            line_split = line.splitlines()

From here I get the whole line in form of the list object

[Package For IoT - 0.0.4 - 279 - https://package-server.something.eu-central-1.host.com/production/package_manager/
device/tlt_custom_pkg_package_iot_0.0.4_morestuff.extention - More text - tlt_custom_pkg_package_iot 
- 0.0.4]

Now here is the question: How do I split this list object the way I could use it? Because the number of words differs from package to package, I can't split using whitespace or "-", the thing that doesn't change is the server address ( at least for now, it might later ), so the safe option would be to use "https" part to split off the link ( I need it, so I should store in a variable ) and then "tlt_custom" part for the name ( I also need it ) as this specific word also stay the same, then I could just process this part even more further and adapt it to further usage. But yeah this is where I struggle, as I just recently started using python and I got stuck at this part. What I tried:

        for line in cat_output:
            line_split = line.splitlines()
            split_list = ["https", "tlt"]
            temp = zip(chain([0], line_split), chain(line_split, [None]))
            res = list(split_list[i] for i in temp)

            print(str(res))

This results in an error: list indices must be integers or slices, not tuple I also wanted to try solutions from here: How to split elements of a list? But I couldn't understand how to adapt those solutions for myself.

TL;DR I want to get the full link and tlt_custom strings from the object list, but I can't figure out how to do that.

UPDATE I tried few more things I think I got what I need:

        ssh.exec_command("command")
        cat_stdin, cat_stdout, cat_stderr = ssh.exec_command("cat filepath")
        cat_output = cat_stdout.readlines()
        for line in cat_output:
            line_split = line.splitlines()
            temp = [i.split('https:')[1] for i in line_split]
            package_info = "https:"+str(temp[0])
            temp = package_info.split('.ipk')[0]
            package_link = str(temp)+".ipk"
            temp = package_info.split('tlt_')[2]
            package_name = "tlt_"+str(temp)

            print(package_link) 
            print(package_name) 

This way I get the link and the name, in theory it should work, but I feel like this is bit convoluted, as I'm removing and adding part of the strings. Is there a better solution for this?

Wowy
  • 137
  • 2
  • 11
  • each element of `temp` is a `tuple` and you are using `tuple` as `list-index`, but `list-index` must be an `integer`. – nobleknight Apr 30 '21 at 07:52

1 Answers1

1

Hmm, if I read it correctly, the actual separator here is ' - '. So you should just split your lines on that:

ssh.exec_command("command")
cat_stdin, cat_stdout, cat_stderr = ssh.exec_command("cat filepath")

sep = ' - '              # DRY principle: write it once, use it many times
for line in cat_stdout:
    line_split = line.strip().split(sep)
    package_link = line_split[3]
    # joining last 2 fields will allow possible - chars in "More text"
    package_name = sep.join(line_split[-2:])
    print(package_link) 
    print(package_name) 

You could even directly build a list of dict with a comprehension:

...
sep = ' - '
data = [{'link': line_split[3], 'name': sep.join(line_split[-2:])}
        for line in cat_out
        for line_split in (line.strip().split(sep),)]
Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • I get: ```'builtin_function_or_method' object is not subscriptable``` when trying the first option, do I have to add something for it to work or it shouldn't have happened at all? – Wowy Apr 30 '21 at 13:49
  • @Wowy: My bad. There was a typo: `line.split[-2]` instead of `line_split[-2]`. I know that manually copying code should be avoided, yet I sometimes do it... – Serge Ballesta Apr 30 '21 at 13:53
  • could you explain me what this ```line_split[-2:]``` part does? I'm especially unsure about ```:``` – Wowy May 03 '21 at 05:07
  • 1
    @Wowy: This is the slice starting at the element before the last one (so before last and last). This [post](https://stackoverflow.com/a/509295/3545273) contains a nice and detailed explaination about slices – Serge Ballesta May 03 '21 at 06:29