0

Beginner programmer here. after a bunch of reading about 'variables' which dont exist in python, (still dont get that) Ive come to the opinion that I think I should be using lists in my data structure, but im not entirely sure.

im analysing a string which contains commands and normal words to be printed out, if the word is a command, which in this case is preceded by an '@' i want it to be in a separate list to the actual words. i will then process all this to be printed out afterwards, but i want the list to be ordered, so i can go through each element and test if it has a word in it, and if not do some action on the command.

so what i want really is a list with two indices(thankyou!) (what do you call that ?) like this:

arglist[x][y]

so i can go through arglist and process whether or not it contains a command or a word to be printed out. i want arglist[x] to contain words and arglist[y] to contain commands.

arglist = [] # not sure how to initialise this properly.
doh="my @command string is this bunch of words @blah with some commands contained in it"
for listindex, word in enumerate(doh):
    if word.startswith('@'):
        # its a command 
        arglist[listindex] = None
        arglist[listindex][listindex]=command
    else:
        # its a normal word
        arglist[listindex]=word
        rglist[listindex][listindex]=None

then i want to be able to go down the list and pick out commands, i guess that would be something like this:

# get the number of items in the list and go through them...
for listindex, woo in enumerate(len(arglist)):
    if arglist[listindex] is None:
        # there is no word here, so print command
        print arglist[listindex][listindex]      
        else:
            # just print out word
            print arglist[listindex] 

so: my question is which data type/ structure should I be using and should I / how do I initialise it ? am I barking up the right tree here?

edit: i just found this gem and now im even more unsure - i want it to be the fastest lookup possible on my data but i still want it to be ordered.

dict and set are fundamentally different from lists and tuples`. They store the hash of their keys, allowing you to see if an item is in them very quickly, but requires the key be hashable. You don't get the same membership testing speed with linked lists or arrays.

many thanks as usual for any help.

edit: eg my string from above should look something like this. doh="my @command string is this bunch of words @blah with some commands contained in it"

arglist[1] = 'my'
arglist[1][1] = None

arglist[2] = None
arglist[2][1] = command

arglist[3] = 'string'
arglist[3][1] = None

etc etc

this whole thing has left me a bit baffled i shall try and update this later.

EDIT: if anyone wanted to know what this was all about look here

  • There are *"variables"*, they are just not bound to a type... – Constantinius Dec 02 '11 at 13:29
  • Seems like a [code review](http://codereview.stackexchange.com) question. – kojiro Dec 02 '11 at 13:32
  • That's not technically true - variables are bound to a type, it's *names* that are not. It *is*, admittedly, kind of confusing, as one primarily deals with names, not with the actual variables - because of this, everyone calls names "variables", which makes it even worse. – Nate Dec 02 '11 at 13:32
  • Python has variables, [it just doesn't have static variables](http://stackoverflow.com/questions/592931/why-doesnt-python-have-static-variables). – kojiro Dec 02 '11 at 13:32
  • hokay, doesnt make it any clearer though, why have mutable and immutable things if nothing is bound to any type ? actually dont answer that it will just confuse me even more :) –  Dec 02 '11 at 13:32
  • @kojiro, absolutely not anywhere near a code review question, code review is for reviewing code, not asking questions. (as far as i knew) –  Dec 02 '11 at 13:33
  • *"a list with two indexes ... `arglist[x][y]` ... `arglist[x]` to contain words and `arglist[y]` to contain commands"*. Erm.. I think your concept of 2d arrays/list is a little off. That's not how multidim arrays work, python or order languages. – Shawn Chin Dec 02 '11 at 13:36
  • 1
    What do you want to do with this? Why is order important? An idea of what you're trying to achieve would greatly inform advice given. – MattH Dec 02 '11 at 13:36
  • Ask one question at a time. What is your desired output? The plural of index is indices ;) – Jochen Ritzel Dec 02 '11 at 13:37
  • order is important because i am reading it back one at a time to print out, except if it contains a command, which then should do some action or another. –  Dec 02 '11 at 13:40
  • So the output you'd expect is two separate lists, one for words and one for commands and they all appear in order? – Shawn Chin Dec 02 '11 at 13:44
  • yep :) each element should be a 'word' position in the string so to speak, so i can step through the words in the string (as written above) and check if it contains a word or a command. –  Dec 02 '11 at 13:48
  • 1
    ok, now you've lost me again. This sounds like an [XY problem](http://meta.stackexchange.com/q/66377/134327). How about you describe what you're trying to achieve rather than how you're trying to solve it? – Shawn Chin Dec 02 '11 at 13:52
  • i thought it was just a simple question on how to initialise lists! (with multiple *indices*) updated above... im not sure how else to describe it other than above. what ive written above is *exactly* what im trying to do. –  Dec 02 '11 at 13:55
  • The problem is you concept of multi-dimensional lists is not accurate. If `arglist[2] = None` then there won't be a `arglist[2][1]`. – Shawn Chin Dec 02 '11 at 13:57
  • maybe you could use a list of 2-tuples `[(command, None), (None, word), (command, word), ...]` – Facundo Casco Dec 02 '11 at 14:21
  • no idea mate, completely baffled now. –  Dec 02 '11 at 14:24

3 Answers3

2

The problem here is you've misunderstood the concept of multidimensional lists (arrays). To annotate the expected output you appended to your question:

arglist[1] = 'my'
arglist[1][1] # equivalent to 'my'[1], so you get 'y'.
              # and you cannot assign anything to arglist[1][1] 
              # because 'str' is immutable

arglist[2] = None
arglist[2][1] # invalid, because arglist[2] (None) is not subscriptable

If you simply want to iterate through the words and perform different operations depending on whether it is a command (starts with @) or a word, then you can do:

for val in doh.split():
    if val.startswith("@"):  # this is a command
        do_commandy_stuff(val)
    else:   # this is a word
        do_wordy_stuff(val)

If what you want is to be able to quickly look up words using and index and determine if it is a command or not, then how about:

>>> lookup = [(w.startswith("@"), w) for w in doh.split()]
>>> lookup
[(False, 'my'), (True, '@command'), (False, 'string'), (False, 'is'), (False, 'this'), (False, 'bunch'), (False, 'of'), (False, 'words'), (True, '@blah'), (False, 'with'), (False, 'some'), (False, 'commands'), (False, 'contained'), (False, 'in'), (False, 'it')]

lookup is now a list of tuples. The first value in the tuple denotes if it is a command or not, the second stores the word.

Looking words up is simple:

is_command, word = lookup[1]   # is_command = True, word = "@command"

While this seems closer to what you're trying to achieve, I don't see an obvious benefit unless you need lots of random access to words.

Shawn Chin
  • 84,080
  • 19
  • 162
  • 191
  • i think you misunderstood what i was asking i want a 'lookup table' type thing that already has all this stuff processed in. so yes i could use this function *before* i put it in my data structure. –  Dec 02 '11 at 14:14
  • Added another alternative, assuming I got you right this time. – Shawn Chin Dec 02 '11 at 14:22
  • @colonwq is that closer to what you're trying to achieve then? – Shawn Chin Dec 02 '11 at 14:33
  • ok, apologies for getting overwhelmed, this is nearly it i think, ill have to post back when ive got more a handle on it, cheers again... –  Dec 02 '11 at 17:42
  • thanks dude, do you think tuples are the fastest thing to use for a lookup like this? that might be another question, or it might be the same one :) –  Dec 02 '11 at 18:04
  • That would be a different question, but yes, if you do want to lookup items by index then `list`/`tuple` will be faster than `dict`. Of course, unless you're dealing with large amounts of data, the difference may not be noticeable. – Shawn Chin Dec 02 '11 at 18:13
  • Yup. Tuples are immutable. A list containing tuples as used above should be ok for your needs. – Shawn Chin Dec 03 '11 at 15:58
  • this is what i meant: http://pastebin.com/YfzMVj1i - and you helped solve it even though you didnt know what for, i really cant think of a better way of doing it than that, doing it 'inline' was messing with the timing too much. –  Dec 05 '11 at 06:03
  • Haven't got a chance to look at it in detail, but impressive result. Well done! – Shawn Chin Dec 05 '11 at 10:18
  • thankyou very much :) havent added the version with your name in it yet :) you helped break the back of the most annoying bit. ta! –  Dec 05 '11 at 13:44
1

If I'm guessing what you're trying to do correctly, you just need two lists. Something like:

>>> def separate_commands(sample):
...   cmds, words = [], []
...   for word in sample.split(' '):
...     if word.startswith('@'):
...       cmds.append(word)
...     else:
...       words.append(word)
...   return cmds, words
...
>>> cmds, words = separate_commands("my @command string is this bunch of words @blah with some commands contained in it")
>>> print cmds
['@command', '@blah']
>>> print words
['my', 'string', 'is', 'this', 'bunch', 'of', 'words', 'with', 'some', 'commands', 'contained', 'in', 'it']

Update

>>> COMMANDS = dict(
...   green = '^]GREEN;',
...   brown = '^]BROWN;',
...   blink = '^]BLINK;',
...   reset = '^]RESET;',
... )
>>>
>>> def process_string(sample):
...   ret = []
...   for word in sample.split(' '):
...     if word.startswith('@'):
...       ret.append(COMMANDS.get(word[1:],'<UNKNOWN COMMAND>'))
...     else:
...       ret.append(word)
...   return ' '.join(ret)
...
>>> print process_string("my @green string is this bunch of words @reset with some commands contained in it")
my ^]GREEN; string is this bunch of words ^]RESET; with some commands contained in it
MattH
  • 37,273
  • 11
  • 82
  • 84
  • thanks dude, how do i determine in which order those commands came though, i need to know where in the string they happened. say for example if one of the commands tells the text colour to change to blue and then prints a word then tells the text colour to change to green. see what i mean ? –  Dec 02 '11 at 14:00
  • Then you should process them inline! What would you do to the string with a green command? – MattH Dec 02 '11 at 14:02
  • what does 'process them inline' mean ? this is primarily all for a txtfx terminal class im trying to make. obv. with a green command the text would turn green etc... –  Dec 02 '11 at 14:04
  • What do you need to do to the input to create the desired output? – MattH Dec 02 '11 at 14:05
  • *process them inline*, what I mean is that you had the words and the commands interleaved in the first place. That's how they arrived, if you need to process the string and interpret the commands in the string, then look at the string word by word, create the output word by word and when you hit a command, modify your output accordingly. – MattH Dec 02 '11 at 14:07
  • the desired output is storing the data in the appropriate 'variable type' to deal with it in the way described. ie in order with an index being able to select between commands and words. –  Dec 02 '11 at 14:10
  • ah, i already have it worky doing it like that (you mean what i would call on the fly) probably worked in studios too much. yesh inline is how i did it but its interfering with the timing of how i want it printed out, so im rewriting it again but processing the string into words and commands and then having some kind of fast lookup on them makes it all a bit nicer and easier to time. –  Dec 02 '11 at 14:11
  • I'm having trouble understanding why you need to store the data in the appropriate data structure in order to deal with it in the way described, instead of just processing it in the way described. – MattH Dec 02 '11 at 14:13
  • @colonwq *"this is primarily all for a txtfx terminal class im trying to make. obv. with a green command the text would turn green etc."* <-- that's what I meant by describing what you're trying to achieve instead of how you're doing it – Shawn Chin Dec 02 '11 at 14:13
  • yeah, but that is irrelevant. –  Dec 02 '11 at 14:17
  • because it prints out multiple things at different speeds at the same time. i really just needed to know what data type to use, now im completely confused by all this! –  Dec 02 '11 at 14:22
  • You're not the only one confused. You can't be advised on how to store your data if people don't know how you're planning on using it. Did you spot the XY comment above? – MattH Dec 02 '11 at 14:34
  • why? thats completely illogical. if i asked how to write the program i could see how that would be applicable, but otherwise, no. i think i have worked out already what the best thing to do is because i been thinking about it for a week :) but i could ask that as another question. i rather thought that was more of a question for code review than here. but anyway apart from my pedantry, i really appreciate the help, without you guys i would be lost. –  Dec 02 '11 at 17:05
  • No, no it's not completely illogical. You're a self-admitted beginner, your knowledge of python and programming nomenclature is as to be expected for a beginner. Everyone here who's been trying to help you has had to **guess** at what you're trying to do, because the thing you asked to do didn't make much sense and no-one here wants to give bad advice. I'm glad that you've found some assistance here. – MattH Dec 02 '11 at 20:32
  • it IS illogical, but if you can think of a better way of organising this data, i will gladly say, OK you were right. http://pastebin.com/YfzMVj1i :) im not *admitting* im a beginner i *am* a beginner! however, im not stupid, im just bad at explaining stuff apparently. anyway forget all that, thanks for your help! –  Dec 05 '11 at 06:09
1

If you need to keep it ordered, then I suggest using a list of tupes. The below code is a generator expression, which can be used to generate a list or just process one

import re
from collections import OrderedDict

def parseCommand(in_string):
    pieces = re.split(r'\B@.*?\b', in_string) # split across words beginning with '@'
    if (not in_string.startswith("@")):
        # First pieces element contains no command
    else:
         for piece in pieces:
             command_parts = piece.split(None, 1)
             yield (command_parts[0], command_parts[1].strip())

command_list = list(parseCommand("my @command string is this bunch of words @blah with some commands contained in it"))

The best tool at this point is probably an ordered dict

commands = OrderedDict()
commands.update(command_list)

From which you can fetch individual commands by name:

blahcommand = commands['blah']

Or treat it as a stack or queue with the .popitem() method (whether it's a stack or queue depends on the boolean argument to popitem().

kojiro
  • 74,557
  • 19
  • 143
  • 201
  • that seems like what i might need. really?? i have to *import* something to do this !?!?! thankyou very much! –  Dec 02 '11 at 14:15
  • @colonwq The way you phrased that makes it sound like you think importing stuff is bad. [Import is your friend](http://effbot.org/zone/import-confusion.htm) – kojiro Dec 02 '11 at 15:58
  • unless there is an absolute need, i am tending to do as little as possible, which is why i didnt use curses for my text fx class, and also why it has become a huge pain :) –  Dec 02 '11 at 17:08