3

I recently got into Stata coming from a procedural/OO/functional background, and am having trouble understanding the basic elements of the language.

For example, I discovered that there is a syntax command which "allows programs to interpret the arguments the user types according to a grammar, such as standard Stata syntax". I infer this is the reason why some command require a list of variables given as arguments to be separated by whitespaces while others require a comma-separated list. But the idea of a program defining its own syntax instead of the (parameter) syntax being enforced seems plain weird.

Another quite interesting construct is the syntax for macro definition and expansion (`macro') and the apparent absence of local variables as known in other languages.

Is there something like a "Stata for Java developers" document explaining the basic concepts of the language to people with my background?

PS: Apologies if this question seems unclear. Unfortunately, I can't formulate more concrete/clear questions at this point :(

blubb
  • 9,510
  • 3
  • 40
  • 82
  • Is your question just "Is there a good quickstart guide"? Or do you want to know something about the language features? In either case, you probably don't need these programming features, at least for now. – Marcin May 10 '11 at 14:33
  • @Marcin: I want to understand the language, rather than just be able to apply it to a problem. What makes you think I don't need these features? I am supposed to take over a code base from a coworker which makes use of these features, that's how I discovered them... – blubb May 10 '11 at 15:55
  • I think that because I'm familiar with other languages that use such features. There aren't going to be any computations that actively require them, unless your language is very, very funky. Learn the rest of the language before trying to deploy those features in new code. – Marcin May 10 '11 at 16:27

2 Answers2

10

I'm not exactly sure what you are looking for... but here's a few related points. Stata is kind of like writing a Unix shell script or a Windows batch file. Each line executes a command, and the first word is the command name. By convention, most commands have the following structure:

command [varlist] [=exp] [if expression] [in range] [weight] [using filename] [, options]

Brackets [.] means it's optional (or unavailable, depending on the command). Some commands can be prefixed (such as by:, xi:, or svy:) The syntax of commands by Stata Corp and experienced users are pretty consistent. But, because Stata users also write commands, you occasionally see things that are wacky.

When Stata users write commands, they are saved in .ado files (not .do) and are defined using the program command. (See help program and the "Ado files" section of the manual.) Writing a command is akin to writing a function in other languages (e.g., MatLab)

The syntax command is used to help you write your own command. When you execute a command, everything following the command's name (command above) is passed to the program in the local macro `0'. The syntax command parses this local macro, so that you can reference `varlist' or `if' and so on. In theory, you could parse `0' yourself, but the syntax command makes it much easier for you and your users (as long as you are following the conventional syntax). I put an example at the bottom.

I don't know exactly what you mean by "apparent absence of local variables as known in other languages." Macros store a single string or a single number in memory. Here's a comment I wrote about Stata's local/global macros. They are indeed a unique feature of Stata's programming language. As their names imply, "local" macros are only available within a specify program (command) or .do file while "global" macros are available throughout a Stata session.

I found that, once I got used to macros in Stata, I started to miss them in other languages. They are pretty handy. In addition to (local/global) macros and the main data set, you can also store "things" in memory with the scalar and matrix commands (and one or two other obscure things).

I hope that helps. Here's a list resources that might help.

Example:

program define myprogram
    syntax varlist [if], [hello(string) yes]
    macro list _0 _varlist _if _hello _yes 
    summarize `varlist' `if'
    display "Here's the string in my hello option: `hello'"
    if !missing("`yes'") di "Yes is on"
    else                 di "Yes is off"
end 

sysuse auto.dta    
myprogram rep78 headroom if price > 5000 , hello("world") yes
Community
  • 1
  • 1
Keith
  • 1,037
  • 6
  • 13
  • You nailed it. This plus your answer in the other question untied the knot. Thanks! – blubb May 10 '11 at 20:24
  • Lovely answer! One quibble: the `syntax` command is *optional*... for example, one can get by using the `args` command instead. – Alexis Feb 08 '22 at 06:50
2

A few books offer an "X for Y users" approach, but generally between stats software solutions. Regarding your question, I would recommend using instinct first.

I started reading (programming and markup) code about ten years ago, and even though I cannot code in a large number of languages, I can read a few languages rather easily. I found Stata easy because most of its core commands are straightforward, with recurrent optional statements like over, if or replace (to take a voluntarily diverse set of statements) that are easy to understand and then apply.

When I teach Stata, I always have problems getting students to use the help pages as much as I do (and I love the fact they can be accessed so easily, just like in R). I explain the paradox by considering the fact that I can read the syntax indications straightaway. Syntax is very well covered by the previous reply to your question.

The extra mile consists in opening the [R], [U] and especially [P] handbooks that come with Stata in the utilities folder. There is a wealth of details there, which will interest both programmers and training statisticians. This is where I learnt to use macros and loops, beyond the obvious logic of commands like local/global and foreach/while (if I understand the term correctly, Stata is Turing-complete).

Stata is sometimes a bit of a pain when it comes to using single/double quotes in macro loops, but it's pretty straightforward otherwise. Have fun!

Fr.
  • 2,865
  • 2
  • 24
  • 44
  • "I always have problems getting students to use the help pages as much as I do" **This so hard!** More or less *every* lecture I try to model Oh! Look at me looking up (1) the basic syntax, (2) running the provided examples at the end of every single help file, and (3) opening up example data sets to see what the data structure requirements look like for a particular command. (I was teach to both R and to Stata, both of which your point applies to). Finally, I said proficient use does not mean you remember the syntax and options for every command, but that knowing how to use a help is. – Alexis Feb 08 '22 at 06:55