Statistics & Analytics Consultants Group Blog

The Statistics & Analytics Consultants group is a network of over 9,000 members. Businesses have the ability to work with consulting firms and individual consultants and eliminate costs. There is also a job board where you can post statistics and analytics jobs. Our members offer a variety of courses to ensure that your company can compete on analytics. Courses range from basic applied understanding of statistical concepts and methods involved in carrying out and interpreting research to advanced modeling and programming.

This blog is a place where featured members are invited to share their expertise and opinions. Opinions are not necessarily the opinions of SACG.

Sunday, December 11, 2011

Alter Type: It’s Not What You Think, by Steven J. Fink, Evans Analytics

As I was reviewing a colleague’s SPSS syntax code the other day, I came across a command called “Alter Type.”  It sounded like a new scary movie, a psychiatric DSM code, or an abnormal personality attribute. 

I looked up this code in the Command Syntax Reference Manual (available through the Help menu) and there it was—a very useful command which can be applied in many applications.  


In brief, it does exactly what the name implies.  It changes the Variable Type (string or numeric) or Format of variables, including the Width of string variables.  As I was reading the explanation, it appears to be a new and improved Format statement, a so-called Format on steroids! 
Format statements are often used to change the width and decimals of numeric variables or the format of a date variable.  The Alter Type command changes the Variable Type of any variable in one short command—no need to write elaborate or unnecessary code…just one easy statement. 

As an example, the dataset below comprises 3 variables and 2 lines of data.

DATA LIST FREE
/Numvar (F2)     StringVar (A5)   Datevar (Adate10).

BEGIN DATA
1 1234 10/28/2007
4 5678 10/28/2007
End data.

To change a numeric variable to a string (alphanumeric) variable, the command is:

 Alter Type Numvar (A2).

To change a string (alphanumeric) variable to a numeric variable, the command is:

Alter Type Stringvar (F6.0).

To change a date variable to a string variable, the command is:

Alter Type Datevar (A10).

One note of caution: The Alter Type command does not allow you to create a new version of the variable.  So you may want to save your data first or create a copy of the variable.

So, the next time you need to perform a calculations or merge data and the variable is not in the right Format or Type, use the Alter Type command.  After all, it’s free, you are not crazy, and it’s cool! 

About Steven: Steven works as a Statistics & Analytics Consultant at Evans Analytics. He has developed or analyzed over 300 surveys including customer satisfaction, work environment, needs assessment, program evaluation, and compensation surveys for public and private sector customers. He has provided SPSS instruction to more than 3,000 analysts, covering a wide variety of topics, including questionnaire design/writing, sample design, data collection strategies, multivariate analyses, and presentation of tables/graphs. He can be reached at: steven@evansanalytics.com

Tuesday, December 6, 2011

What is a Scratch Variable in IBM SPSS Statistics Syntax? by Keith McCormick


This might seem an obscure topic, but it is easily grasped and has the potential to make your SPSS Syntax more readable. Readable is good.

A Scratch Variable is a Variable with a # Symbol in front of it. It is available temporarily for an intermediate step in a Transformation calculation. Once a Procedure occurs, it is no longer available. If the distinction between Transformation and Procedure is new to you, you should put researching that on your to-do list. Start with Appendix B of the Syntax Reference Guide.

You can use the following lines to create a tiny data set.

DATA LIST /LocationName 1-50 (A) .
BEGIN DATA
Raleigh, North Carolina
Durham, North Carolina
Cary, North Carolina
END DATA.

Let’s say that you wanted to pull out just State from the three examples in the data set. The first step would be to identify the location of the comma because the last letter of the last name is always one character before the comma. It is not a constant value because the names are of variable length.

This bit of code will do it:

COMPUTE CommaLocation = INDEX(LocationName,',').

This next step would complete the process, but would also create a new variable that you don’t need.

STRING State (A50).
COMPUTE State = substring(LocationName,CommaLocation+2).

Warning: you only need to run the STRING command once.

Do we actually want to create this variable? What are we going to do with it after we complete the calculation? We could use DELETE VARIABLES once we are done, but we have two better options. In this example, DELETE VARIABLES is harmless, but it would be slower on large data sets, and therefore some programmers would consider it inelegant. It is noteworthy that for decades the language got by just fine without the fairly recent addition of DELETE VARIABLES.

(Note that I have not included EXECUTE commands in any of these code examples. Curious Why? You really shouldn’t use EXECUTE if there will be any procedures later in the code, and there are always procedures later on in the code. That same Appendix B in the Syntax Reference Guide mentioned earlier is a good place to read more about this.)

We could put a function inside of a function:

STRING State(A50).
COMPUTE State = substring(LocationName,INDEX(LocationName,',')+2).

We could also use a Scratch variable:

COMPUTE #CommaLocation = INDEX(LocationName,',').
STRING State(A50).
COMPUTE State = substring(LocationName,#CommaLocation+2).

In an example as straightforward as this, the function inside of a function might be best. As the complexity grows, there will be opportunities to use the Scratch variable option to break up a calculation into two or more steps instead of a single very long, and potentially confusing, line of code.

And who doesn’t want more tools in their Syntax tool chest?

About Keith: Keith McCormick is an independent data mining professional who blogs at:  http://www.keithmccormick.com/