My data looks a bit like this:
sp_counts_1
1|2|4
1
4
5|7
8|9|2|4
I just need the first number and want to delete the rest. I have been running this syntax to extract the number before the vertical slash '|'. Whilst it works it is crashing my computer due to the large dataset (500,000 cases).
Due to the high number of variables I prefer to replace the exsiting variables (1000's, but split into files) rather than make new ones.
It is getting stuck on the 'list' and never makes it through.
Any ideas on how I could simplify the syntax so its happier to run?
Thanks in advance
STRING new1 to new10 (A10).
DO REPEAT old = sp_counts_1 TO sp_counts_10 / new = new1 TO new10.
COMPUTE #L = CHAR.INDEX(old,"|") - 1.
IF #L EQ -1 #L = LENGTH(RTRIM(old)).
COMPUTE new = CHAR.SUBSTR(old,1,#L).
END REPEAT.
LIST.
Delete variables sp_counts_1 TO sp_counts_10.
RENAME VARIABLES (new1 to new10 = sp_counts_1 TO sp_counts_10).
Execute.
My data looks a bit like this:
sp_counts_1
1|2|4
1
4
5|7
8|9|2|4
I just need the first number and want to delete the rest. I have been running this syntax to extract the number before the vertical slash '|'. Whilst it works it is crashing my computer due to the large dataset (500,000 cases).
Due to the high number of variables I prefer to replace the exsiting variables (1000's, but split into files) rather than make new ones.
It is getting stuck on the 'list' and never makes it through.
Any ideas on how I could simplify the syntax so its happier to run?
Thanks in advance
STRING new1 to new10 (A10).
DO REPEAT old = sp_counts_1 TO sp_counts_10 / new = new1 TO new10.
COMPUTE #L = CHAR.INDEX(old,"|") - 1.
IF #L EQ -1 #L = LENGTH(RTRIM(old)).
COMPUTE new = CHAR.SUBSTR(old,1,#L).
END REPEAT.
LIST.
Delete variables sp_counts_1 TO sp_counts_10.
RENAME VARIABLES (new1 to new10 = sp_counts_1 TO sp_counts_10).
Execute.
On Wed, 29 Jul 2020 21:30:24 -0700 (PDT), Erin Holloway <erin.ps...@gmail.com> wrote:
My data looks a bit like this:
sp_counts_1
1|2|4
1
4
5|7
8|9|2|4
I just need the first number and want to delete the rest. I have been running this syntax to extract the number before the vertical slash '|'. Whilst it works it is crashing my computer due to the large dataset (500,000 cases).Are you saying that, Yes, it does work when there
are few lines, or when you specify LIST CASES TO 100. ?
The first legitimate failure I think of is running out of disc.
That seems abnormal and wrong. 500 000 cases is no longer
too huge to process.
What are the symptoms of your crashes?
Due to the high number of variables I prefer to replace the exsiting variables (1000's, but split into files) rather than make new ones.Thousands of variables to list on one line? That might cause
some upset if you are writing a HUGE format across. The
default used to be to WRAP, which I learned to avoid.
It is getting stuck on the 'list' and never makes it through.
Any ideas on how I could simplify the syntax so its happier to run?
Thanks in advance
STRING new1 to new10 (A10).I wonder why you are showing us the Delete vars and Rename.
DO REPEAT old = sp_counts_1 TO sp_counts_10 / new = new1 TO new10.
COMPUTE #L = CHAR.INDEX(old,"|") - 1.
IF #L EQ -1 #L = LENGTH(RTRIM(old)).
COMPUTE new = CHAR.SUBSTR(old,1,#L).
END REPEAT.
LIST.
Delete variables sp_counts_1 TO sp_counts_10.
RENAME VARIABLES (new1 to new10 = sp_counts_1 TO sp_counts_10).
Execute.
Once upon a time, LIST was not a procedure; it set a switch to LIST
after a procedure caused cases to be read. If your SPSS is that old,
then just running the sytax down through LIST will do nothing --
SPSS will sit there waiting for a procedure or EXE.
If your SPSS is that old, I don't know what happens when variables
are renamed between LIST and EXE, since LIST in that case would not
be "performed" in the order written in syntax.
For a newer SPSS, please describe the symptoms of "crash".
I find the below variaton of syntax easier to read.
* append | to the value so that one is always found.
STRING #temp(A11) , new1 to new10(A10).
DO REPEAT old = sp_counts_1 TO sp_counts_10 / new = new1 TO new10.
COMPUTE #temp= CONCAT( RTRIM(old), '|' ) .
COMPUTE #L = CHAR.INDEX(#temp,"|") - 1.
COMPUTE new = CHAR.SUBSTR(#temp,1,#L).
END REPEAT.
COMMENT + will concatenate in file names, maybe not in COMPUTE.
COMMENT If not, use the concatenation function to combine.
--
Rich Ulrich
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 296 |
Nodes: | 16 (2 / 14) |
Uptime: | 35:15:28 |
Calls: | 6,648 |
Calls today: | 3 |
Files: | 12,193 |
Messages: | 5,328,921 |