To check the duplicate obs from the sort proc
Posted: 19 February 2012 11:31 AM   [ Ignore ]
Sr. Member
RankRankRankRank
Total Posts:  169
Joined  2011-09-20

proc sort data =  work.data nodupkey
dupout=work.sort_dropped;
by key;
run;

“dupout” is the magic word !!!

smile

Credits go to Joost!

 Signature 

Zenghui
A humble student of business

Profile
 
 
Posted: 19 February 2012 11:47 AM   [ Ignore ]   [ # 1 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

Thanks Zenghui,

Some additions:
- if you want to keep the unique keys (1 obs per key), add “out = work.sort_unique”
- the first obs will be kept, so it may be an idea to sort the dataset on some properties first (so that you keep the best observation, instead of a ‘random’ one)

best regards,

Joost

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 19 February 2012 11:59 AM   [ Ignore ]   [ # 2 ]
Sr. Member
RankRankRankRank
Total Posts:  169
Joined  2011-09-20

Joost,

So by default the sort keep random obs with the same key? that is good to know.

Great that with this option:
out = work.sort_unique

we can get only the first obs with this key. Is it possible to get the last obs with the same key here?

Thanks,
Zenghui

 Signature 

Zenghui
A humble student of business

Profile
 
 
Posted: 19 February 2012 12:19 PM   [ Ignore ]   [ # 3 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

Zenghui,

Maybe ‘random’ is not the right word. It will be the first observation for each ‘key’. (By the way, it also works with multiple variables, like “by gvkey fyear”, so that the first obs for each firm-year is kept).

If do not know how to keep the last obs with ‘nodup’, but you could do an initial sort, so that whatever record you wish to keep comes in as first.

data work.set2;
set work.set1;
counter _N_;
run;

proc sort data work.set2 by key descending counter ;run

For each set of records for some key, the order in work.set2 will be flipped relative to work.set1.

best regards,

Joost

 

 

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 19 February 2012 01:29 PM   [ Ignore ]   [ # 4 ]
Sr. Member
RankRankRankRank
Total Posts:  169
Joined  2011-09-20

Joost,

That is very helpful suggestions.

Thanks again.

smile

Zenghui

 Signature 

Zenghui
A humble student of business

Profile