Which Of The Following Is Not An Example Of Primary Data Obtained Through Observation?
Combining SAS Information Sets: Methods
Definition
Concatenating data sets is the combining of two or more data sets, one later the other, into a single data set. The number of observations in the new data set up is the sum of the number of observations in the original data sets. The order of observations is sequential. All observations from the get-go data set are followed by all observations from the second data set, and so on.
In the simplest case, all input data sets contain the same variables. If the input data sets contain different variables, observations from i data set take missing values for variables defined but in other data sets. In either case, the variables in the new data set are the aforementioned as the variables in the old data sets.
Syntax
Utilize this form of the SET statement to concatenate information sets:
where
- data-set
- specifies any valid SAS information prepare proper noun.
For a complete clarification of the SET argument, see SAS Language Reference: Dictionary.
Data Stride Processing During Concatenation
- Compilation stage
- SAS reads the descriptor data of each data prepare that is named in the Set statement and so creates a programme data vector that contains all the variables from all data sets as well as variables created by the DATA step.
- Execution -- Step 1
- SAS reads the first observation from the kickoff data set into the plan data vector. It processes the first observation and executes other statements in the Information step. Information technology so writes the contents of the program data vector to the new data gear up. The Set up statement does not reset the values in the program data vector to missing, except for variables whose value is calculated or assigned during the Data step.
- Execution -- Step ii
- SAS continues to read one ascertainment at a time from the starting time data fix until it finds an stop-of-file indicator. The values of the variables in the program data vector are then ready to missing, and SAS begins reading observations from the 2d data ready and so forth until it reads all observations from all information sets.
Instance 1: Concatenation Using the DATA Footstep
In this example, each data fix contains the variables Mutual and Number, and the observations are arranged in the order of the values of Common. Generally, y'all concatenate SAS data sets that have the same variables. In this case, each information gear up also contains a unique variable to show the effects of combining data sets more conspicuously. The following shows the Fauna and the PLANT input data sets in the library that is referenced by the libref Case:
Fauna PLANT OBS Common Creature Number OBS Common Constitute Number 1 a Ant 5 1 g Grape 69 2 b Bird 2 h Hazelnut 55 3 c Cat 17 3 i Indigo 4 d Dog nine 4 j Jicama xiv 5 e Eagle v yard Kale v six f Frog 76 6 l Lentil 77
The following program uses a SET statement to concatenate the data sets and so prints the results:
libname example 'SAS-data-library'; data example.chain; fix example.animate being example.establish; run; proc print information=instance.concatenation; var Common Creature Plant Number; title 'Information Fix Chain'; run;
Concatenated Data Sets (Information Step)
Data Ready Chain 1 Obs Common Animal Plant Number 1 a Ant 5 2 b Bird . 3 c Cat 17 4 d Dog nine five e Eagle . 6 f Frog 76 seven g Grape 69 8 h Hazelnut 55 ix i Indigo . ten j Jicama 14 11 k Kale 5 12 l Lentil 77
The resulting data set CONCATENATION has 12 observations, which is the sum of the observations from the combined data sets. The plan data vector contains all variables from all data sets. The values of variables plant in 1 data set but not in another are set to missing.
Instance ii: Concatenation Using SQL
Y'all can also utilize the SQL linguistic communication to concatenate tables. In this example, SQL reads each row in both tables and creates a new table named COMBINED. The post-obit shows the YEAR1 and YEAR2 input tables:
YEAR1 YEAR2 Date1 Date2 1996 1997 1997 1998 1998 1999 1999 2000 2001
The post-obit SQL code creates and prints the table COMBINED.
proc sql; title 'SQL Table COMBINED'; create table combined as select * from year1 outer union corr select * from year2; select * from combined; quit;
Concatenated Tables (SQL)
SQL Table COMBINED 1 Year -------- 1996 1997 1998 1999 1997 1998 1999 2000 2001
Appending Files
Instead of concatenating information sets or tables, you can append them and produce the aforementioned results as chain. SAS concatenates data sets (DATA step) and tables (SQL) by reading each row of data to create a new file. To avert reading all the records, you can suspend the second file to the commencement file past using the Suspend procedure:
proc suspend base of operations=year1 data=year2; run;
The YEAR1 file volition comprise all rows from both tables.
Note: Yous cannot use PROC Suspend to add observations to a SAS information set up in a sequential library.
Efficiency
If no additional processing is necessary, using PROC APPEND or the APPEND statement in PROC DATASETS is more efficient than using a DATA step to concatenate data sets.
Definition
Interleaving uses a SET argument and a BY statement to combine multiple information sets into one new data ready. The number of observations in the new data set is the sum of the number of observations from the original data sets. Notwithstanding, the observations in the new data set are arranged by the values of the BY variable or variables and, within each By group, by the order of the data sets in which they occur. You can interleave data sets either by using a Past variable or past using an index.
Syntax
Use this course of the SET argument to interleave data sets when you use a BY variable:
where
- data-set
- specifies a one-level proper name, a two-level name, or one of the special SAS data set up names.
- variable
- specifies each variable by which the data set is sorted. These variables are referred to as BY variables for the electric current DATA or PROC stride.
Apply this course of the SET argument to interleave data sets when you use an index:
Prepare data-set-1 . . . data-gear up-northward Primal= alphabetize;
where
- data-set up
- specifies a ane-level name, a two-level name, or one of the special SAS information gear up names.
- index
- provides nonsequential access to observations in a SAS information set, which are based on the value of an index variable or cardinal.
For a consummate description of the SET statement, including SET with the Primal= option, run across SAS Language Reference: Dictionary.
Sort Requirements
Before y'all tin can interleave data sets, the observations must be sorted or grouped by the aforementioned variable or variables that yous employ in the BY statement, or y'all must have an appropriate alphabetize for the information sets.
Information Step Processing During Interleaving
- Compilation phase
-
- SAS reads the descriptor information of each data set that is named in the Gear up statement and so creates a programme data vector that contains all the variables from all data sets also as variables created by the DATA pace.
- SAS creates the Outset.variable and LAST.variable for each variable listed in the By statement.
- Execution -- Footstep 1
- SAS compares the first ascertainment from each information gear up that is named in the Set up statement to determine which By group should appear first in the new data ready. It reads all observations from the first BY group from the selected data set. If this Past group appears in more than one data prepare, it reads from the data sets in the gild in which they appear in the Prepare statement. The values of the variables in the program data vector are set to missing each time SAS starts to read a new information prepare and when the BY grouping changes.
- Execution -- Step 2
- SAS compares the next observations from each data gear up to determine the adjacent By grouping and then starts reading observations from the selected data set in the Set statement that contains observations for this Past grouping. SAS continues until it has read all observations from all data sets.
Example 1: Interleaving in the Simplest Case
In this example, each data set contains the BY variable Common, and the observations are arranged in social club of the values of the BY variable. The following shows the Beast and the PLANT input data sets in the library that is referenced by the libref Case:
Fauna PLANT OBS Common Animal OBS Mutual Establish one a Ant ane a Apple 2 b Bird two b Banana three c Cat 3 c Coconut four d Canis familiaris 4 d Dewberry 5 east Eagle 5 e Eggplant half-dozen f Frog 6 f Fig
The following program uses Set up and BY statements to interleave the data sets, and prints the results:
data instance.interleaving; prepare instance.animal case.institute; by Common; run; proc print data=example.interleaving; title 'Data Set INTERLEAVING'; run;
Interleaved Data Sets
Data Set INTERLEAVING ane Obs common animal plant i a Pismire 2 a Apple 3 b Bird 4 b Assistant 5 c Cat half dozen c Coconut 7 d Dog eight d Dewberry 9 e Hawkeye x e Eggplant xi f Frog 12 f Fig
The resulting data set INTERLEAVING has 12 observations, which is the sum of the observations from the combined data sets. The new information set contains all variables from both data sets. The value of variables found in 1 data set simply not in the other are set to missing, and the observations are arranged by the values of the Past variable.
Example ii: Interleaving with Duplicate Values of the Past variable
If the data sets contain duplicate values of the Past variables, the observations are written to the new data set in the club in which they occur in the original data sets. This instance contains duplicate values of the BY variable Common. The following shows the ANIMAL1 and PLANT1 input data sets:
ANIMAL1 PLANT1 OBS Common Animal1 OBS Mutual Plant1 1 a Emmet i a Apple 2 a Ape 2 b Banana 3 b Bird iii c Kokosnoot four c Cat 4 c Celery five d Dog five d Dewberry vi eastward Eagle half dozen eastward Eggplant
The following program uses SET and BY statements to interleave the data sets, and prints the results:
information example.interleaving2; prepare instance.animal1 example.plant1; by Common; run; proc print data=example.interleaving2; title 'Data Set INTERLEAVING2: Duplicate By Values'; run;
Interleaved Data Sets with Duplicate Values of the BY Variable
Data Set INTERLEAVING2: Indistinguishable BY Values 1 Obs Common Animal1 Plant1 one a Emmet two a Ape 3 a Apple tree 4 b Bird 5 b Banana 6 c Cat 7 c Coconut 8 c Celery 9 d Dog ten d Dewberry 11 e Eagle 12 e Eggplant
The number of observations in the new data set is the sum of the observations in all the data sets. The observations are written to the new data set in the order in which they occur in the original data sets.
Example 3: Interleaving with Different Past Values in Each Data Fix
The data sets ANIMAL2 and PLANT2 both contain By values that are present in one data set but not in the other. The following shows the ANIMAL2 and the PLANT2 input data sets:
ANIMAL2 PLANT2 OBS Common Animal2 OBS Common Plant2 1 a Ant 1 a Apple tree two c Cat 2 b Banana 3 d Dog 3 c Coconut 4 e Eagle 4 due east Eggplant 5 f Fig
This program uses Fix and BY statements to interleave these data sets, and prints the results:
data example.interleaving3; set example.animal2 example.plant2; by Common; run; proc print data=example.interleaving3; championship 'Data Ready INTERLEAVING3: Different Past Values'; run;
Interleaving Data Sets with Unlike Past Values
Data Set INTERLEAVING3: Different Past Values one Obs Common Animal2 Plant2 1 a Ant two a Apple iii b Banana 4 c Cat 5 c Kokosnoot 6 d Dog seven e Hawkeye eight e Eggplant 9 f Fig
The resulting data set has nine observations arranged past the values of the By variable.
Comments and Comparisons
- In other languages, the term merge is often used to mean interleave. SAS reserves the term merge for the operation in which observations from two or more data sets are combined into one observation. The observations in interleaved data sets are not combined; they are copied from the original data sets in the order of the values of the By variable.
- If ane table has multiple rows with the aforementioned BY value, the Information pace preserves the club of those rows in the result.
- To utilize the DATA stride, the input tables must be appropriately sorted or indexed. SQL does not require the input tables to be in order.
Definition
I-to-one reading combines observations from two or more than data sets into one observation by using two or more Ready statements to read observations independently from each data set up. This process is also called i-to-ane matching. The new data fix contains all the variables from all the input data sets. The number of observations in the new information set is the number of observations in the smallest original information set. If the data sets contain common variables, the values that are read in from the concluding data set supersede the values that were read in from earlier data sets.
Syntax
Apply this form of the Set statement for one-to-1 reading:
where
- data-fix-1
- specifies a one-level name, a two-level proper noun, or one of the special SAS data set names. information-set up-one is the first file that the Data step reads.
- information-set-2
- specifies a i-level name, a two-level name, or one of the special SAS information set up names. data-set-2 is the second file that the DATA step reads.
- Caution:
- Use care when you lot combine data sets with multiple Fix statements. Using multiple Set statements to combine observations tin produce undesirable results. Test your plan on representative samples of the information sets before using this method to combine them.
For a complete description of the Set up statement, see SAS Language Reference: Dictionary.
DATA Footstep Processing During a 1-to-One Reading
- Compilation phase
- SAS reads the descriptor data of each data set named in the SET statement and and so creates a plan data vector that contains all the variables from all data sets as well as variables created past the DATA footstep.
- Execution -- Stride i
- When SAS executes the beginning Set statement, SAS reads the first ascertainment from the outset data set into the programme data vector. The second Fix statement reads the offset ascertainment from the 2nd data gear up into the program data vector. If both data sets contain the same variables, the values from the second information set replace the values from the outset data set, fifty-fifty if the value is missing. After reading the start observation from the last data set and executing whatever other statements in the Information footstep, SAS writes the contents of the programme data vector to the new data set. The SET statement does not reset the values in the program data vector to missing, except for those variables that were created or assigned values during the DATA pace.
- Execution -- Step 2
- SAS continues reading from one information set and then the other until it detects an end-of-file indicator in one of the data sets. SAS stops processing with the last observation of the shortest information fix and does non read the remaining observations from the longer data ready.
Example i: One-to-I Reading: Processing an Equal Number of Observations
The SAS data sets ANIMAL and Plant both comprise the variable Common, and are bundled by the values of that variable. The post-obit shows the ANIMAL and the PLANT input data sets:
ANIMAL PLANT OBS Common Animal OBS Common Plant 1 a Ant 1 a Apple 2 b Bird 2 b Assistant iii c True cat 3 c Coconut four d Dog iv d Dewberry v east Eagle 5 eastward Eggplant 6 f Frog half-dozen 1000 Fig
The following program uses two SET statements to combine observations from Fauna and PLANT, and prints the results:
data twosets; set brute; set plant; run; proc impress data=twosets; title 'Data Gear up TWOSETS - Equal Number of Observations'; run;
Data Set up Created from 2 Data Sets That Take Equal Observations
Data Set TWOSETS - Equal Number of Observations 1 Obs Mutual Animal Institute 1 a Ant Apple ii b Bird Assistant 3 c Cat Coconut 4 d Dog Dewberry 5 e Hawkeye Eggplant 6 yard Frog Fig
Each observation in the new data prepare contains all the variables from all the data sets. Note, however, that the Common variable value in observation half dozen contains a "thou." The value of Mutual in observation vi of the Creature information set was overwritten past the value in Constitute, which was the data set that SAS read final.
Comments and Comparisons
- The results that are obtained past reading observations using two or more Fix statements are similar to those that are obtained by using the MERGE statement with no BY statement. However, with 1-to-one reading, SAS stops processing before all observations are read from all data sets if the number of observations in the data sets is not equal.
- Using multiple Gear up statements with other Data pace statements makes the following applications possible:
- merging one observation with many
- conditionally merging observations
- reading from the same data set twice.
Definition
Ane-to-one merging combines observations from ii or more SAS data sets into a single ascertainment in a new information ready. To perform a one-to-ane merge, utilise the MERGE statement without a By statement. SAS combines the kickoff observation from all data sets in the MERGE statement into the starting time observation in the new data gear up, the second ascertainment from all data sets into the second ascertainment in the new information ready, so on. In a one-to-one merge, the number of observations in the new information set equals the number of observations in the largest data set up that was named in the MERGE statement.
If yous employ the MERGENOBY= SAS system option, you tin control whether SAS bug a message when MERGE processing occurs without an associated BY statement.
Syntax
Use this course of the MERGE statement to merge SAS data sets:
where
- information-ready
- names at least ii existing SAS data sets.
- Circumspection:
- Avoid using indistinguishable values or dissimilar values of common variables. One-to-one merging with data sets that comprise duplicate values of common variables can produce undesirable results. If a variable exists in more than than one data set, the value from the last data set that is read is the ane that is written to the new data set. The variables are combined exactly equally they are read from each information set. Using a one-to-i merge to combine information sets with different values of common variables can also produce undesirable results. If a variable exists in more than than one data set, the value from the last data set read is the one that is written to the new data ready fifty-fifty if the value is missing. Once SAS has processed all observations in a data set, all subsequent observations in the new data prepare have missing values for the variables that are unique to that information set.
For a complete clarification of the MERGE statement, see SAS Language Reference: Dictionary.
DATA Step Processing During One-to-One Merging
- Compilation stage
- SAS reads the descriptor information of each data set up that is named in the MERGE argument then creates a program information vector that contains all the variables from all data sets also as variables created by the DATA step.
- Execution -- Stride i
- SAS reads the commencement observation from each data ready into the program data vector, reading the data sets in the order in which they announced in the MERGE statement. If two data sets contain the same variables, the values from the second information ready supercede the values from the first data set. After reading the first observation from the terminal data fix and executing whatever other statements in the Information step, SAS writes the contents of the programme information vector to the new data gear up. Only those variables that are created or assigned values during the Data step are set to missing.
- Execution -- Step two
- SAS continues until it has read all observations from all information sets.
Example 1: 1-to-One Merging with an Equal Number of Observations
The SAS data sets Brute and Constitute both contain the variable Common, and the observations are arranged by the values of Common. The following shows the Creature and the PLANT input data sets:
Brute Institute OBS Mutual Creature OBS Common Plant 1 a Ant 1 a Apple tree two b Bird 2 b Banana 3 c True cat 3 c Coconut four d Domestic dog 4 d Dewberry 5 eastward Eagle 5 e Eggplant six f Frog six g Fig
The post-obit program merges these data sets and prints the results:
data combined; merge animal plant; run; proc print data=combined; championship 'Information Set COMBINED'; run;
Merged Data Sets That Have an Equal Number of Observations
Data Gear up COMBINED i Obs Common Animal Plant 1 a Pismire Apple 2 b Bird Banana 3 c True cat Coconut four d Domestic dog Dewberry 5 e Hawkeye Eggplant 6 g Frog Fig
Each observation in the new data set contains all variables from all data sets. If two data sets contain the same variables, the values from the second data set supersede the values from the first data set, as shown in observation half-dozen.
Example two: One-to-One Merging with an Unequal Number of Observations
The SAS information sets ANIMAL1 and PLANT1 both comprise the variable Mutual, and the observations are arranged past the values of Common. The PLANT1 information set has fewer observations than the ANIMAL1 data set. The post-obit shows the ANIMAL1 and the PLANT1 input data sets:
ANIMAL1 PLANT1 OBS Mutual Creature OBS Common Plant one a Emmet one a Apple 2 b Bird two b Banana 3 c Cat 3 c Coconut 4 d Domestic dog 5 e Eagle half-dozen f Frog
The following program merges these unequal data sets and prints the results:
information combined1; merge animal1 plant1; run; proc print data=combined1; title 'Data Set COMBINED1'; run;
Merged Information Sets That Take an Diff Number of Observations
Information Set up COMBINED1 1 Obs Common Fauna Plant one a Ant Apple 2 b Bird Assistant 3 c Cat Coconut four d Dog 5 e Eagle 6 f Frog
Note that observations iv through vi contain missing values for the variable Found.
Example 3: One-to-1 Merging with Duplicate Values of Common Variables
The following case shows the undesirable results that you can obtain by using ane-to-one merging with data sets that incorporate duplicate values of common variables. The value from the last data set that is read is the one that is written to the new information set. The variables are combined exactly as they are read from each data fix. In the following example, the data sets ANIMAL1 and PLANT1 contain the variable Mutual, and each data set contains observations with duplicate values of Mutual. The post-obit shows the ANIMAL1 and the PLANT1 input information sets:
ANIMAL1 PLANT1 OBS Common Creature OBS Common Plant 1 a Ant 1 a Apple 2 a Ape ii b Assistant three b Bird 3 c Coconut 4 c Cat 4 c Celery five d Canis familiaris five d Dewberry vi eastward Eagle 6 east Eggplant
The post-obit program produces the information fix MERGE1 data set and prints the results:
/* This program illustrates undesirable results. */ data merge1; merge animal1 plant1; run; proc print data=merge1; championship 'Information Ready MERGE1'; run;
Undesirable Results with Duplicate Values of Common Variables
Information Set up MERGE1 one Obs Common Animal1 Plant1 ane a Pismire Apple ii b Ape Assistant iii c Bird Coconut 4 c Cat Celery 5 d Domestic dog Dewberry 6 e Hawkeye Eggplant
The number of observations in the new information ready is 6. Notation that observations two and 3 contain undesirable values. SAS reads the 2nd ascertainment from information set ANIMAL1. Information technology then reads the second observation from information set PLANT1 and replaces the values for the variables Common and Plant1. The tertiary observation is created in the aforementioned way.
Example four: One-to-Ane Merging with Different Values of Common Variables
The following instance shows the undesirable results obtained from using the one-to-one merge to combine information sets with dissimilar values of mutual variables. If a variable exists in more than one data ready, the value from the last data set that is read is the 1 that is written to the new information set fifty-fifty if the value is missing. Once SAS processes all observations in a data set, all subsequent observations in the new information set take missing values for the variables that are unique to that data gear up. In this example, the data sets ANIMAL2 and PLANT2 take unlike values of the Mutual variable. The following shows the ANIMAL2 and the PLANT2 input data sets:
ANIMAL2 PLANT2 OBS Common Animal OBS Common Plant one a Ant one a Apple ii c True cat 2 b Banana three d Dog 3 c Kokosnoot four e Eagle iv e Eggplant v f Fig
The following programme produces the data set MERGE2 and prints the results:
/* This programme illustrates undesirable results. */ data merge2; merge animal2 plant2; run; proc print data=merge2; title 'Data Gear up MERGE2'; run;
Undesirable Results with Unlike Values of Mutual Variables
Data Set up MERGE2 1 Obs Common Animal2 Plant2 1 a Pismire Apple ii b Cat Banana 3 c Domestic dog Kokosnoot 4 e Hawkeye Eggplant 5 f Fig
Comments and Comparisons
The results from a one-to-one merge are similar to the results obtained from using ii or more SET statements to combine observations. However, with the ane-to-one merge, SAS continues processing all observations in all information sets that were named in the MERGE statement.
Definition
Match-merging combines observations from ii or more SAS data sets into a single ascertainment in a new data prepare according to the values of a common variable. The number of observations in the new data set is the sum of the largest number of observations in each By grouping in all information sets. To perform a friction match-merge, employ the MERGE statement with a BY statement. Before you tin can perform a match-merge, all data sets must be sorted by the variables that y'all specify in the BY argument or they must have an index.
Syntax
Utilize this form of the MERGE statement to lucifer-merge data sets:
where
- data-set up
- names at to the lowest degree 2 existing SAS data sets from which observations are read.
- variable
- names each variable by which the data ready is sorted or indexed. These variables are referred to equally BY variables.
For a consummate description of the MERGE and the BY statements, see SAS Language Reference: Lexicon.
Data Footstep Processing During Match-Merging
- Compilation stage
- SAS reads the descriptor information of each data ready that is named in the MERGE statement and so creates a program information vector that contains all the variables from all information sets too as variables created by the DATA pace. SAS creates the Outset.variable and LAST.variable for each variable that is listed in the BY argument.
- Execution - Step 1
- SAS looks at the starting time BY group in each data set that is named in the MERGE argument to determine which BY group should appear starting time in the new data set. The Data footstep reads into the program data vector the first observation in that BY group from each data gear up, reading the data sets in the lodge in which they appear in the MERGE argument. If a information gear up does non have observations in that By group, the programme data vector contains missing values for the variables unique to that data set up.
- Execution - Step 2
- Later on processing the first observation from the concluding data set and executing other statements, SAS writes the contents of the program data vector to the new data set. SAS retains the values of all variables in the program data vector except those variables that were created past the Information step; SAS sets those values to missing. SAS continues to merge observations until it writes all observations from the first By group to the new data prepare. When SAS has read all observations in a Past group from all information sets, it sets all variables in the program data vector to missing. SAS looks at the next BY group in each information ready to determine which Past group should appear next in the new data set.
- Execution - Footstep three
- SAS repeats these steps until information technology reads all observations from all BY groups in all information sets.
Case 1: Combining Observations Based on a Criterion
The SAS information sets ANIMAL and Found each comprise the Past variable Mutual, and the observations are bundled in order of the values of the Past variable. The following shows the Animate being and the Institute input data sets:
ANIMAL PLANT OBS Common Animal OBS Common Plant i a Ant ane a Apple 2 b Bird ii b Banana 3 c Cat 3 c Coconut 4 d Dog 4 d Dewberry 5 e Hawkeye 5 e Eggplant 6 f Frog 6 f Fig
The post-obit plan merges the information sets co-ordinate to the values of the Past variable Common, and prints the results:
data combined; merge animal plant; past Common; run; proc impress information=combined; championship 'Data Set up COMBINED'; run;
Data Sets Combined by Friction match-Merging
Data Set COMBINED 1 Obs Common Fauna Institute 1 a Emmet Apple tree 2 b Bird Banana 3 c Cat Coconut 4 d Domestic dog Dewberry 5 e Eagle Eggplant 6 f Frog Fig
Each observation in the new data gear up contains all the variables from all the data sets.
Example ii: Match-Merge with Duplicate Values of the Past Variable
When SAS reads the terminal observation from a BY group in one information gear up, SAS retains its values in the plan data vector for all variables that are unique to that data prepare until all observations for that By group have been read from all data sets. In the following example, the data sets ANIMAL1 and PLANT1 contain indistinguishable values of the By variable Mutual. The following shows the ANIMAL1 and the PLANT1 input information sets:
ANIMAL1 PLANT1 OBS Common Animal1 OBS Mutual Plant1 1 a Ant 1 a Apple tree 2 a Ape ii b Assistant 3 b Bird iii c Coconut 4 c Cat 4 c Celery 5 d Dog 5 d Dewberry 6 e Hawkeye 6 e Eggplant
The following program produces the merged data prepare MATCH1, and prints the results:
data match1; merge animal1 plant1; by Common; run; proc impress data=match1; title 'Data Set MATCH1'; run;
Match-Merged Data Ready with Duplicate Past Values
Information Fix MATCH1 1 Obs Common Animal1 Plant1 i a Ant Apple ii a Ape Apple 3 b Bird Assistant 4 c Cat Coconut 5 c Cat Celery six d Dog Dewberry 7 due east Hawkeye Eggplant
In observation 2 of the output, the value of the variable Plant1 is retained until all observations in the BY group are written to the new data ready. Friction match-merging also produced duplicate values in ANIMAL1 for observations four and v.
Example 3: Match-Merge with Nonmatched Observations
When SAS performs a match-merge with nonmatched observations in the input data sets, SAS retains the values of all variables in the plan information vector even if the value is missing. The data sets ANIMAL2 and PLANT2 do not comprise all values of the By variable Common. The following shows the ANIMAL2 and the PLANT2 input data sets:
ANIMAL2 PLANT2 OBS Common Animal2 OBS Common Plant2 one a Pismire 1 a Apple tree two c Cat 2 b Banana 3 d Canis familiaris 3 c Coconut 4 e Eagle iv eastward Eggplant 5 f Fig
The following program produces the merged information set MATCH2, and prints the results:
data match2; merge animal2 plant2; by Common; run; proc print data=match2; championship 'Information Set MATCH2'; run;
Friction match-Merged Information Set with Nonmatched Observations
Data Set MATCH2 one Obs Common Animal2 Plant2 ane a Ant Apple 2 b Banana 3 c Cat Coconut four d Dog 5 e Eagle Eggplant half dozen f Fig
As the output shows, all values of the variable Mutual are represented in the new data set, including missing values for the variables that are in one information gear up simply not in the other.
|
Updating with the UPDATE and the MODIFY Statements |
Definitions
Updating a data set up refers to the process of applying changes to a master data gear up. To update data sets, you work with two input information sets. The information set containing the original information is the master data set, and the information set containing the new information is the transaction data fix.
You can update information sets by using the UPDATE statement or the Modify statement:
UPDATE | uses observations from the transaction data set to change the values of respective observations from the chief data set. You lot must utilize a BY argument with the UPDATE argument considering all observations in the transaction data prepare are keyed to observations in the master data prepare according to the values of the BY variable. |
Change | can replace, delete, and append observations in an existing data set. Using the Change argument can save disk space because it modifies data in place, without creating a copy of the information set. |
The number of observations in the new information set is the sum of the number of observations in the master data set and the number of unmatched observations in the transaction data set.
For complete information about the UPDATE and the MODIFY statements, see "Statements" in SAS Language Reference: Dictionary.
Syntax of the UPDATE Statement
Employ this course of the UPDATE statement to update a master data gear up:
UPDATE master-data-set transaction-data-set;
where
- principal-data-set up
- names the SAS information set that is used as the primary file.
- transaction-data-set
- names the SAS data set up that contains the changes to be applied to the master data set.
- variable-listing
- specifies the variables past which observations are matched.
If the transaction data set contains duplicate values of the Past variable, SAS applies both transactions to the observation. The terminal values that are copied into the program information vector are written to the new data set. If your data is in this form, use the MODIFY argument instead of the UPDATE statement to process your data.
- CAUTION:
- Values of the Past variable must be unique for each observation in the master information set. If the master data prepare contains two observations with the aforementioned value of the By variable, the first observation is updated and the second observation is ignored. SAS writes a alert message to the log when the Data step executes.
For complete data about the UPDATE argument, run into SAS Language Reference: Dictionary.
Syntax of the Modify Statement
This form of the MODIFY statement is used in the examples that follow:
where
- chief-data-set
- specifies the SAS data ready that yous want to modify.
- variable-list
- names each variable by which the data prepare is ordered.
Notation: The Modify statement does not support changing the descriptor portion of a SAS data prepare, such equally adding a variable.
For complete data about the Alter statement, see SAS Linguistic communication Reference: Lexicon.
DATA Footstep Processing with the UPDATE Argument
- Compilation stage
-
- SAS reads the descriptor information of each data set that is named in the UPDATE statement and creates a programme information vector that contains all the variables from all data sets likewise equally variables created past the Data stride.
- SAS creates the FIRST.variable and Last.variable for each variable that is listed in the BY statement.
- Execution - Step one
- SAS looks at the kickoff observation in each information gear up that is named in the UPDATE argument to determine which Past grouping should announced first. If the transaction BY value precedes the master BY value, SAS reads from the transaction information set only and sets the variables from the principal data gear up to missing. If the master BY value precedes the transaction By value, SAS reads from the master data set but and sets the unique variables from the transaction information set to missing. If the BY values in the master and transaction information sets are equal, it applies the starting time transaction by copying the nonmissing values into the programme information vector.
- Execution - Stride 2
- Afterwards completing the beginning transaction, SAS looks at the adjacent observation in the transaction data ready. If SAS finds one with the same Past value, it applies that transaction besides. The outset observation and then contains the new values from both transactions. If no other transactions exist for that observation, SAS writes the ascertainment to the new data set and sets the values in the program information vector to missing. SAS repeats these steps until it has read all observations from all BY groups in both data sets.
Updating with Nonmatched Observations, Missing Values, and New Variables
In the UPDATE statement, if an ascertainment in the chief data fix does not take a respective observation in the transaction data set, SAS writes the ascertainment to the new information gear up without modifying it. Whatsoever ascertainment from the transaction data set that does not represent to an ascertainment in the chief information prepare is written to the program information vector and becomes the basis for an observation in the new data set. The data in the program data vector tin be modified by other transactions before it is written to the new data set. If a main data set observation does non demand updating, the corresponding observation can be omitted from the transaction information set.
SAS does not supplant existing values in the master data gear up with missing values if those values are coded every bit periods (for numeric variables) or blanks (for character variables) in the transaction data gear up. To replace existing values with missing values, y'all must either create a transaction data set in which missing values are coded with the special missing value characters, or use the UPDATEMODE=NOMISSINGCHECK argument option.
With UPDATE, the transaction data set can comprise new variables to be added to all observations in the chief information set.
To view a sample program, encounter Example 3: Using UPDATE for Processing Nonmatched Observations, Missing Values, and New Variables.
Sort Requirements for the UPDATE Statement
If you lot do not use an index, both the chief information set and the transaction data set must exist sorted by the same variable or variables that you specify in the By statement that accompanies the UPDATE statement. The values of the Past variable should be unique for each observation in the master information gear up. If you apply more than one BY variable, the combination of values of all Past variables should exist unique for each ascertainment in the master data gear up. The Past variable or variables should be ones that yous never need to update.
Note: The Change argument does non crave sorted files. Nonetheless, sorting the data improves efficiency.
Using an Index with the MODIFY Statement
The MODIFY statement maintains the alphabetize. You do not have to rebuild the index like you practice for the UPDATE argument.
Choosing between UPDATE or Modify with BY
Using the UPDATE argument is comparable to using Change with By to apply transactions to a information set. While Modify is a more powerful tool with several other applications, UPDATE is all the same the tool of selection in some cases. The following table helps you choose whether to use UPDATE or Alter with BY.
Issue | MODIFY with BY | UPDATE |
---|---|---|
Disk space | saves disk space because information technology updates data in place | requires more than disk space considering it produces an updated copy of the information fix |
Sort and alphabetize | sorted input information sets are not required, although for proficient performance, it is strongly recommended that both information sets be sorted and that the master information ready be indexed | requires simply that both information sets exist sorted |
When to use | use only when you look to process a SMALL portion of the data set | employ if you expect to demand to procedure most of the data set |
Where to specify the modified data set up | specify the updated data set in both the DATA and the Modify statements | specify the updated information prepare in the DATA and the UPDATE statements |
Duplicate By-values | allows duplicate BY-values in both the master and the transaction data sets | allows indistinguishable Past-values in the transaction data fix only (If duplicates be in the principal data set, SAS issues a alarm.) |
Telescopic of changes | cannot change the data prepare descriptor information, and so changes such every bit adding or deleting variables, variable labels, and then on, are not valid | tin make changes that require a modify in the descriptor portion of a information gear up, such as adding new variables, and so on |
Error checking | has fault-checking capabilities using the _IORC_ automatic variable and the SYSRC autocall macro | needs no mistake checking because transactions without a corresponding chief record are not applied but are added to the data set |
Information set integrity | data may merely be partially updated due to an abnormal task termination | no information loss occurs considering UPDATE works on a re-create of the information |
For more information about tools for combining SAS information sets, see Statements or Procedures for Combining SAS Data Sets.
Primary Uses of the Alter Statement
The Alter statement has three principal uses:
- modifying observations in a unmarried SAS data fix.
- modifying observations in a single SAS data prepare straight, either by observation number or by values in an alphabetize.
- modifying observations in a principal data fix, based on values in a transaction information set. Modify with BY is similar to using the UPDATE statement.
Several of the examples that follow demonstrate these uses.
Case 1: Using UPDATE for Basic Updating
In this example, the information set MASTER contains original values of the variables Animate being and Plant. The data gear up NEWPLANT is a transaction information set with new values of the variable Plant. The following shows the MASTER and the NEWPLANT input data sets:
Chief NEWPLANT OBS Mutual Animal Found OBS Common Plant 1 a Ant Apple 1 a Apricot 2 b Bird Banana 2 b Barley 3 c True cat Kokosnoot three c Cactus 4 d Canis familiaris Dewberry 4 d Date 5 e Eagle Eggplant 5 eastward Escarole 6 f Frog Fig 6 f Fennel
The following program updates MASTER with the transactions in the data set NEWPLANT, writes the results to UPDATE_FILE, and prints the results:
data update_file; update chief newplant; by common; run; proc print data=update_file; title 'Data Set Update_File'; run;
Primary Data Gear up Updated by Transaction Data Set
Data Set up Update_File 1 Obs Mutual Animal Plant 1 a Ant Apricot 2 b Bird Barley three c Cat Cactus 4 d Domestic dog Date v eastward Eagle Escarole 6 f Frog Fennel
Each observation in the new data set up contains a new value for the variable Plant.
Example 2: Using UPDATE with Duplicate Values of the By Variable
If the main information fix contains ii observations with the same value of the BY variable, the offset ascertainment is updated and the second ascertainment is ignored. SAS writes a warning message to the log. If the transaction data set contains indistinguishable values of the Past variable, SAS applies both transactions to the observation. The last values copied into the programme data vector are written to the new data ready. The following shows the MASTER1 and the DUPPLANT input data sets.
MASTER1 DUPPLANT OBS Common Animal1 Plant1 OBS Common Plant1 1 a Ant Apple 1 a Apricot 2 b Bird Banana 2 b Barley 3 b Bird Banana 3 c Cactus 4 c True cat Coconut 4 d Date five d Dog Dewberry 5 d Dill 6 east Eagle Eggplant vi e Escarole vii f Frog Fig 7 f Fennel
The following program applies the transactions in DUPPLANT to MASTER1 and prints the results:
information update1; update master1 dupplant; by Common; run; proc impress data=update1; title 'Data Set Update1'; run;
Updating Data Sets with Indistinguishable BY Values
Data Set Update1 1 Obs Common Animal1 Plant1 i a Emmet Apricot 2 b Bird Barley 3 b Bird Banana 4 c Cat Cactus v d Dog Dill six e Eagle Escarole vii f Frog Fennel
When this Data step executes, SAS generates a warning message stating that in that location is more than one observation for a Past group. However, the Information step continues to process, and the data fix UPDATE1 is created.
The resulting data set has 7 observations. Observations ii and 3 have duplicate values of the By variable Common. However, the value of the variable PLANT1 was not updated in the second occurrence of the duplicate BY value.
Instance 3: Using UPDATE for Processing Nonmatched Observations, Missing Values, and New Variables
In this example, the information ready MASTER2 is a chief data set. Information technology contains a missing value for the variable Plant2 in the beginning observation, and not all of the values of the BY variable Common are included. The transaction data set NONPLANT contains a new variable Mineral, a new value of the By variable Common, and missing values for several observations. The following shows the MASTER2 and the NONPLANT input data sets:
MASTER2 NONPLANT OBS Common Animal2 Plant2 OBS Mutual Plant2 Mineral i a Ant 1 a Apricot Amethyst 2 c Cat Kokosnoot 2 b Barley Beryl iii d Domestic dog Dewberry 3 c Cactus 4 e Hawkeye Eggplant four e 5 f Frog Fig five f Fennel half dozen g Grape Garnet
The post-obit program updates the information set MASTER2 and prints the results:
data update2_file; update master2 nonplant; by Common; run; proc print information=update2_file; championship 'Information Set Update2_File'; run;
Results of Updating with New Variables, Nonmatched Observations, and Missing Values
Data Set Update2_File 1 Obs Mutual Animal2 Plant2 Mineral 1 a Ant Apricot Amethyst ii b Barley Beryl 3 c Cat Cactus 4 d Dog Dewberry 5 e Eagle Eggplant 6 f Frog Fennel 7 m Grape Garnet
As shown, all observations at present include values for the variable Mineral. The value of Mineral is gear up to missing for some observations. Observations ii and 6 in the transaction data ready did non accept corresponding observations in MASTER2, and they have become new observations. Ascertainment 3 from the master information set was written to the new data set without modify, and the value for Plant2 in ascertainment 4 was not inverse to missing. 3 observations in the new data set take updated values for the variable Plant2.
The following program uses the UPDATEMODE statement option on the UPDATE statement, and prints the results:
information update2_file; update master2 nonplant updatemode=nomissingcheck; by Common; run; proc print data=update2_file; title 'Data Ready Update2_File - UPDATEMODE Pick'; run;
Results of Updating with the UPDATEMODE Option
Data Gear up Update2_File - UPDATEMODE Option 1 Obs Common Animal2 Plant2 Mineral ane a Ant Apricot Amethyst 2 b Barley Beryl 3 c True cat Cactus four d Domestic dog Dewberry 5 due east Hawkeye six f Frog Fennel vii g Grape Garnet
The value of Plant2 in observation 5 is set to missing considering the UPDATEMODE=NOMISSINGCHECK option is in effect.
For detailed examples for updating information sets, see Combining and Modifying SAS Data Sets: Examples.
Case iv: Updating a Principal Information Fix by Adding an Observation
If the transaction data set contains an ascertainment that does not match an observation in the primary data gear up, yous must alter the program. The Year value in ascertainment v of TRANSACTION has no friction match in Principal. The post-obit shows the Chief and the TRANSACTION input data sets:
MASTER TRANSACTION OBS Year VarX VarY OBS Twelvemonth VarX VarY one 1985 x1 y1 1 1991 x2 2 1986 x1 y1 2 1992 x2 y2 three 1987 x1 y1 3 1993 x2 4 1988 x1 y1 4 1993 y2 five 1989 x1 y1 five 1995 x2 y2 6 1990 x1 y1 7 1991 x1 y1 eight 1992 x1 y1 9 1993 x1 y1 ten 1994 x1 y1
You must apply an explicit OUTPUT statement to write a new observation to a master data set. (The default action for a Information step using a MODIFY argument is Supercede, not OUTPUT.) Once you lot specify an explicit OUTPUT argument, you must too specify a Supercede statement. The following DATA step updates information set up Primary, based on values in TRANSACTION, and adds a new observation. This program also uses the _IORC_ automatic variable for mistake checking. (For more information about error checking, see Error Checking When Using Indexes to Randomly Access or Update Information.
data master; alter principal transaction; by Year; if _iorc_=%sysrc(_sok) then replace; else if _iorc_=%sysrc(_dsenmr) so do; output; _error_=0; end; else practice; put "Unexpected fault at Observation: " _n_; _error_=0; stop; finish; run; proc impress data=master; title 'Updated Master Data Ready -- MODIFY'; title2 'One Ascertainment Added'; run;
Modified Main Information Set
Updated Master Data Set -- MODIFY 1 Ane Observation Added Obs Year VarX VarY 1 1985 x1 y1 2 1986 x1 y1 iii 1987 x1 y1 4 1988 x1 y1 five 1989 x1 y1 6 1990 x1 y1 7 1991 x2 y1 8 1992 x2 y2 9 1993 x2 y2 ten 1994 x1 y1 xi 1995 x2 y2
SAS added a new observation, ascertainment xi, to the Main data set and updated observations 7, eight, and 9.
Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
Which Of The Following Is Not An Example Of Primary Data Obtained Through Observation?,
Source: https://v8doc.sas.com/sashtml/lrcon/z1081414.htm
Posted by: nelsonenterhad.blogspot.com
0 Response to "Which Of The Following Is Not An Example Of Primary Data Obtained Through Observation?"
Post a Comment