Rstudio: Difference between revisions

Sac1 (talk | contribs)
formatting
Sac1 (talk | contribs)
Line 51: Line 51:
Consider the [research-datasets|public dataset] section or the [datasets|community dataset] section
Consider the [research-datasets|public dataset] section or the [datasets|community dataset] section


=== Nominal Data (Categorical without Order) ===
== Nominal Data (Categorical without Order) ==


''' Example dataset for locations (City, Country, Region):
‘’‘Example dataset for locations (City, Country, Region):’‘’


{| class="wikitable"
{| class=“wikitable sortable”
|-
|-
! City ! Country ! Region  
! City
! Country
! Region
|-
|-
| New York || United States || North America
| New York
|-
| United States
| Tokyo || Japan || Asia
|-
| Paris || France || Europe
|}


North America
Tokyo
Japan
Asia
-
Paris
France
Europe
}
=== Ordinal Data (Categorical with Order) ===
=== Ordinal Data (Categorical with Order) ===


''' Example dataset for survey responses (Satisfaction Level): It is useful for understanding the order of responses but not the magnitude of differences between them
‘’‘Example dataset for survey responses (Satisfaction Level):’‘’ It is useful for understanding the order of responses but not the magnitude of differences between them.


{| class="wikitable"
{| class=“wikitable sortable”
|-
! RespondentID ! SatisfactionLevel
|-
| 1 || Satisfied
|-
|-
| 2 || Neutral
! RespondentID
! SatisfactionLevel
|-
|-
| 3 || Dissatisfied
| 1
|}


Satisfied
2
Neutral
-
3
Dissatisfied
}
To understand the magnitude of differences between responses, you’d need to use interval or ratio data. Ordinal Data can be converted to interval or ratio data by assigning numerical values to the categories (e.g., 1 for Dissatisfied, 2 for Neutral, 3 for Satisfied).
To understand the magnitude of differences between responses, you’d need to use interval or ratio data. Ordinal Data can be converted to interval or ratio data by assigning numerical values to the categories (e.g., 1 for Dissatisfied, 2 for Neutral, 3 for Satisfied).


Line 87: Line 97:
Useful for understanding the magnitude of differences between responses, the mean and standard deviation can be calculated for interval and ratio data.
Useful for understanding the magnitude of differences between responses, the mean and standard deviation can be calculated for interval and ratio data.


{| class="wikitable"
{| class=“wikitable sortable”
|-
! RespondentID ! SatisfactionLevel
|-
| 1 || 7
|-
|-
| 2 || 5
! RespondentID
! SatisfactionLevel
|-
|-
| 3 || 3
| 1
|}


7
2
5
-
3
3
}
=== Interval and Ratio Data (Numeric Data) ===
=== Interval and Ratio Data (Numeric Data) ===


''' Interval data example for temperature readings in Celsius (without a true zero point): Useful for understanding temperature changes over time.
‘’‘Interval data example for temperature readings in Celsius (without a true zero point):’‘’ Useful for understanding temperature changes over time.


{| class="wikitable"
{| class=“wikitable sortable”
|-
! City ! MorningTemp ! NoonTemp ! EveningTemp
|-
|-
| New York || 15 || 22 || 18
! City
! MorningTemp
! NoonTemp
! EveningTemp
|-
|-
| Tokyo || 20 || 28 || 25
| New York
|-
| 15
| Paris || 12 || 18 || 14
| 22
|}


''' Ratio data example for population size (has a true zero point): Useful for understanding the population changes over time.
18
Tokyo
20
28
25
-
Paris
12
18
14
}
‘’‘Ratio data example for population size (has a true zero point):’‘’ Useful for understanding the population changes over time.


{| class="wikitable"
{| class=“wikitable sortable”
|-
|-
! City ! Population2010 ! Population2020  
! City
! Population2010
! Population2020
|-
|-
| New York || 8175133 || 8336817
| New York
|-
| 8,175,133
| Tokyo || 13074000 || 13929286
|-
| Paris || 2243833 || 2148271
|}


8,336,817
Tokyo
13,074,000
13,929,286
-
Paris
2,243,833
2,148,271
}
=== Multivariable Datasets ===
=== Multivariable Datasets ===


''' Example with two or more tables required for dependent, independent variables:  
‘’‘Example with two or more tables required for dependent and independent variables:’‘’


'' Table 1: Economic Data by Country
‘‘Table 1: Economic Data by Country’’


{| class="wikitable"
{| class=“wikitable sortable”
|-
! Country ! GDP2010 (in billions) ! GDP2020 (in billions)
|-
|-
| United States || 14964.4 || 21427.7
! Country
! GDP2010 (in billions)
! GDP2020 (in billions)
|-
|-
| Japan || 5700.1 || 5065.2
| United States
|-
| 14,964.4
| France || 2649.0 || 2715.5
|}


'' Table 2: Education Data by Country
21,427.7
Japan
5,700.1
5,065.2
-
France
2,649.0
2,715.5
}
‘‘Table 2: Education Data by Country’’


{| class="wikitable"
{| class=“wikitable sortable”
|-
|-
! Country ! AvgYearsOfSchooling2010 ! AvgYearsOfSchooling2020  
! Country
! AvgYearsOfSchooling2010
! AvgYearsOfSchooling2020
|-
|-
| United States || 12 || 13  
| United States
|-
| 12
| Japan || 11 || 12  
 
|-
13
| France || 11 || 12  
Japan
|}
11
12
-
France
11
12
}


Note: In R, you’d typically handle these datasets as separate data frames or Tibbles and might use join operations to combine them based on common keys (e.g., Country) for analysis.
Note: In R, you’d typically handle these datasets as separate data frames or Tibbles and might use join operations to combine them based on common keys (e.g., Country) for analysis.