Rstudio: Difference between revisions

(One intermediate revision by one other user not shown)

Line 51:

Consider the [research-datasets|public dataset] section or the [datasets|community dataset] section

== Nominal Data (Categorical without Order) ==

=== Nominal Data (Categorical without Order) ===

~~‘’‘Example~~ dataset for locations (City, Country, Region):~~’‘’~~

''' Example dataset for locations (City, Country, Region):

{| class=~~“wikitable sortable”~~

{| class="wikitable"

|-

! City

! City ! Country ! Region

! Country

! Region

|-

| New York

| New York || United States || North America

| United States

|-

| Tokyo || Japan || Asia

|-

| Paris || France || Europe

|}

~~North America~~

~~Tokyo~~

~~Japan~~

~~Asia~~

-

~~Paris~~

~~France~~

~~Europe~~

}

=== Ordinal Data (Categorical with Order) ===

~~‘’‘Example~~ dataset for survey responses (Satisfaction Level):~~’‘’~~ It is useful for understanding the order of responses but not the magnitude of differences between them.

''' Example dataset for survey responses (Satisfaction Level): It is useful for understanding the order of responses but not the magnitude of differences between them

{| class=~~“wikitable sortable”~~

{| class="wikitable"

|-

! RespondentID

! RespondentID ! SatisfactionLevel

! SatisfactionLevel

|-

| 1

| 1 || Satisfied

|-

| 2 || Neutral

|-

| 3 || Dissatisfied

|}

~~Satisfied~~

2

~~Neutral~~

-

3

~~Dissatisfied~~

}

To understand the magnitude of differences between responses, you’d need to use interval or ratio data. Ordinal Data can be converted to interval or ratio data by assigning numerical values to the categories (e.g., 1 for Dissatisfied, 2 for Neutral, 3 for Satisfied).

Line 97:

Line 87:

Useful for understanding the magnitude of differences between responses, the mean and standard deviation can be calculated for interval and ratio data.

{| class=~~“wikitable sortable”~~

{| class="wikitable"

|-

! RespondentID

! RespondentID ! SatisfactionLevel

! SatisfactionLevel

|-

| 1

| 1 || 7

|-

| 2 || 5

|-

| 3 || 3

|}

7

2

5

-

3

}

=== Interval and Ratio Data (Numeric Data) ===

~~‘’‘Interval~~ data example for temperature readings in Celsius (without a true zero point):~~’‘’~~ Useful for understanding temperature changes over time.

''' Interval data example for temperature readings in Celsius (without a true zero point): Useful for understanding temperature changes over time.

{| class=~~“wikitable sortable”~~

{| class="wikitable"

|-

! City ! MorningTemp ! NoonTemp ! EveningTemp

|-

| New York || 15 || 22 || 18

|-

~~! City~~

| Tokyo || 20 || 28 || 25

~~! MorningTemp~~

~~! NoonTemp~~

~~! EveningTemp~~

|-

| ~~New York~~

| Paris || 12 || 18 || 14

| 15

|}

| 22

18

''' Ratio data example for population size (has a true zero point): Useful for understanding the population changes over time.

~~Tokyo~~

20

28

25

-

~~Paris~~

12

18

14

}

~~‘’‘Ratio~~ data example for population size (has a true zero point):~~’‘’~~ Useful for understanding the population changes over time.

{| class=~~“wikitable sortable”~~

{| class="wikitable"

|-

! City

! City ! Population2010 ! Population2020

! Population2010

! Population2020

|-

| New York

| New York || 8175133 || 8336817

| ~~8,175,133~~

|-

| Tokyo || 13074000 || 13929286

|-

| Paris || 2243833 || 2148271

|}

~~8,336,817~~

~~Tokyo~~

~~13,074,000~~

~~13,929,286~~

-

~~Paris~~

~~2,243,833~~

~~2,148,271~~

}

=== Multivariable Datasets ===

~~‘’‘Example~~ with two or more tables required for dependent ~~and~~ independent variables:~~’‘’~~

''' Example with two or more tables required for dependent, independent variables:

~~‘‘Table~~ 1: Economic Data by ~~Country’’~~

'' Table 1: Economic Data by Country

{| class=~~“wikitable sortable”~~

{| class="wikitable"

|-

! Country ! GDP2010 (in billions) ! GDP2020 (in billions)

|-

| United States || 14964.4 || 21427.7

|-

~~! Country~~

| Japan || 5700.1 || 5065.2

~~! GDP2010 (in billions)~~

~~! GDP2020 (in billions)~~

|-

| ~~United States~~

| France || 2649.0 || 2715.5

| ~~14,964.4~~

|}

~~21,427.7~~

'' Table 2: Education Data by Country

~~Japan~~

~~5,700.1~~

~~5,065.2~~

-

~~France~~

~~2,649.0~~

~~2,715.5~~

}

~~‘‘Table~~ 2: Education Data by ~~Country’’~~

{| class=~~“wikitable sortable”~~

{| class="wikitable"

|-

! Country ! AvgYearsOfSchooling2010 ! AvgYearsOfSchooling2020

|-

| United States || 12 || 13

|-

~~! Country~~

| Japan || 11 || 12

~~! AvgYearsOfSchooling2010~~

~~! AvgYearsOfSchooling2020~~

|-

| ~~United States~~

| France || 11 || 12

| 12

|}

13

~~Japan~~

11

12

-

~~France~~

11

12

}

Note: In R, you’d typically handle these datasets as separate data frames or Tibbles and might use join operations to combine them based on common keys (e.g., Country) for analysis.

Line 211:

Line 165:

''' ''''Pre- and Post-Campaign Survey Results''''': Collecting data on individuals’ attitudes, knowledge, or behaviors before and after exposure to a campaign allows for direct assessment of the campaign’s impact.

''' ''''Sales Data''''': Sales data can be used to measure the impact of marketing campaigns on consumer behavior.

''' ''''Event Attendance''''': Data on event attendance can be used to measure the impact of public health or social policy campaigns on participation in related activities.

''' <nowiki>''</nowiki>''Sales Data''''': Sales data can be used to measure the impact of marketing campaigns on consumer behavior.

''' ''''Economic Indicators/Purchasing Data''''': Marketing campaigns can use economic indicators and purchasing data to measure their impact on consumer behavior.

''' ''''Engagement Metrics''''': Data on online campaigns might include website visits, time spent on the site, click-through rates, social media engagement (likes, shares), and more.

''' <nowiki>''</nowiki>''Event Attendance''''': Data on event attendance can be used to measure the impact of public health or social policy campaigns on participation in related activities.

''' <nowiki>''</nowiki>''Economic Indicators/Purchasing Data''''': Marketing campaigns can use economic indicators and purchasing data to measure their impact on consumer behavior.

''' <nowiki>''</nowiki>''Engagement Metrics''''': Data on online campaigns might include website visits, time spent on the site, click-through rates, social media engagement (likes, shares), and more.

==== Example of Behavior Change That Can and Cannot Be Assessed ====

✅ Can Be Assessed: An increase/decrease in recycling rates following an environmental awareness campaign can be measured through surveys or municipal waste data.

✅ Can Be Assessed: Sales data can measure an increase or decrease in product sales following a marketing campaign.

✅ Can Be Assessed: An increase/decrease in attendance at a public health event following a public health campaign can be measured through event attendance data.

✅ Can Be Assessed: Web analytics can measure an increase or decrease in website visits following a digital marketing campaign.

✅ Can Be Assessed: Social media analytics can measure an increase or decrease in engagement following a social media campaign.

❌ Cannot Be Assessed Easily: Changes in attitudes or knowledge following a public awareness campaign may require pre- and post-campaign surveys to assess the campaign’s impact.

❌ Cannot Be Assessed Easily: Changes in deeply held beliefs or attitudes, such as political views, may not be immediately observable or directly translate into measurable behaviors.

❌ Cannot Be Assessed Easily: Changes in long-term health outcomes following a public health campaign may require long-term follow-up and control groups to assess the campaign’s impact.

❌ Cannot Be Assessed Easily: Changes in social norms or cultural attitudes may be difficult to measure directly and require more qualitative or indirect measures.

==== Common Errors and Pitfalls When Starting Research ====

''' ''''Selection Bias''''': Not adequately representing the target population in pre- and post-campaign surveys can lead to skewed results.

''' ''''Confirmation Bias''''': Interpreting data to confirm preconceived notions about the campaign’s effectiveness without objectively considering all evidence.

''' ''''Overlooking External Factors''''': Failing to account for external events or trends that may influence behavior independently of the campaign (e.g., a new law or cultural shift).

'''<nowiki>''</nowiki>''Confirmation Bias''''': Interpreting data to confirm preconceived notions about the campaign’s effectiveness without objectively considering all evidence.

''' ''''Insufficient Pre-Campaign Data''''': Starting data collection without establishing a baseline for comparison can make it challenging to attribute changes in behavior directly to the campaign.

''' ''''Assuming Immediate Impact''''': Some campaigns may delay behavior, leading to premature conclusions about their ineffectiveness if immediate post-campaign data is the sole focus.

''' <nowiki>''</nowiki>''Overlooking External Factors''''': Failing to account for external events or trends that may influence behavior independently of the campaign (e.g., a new law or cultural shift).

'''<nowiki>''</nowiki>''Insufficient Pre-Campaign Data''''': Starting data collection without establishing a baseline for comparison can make it challenging to attribute changes in behavior directly to the campaign.

'''<nowiki>''</nowiki>''Assuming Immediate Impact''''': Some campaigns may delay behavior, leading to premature conclusions about their ineffectiveness if immediate post-campaign data is the sole focus.

By carefully planning research and being mindful of these considerations and potential pitfalls, analysts can more accurately assess the impact of messaging campaigns on behavior.

@@ Line 51: / Line 51: @@
 Consider the [research-datasets|public dataset] section or the [datasets|community dataset] section
-== Nominal Data (Categorical without Order) ==
+=== Nominal Data (Categorical without Order) ===
-‘’‘Example dataset for locations (City, Country, Region):’‘’
+''' Example dataset for locations (City, Country, Region):
-{| class=“wikitable sortable”
+{| class="wikitable"
 |-
-! City
+! City ! Country ! Region
-! Country
-! Region
 |-
-| New York
+| New York || United States || North America
-| United States
+|-
+| Tokyo || Japan || Asia
+|-
+| Paris || France || Europe
+|}
-North America
-Tokyo
-Japan
-Asia
--
-Paris
-France
-Europe
-}
 === Ordinal Data (Categorical with Order) ===
-‘’‘Example dataset for survey responses (Satisfaction Level):’‘’ It is useful for understanding the order of responses but not the magnitude of differences between them.
+''' Example dataset for survey responses (Satisfaction Level): It is useful for understanding the order of responses but not the magnitude of differences between them
-{| class=“wikitable sortable”
+{| class="wikitable"
 |-
-! RespondentID
+! RespondentID ! SatisfactionLevel
-! SatisfactionLevel
 |-
-| 1
+| 1 || Satisfied
+|-
+| 2 || Neutral
+|-
+| 3 || Dissatisfied
+|}
-Satisfied
-Neutral
--
-Dissatisfied
-}
 To understand the magnitude of differences between responses, you’d need to use interval or ratio data. Ordinal Data can be converted to interval or ratio data by assigning numerical values to the categories (e.g., 1 for Dissatisfied, 2 for Neutral, 3 for Satisfied).
@@ Line 97: / Line 87: @@
 Useful for understanding the magnitude of differences between responses, the mean and standard deviation can be calculated for interval and ratio data.
-{| class=“wikitable sortable”
+{| class="wikitable"
 |-
-! RespondentID
+! RespondentID ! SatisfactionLevel
-! SatisfactionLevel
 |-
-| 1
+| 1 || 7
+|-
+| 2 || 5
+|-
+| 3 || 3
+|}
--
-}
 === Interval and Ratio Data (Numeric Data) ===
-‘’‘Interval data example for temperature readings in Celsius (without a true zero point):’‘’ Useful for understanding temperature changes over time.
+''' Interval data example for temperature readings in Celsius (without a true zero point): Useful for understanding temperature changes over time.
-{| class=“wikitable sortable”
+{| class="wikitable"
+|-
+! City ! MorningTemp ! NoonTemp ! EveningTemp
+|-
+| New York || 15 || 22 || 18
 |-
-! City
+| Tokyo || 20 || 28 || 25
-! MorningTemp
-! NoonTemp
-! EveningTemp
 |-
-| New York
+| Paris || 12 || 18 || 14
-| 15
+|}
-| 22
+''' Ratio data example for population size (has a true zero point): Useful for understanding the population changes over time.
-Tokyo
--
-Paris
-}
-‘’‘Ratio data example for population size (has a true zero point):’‘’ Useful for understanding the population changes over time.
-{| class=“wikitable sortable”
+{| class="wikitable"
 |-
-! City
+! City ! Population2010 ! Population2020
-! Population2010
-! Population2020
 |-
-| New York
+| New York || 8175133 || 8336817
-| 8,175,133
+|-
+| Tokyo || 13074000 || 13929286
+|-
+| Paris || 2243833 || 2148271
+|}
-,336,817
-Tokyo
-,074,000
-,929,286
--
-Paris
-,243,833
-,148,271
-}
 === Multivariable Datasets ===
-‘’‘Example with two or more tables required for dependent and independent variables:’‘’
+''' Example with two or more tables required for dependent, independent variables:
-‘‘Table 1: Economic Data by Country’’
+'' Table 1: Economic Data by Country
-{| class=“wikitable sortable”
+{| class="wikitable"
+|-
+! Country ! GDP2010 (in billions) ! GDP2020 (in billions)
+|-
+| United States || 14964.4 || 21427.7
 |-
-! Country
+| Japan || 5700.1 || 5065.2
-! GDP2010 (in billions)
-! GDP2020 (in billions)
 |-
-| United States
+| France || 2649.0 || 2715.5
-| 14,964.4
+|}
-,427.7
+'' Table 2: Education Data by Country
-Japan
-,700.1
-,065.2
--
-France
-,649.0
-,715.5
-}
-‘‘Table 2: Education Data by Country’’
-{| class=“wikitable sortable”
+{| class="wikitable"
+|-
+! Country ! AvgYearsOfSchooling2010 ! AvgYearsOfSchooling2020
+|-
+| United States || 12 || 13
 |-
-! Country
+| Japan || 11 || 12
-! AvgYearsOfSchooling2010
-! AvgYearsOfSchooling2020
 |-
-| United States
+| France || 11 || 12
-| 12
+|}
-Japan
--
-France
-}
 Note: In R, you’d typically handle these datasets as separate data frames or Tibbles and might use join operations to combine them based on common keys (e.g., Country) for analysis.
@@ Line 211: / Line 165: @@
 ''' ''''Pre- and Post-Campaign Survey Results''''': Collecting data on individuals’ attitudes, knowledge, or behaviors before and after exposure to a campaign allows for direct assessment of the campaign’s impact.
-''' ''''Sales Data''''': Sales data can be used to measure the impact of marketing campaigns on consumer behavior.
-''' ''''Event Attendance''''': Data on event attendance can be used to measure the impact of public health or social policy campaigns on participation in related activities.
+''' <nowiki>''</nowiki>''Sales Data''''': Sales data can be used to measure the impact of marketing campaigns on consumer behavior.
-''' ''''Economic Indicators/Purchasing Data''''': Marketing campaigns can use economic indicators and purchasing data to measure their impact on consumer behavior.
-''' ''''Engagement Metrics''''': Data on online campaigns might include website visits, time spent on the site, click-through rates, social media engagement (likes, shares), and more.
+''' <nowiki>''</nowiki>''Event Attendance''''': Data on event attendance can be used to measure the impact of public health or social policy campaigns on participation in related activities.
+''' <nowiki>''</nowiki>''Economic Indicators/Purchasing Data''''': Marketing campaigns can use economic indicators and purchasing data to measure their impact on consumer behavior.
+''' <nowiki>''</nowiki>''Engagement Metrics''''': Data on online campaigns might include website visits, time spent on the site, click-through rates, social media engagement (likes, shares), and more.
 ==== Example of Behavior Change That Can and Cannot Be Assessed ====
 ✅ Can Be Assessed: An increase/decrease in recycling rates following an environmental awareness campaign can be measured through surveys or municipal waste data.
 ✅ Can Be Assessed: Sales data can measure an increase or decrease in product sales following a marketing campaign.
 ✅ Can Be Assessed: An increase/decrease in attendance at a public health event following a public health campaign can be measured through event attendance data.
 ✅ Can Be Assessed: Web analytics can measure an increase or decrease in website visits following a digital marketing campaign.
 ✅ Can Be Assessed: Social media analytics can measure an increase or decrease in engagement following a social media campaign.
 ❌ Cannot Be Assessed Easily: Changes in attitudes or knowledge following a public awareness campaign may require pre- and post-campaign surveys to assess the campaign’s impact.
 ❌ Cannot Be Assessed Easily: Changes in deeply held beliefs or attitudes, such as political views, may not be immediately observable or directly translate into measurable behaviors.
 ❌ Cannot Be Assessed Easily: Changes in long-term health outcomes following a public health campaign may require long-term follow-up and control groups to assess the campaign’s impact.
 ❌ Cannot Be Assessed Easily: Changes in social norms or cultural attitudes may be difficult to measure directly and require more qualitative or indirect measures.
 ==== Common Errors and Pitfalls When Starting Research ====
 ''' ''''Selection Bias''''': Not adequately representing the target population in pre- and post-campaign surveys can lead to skewed results.
-''' ''''Confirmation Bias''''': Interpreting data to confirm preconceived notions about the campaign’s effectiveness without objectively considering all evidence.
-''' ''''Overlooking External Factors''''': Failing to account for external events or trends that may influence behavior independently of the campaign (e.g., a new law or cultural shift).
+'''<nowiki>''</nowiki>''Confirmation Bias''''': Interpreting data to confirm preconceived notions about the campaign’s effectiveness without objectively considering all evidence.
-''' ''''Insufficient Pre-Campaign Data''''': Starting data collection without establishing a baseline for comparison can make it challenging to attribute changes in behavior directly to the campaign.
-''' ''''Assuming Immediate Impact''''': Some campaigns may delay behavior, leading to premature conclusions about their ineffectiveness if immediate post-campaign data is the sole focus.
+''' <nowiki>''</nowiki>''Overlooking External Factors''''': Failing to account for external events or trends that may influence behavior independently of the campaign (e.g., a new law or cultural shift).
+'''<nowiki>''</nowiki>''Insufficient Pre-Campaign Data''''': Starting data collection without establishing a baseline for comparison can make it challenging to attribute changes in behavior directly to the campaign.
+'''<nowiki>''</nowiki>''Assuming Immediate Impact''''': Some campaigns may delay behavior, leading to premature conclusions about their ineffectiveness if immediate post-campaign data is the sole focus.
 By carefully planning research and being mindful of these considerations and potential pitfalls, analysts can more accurately assess the impact of messaging campaigns on behavior.