Overview

Row

Overview


Central California Travel Study: Unweighted Codebook and Dataset Guide

This document provides key information about all variables included in the Central California Travel Study dataset. This includes:

  • A codebook with all variable and value labels, searchable by data table.

  • Unweighted frequency tables with unweighted tabulations for all categorical variables in the dataset.

  • Information about data privacy, data preparation, and notes on joining tables.


Survey and Dataset Overview

The Central California Travel Study collected data from April 19, 2022 to March 7, 2023. Households were recruited into the study via address-based sampling (ABS), supplemented with non-probability sampling through Housing Agencies, Transit Lists and Outreach Organizations. The study included two parts:

  1. Part one, also called the “recruit survey,” collected information about household composition, demographics, and typical travel behavior.

  2. Part two, also called the “travel diary,” required participants to record their travel during an assigned travel period.


“Complete” households met the following conditions

  • The household completed the recruit survey in full.

  • The household completed a travel diary for all participating household members on at least one concurrent weekday.


Identifying participating household members

The household member who completed the recruit survey is referred to as “person 1.” All household members whose relationship to person 1 is spouse or unmarried partner, child, parent, sibling, or other relative reported their travel and are considered participating household members. These persons can be identified by a value of 1 in the “surveyable” column in the person data table. Roommates/friends, household help, and other nonrelatives living in the household did not report travel, have a value of 0 in the “surveyable” column and are not considered participating household members. As described above, households in which all participating household members completed all travel diary surveys on at least one concurrent weekday are considered “complete.”

Dataset Composition

Dataset Composition

The final dataset includes seven distinct data tables. These tables include all user-input survey variables, certain survey metadata (e.g., survey completion mode), and variables derived to support data analysis.

Table Rows
Household     7,406 complete households
Person    19,084 people
Vehicle    13,186 vehicles
Day    42,567 days
Trip   150,012 trips
Location 2,077,228 points


Data Privacy

Data Privacy

The survey dataset contains sensitive and confidential data, including personally identifiable information (PII). The table below lists several specific variables that are likely to contain PII or sensitive information, however this list is not exhaustive. Combinations of additional variables may also constitute PII. Please follow your organization’s privacy policy, this survey’s privacy policy, and generally accepted best practices for data privacy and security.

RSG has removed any PII used solely for survey administration from the final dataset. Additionally, survey access codes used by participants have been replaced with household IDs (hh_id) to ensure that no information in the final dataset can be identified using the passwords the participant used.


Sensitive and Personally Identifiable Information (PII)

Variable Name(s) Data Level Confidential Data Type
home_lat, home_lon Household PII
income_detailed, income_broad Household Sensitive
ethnicity Person Sensitive
ethnicity_other Person Sensitive and/or PII
race Person Sensitive
race_other Person Sensitive and/or PII
school_lat, school_lon Person PII
work_lat, work_lon Person PII
o_lat, o_lon Trip PII
d_lat, d_lon Trip PII
mode_other_specify Trip Sensitive and/or PII
other Vehicle Sensitive and/or PII

Codebook

Row

Variable Descriptions

Value Labels

Household

Row

sample_segment

Sample segment

Row

participation_group

Participation group

Row

home_county

Home location– County

Row

home_state

Home location– State

Row

home_in_region

Home is in the study region

Row

home_puma_2012

Home location– 2012 Public Use Microdata Area

Row

num_trips

Number of trips on complete travel day



Row

num_days_complete_weekday

Number of complete household days on Monday, Tuesday, Wednesday, Thursday, and Friday

Row

num_days_complete_weekend

Number of complete household days on Saturday and Sunday

Row

num_complete_mon

Household is complete on Monday of travel day

Row

num_complete_tue

Household is complete on Tuesday of travel day

Row

num_complete_wed

Household is complete on Wednesday of travel day

Row

num_complete_thu

Household is complete on Thursday of travel day

Row

num_complete_fri

Household is complete on Friday of travel day

Row

num_complete_sat

Household is complete on Saturday of travel day

Row

num_complete_sun

Household is complete on Sunday of travel day

Row

num_people

Number of people in household

Row

num_participants

Number of participants

Row

num_adults

Number of adults in household (age 18 and above)

Row

num_kids

Number of children in household (age 0-17)

Row

num_students

Number of adult students in household

Row

num_workers

Number of workers in household (employed full-time, employed part-time, self-employed, or volunteer/unpaid intern)

Row

num_vehicles

Number of vehicles in household

Row

income_detailed

2021 household income (detailed categories)

Row

income_followup

2021 household income (broad categories)

if replied ‘Prefer not to answer’ to income_detailed

Row

income_broad

2021 household income (broad categories, combined responses of income_detailed and income_followup)

Row

residence_rent_own

Current residence ownership

if rMove or (rMove for Web and person 1)

Row

residence_type

Type of current residence

if rMove or (rMove for Web and person 1)

Row

residence_term

Duration lived in current residence

if rMove or (rMove for Web and person 1)

Row

residence_own_barriers

Barriers against home owning

if rMove or (rMove for Web and person 1) and residence_rent_own is not ‘own/buying (paying a mortgage)’

Row

num_bicycles

Number of bicycles in household

Row

bicycle_type

Bicycle types owned by household

if household has >= 1 bike

Row

micromobility_devices

Micromobility used by household

if rMove or (rMove for Web and person 1)

Person

Row

person_num

Person number within household

Row

num_trips

Number of trips on complete travel day



Row

is_participant

Active participant (age 18+)

Row

surveyable

Survey participant (related to primary member)

Row

has_proxy

Has a proxy

Row

is_proxy

Assigned proxy reporter

Row

relationship

Relationship to household person number 1

Row

age

Age of household member

Row

gender

Gender

if surveyable

Row

education

Highest level of education completed

if participant

Row

disability

Disability or illness that affects ability to travel

if age >= 18 and surveyable

Row

race

Race

if age >= 18 and surveyable

Row

ethnicity

Ethnicity

if age >= 18 and surveyable

Row

can_drive

Household member drives

if age >= 16 and surveyable

Row

student

Student status and location

if age >= 18 and surveyable

Row

employment

Employment status

if age >= 16

Row

num_jobs

Number of jobs

if employment = full/part/self/volunteer/furloughed and surveyable

Row

job_type

Work location type

if employment = full/part/self/volunteer and surveyable

Row

industry

Job industry

if age >= 18, employment = full/part/self/volunteer/furloughed, and surveyable

Row

work_in_region

Work is in study region

if employment = full/part/self/volunteer and attends work in-person (some or all of the time)

Row

work_county

Work location– County

if employment = full/part/self/volunteer and attends work in-person (some or all of the time)

Row

work_state

Work location– State

if employment = full/part/self/volunteer and attends work in-person (some or all of the time)

Row

work_puma_2012

Work location– 2012 Public Use Microdata Area

if employment = full/part/self/volunteer and attends work in-person (some or all of the time)

Row

work_freq

Number of days typically worked each week

if participant and employment = full/part/self/volunteer

Row

work_mode

Typical mode of travel to/from work

if employment - full/part/self/volunteer, age >= 18, and doesn’t work only from home

Row

telework_freq

How often telecommutes

if employment = full/part/self/volunteer and age >= 18

Row

telework_freq_pre_covid

How often telecommuted before covid

if employment = full/part/self/volunteer/furloughed and age >= 18

Row

commute_subsidy

Commute Benefits Provided by Employer

if employment = full/part/self/volunteer/furloughed and age >= 18

Row

commute_subsidy_use

Commute Subsidy Used

if employment = full/part/self/volunteer/furloughed, employer provides benefits, and surveyable

Row

school_type

Type of school attends

if student and surveyable

Row

school_in_region

School is in study region

if student who attends some or all in-person classes

Row

school_county

School location– County

if student who attends some or all in-person classes

Row

school_state

School location– State

if student who attends some or all in-person classes

Row

school_puma_2012

School location– 2012 Public Use Microdata Area

if student who attends some or all in-person classes

Row

school_attend

Structure of classes

if student and not cared for at home, attending daycare, or home schooled

Row

school_freq

Frequency of travel to school

if student who attends some or all in-person classes

Row

school_mode

Typical mode of travel to/from school

if student who attends some or all in-person classes

Row

remote_class_freq

Remote class frequency

if child and student

Row

second_home

Regularly spends the night at a second home (e.g., another parent or grandparent’s house, partner or spouse’s home, or a vacation home)

if surveyable

Row

second_home_in_region

Second home is in study region

if has second home

Row

second_home_county

Second home location– County

if has second home

Row

second_home_state

Second home location– State

if has second home

Row

second_home_puma_2012

Second home location– 2012 Public Use Microdata Area

if has second home

Row

share

Share services

if age >= 18

Row

tnc_freq

Frequency of app-based ride service

if uses Uber, Lyft, or other smartphone-app ride service

Row

carshare_freq

Frequency of carshare use

if uses Carshare

Row

peerrent_freq

Frequency of peer-to-peer car rental use

if uses Peer-to-peer car rental

Row

bikeshare_freq

Frequency of bikeshare use

if uses bikeshare or bike rental service

Row

vanpool_freq

Frequency of vanpool use

if uses vanpool

Row

scootshare_freq

Frequency of scootershare use

if uses scooter share

Row

mopedshare_freq

Frequency of mopedshare use

if uses moped share

Row

transit_freq

Use frequency of transit

if age >= 18 and participant

Row

vehicle

Vehicle driven the most

if household has >= 1 vehicle and person drives

Row

ev_typical_charge

Typical electric vehicle charge location

if fuel type of primary vehicle driven is electric

Row

ev_purchase

Considering purchasing a fully electric vehicle

if household has >=1 vehicles and fuel type of primary vehicle driven is not electric

Row

home_vehicle_park

Typical park location at home

if household has 1+ vehicles and person drives; if rMove or (rMove for Web and person 1)

Row

home_vehicle_park_pay

Pays to park

if household has 1+ vehicles and person drives; if rMove or (rMove for Web and person 1)

Row

barriers

Barrier

if age >= 18 and participant

Row

bike_freq

Bicycle use frequency

if age >= 18 and participant

Row

bike_store

Bicycle storage location

if household has >= 1 bike

Row

physical_activity

Time spent exercising per week

if age >= 18 and participant

Row

discrimination

Discrimination Faced

if rMove or (rMove for Web and person 1)

Row

discrimination_basis

Basis of Discrimination

if rMove or (rMove for Web and person 1) and discrimination is not ‘No, I have not experienced discrimination in housing’

Row

neighborhood_pro

Best thing about current neighborhood

if rMove or (rMove for Web and person 1)

Row

neighborhood_con

Worst thing about current neighborhood

if rMove or (rMove for Web and person 1)

Row

housing_planning_types

Desired housing type in community

if rMove or (rMove for Web and person 1)

Row

language_at_home

Speaks a language other than English at home

if age >= 18 and surveyable

Row

language_spoken

Language spoken at home

if language_at_home = ‘Yes’

Row

phone_type

Type of phone

if age >= 18 and surveyable and (bMove or not person 1)

Row

participate

Willingness to participate in future transportation surveys

Vehicle

Row

vehicle_num

Vehicle number within household

Row

make

Vehicle make

Row

model

Vehicle model

Row

year

Vehicle year

Row

fuel_type

Fuel Type

Row

vehicle_ownership

Ownership status of vehicle

Row

toll_transponder

Vehicle has a toll transponder

Day

Row

person_num

Person number within household

Row

day_num

Day number within travel period

Row

num_trips

Number of trips on complete travel day



Row

is_participant

Active participant (age 18+)

Row

surveyable

Survey participant (related to primary member)

Row

travel_day

Day on which trips are reported (including child proxy day)

Row

travel_dow

Day of week

Row

hh_day_complete

Household day completion status

Row

summary_complete

Summary is complete

Row

proxy_complete

Proxy is complete

if has a proxy

Row

num_complete_trip_surveys

Number of complete trip surveys



Row

begin_day

Location at beginning of day

Row

end_day

Location at end of day

Row

made_travel

Made travel on day

if child with zero trips, did not travel to school/daycare, and begin_day = end_day

Row

no_travel

Reason for no travel on travel day

if made zero trips

Row

delivery

Delivery on travel day

if adult participant and (rMove or person 1)

Row

attend_school

Travel to school on day

if age < 18 and attends school or daycare in-person and did not report trip with school purpose on travel day

Row

attend_school_no

Didn’t travel to school on day

if age < 18 and did not attend school or daycare on travel day

Row

telecommute_time

if teleworked



Trip

Row

person_num

Person number within household

Row

day_num

Day number within travel period

Row

travel_dow

Day of week

Row

hh_day_complete

Household day completion status

Row

trip_survey_complete

Trip survey was completed

Row

copied_from_proxy

Trip copied from proxy

Row

depart_dow

Departure day of the week

Row

arrive_dow

Arrival day of the week

Row

o_in_region

Origin is in study region

Row

o_county

Origin– County

Row

o_state

Origin– State

Row

o_puma_2012

Origin– 2012 Public Use Microdata Area

Row

d_in_region

Destination is in study region

Row

d_county

Destination– County

Row

d_state

Destination– State

Row

d_puma_2012

Destination– 2012 Public Use Microdata Area

Row

speed_mph

Speed (mph)



Row

distance_miles

Trip distance between collected location points (miles)



Row

duration_minutes

Trip duration imputed (minutes)



Row

mode_type

Mode type

Row

mode_1

Trip mode 1

Row

mode_2

Trip mode 2

Row

mode_3

Trip mode 3

Row

mode_4

Trip mode 4

Row

num_travelers

Number of people in travel party

Row

num_hh_travelers

Number of household members on trip

Row

num_non_hh_travelers

Number of non-household member travelers

Row

hh_member

Member on trip

Row

o_purpose_category

Origin purpose category

Row

o_purpose

Origin purpose

Row

d_purpose_category

Destination purpose category

Row

d_purpose

Destination purpose

Row

driver

Driver of vehicle

if mode/transit_access/transit_egress = HH vehicle or other vehicle and (travel party = 2+ except if other travelers are household children under 16)

Row

ev_charge_station

Electric vehicle charge station at trip destination

if used household electric vehicle on trip

Row

ev_charge_station_decision

Decided to stop because of EV charging stations

if used EV charge station at destination and destination is not home/work/school location

Row

ev_charge_station_level

EV charge stations at destination

if EV charge stations were at destination

Row

park_location

Location of park vehicle

if mode or transit_access or transit_egress = HH vehicle or other vehicle

Row

park_type

Payment method for vehicle parking

if park_location = lot/garage, on-street, or park and ride lot

Row

bike_park_loc

Location bike was parked on trip

if mode or transit_access or transit_egress = bicycle

Row

scooter_park_location

Scooter park location for trip

if mode or transit_access or transit_egress = micromobility

Row

transit_egress

Mode used to leave transit stop

if mode = bus or rail

Row

transit_access

Mode used to access transit stop

if mode = bus or rail

Row

transit_type

Payment method for transit

if mode = bus (except school bus) or rail

Row

tnc_type

Type of TNC used on trip

if mode_taxi = Uber/Lyft

Row

taxi_type

Who paid for trip

if mode or transit_access or transit_egress = taxi

Row

taxi_pay

Knows amount paid for transit trip

if taxi_type = I paid, employer paid, split/shared

Row

taxi_cost

Amount paid for taxi trip

if taxi_pay = yes



Row

unlinked_trip

Trip is unlinked

Household

Row

sample_segment

Sample segment

Row

participation_group

Participation group

Row

home_county

Home location– County

Row

home_state

Home location– State

Row

home_in_region

Home is in the study region

Row

home_puma_2012

Home location– 2012 Public Use Microdata Area

Row

num_trips

Number of trips on complete travel day



Row

num_days_complete_weekday

Number of complete household days on Monday, Tuesday, Wednesday, Thursday, and Friday

Row

num_days_complete_weekend

Number of complete household days on Saturday and Sunday

Row

num_complete_mon

Household is complete on Monday of travel day

Row

num_complete_tue

Household is complete on Tuesday of travel day

Row

num_complete_wed

Household is complete on Wednesday of travel day

Row

num_complete_thu

Household is complete on Thursday of travel day

Row

num_complete_fri

Household is complete on Friday of travel day

Row

num_complete_sat

Household is complete on Saturday of travel day

Row

num_complete_sun

Household is complete on Sunday of travel day

Row

num_people

Number of people in household

Row

num_participants

Number of participants

Row

num_adults

Number of adults in household (age 18 and above)

Row

num_kids

Number of children in household (age 0-17)

Row

num_students

Number of adult students in household

Row

num_workers

Number of workers in household (employed full-time, employed part-time, self-employed, or volunteer/unpaid intern)

Row

num_vehicles

Number of vehicles in household

Row

income_detailed

2021 household income (detailed categories)

Row

income_followup

2021 household income (broad categories)

if replied ‘Prefer not to answer’ to income_detailed

Row

income_broad

2021 household income (broad categories, combined responses of income_detailed and income_followup)

Row

residence_rent_own

Current residence ownership

if rMove or (rMove for Web and person 1)

Row

residence_type

Type of current residence

if rMove or (rMove for Web and person 1)

Row

residence_term

Duration lived in current residence

if rMove or (rMove for Web and person 1)

Row

residence_own_barriers

Barriers against home owning

if rMove or (rMove for Web and person 1) and residence_rent_own is not ‘own/buying (paying a mortgage)’

Row

num_bicycles

Number of bicycles in household

Row

bicycle_type

Bicycle types owned by household

if household has >= 1 bike

Row

micromobility_devices

Micromobility used by household

if rMove or (rMove for Web and person 1)

Person

Row

person_num

Person number within household

Row

num_trips

Number of trips on complete travel day



Row

is_participant

Active participant (age 18+)

Row

surveyable

Survey participant (related to primary member)

Row

has_proxy

Has a proxy

Row

is_proxy

Assigned proxy reporter

Row

relationship

Relationship to household person number 1

Row

age

Age of household member

Row

gender

Gender

if surveyable

Row

education

Highest level of education completed

if participant

Row

disability

Disability or illness that affects ability to travel

if age >= 18 and surveyable

Row

race

Race

if age >= 18 and surveyable

Row

ethnicity

Ethnicity

if age >= 18 and surveyable

Row

can_drive

Household member drives

if age >= 16 and surveyable

Row

student

Student status and location

if age >= 18 and surveyable

Row

employment

Employment status

if age >= 16

Row

num_jobs

Number of jobs

if employment = full/part/self/volunteer/furloughed and surveyable

Row

job_type

Work location type

if employment = full/part/self/volunteer and surveyable

Row

industry

Job industry

if age >= 18, employment = full/part/self/volunteer/furloughed, and surveyable

Row

work_in_region

Work is in study region

if employment = full/part/self/volunteer and attends work in-person (some or all of the time)

Row

work_county

Work location– County

if employment = full/part/self/volunteer and attends work in-person (some or all of the time)

Row

work_state

Work location– State

if employment = full/part/self/volunteer and attends work in-person (some or all of the time)

Row

work_puma_2012

Work location– 2012 Public Use Microdata Area

if employment = full/part/self/volunteer and attends work in-person (some or all of the time)

Row

work_freq

Number of days typically worked each week

if participant and employment = full/part/self/volunteer

Row

work_mode

Typical mode of travel to/from work

if employment - full/part/self/volunteer, age >= 18, and doesn’t work only from home

Row

telework_freq

How often telecommutes

if employment = full/part/self/volunteer and age >= 18

Row

telework_freq_pre_covid

How often telecommuted before covid

if employment = full/part/self/volunteer/furloughed and age >= 18

Row

commute_subsidy

Commute Benefits Provided by Employer

if employment = full/part/self/volunteer/furloughed and age >= 18

Row

commute_subsidy_use

Commute Subsidy Used

if employment = full/part/self/volunteer/furloughed, employer provides benefits, and surveyable

Row

school_type

Type of school attends

if student and surveyable

Row

school_in_region

School is in study region

if student who attends some or all in-person classes

Row

school_county

School location– County

if student who attends some or all in-person classes

Row

school_state

School location– State

if student who attends some or all in-person classes

Row

school_puma_2012

School location– 2012 Public Use Microdata Area

if student who attends some or all in-person classes

Row

school_attend

Structure of classes

if student and not cared for at home, attending daycare, or home schooled

Row

school_freq

Frequency of travel to school

if student who attends some or all in-person classes

Row

school_mode

Typical mode of travel to/from school

if student who attends some or all in-person classes

Row

remote_class_freq

Remote class frequency

if child and student

Row

second_home

Regularly spends the night at a second home (e.g., another parent or grandparent’s house, partner or spouse’s home, or a vacation home)

if surveyable

Row

second_home_in_region

Second home is in study region

if has second home

Row

second_home_county

Second home location– County

if has second home

Row

second_home_state

Second home location– State

if has second home

Row

second_home_puma_2012

Second home location– 2012 Public Use Microdata Area

if has second home

Row

share

Share services

if age >= 18

Row

tnc_freq

Frequency of app-based ride service

if uses Uber, Lyft, or other smartphone-app ride service

Row

carshare_freq

Frequency of carshare use

if uses Carshare

Row

peerrent_freq

Frequency of peer-to-peer car rental use

if uses Peer-to-peer car rental

Row

bikeshare_freq

Frequency of bikeshare use

if uses bikeshare or bike rental service

Row

vanpool_freq

Frequency of vanpool use

if uses vanpool

Row

scootshare_freq

Frequency of scootershare use

if uses scooter share

Row

mopedshare_freq

Frequency of mopedshare use

if uses moped share

Row

transit_freq

Use frequency of transit

if age >= 18 and participant

Row

vehicle

Vehicle driven the most

if household has >= 1 vehicle and person drives

Row

ev_typical_charge

Typical electric vehicle charge location

if fuel type of primary vehicle driven is electric

Row

ev_purchase

Considering purchasing a fully electric vehicle

if household has >=1 vehicles and fuel type of primary vehicle driven is not electric

Row

home_vehicle_park

Typical park location at home

if household has 1+ vehicles and person drives; if rMove or (rMove for Web and person 1)

Row

home_vehicle_park_pay

Pays to park

if household has 1+ vehicles and person drives; if rMove or (rMove for Web and person 1)

Row

barriers

Barrier

if age >= 18 and participant

Row

bike_freq

Bicycle use frequency

if age >= 18 and participant

Row

bike_store

Bicycle storage location

if household has >= 1 bike

Row

physical_activity

Time spent exercising per week

if age >= 18 and participant

Row

discrimination

Discrimination Faced

if rMove or (rMove for Web and person 1)

Row

discrimination_basis

Basis of Discrimination

if rMove or (rMove for Web and person 1) and discrimination is not ‘No, I have not experienced discrimination in housing’

Row

neighborhood_pro

Best thing about current neighborhood

if rMove or (rMove for Web and person 1)

Row

neighborhood_con

Worst thing about current neighborhood

if rMove or (rMove for Web and person 1)

Row

housing_planning_types

Desired housing type in community

if rMove or (rMove for Web and person 1)

Row

language_at_home

Speaks a language other than English at home

if age >= 18 and surveyable

Row

language_spoken

Language spoken at home

if language_at_home = ‘Yes’

Row

phone_type

Type of phone

if age >= 18 and surveyable and (bMove or not person 1)

Row

participate

Willingness to participate in future transportation surveys

Vehicle

Row

vehicle_num

Vehicle number within household

Row

make

Vehicle make

Row

model

Vehicle model

Row

year

Vehicle year

Row

fuel_type

Fuel Type

Row

vehicle_ownership

Ownership status of vehicle

Row

toll_transponder

Vehicle has a toll transponder

Day

Row

person_num

Person number within household

Row

day_num

Day number within travel period

Row

num_trips

Number of trips on complete travel day



Row

is_participant

Active participant (age 18+)

Row

surveyable

Survey participant (related to primary member)

Row

travel_day

Day on which trips are reported (including child proxy day)

Row

travel_dow

Day of week

Row

hh_day_complete

Household day completion status

Row

summary_complete

Summary is complete

Row

proxy_complete

Proxy is complete

if has a proxy

Row

num_complete_trip_surveys

Number of complete trip surveys



Row

begin_day

Location at beginning of day

Row

end_day

Location at end of day

Row

made_travel

Made travel on day

if child with zero trips, did not travel to school/daycare, and begin_day = end_day

Row

no_travel

Reason for no travel on travel day

if made zero trips

Row

delivery

Delivery on travel day

if adult participant and (rMove or person 1)

Row

attend_school

Travel to school on day

if age < 18 and attends school or daycare in-person and did not report trip with school purpose on travel day

Row

attend_school_no

Didn’t travel to school on day

if age < 18 and did not attend school or daycare on travel day

Row

telecommute_time

if teleworked



Trip

Row

person_num

Person number within household

Row

day_num

Day number within travel period

Row

travel_dow

Day of week

Row

hh_day_complete

Household day completion status

Row

trip_survey_complete

Trip survey was completed

Row

copied_from_proxy

Trip copied from proxy

Row

depart_dow

Departure day of the week

Row

arrive_dow

Arrival day of the week

Row

o_in_region

Origin is in study region

Row

o_county

Origin– County

Row

o_state

Origin– State

Row

o_puma_2012

Origin– 2012 Public Use Microdata Area

Row

d_in_region

Destination is in study region

Row

d_county

Destination– County

Row

d_state

Destination– State

Row

d_puma_2012

Destination– 2012 Public Use Microdata Area

Row

speed_mph

Speed (mph)



Row

distance_miles

Trip distance between collected location points (miles)



Row

duration_minutes

Trip duration imputed (minutes)



Row

mode_type

Mode type

Row

mode_1

Trip mode 1

Row

mode_2

Trip mode 2

Row

mode_3

Trip mode 3

Row

mode_4

Trip mode 4

Row

num_travelers

Number of people in travel party

Row

num_hh_travelers

Number of household members on trip

Row

num_non_hh_travelers

Number of non-household member travelers

Row

hh_member

Member on trip

Row

o_purpose_category

Origin purpose category

Row

o_purpose

Origin purpose

Row

d_purpose_category

Destination purpose category

Row

d_purpose

Destination purpose

Row

driver

Driver of vehicle

if mode/transit_access/transit_egress = HH vehicle or other vehicle and (travel party = 2+ except if other travelers are household children under 16)

Row

ev_charge_station

Electric vehicle charge station at trip destination

if used household electric vehicle on trip

Row

ev_charge_station_decision

Decided to stop because of EV charging stations

if used EV charge station at destination and destination is not home/work/school location

Row

ev_charge_station_level

EV charge stations at destination

if EV charge stations were at destination

Row

park_location

Location of park vehicle

if mode or transit_access or transit_egress = HH vehicle or other vehicle

Row

park_type

Payment method for vehicle parking

if park_location = lot/garage, on-street, or park and ride lot

Row

bike_park_loc

Location bike was parked on trip

if mode or transit_access or transit_egress = bicycle

Row

scooter_park_location

Scooter park location for trip

if mode or transit_access or transit_egress = micromobility

Row

transit_egress

Mode used to leave transit stop

if mode = bus or rail

Row

transit_access

Mode used to access transit stop

if mode = bus or rail

Row

transit_type

Payment method for transit

if mode = bus (except school bus) or rail

Row

tnc_type

Type of TNC used on trip

if mode_taxi = Uber/Lyft

Row

taxi_type

Who paid for trip

if mode or transit_access or transit_egress = taxi

Row

taxi_pay

Knows amount paid for transit trip

if taxi_type = I paid, employer paid, split/shared

Row

taxi_cost

Amount paid for taxi trip

if taxi_pay = yes



Row

unlinked_trip

Trip is unlinked

Using The Data

Column

Data Preparation

Data Preparation

This section summarizes the methods used to prepare the data. Given that all data were collected in a “controlled” environment (e.g., survey answers are validated in real-time), data preparation was primarily focused on coding variables and deriving new fields to facilitate analysis.

Data cleaning included dropping trips with unreasonable speeds or distances based on the mode of travel. Only households with reported home locations inside of the survey region were included in the final data.


Joining Tables

Joining Tables

Survey data tables can be joined to one another as follows:

Table name Variable(s) to join to other survey data tables
Household hh_id
Person hh_id, person_id
Vehicle hh_id
Day hh_id, person_id, day_id
Trip hh_id, person_id, day_id, trip_id,
Location trip_id

Time and Location Standards

Time and Location Standards

All timestamps are set to the local time at the time they were collected. All location latitude and longitude information are presented in WGS84 format.

All timestamps reflect the local time zone for the study region (Pacific Time), regardless of where the trip took place geographically (e.g., if a trip took place in another time zone, the timestamps for that trip are still in Pacific Time).

Missing Values

Missing Values

A survey data table cell may be missing data for one of four reasons:

1. Value or response is missing due to survey logic, participant non-response, or error.

Example: Participants who traveled by bus were not asked if they were the driver or passenger on the trip.

Coded as: 995 for categorical variables, blank/NA for continuous variables

2. A respondent indicated that the question was not applicable and skipped that question.

Example: Some participants did not share how they pay to park at work because they do not park at work (e.g., carpool).

Coded as: 996 (often labeled as “Not applicable”)

3. A respondent indicated that they didn’t know the answer and skipped that question.

Example: Some participants who made a vehicle trip and paid to park the vehicle may not remember the amount they paid.

Coded as: 998 (Don’t know)

4. A respondent indicated that they preferred not to answer a question and skipped that question.

Example: Some participants chose not to provide their household income.

Coded as: 999 (Prefer not to answer)

Other notes about missing survey data:

  • For a survey to be complete, all survey questions asked of the participant must have been answered in the app or online instrument.
  • Continuous variables (e.g., trip distance, trip duration) are not coded with missing value codes and are instead left empty when missing to avoid interfering with statistical calculations.
  • Due to the large size of the location table, missing values were left exactly as they were collected. Speed, heading, and accuracy can all potentially contain missing values that are either stored as “-1”, NA, or 0. Analysis on those fields should filter to where the values are greater than zero.

Outliers

Outliers

Continuous variables (e.g., trip distance, trip duration) in the dataset may contain outliers. Data users should be aware of these outliers when calculating summary statistics (e.g., mean) for these variables.

Derived and Recoded Variables

Derived and Recoded Variables

This dataset includes a combination of variables that were actively collected via survey questions, passively collected via rMove or other metadata, implicitly assigned (e.g., administrative variables such as ID numbers), and derived or recoded (calculated from some combination of other variables). Key derived or recoded variables in this dataset are summarized below.

Household-level Derived Variables

  • Completion status
  • Home geographies (block group, zone)
  • Aggregate income (based on the initial and follow-up income questions)

Person-level Derived Variables

  • Completion status
  • Number of complete days
  • Work/school geographies (state, county, block group, zone)

Day-level Variables

  • Completion status
  • Number of trips per day
  • Day completion status

Trip-level Variables

  • Trip speed
  • Trip path distance (based on the GPS location data)
  • Trip origin and destination geographies (state, county, block group, PUMA)
  • Departure time (imputed in some cases)
  • Trip purpose (imputed in some cases)
  • Mode type and purpose categories

Imputation

Imputation

Departure Time

In some cases, the rMove app may have detected the start of a trip after its true start time, which can yield invalid or extreme values for trip duration and speed. In these cases, the fields depart_date, depart_hour, and depart_minute were adjusted for “late pickup” conditions using the following approach:

  • Departure time was imputed using the median speed between all locations along the trip, excluding the origin point, and the distance between the origin and the next point on the trip. For trips with fewer than three recorded locations, imputed departure time is set three minutes earlier than the original departure time to compensate for rMove’s 3-5-minute ping interval. Note that some trips that are the result of split loop trips may only have three or fewer points but will use the imputed depart time from before the loop trip was split and thus may not be included in this rule.
  • If the imputed departure time overlaps with the previous trip’s arrival time, the previous trip’s arrival time was instead used as the departure time. Regardless of the number of locations along a trip, if the imputed departure time was later than the initially reported departure time, the imputed departure time is set to the original departure time. User-added trips as well as long distance passenger mode trips are also set to the original departure time, as user-added trips are not subject to “late pickup” conditions, and long-distance passenger modes are often plane trips where all collected traces contain speed information from other modes and thus are less reliable (as rMove cannot collect locations when a phone is in “airplane mode”).

Duration and speed are calculated based on the imputed departure time.

Purpose

Respondents report the purpose of the trip destination in each trip survey. The origin purpose is derived from the destination purpose of the previous trip, except for the first trip in the travel period or where an rMove trip occurs after a trip with item non-response. For the first trip in the travel period, the origin purpose can be inferred from “begin_day” in the day table.

When purpose was not asked because an analyst split a user-reported trip during data cleaning (creating a new destination along a trip), purpose values are derived where possible based on proximity (within 150 meters) to estimated home, work, or school locations. If the location is not proximate to home, work, or school locations, the purpose is set to “other.”

The purpose category variables (o_purpose_category, d_purpose_category) contain aggregated purpose values based on the type of purpose at the origin/destination of each trip. Dataset users are welcome to perform their own recoding of the purpose categories as well.

Trip purposes have been imputed in cases where a purpose reported by the user is assumed to be inaccurate based on information about that person’s reported habitual locations and other trips (primarily to home, work, and school locations). The trip purpose imputation approach was applied to all rMove trips in person-days with at least 1 complete trip and no more than 10 incomplete trips. (“Incomplete” trips are trips for which the respondent did not answer the trip-specific survey questions about purpose, mode, etc. for the given trip.)

The approach was to apply various “tests” in logical sequence to trips for which the stated purpose is not consistent with the location type based on the reported habitual locations. In general terms, the tests were designed to:

  • Check the respondent’s reported destination purpose when it conflicts with the destination location type. (The details of the tests depend on the trip purpose, with different criteria used for change-mode trips, escort trips, linked transit trips, trips with home destinations but other reported purposes, etc.)
  • Identify cases where respondents swapped the order of two or more trips when reporting their details.
  • Identify cases where respondents may have omitted a trip and shifted remaining reported trip details by one trip when reporting the rest of their trips.
  • Fill in missing data by sampling destination purposes from other trips made to the same locations, either by the same respondent or by other respondents.

Mode type

Mode type (mode_type)

Mode_type synthesizes mode_1 to mode_4 down to a single, easier-to-use variable for analytical purposes (so that data users can avoid always referencing all modes on a multimodal trips). The table below shows the full crosswalk of which detailed modes correspond to which mode_types. Higher values of mode_type are prioritized over lower mode_type values in the derivation. For example, transit trips, with mode_type 13, are prioritized over walk trips, with mode_type 1. When transit trips were unlinked using the Google API during cleaning (and thus did not have a reported mode_1, mode_2, mode_3, or mode_4), the non-transit legs of the trip were recoded using Google’s suggested mode (most frequently “walk” or “bike”) and do not have a reported mode_1, mode_2, mode_3, or mode_4.