1

I'm stuck with an attempt to subset columns in order from a very wide formatted dataframe. 1 row, couple thousand columns. column names are repetitive so they are all tagged with "...1/2/3/4/5" when retrieved and readin by R

sample data:

dput(mydata)

structure(list(general_issue_code...1 = "MMM", general_issue_code_display...2 = "Medicare/Medicaid", 
    description...3 = "340B Drug Pricing Program; Medicare Shared Savings Program; Medicare Advantage; Alternative Payment Models; Graduate Medical Education;Centers for Medicare and Medicaid Innovation; Stark Law Reform/Modernization; Value-Based Purchasing Payments; Post-Acute and Home HealthCare Payments; Drug Pricing and Medicare Parts B and D; Medicare Regulatory Reform; Medicaid Funding; Health Care Innovation; Telehealth/Digital Health; Surprise\nBilling; Anti-Kickback Statute Reform; CREATES Act; COVID-19; Value in Health Care Act", 
    foreign_entity_issues...4 = "", id...5 = 71171L, first_name...6 = "JOYCE", 
    last_name...7 = "ROGERS", new...8 = FALSE, id...9 = 79365L, 
    first_name...10 = "THOMAS", last_name...11 = "MCDANIELS", 
    suffix...12 = "jr", suffix_display...13 = "JR", new...14 = FALSE, 
    id...15 = 58395L, first_name...16 = "MEGHAN", last_name...17 = "CLUNE WOLTMAN", 
    new...18 = TRUE, id...19 = 89715L, first_name...20 = "JILL", 
    last_name...21 = "DOWELL", new...22 = FALSE, id...23 = 93159L, 
    first_name...24 = "JOYCE", middle_name...25 = "A", last_name...26 = "ROGERS", 
    new...27 = FALSE, id...28 = 93160L, first_name...29 = "THOMAS", 
    last_name...30 = "MCDANIELS", new...31 = FALSE, id...32 = 93161L, 
    first_name...33 = "MEGHAN", last_name...34 = "WOLTMAN", new...35 = TRUE, 
    id...36 = 69176L, first_name...37 = "ANTHONY", last_name...38 = "CURRY", 
    new...39 = FALSE, id...40 = 93163L, first_name...41 = "JILL", 
    last_name...42 = "DOWELL - EFFECTIVE 2/9/21", new...43 = FALSE, 
    id...44 = 34L, name...45 = "Health & Human Services, Dept of (HHS)", 
    id...46 = 2L, name...47 = "HOUSE OF REPRESENTATIVES", id...48 = 1L, 
    name...49 = "SENATE", general_issue_code...50 = "HCR", general_issue_code_display...51 = "Health Issues", 
    description...52 = "340B Drug Pricing Program; Medicare Shared Savings Program; Medicare Advantage; Alternative Payment Models; Graduate Medical Education;Centers for Medicare and Medicaid Innovation; Stark Law Reform/Modernization; Value-Based Purchasing Payments; Post-Acute and Home HealthCare Payments; Drug Pricing and Medicare Parts B and D; Medicare Regulatory Reform; Medicaid Funding; Health Care Innovation; Telehealth/Digital Health; Surprise\nBilling; Anti-Kickback Statute Reform; CREATES Act; COVID-19; Value in Health Care Act", 
    foreign_entity_issues...53 = "", id...54 = 71171L, first_name...55 = "JOYCE", 
    last_name...56 = "ROGERS", new...57 = FALSE, id...58 = 79365L, 
    first_name...59 = "THOMAS", last_name...60 = "MCDANIELS", 
    suffix...61 = "jr", suffix_display...62 = "JR", new...63 = FALSE, 
    id...64 = 58395L, first_name...65 = "MEGHAN", last_name...66 = "CLUNE WOLTMAN", 
    new...67 = TRUE, id...68 = 89715L, first_name...69 = "JILL", 
    last_name...70 = "DOWELL", new...71 = FALSE, id...72 = 69176L, 
    first_name...73 = "ANTHONY", last_name...74 = "CURRY", new...75 = FALSE, 
    id...76 = 93159L, first_name...77 = "JOYCE", middle_name...78 = "A", 
    last_name...79 = "ROGERS", new...80 = FALSE, id...81 = 93160L, 
    first_name...82 = "THOMAS", last_name...83 = "MCDANIELS", 
    new...84 = FALSE, id...85 = 93161L, first_name...86 = "MEGHAN", 
    last_name...87 = "WOLTMAN", new...88 = TRUE, id...89 = 93163L, 
    first_name...90 = "JILL", last_name...91 = "DOWELL - EFFECTIVE 2/9/21", 
    new...92 = FALSE, id...93 = 34L, name...94 = "Health & Human Services, Dept of (HHS)", 
    id...95 = 2L, name...96 = "HOUSE OF REPRESENTATIVES", id...97 = 1L, 
    name...98 = "SENATE", general_issue_code...1.1 = "VET", general_issue_code_display...2.1 = "Veterans", 
    description...3.1 = "Implementation of MISSION Act (PL 115-182) and the problems with same as experienced by podiatric physicians and surgeons.", 
    foreign_entity_issues...4.1 = "", id...5.1 = 93368L, first_name...6.1 = "BENJAMIN", 
    last_name...7.1 = "WALLNER", new...8.1 = FALSE, id...9.1 = 53293L, 
    prefix...10 = "mr", prefix_display...11 = "MR.", first_name...12 = "BENJAMIN", 
    middle_name...13 = "J", last_name...14 = "WALLNER", new...15 = FALSE, 
    id...16 = 136L, name...17 = "Centers For Medicare and Medicaid Services (CMS)", 
    id...18 = 2L, name...19 = "HOUSE OF REPRESENTATIVES", id...20 = 1L, 
    name...21 = "SENATE", general_issue_code...22 = "EDU", general_issue_code_display...23 = "Education", 
    description...24 = "Medical student loan assistance; graduate medical education; higher education act reauthorization; student loan reform; student loan forgiveness for providers during public health emergency.", 
    foreign_entity_issues...25 = "", id...26 = 93368L, first_name...27 = "BENJAMIN", 
    last_name...28 = "WALLNER", new...29 = FALSE, id...30 = 53293L, 
    prefix...31 = "mr", prefix_display...32 = "MR.", first_name...33.1 = "BENJAMIN", 
    middle_name...34 = "J", last_name...35 = "WALLNER", new...36 = FALSE, 
    id...37 = 136L, name...38 = "Centers For Medicare and Medicaid Services (CMS)", 
    id...39 = 2L, name...40 = "HOUSE OF REPRESENTATIVES", id...41 = 1L, 
    name...42 = "SENATE", general_issue_code...43 = "HCR", general_issue_code_display...44 = "Health Issues", 
    description...45 = "Medicaid access to podiatry and program integrity; Medicare therapeutic shoe program for patients with diabetes; MACRA/MIPS implementation and Physician Fee Schedule proposed updates as pertains to physicians and providers. Seeking pandemic relief for providers.", 
    foreign_entity_issues...46 = "", id...47 = 93368L, first_name...48 = "BENJAMIN", 
    last_name...49 = "WALLNER", new...50 = FALSE, id...51 = 53293L, 
    prefix...52 = "mr", prefix_display...53 = "MR.", first_name...54 = "BENJAMIN", 
    middle_name...55 = "J", last_name...56.1 = "WALLNER", new...57.1 = FALSE, 
    id...58.1 = 136L, name...59 = "Centers For Medicare and Medicaid Services (CMS)", 
    id...60 = 2L, name...61 = "HOUSE OF REPRESENTATIVES", id...62 = 1L, 
    name...63 = "SENATE", general_issue_code...1.2 = "DEF", general_issue_code_display...2.2 = "Defense", 
    description...3.2 = "telemedicine", foreign_entity_issues...4.2 = "", 
    id...5.2 = 45748L, first_name...6.2 = "OLIVER", last_name...7.2 = "MEISSNER", 
    new...8.2 = FALSE, id...9.2 = 75684L, first_name...10.1 = "CHARLES", 
    last_name...11.1 = "PROSCH", new...12 = FALSE, id...13 = 93367L, 
    first_name...14 = "MICHAEL", last_name...15 = "TOROUNIAN", 
    new...16 = FALSE, id...17 = 80766L, first_name...18 = "BRANDON", 
    last_name...19 = "KIRBY", new...20 = TRUE, id...21 = 75303L, 
    first_name...22 = "MICHAEL", middle_name...23 = "WILLIAM", 
    last_name...24 = "TOROUNIAN", new...25 = FALSE, id...26.1 = 134L, 
    name...27 = "Centers For Disease Control & Prevention (CDC)", 
    id...28.1 = 25L, name...29 = "Defense, Dept of (DOD)", id...30.1 = 34L, 
    name...31 = "Health & Human Services, Dept of (HHS)", id...32.1 = 2L, 
    name...33 = "HOUSE OF REPRESENTATIVES", id...34 = 1L, name...35 = "SENATE", 
    id...36.1 = 39L, name...37 = "State, Dept of (DOS)", id...38 = 12L, 
    name...39 = "White House Office", general_issue_code...40 = "MMM", 
    general_issue_code_display...41 = "Medicare/Medicaid", description...42 = "telemedcine", 
    foreign_entity_issues...43 = "", id...44.1 = 93367L, first_name...45 = "MICHAEL", 
    last_name...46 = "TOROUNIAN", new...47 = FALSE, id...48.1 = 45748L, 
    first_name...49 = "OLIVER", last_name...50 = "MEISSNER", 
    new...51 = FALSE, id...52 = 75303L, first_name...53 = "MICHAEL", 
    middle_name...54 = "WILLIAM", last_name...55 = "TOROUNIAN", 
    new...56 = FALSE, id...57 = 75684L, first_name...58 = "CHARLES", 
    last_name...59 = "PROSCH", new...60 = FALSE, id...61 = 80766L, 
    first_name...62 = "BRANDON", last_name...63 = "KIRBY", new...64 = TRUE, 
    id...65 = 134L, name...66 = "Centers For Disease Control & Prevention (CDC)", 
    id...67 = 25L, name...68 = "Defense, Dept of (DOD)", id...69 = 34L, 
    name...70 = "Health & Human Services, Dept of (HHS)", id...71 = 2L, 
    name...72 = "HOUSE OF REPRESENTATIVES", id...73 = 1L, name...74 = "SENATE", 
    id...75 = 39L, name...76 = "State, Dept of (DOS)", id...77 = 12L, 
    name...78 = "White House Office", general_issue_code...1.3 = "HCR", 
    general_issue_code_display...2.3 = "Health Issues", description...3.3 = "H.R. 1439 The Expanded Genetic Screening Act of 2021: \"To amend title XIX of the Social Security Act to provide for coverage under the Medicaid program of non-invasive prenatal genetic screening\"", 
    foreign_entity_issues...4.3 = "", id...5.3 = 45748L, first_name...6.3 = "OLIVER", 
    last_name...7.3 = "MEISSNER", new...8.3 = FALSE, id...9.3 = 75684L, 
    first_name...10.2 = "CHARLES", last_name...11.2 = "PROSCH", 
    new...12.1 = FALSE, id...13.1 = 80766L, first_name...14.1 = "BRANDON", 
    last_name...15.1 = "KIRBY", new...16.1 = TRUE, id...17.1 = 93404L, 
    first_name...18.1 = "MICHAEL", last_name...19.1 = "TOROUNIAN", 
    new...20.1 = FALSE, id...21.1 = 75303L, first_name...22.1 = "MICHAEL", 
    middle_name...23.1 = "WILLIAM", last_name...24.1 = "TOROUNIAN", 
    new...25.1 = FALSE, id...26.2 = 34L, name...27.1 = "Health & Human Services, Dept of (HHS)", 
    id...28.2 = 2L, name...29.1 = "HOUSE OF REPRESENTATIVES", 
    id...30.2 = 1L, name...31.1 = "SENATE", general_issue_code...32 = "MMM", 
    general_issue_code_display...33 = "Medicare/Medicaid", description...34 = "H.R. 1439 The Expanded Genetic Screening Act of 2021: \"To amend title XIX of the Social Security Act to provide for coverage under the Medicaid program of non-invasive prenatal genetic screening\"", 
    foreign_entity_issues...35 = "", id...36.2 = 45748L, first_name...37.1 = "OLIVER", 
    last_name...38.1 = "MEISSNER", new...39.1 = FALSE, id...40.1 = 75684L, 
    first_name...41.1 = "CHARLES", last_name...42.1 = "PROSCH", 
    new...43.1 = FALSE, id...44.2 = 80766L, first_name...45.1 = "BRANDON", 
    last_name...46.1 = "KIRBY", new...47.1 = FALSE, id...48.2 = 93404L, 
    first_name...49.1 = "MICHAEL", last_name...50.1 = "TOROUNIAN", 
    new...51.1 = FALSE, id...52.1 = 75303L, first_name...53.1 = "MICHAEL", 
    middle_name...54.1 = "WILLIAM", last_name...55.1 = "TOROUNIAN", 
    new...56.1 = FALSE, id...57.1 = 34L, name...58 = "Health & Human Services, Dept of (HHS)", 
    id...59 = 2L, name...60 = "HOUSE OF REPRESENTATIVES", id...61.1 = 1L, 
    name...62 = "SENATE", general_issue_code = "VET", general_issue_code_display = "Veterans", 
    description = "Veteran Healthcare, Veteran Benefits, Military Quality of Life Programs.", 
    foreign_entity_issues = "", id...5.4 = 73532L, first_name...6.4 = "PAT", 
    last_name...7.4 = "MURRAY", new...8.4 = FALSE, id...9.4 = 87133L, 
    first_name...10.3 = "PATRICK", last_name...11.3 = "MURRAY", 
    new...12.2 = FALSE, id...13.2 = 25L, name...14 = "Defense, Dept of (DOD)", 
    id...15.1 = 2L, name...16 = "HOUSE OF REPRESENTATIVES", id...17.2 = 38L, 
    name...18 = "Labor, Dept of (DOL)", id...19.1 = 1L, name...20 = "SENATE", 
    id...21.2 = 90L, name...22 = "Small Business Administration (SBA)", 
    id...23.1 = 42L, name...24 = "Veterans Affairs, Dept of (VA)", 
    id...25 = 170L, name...26 = "Veterans Employment & Training Service", 
    general_issue_code.1 = "TEC", general_issue_code_display.1 = "Telecommunications", 
    description.1 = "Issues include the USA Telecommunications Act and overall spectrum and technology policy, with a focus on 5G.", 
    foreign_entity_issues.1 = "", id...5.5 = 93561L, first_name...6.5 = "DAVID", 
    last_name...7.5 = "MURRAY", new...8.5 = FALSE, id...9.5 = 79245L, 
    first_name...10.4 = "DAVID", middle_name = "THOMAS", last_name...12 = "MURRAY", 
    new...13 = FALSE, id...14 = 2L, name...15 = "HOUSE OF REPRESENTATIVES", 
    id...16.1 = 1L, name...17.1 = "SENATE", general_issue_code.2 = "TRA", 
    general_issue_code_display.2 = "Transportation", description.2 = "Agricultural and transportation issues", 
    foreign_entity_issues.2 = "", id...5.6 = 64951L, first_name = "BRITTON", 
    last_name = "CLARKE", covered_position = "Deputy Chief of Staff and press secretary for Congressman Henry Brown (SC-1) from 2001- 2002", 
    new = FALSE, id...10 = 23L, name...11 = "Agriculture, Dept of (USDA)", 
    id...12 = 2L, name...13 = "HOUSE OF REPRESENTATIVES", id...14.1 = 1L, 
    name...15.1 = "SENATE", general_issue_code.3 = "TRA", general_issue_code_display.3 = "Transportation", 
    description.3 = "Transportation issues", foreign_entity_issues.3 = "", 
    id...5.7 = 64951L, first_name.1 = "BRITTON", last_name.1 = "CLARKE", 
    covered_position.1 = "Deputy Chief of Staff and press secretary for Congressman Henry Brown (SC-1) from 2001- 2002", 
    new.1 = FALSE, id...10.1 = 201L, name...11.1 = "Homeland Security, Dept of (DHS)", 
    id...12.1 = 2L, name...13.1 = "HOUSE OF REPRESENTATIVES", 
    id...14.2 = 1L, name...15.2 = "SENATE", general_issue_code.4 = "TRA", 
    general_issue_code_display.4 = "Transportation", description.4 = "Transportation issues", 
    foreign_entity_issues.4 = "", id...5.8 = 64951L, first_name.2 = "BRITTON", 
    last_name.2 = "CLARKE", covered_position.2 = "Deputy Chief of Staff and press secretary for Congressman Henry Brown (SC-1) from 2001- 2002", 
    new.2 = FALSE, id...10.2 = 2L, name...11.2 = "HOUSE OF REPRESENTATIVES", 
    id...12.2 = 1L, name...13.2 = "SENATE", general_issue_code...1.4 = "TRD", 
    general_issue_code_display...2.4 = "Trade (domestic/foreign)", 
    description...3.4 = "Trade, border relations, port operations", 
    foreign_entity_issues...4.4 = "", id...5.9 = 64951L, first_name...6.6 = "BRITTON", 
    last_name...7.6 = "CLARKE", covered_position.3 = "Deputy Chief of Staff and press secretary for Congressman Henry Brown (SC-1) from 2001- 2002", 
    new...9 = FALSE, id...10.3 = 24L, name...11.3 = "Commerce, Dept of (DOC)", 
    id...12.3 = 2L, name...13.3 = "HOUSE OF REPRESENTATIVES", 
    id...14.3 = 1L, name...15.3 = "SENATE", id...16.2 = 39L, 
    name...17.2 = "State, Dept of (DOS)", general_issue_code...18 = "TRA", 
    general_issue_code_display...19 = "Transportation", description...20 = "Trade, border travel, ports of entry", 
    foreign_entity_issues...21 = "", id...22 = 64951L, first_name...23 = "BRITTON", 
    last_name...24.2 = "CLARKE", new...25.2 = TRUE, id...26.3 = 24L, 
    name...27.2 = "Commerce, Dept of (DOC)", id...28.3 = 2L, 
    name...29.2 = "HOUSE OF REPRESENTATIVES", id...30.3 = 1L, 
    name...31.2 = "SENATE", id...32.2 = 39L, name...33.1 = "State, Dept of (DOS)", 
    general_issue_code...1.5 = "TRD", general_issue_code_display...2.5 = "Trade (domestic/foreign)", 
    description...3.5 = "International trade", foreign_entity_issues...4.5 = "", 
    id...5.10 = 64951L, first_name...6.7 = "BRITTON", last_name...7.7 = "CLARKE", 
    covered_position...8 = "Deputy Chief of Staff and press secretary for Congressman Henry Brown (SC-1) from 2001- 2002", 
    new...9.1 = FALSE, id...10.4 = 2L, name...11.4 = "HOUSE OF REPRESENTATIVES", 
    id...12.4 = 1L, name...13.4 = "SENATE", general_issue_code...14 = "TRA", 
    general_issue_code_display...15 = "Transportation", description...16 = "Transportation issues; border travel", 
    foreign_entity_issues...17 = "", id...18.1 = 64951L, first_name...19 = "BRITTON", 
    last_name...20 = "CLARKE", covered_position...21 = "Deputy Chief of Staff and press secretary for Congressman Henry Brown (SC-1) from 2001- 2002", 
    new...22.1 = FALSE, id...23.2 = 2L, name...24.1 = "HOUSE OF REPRESENTATIVES", 
    id...25.1 = 1L, name...26.1 = "SENATE", general_issue_code.5 = "TRA", 
    general_issue_code_display.5 = "Transportation", description.5 = "Speed limiters", 
    foreign_entity_issues.5 = "", id...5.11 = 64951L, first_name.3 = "BRITTON", 
    last_name.3 = "CLARKE", covered_position.4 = "Deputy Chief of Staff and press secretary for Congressman Henry Brown (SC-1) from 2001- 2002", 
    new.3 = FALSE, id...10.5 = 2L, name...11.5 = "HOUSE OF REPRESENTATIVES", 
    id...12.5 = 1L, name...13.5 = "SENATE", general_issue_code.6 = "TRA", 
    general_issue_code_display.6 = "Transportation", description.6 = "Transportation issues", 
    foreign_entity_issues.6 = "", id...5.12 = 64951L, first_name.4 = "BRITTON", 
    last_name.4 = "CLARKE", covered_position.5 = "Deputy Chief of Staff and press secretary for Congressman Henry Brown (SC-1) from 2001- 2002", 
    new.4 = FALSE, id...10.6 = 2L, name...11.6 = "HOUSE OF REPRESENTATIVES", 
    id...12.6 = 1L, name...13.6 = "SENATE", general_issue_code.7 = "TRA", 
    general_issue_code_display.7 = "Transportation", description.7 = "Transportation issues", 
    foreign_entity_issues.7 = "", id...5.13 = 64951L, first_name.5 = "BRITTON", 
    last_name.5 = "CLARKE", covered_position.6 = "Deputy Chief of Staff and press secretary for Congressman Henry Brown (SC-1) from 2001- 2002", 
    new.5 = FALSE, id...10.7 = 2L, name...11.7 = "HOUSE OF REPRESENTATIVES", 
    id...12.7 = 1L, name...13.7 = "SENATE", id...14.4 = 40L, 
    name...15.4 = "Transportation, Dept of (DOT)", general_issue_code.8 = "TRA", 
    general_issue_code_display.8 = "Transportation", description.8 = "Transportation technologies", 
    foreign_entity_issues.8 = "", id...5.14 = 64951L, first_name.6 = "BRITTON", 
    last_name.6 = "CLARKE", covered_position.7 = "Deputy Chief of Staff and press secretary for Congressman Henry Brown (SC-1) from 2001- 2002", 
    new.6 = FALSE, id...10.8 = 2L, name...11.8 = "HOUSE OF REPRESENTATIVES", 
    id...12.8 = 1L, name...13.8 = "SENATE"), row.names = 1L, class = "data.frame")
> dput(q<-q[1,1:200])
structure(list(general_issue_code...1 = "MMM", general_issue_code_display...2 = "Medicare/Medicaid", 
    description...3 = "340B Drug Pricing Program; Medicare Shared Savings Program; Medicare Advantage; Alternative Payment Models; Graduate Medical Education;Centers for Medicare and Medicaid Innovation; Stark Law Reform/Modernization; Value-Based Purchasing Payments; Post-Acute and Home HealthCare Payments; Drug Pricing and Medicare Parts B and D; Medicare Regulatory Reform; Medicaid Funding; Health Care Innovation; Telehealth/Digital Health; Surprise\nBilling; Anti-Kickback Statute Reform; CREATES Act; COVID-19; Value in Health Care Act", 
    foreign_entity_issues...4 = "", id...5 = 71171L, first_name...6 = "JOYCE", 
    last_name...7 = "ROGERS", new...8 = FALSE, id...9 = 79365L, 
    first_name...10 = "THOMAS", last_name...11 = "MCDANIELS", 
    suffix...12 = "jr", suffix_display...13 = "JR", new...14 = FALSE, 
    id...15 = 58395L, first_name...16 = "MEGHAN", last_name...17 = "CLUNE WOLTMAN", 
    new...18 = TRUE, id...19 = 89715L, first_name...20 = "JILL", 
    last_name...21 = "DOWELL", new...22 = FALSE, id...23 = 93159L, 
    first_name...24 = "JOYCE", middle_name...25 = "A", last_name...26 = "ROGERS", 
    new...27 = FALSE, id...28 = 93160L, first_name...29 = "THOMAS", 
    last_name...30 = "MCDANIELS", new...31 = FALSE, id...32 = 93161L, 
    first_name...33 = "MEGHAN", last_name...34 = "WOLTMAN", new...35 = TRUE, 
    id...36 = 69176L, first_name...37 = "ANTHONY", last_name...38 = "CURRY", 
    new...39 = FALSE, id...40 = 93163L, first_name...41 = "JILL", 
    last_name...42 = "DOWELL - EFFECTIVE 2/9/21", new...43 = FALSE, 
    id...44 = 34L, name...45 = "Health & Human Services, Dept of (HHS)", 
    id...46 = 2L, name...47 = "HOUSE OF REPRESENTATIVES", id...48 = 1L, 
    name...49 = "SENATE", general_issue_code...50 = "HCR", general_issue_code_display...51 = "Health Issues", 
    description...52 = "340B Drug Pricing Program; Medicare Shared Savings Program; Medicare Advantage; Alternative Payment Models; Graduate Medical Education;Centers for Medicare and Medicaid Innovation; Stark Law Reform/Modernization; Value-Based Purchasing Payments; Post-Acute and Home HealthCare Payments; Drug Pricing and Medicare Parts B and D; Medicare Regulatory Reform; Medicaid Funding; Health Care Innovation; Telehealth/Digital Health; Surprise\nBilling; Anti-Kickback Statute Reform; CREATES Act; COVID-19; Value in Health Care Act", 
    foreign_entity_issues...53 = "", id...54 = 71171L, first_name...55 = "JOYCE", 
    last_name...56 = "ROGERS", new...57 = FALSE, id...58 = 79365L, 
    first_name...59 = "THOMAS", last_name...60 = "MCDANIELS", 
    suffix...61 = "jr", suffix_display...62 = "JR", new...63 = FALSE, 
    id...64 = 58395L, first_name...65 = "MEGHAN", last_name...66 = "CLUNE WOLTMAN", 
    new...67 = TRUE, id...68 = 89715L, first_name...69 = "JILL", 
    last_name...70 = "DOWELL", new...71 = FALSE, id...72 = 69176L, 
    first_name...73 = "ANTHONY", last_name...74 = "CURRY", new...75 = FALSE, 
    id...76 = 93159L, first_name...77 = "JOYCE", middle_name...78 = "A", 
    last_name...79 = "ROGERS", new...80 = FALSE, id...81 = 93160L, 
    first_name...82 = "THOMAS", last_name...83 = "MCDANIELS", 
    new...84 = FALSE, id...85 = 93161L, first_name...86 = "MEGHAN", 
    last_name...87 = "WOLTMAN", new...88 = TRUE, id...89 = 93163L, 
    first_name...90 = "JILL", last_name...91 = "DOWELL - EFFECTIVE 2/9/21", 
    new...92 = FALSE, id...93 = 34L, name...94 = "Health & Human Services, Dept of (HHS)", 
    id...95 = 2L, name...96 = "HOUSE OF REPRESENTATIVES", id...97 = 1L, 
    name...98 = "SENATE", general_issue_code...1.1 = "VET", general_issue_code_display...2.1 = "Veterans", 
    description...3.1 = "Implementation of MISSION Act (PL 115-182) and the problems with same as experienced by podiatric physicians and surgeons.", 
    foreign_entity_issues...4.1 = "", id...5.1 = 93368L, first_name...6.1 = "BENJAMIN", 
    last_name...7.1 = "WALLNER", new...8.1 = FALSE, id...9.1 = 53293L, 
    prefix...10 = "mr", prefix_display...11 = "MR.", first_name...12 = "BENJAMIN", 
    middle_name...13 = "J", last_name...14 = "WALLNER", new...15 = FALSE, 
    id...16 = 136L, name...17 = "Centers For Medicare and Medicaid Services (CMS)", 
    id...18 = 2L, name...19 = "HOUSE OF REPRESENTATIVES", id...20 = 1L, 
    name...21 = "SENATE", general_issue_code...22 = "EDU", general_issue_code_display...23 = "Education", 
    description...24 = "Medical student loan assistance; graduate medical education; higher education act reauthorization; student loan reform; student loan forgiveness for providers during public health emergency.", 
    foreign_entity_issues...25 = "", id...26 = 93368L, first_name...27 = "BENJAMIN", 
    last_name...28 = "WALLNER", new...29 = FALSE, id...30 = 53293L, 
    prefix...31 = "mr", prefix_display...32 = "MR.", first_name...33.1 = "BENJAMIN", 
    middle_name...34 = "J", last_name...35 = "WALLNER", new...36 = FALSE, 
    id...37 = 136L, name...38 = "Centers For Medicare and Medicaid Services (CMS)", 
    id...39 = 2L, name...40 = "HOUSE OF REPRESENTATIVES", id...41 = 1L, 
    name...42 = "SENATE", general_issue_code...43 = "HCR", general_issue_code_display...44 = "Health Issues", 
    description...45 = "Medicaid access to podiatry and program integrity; Medicare therapeutic shoe program for patients with diabetes; MACRA/MIPS implementation and Physician Fee Schedule proposed updates as pertains to physicians and providers. Seeking pandemic relief for providers.", 
    foreign_entity_issues...46 = "", id...47 = 93368L, first_name...48 = "BENJAMIN", 
    last_name...49 = "WALLNER", new...50 = FALSE, id...51 = 53293L, 
    prefix...52 = "mr", prefix_display...53 = "MR.", first_name...54 = "BENJAMIN", 
    middle_name...55 = "J", last_name...56.1 = "WALLNER", new...57.1 = FALSE, 
    id...58.1 = 136L, name...59 = "Centers For Medicare and Medicaid Services (CMS)", 
    id...60 = 2L, name...61 = "HOUSE OF REPRESENTATIVES", id...62 = 1L, 
    name...63 = "SENATE", general_issue_code...1.2 = "DEF", general_issue_code_display...2.2 = "Defense", 
    description...3.2 = "telemedicine", foreign_entity_issues...4.2 = "", 
    id...5.2 = 45748L, first_name...6.2 = "OLIVER", last_name...7.2 = "MEISSNER", 
    new...8.2 = FALSE, id...9.2 = 75684L, first_name...10.1 = "CHARLES", 
    last_name...11.1 = "PROSCH", new...12 = FALSE, id...13 = 93367L, 
    first_name...14 = "MICHAEL", last_name...15 = "TOROUNIAN", 
    new...16 = FALSE, id...17 = 80766L, first_name...18 = "BRANDON", 
    last_name...19 = "KIRBY", new...20 = TRUE, id...21 = 75303L, 
    first_name...22 = "MICHAEL", middle_name...23 = "WILLIAM", 
    last_name...24 = "TOROUNIAN", new...25 = FALSE, id...26.1 = 134L, 
    name...27 = "Centers For Disease Control & Prevention (CDC)", 
    id...28.1 = 25L, name...29 = "Defense, Dept of (DOD)", id...30.1 = 34L, 
    name...31 = "Health & Human Services, Dept of (HHS)", id...32.1 = 2L, 
    name...33 = "HOUSE OF REPRESENTATIVES", id...34 = 1L, name...35 = "SENATE", 
    id...36.1 = 39L, name...37 = "State, Dept of (DOS)", id...38 = 12L, 
    name...39 = "White House Office"), row.names = 1L, class = "data.frame")

Now you see the columns are repetitive in segments, each segment is basically a person, his first and last name, and his possible title. Then it goes to next person

What I want is to subset them based on such segmentation, so I can then pivot the table and clean this mess up.

mydata %>% 
select(
     starts_with("first_name"),
     starts_with("last_name"),
     starts_with("covered_position"))->cv

This was my original attempt, it pulled all the columns matching the first name, and last name, then covered_position. The problem is that it lumps all the first name together, then goes for the last name, then covered_position.

So in this way, I lost the segment, I want to get the 3 columns out in their original order:

for example:

"first_name...5""lastname...5""covered_position...1", 
"first_name...9""lastname...8""covered_position...2"
...

Then join the 3 columns together, so I have a string then I can pivot it to long form and clean this up.

Thank you

ML33M
  • 341
  • 2
  • 19

1 Answers1

0

This is my attempt. However, first it should be noticed that the number of column names matching "^first_name", "^last_name" and "^covered_position" is not equal, possibly due to limitation of sample data:

library(purrr)

list(grep("^first_name",       colnames(mydata)),
     grep("^last_name",        colnames(mydata)),
     grep("^covered_position", colnames(mydata))) |>
purrr::map_dbl(length)

[1] 59 59 10

Therefore, by considering only the first 10 matches:

purrr::pmap(list(grep("^first_name",       colnames(mydata))[1:10],
                 grep("^last_name",        colnames(mydata))[1:10],
                 grep("^covered_position", colnames(mydata))[1:10]),
            c) |>
    purrr::map(~mydata[,.x]) |>
    purrr::map_dfr(~rename_with(.x, ~gsub("\\..*$", "", .x)))


+    first_name                 last_name
1       JOYCE                    ROGERS
2      THOMAS                 MCDANIELS
3      MEGHAN             CLUNE WOLTMAN
4        JILL                    DOWELL
5       JOYCE                    ROGERS
6      THOMAS                 MCDANIELS
7      MEGHAN                   WOLTMAN
8     ANTHONY                     CURRY
9        JILL DOWELL - EFFECTIVE 2/9/21
10      JOYCE                    ROGERS
                                                                               covered_position
1  Deputy Chief of Staff and press secretary for Congressman Henry Brown (SC-1) from 2001- 2002
2  Deputy Chief of Staff and press secretary for Congressman Henry Brown (SC-1) from 2001- 2002
3  Deputy Chief of Staff and press secretary for Congressman Henry Brown (SC-1) from 2001- 2002
4  Deputy Chief of Staff and press secretary for Congressman Henry Brown (SC-1) from 2001- 2002
5  Deputy Chief of Staff and press secretary for Congressman Henry Brown (SC-1) from 2001- 2002
6  Deputy Chief of Staff and press secretary for Congressman Henry Brown (SC-1) from 2001- 2002
7  Deputy Chief of Staff and press secretary for Congressman Henry Brown (SC-1) from 2001- 2002
8  Deputy Chief of Staff and press secretary for Congressman Henry Brown (SC-1) from 2001- 2002
9  Deputy Chief of Staff and press secretary for Congressman Henry Brown (SC-1) from 2001- 2002
10 Deputy Chief of Staff and press secretary for Congressman Henry Brown (SC-1) from 2001- 2002

Stefano Barbi
  • 2,978
  • 1
  • 12
  • 11
  • Thank you for the code. And I might to ask some favour in explaining a bit. First I guess the "|>" works like "%>%"? – ML33M Feb 11 '22 at 12:21
  • Next, I applied them to the whole data set, the first step got [1] 982 980 345, which I used 1:345 to subset the data. Now The real problem happened: the result returned different from yours. For example JOYCE ROGERS has only NA for the covered official column. So as everyone in the data you used. Then there is NA in the names, but there is "Deputy Chief of Staff" under the covered official. I think the NA is messing up somethings – ML33M Feb 11 '22 at 12:28
  • @ML33M `|>` is the new pipe operator that comes with R 4.1 and differs from the `magrittr` `%>%`, because it only accepts function calls as RHS and has no pronoun `.` – Stefano Barbi Feb 11 '22 at 13:09
  • cheers man, just found out about the |> too. cool addition. Still puzzled on the data output. I think the NAs in the data might throw us off. Even in the test data you used those 10 people do not all have covered offical roles. but all of them came up having the same role/title – ML33M Feb 11 '22 at 13:13
  • @ML33M As for you data, I think you should split the sequence of key values (column header, value) into records by finding a suitable separator pattern, because most of the records do not have the same fields. I saw a recurring `id` `first_name` sequence which can possibly delimit the start of a record but cannot guarantee/ – Stefano Barbi Feb 11 '22 at 13:13
  • I will be happy to get the Id and first_name and the covered officals out too. How can I split them into records? – ML33M Feb 11 '22 at 13:26
  • @ML33M I would start by importing your original txt in the purest form possible. No column name mangling etc. Can you attach a sample of your txt in the question? – Stefano Barbi Feb 11 '22 at 13:35
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/241931/discussion-between-sbarbit-and-ml33m). – Stefano Barbi Feb 11 '22 at 13:46