I have two csv files:
old file:
name size_bytes
air unknown
data/air/monitor
data/air/monitor/ambient-air-quality-oil-sands-region
data/air/monitor/ambient-air-quality-oil-sands-region/ecosystem-sites-speciated-mercury-preliminary-data-oil-sands-region
data/air/monitor/ambient-air-quality-oil-sands-region/ecosystem-sites-speciated-mercury-preliminary-data-oil-sands-region/2017-04-02
data/air/monitor/ambient-air-quality-oil-sands-region/ecosystem-sites-speciated-mercury-preliminary-data-oil-sands-region/2017-04-02/EN
data/air/monitor/ambient-air-quality-oil-sands-region/ecosystem-sites-speciated-mercury-preliminary-data-oil-sands-region/2017-04-02/EN/datapackage.json 886
data/air/monitor/ambient-air-quality-oil-sands-region/ecosystem-sites-speciated-mercury-preliminary-data-oil-sands-region/2017-04-02/EN/digest.txt 186
data/air/monitor/ambient-air-quality-oil-sands-region/ecosystem-sites-speciated-mercury-preliminary-data-oil-sands-region/2017-04-02/EN/JOSM_AMS13_SpecHg_AB_2017-04-02_EN.pdf 9033
data/air/monitor/ambient-air-quality-oil-sands-region/ecosystem-sites-speciated-mercury-preliminary-data-oil-sands-region/2017-04-02/FR
data/air/monitor/ambient-air-quality-oil-sands-region/ecosystem-sites-speciated-mercury-preliminary-data-oil-sands-region/2017-04-02/FR/datapackage.json 886
data/air/monitor/ambient-air-quality-oil-sands-region/ecosystem-sites-speciated-mercury-preliminary-data-oil-sands-region/2017-04-02/FR/digest.txt 186
...
new file:
name size_bytes
data 0
data/air 0
data/air/monitor 0
data/air/monitor/ambient-air-quality-oil-sands-region 0
data/air/monitor/ambient-air-quality-oil-sands-region/96c679c3-709e-4a42-89c6-09f09f2b7ffe.xml 65589
data/air/monitor/ambient-air-quality-oil-sands-region/datapackage.json 13152367
data/air/monitor/ambient-air-quality-oil-sands-region/digest.txt 188
data/air/monitor/ambient-air-quality-oil-sands-region/ecosystem-sites-speciated-mercury-preliminary-data-oil-sands-region 0
data/air/monitor/ambient-air-quality-oil-sands-region/ecosystem-sites-speciated-mercury-preliminary-data-oil-sands-region/2017-04-02 0
data/air/monitor/ambient-air-quality-oil-sands-region/ecosystem-sites-speciated-mercury-preliminary-data-oil-sands-region/2017-04-02/FR 0
data/air/monitor/ambient-air-quality-oil-sands-region/ecosystem-sites-speciated-mercury-preliminary-data-oil-sands-region/2017-04-02/FR/JOSM_AMS13_SpecHg_AB_2017-04-02_FR.pdf 9186
data/air/monitor/ambient-air-quality-oil-sands-region/ecosystem-sites-speciated-mercury-preliminary-data-oil-sands-region/2017-04-02/digest.txt 82
data/air/monitor/ambient-air-quality-oil-sands-region/ecosystem-sites-speciated-mercury-preliminary-data-oil-sands-region/2017-04-09 0
...
I want to compare the names from the "old file" to the names in the "new file" and get any missing names (folder or file paths).
Right now I have this:
with open('old_file.csv', 'r') as old_file:
old = set(row.split(',')[0].strip().lower() for row in old_file)
with open('new_file.csv','r') as new_file, open('compare.csv', 'w') as compare_files:
for line in new_file:
if line.split(',')[0].strip().lower() not in old:
compare_files.write(line)
This runs but the output is not correct, it prints out names that ARE in both files. Here is the output:
data 0
data/air 0
data/air/monitor/deposition-oil-sands-region/the-monitored-ambient-concentration-and-estimated-atmospheric-deposition-of-trace-elements-at-four-monitoring-sites-in-the-canadian-athabasca-oil-sands-region 0
data/air/monitor/deposition-oil-sands-region/the-monitored-ambient-concentration-and-estimated-atmospheric-deposition-of-trace-elements-at-four-monitoring-sites-in-the-canadian-athabasca-oil-sands-region/ElementConcentrationPM25_OSM_AMS-sites_2016-2017.csv 736737
data/air/monitor/deposition-oil-sands-region/the-monitored-ambient-concentration-and-estimated-atmospheric-deposition-of-trace-elements-at-four-monitoring-sites-in-the-canadian-athabasca-oil-sands-region/ElementConcentrationPM25to10_OSM_AMS-sites_2016-2017.csv 227513
data/air/monitor/deposition-oil-sands-region/the-monitored-ambient-concentration-and-estimated-atmospheric-deposition-of-trace-elements-at-four-monitoring-sites-in-the-canadian-athabasca-oil-sands-region/ElementFlux_OSM_AMS-sites_2016-2017.csv 691252
data/air/monitor/deposition-oil-sands-region/the-monitored-ambient-concentration-and-estimated-atmospheric-deposition-of-trace-elements-at-four-monitoring-sites-in-the-canadian-athabasca-oil-sands-region/ffeae500-ea0c-493f-9b24-5efbd16411fd.xml 41399
data/air/monitor/monitoring-of-atmospheric-precipitation-chemistry/major-ions/AtmosphericPrecipitationChemistry-MajorIons-APQMP-AllSites-2019.csv 169109
data/air/monitor/monitoring-of-atmospheric-precipitation-chemistry/major-ions/AtmosphericPrecipitationChemistry-MajorIons-APQMP-AllSites-2020.csv 150205
data/air/monitor/monitoring-of-atmospheric-precipitation-chemistry/major-ions/AtmosphericPrecipitationChemistry-MajorIons-CAPMoN-AllSites-2017.csv 4343972
data/air/monitor/monitoring-of-atmospheric-precipitation-chemistry/major-ions/AtmosphericPrecipitationChemistry-MajorIons-CAPMoN-AllSites-2018.csv 3782783
data/air/monitor/monitoring-of-combined-atmospheric-gases-and-particles/major-ions-and-acidifying-gases 0
data/air/monitor/monitoring-of-combined-atmospheric-gases-and-particles/major-ions-and-acidifying-gases/AtmosphericCombinedGasesParticles-FilterPack-CAPMoN-AllSites-2012.csv 1826690
data/air/monitor/monitoring-of-combined-atmospheric-gases-and-particles/major-ions-and-acidifying-gases/AtmosphericCombinedGasesParticles-FilterPack-CAPMoN-AllSites-2013.csv 1890761
data/air/monitor/monitoring-of-combined-atmospheric-gases-and-particles/major-ions-and-acidifying-gases/AtmosphericCombinedGasesParticles-FilterPack-CAPMoN-AllSites-2014.csv 1946788
data/air/monitor/monitoring-of-combined-atmospheric-gases-and-particles/major-ions-and-acidifying-gases/AtmosphericCombinedGasesParticles-FilterPack-CAPMoN-AllSites-2015.csv 2186536
data/air/monitor/monitoring-of-combined-atmospheric-gases-and-particles/major-ions-and-acidifying-gases/AtmosphericCombinedGasesParticles-FilterPack-CAPMoN-AllSites-2016.csv 2434692
data/air/monitor/monitoring-of-combined-atmospheric-gases-and-particles/major-ions-and-acidifying-gases/AtmosphericCombinedGasesParticles-FilterPack-CAPMoN-AllSites-2017.csv 2150499
data/air/monitor/monitoring-of-combined-atmospheric-gases-and-particles/major-ions-and-acidifying-gases/AtmosphericCombinedGasesParticles-FilterPack-CAPMoN-AllSites-2018.csv 2136853
...
Is there something wrong with my code? Is there a better way to do this? Maybe using pandas?