0

I have a file a notepad txt file inflation.txt that looks something like this:

1950-1 0.0084490544865279
1950-2 −0.0050487986543660
1950-3 0.0038461526886055
1950-4 0.0214293914558992
1951-1 0.0232839389540449
1951-2 0.0299121323429455
1951-3 0.0379293285389640
1951-4 0.0212773984472849

From a previous stackoverflow post, I learned how to import this file into R:

data <- read.table("inflation.txt", sep = "" , header = F ,
                   na.strings ="", stringsAsFactors= F, encoding = "UTF-8")

However, this code reads the file as a character. When I try to convert this file to numeric format, all negative values are replaced with NA:

 b=as.numeric(data$V2)

Warning message:
In base::as.numeric(x) : NAs introduced by coercion

> head(b)
[1] 0.008449054          NA 0.003846153 0.021429391 0.023283939 0.029912132

Can someone please show me what I am doing wrong? Is it possible to save the inflation.txt file as a data.frame?

desertnaut
  • 57,590
  • 26
  • 140
  • 166
stats_noob
  • 5,401
  • 4
  • 27
  • 83

2 Answers2

0

I would read the file using space as a separator, then spin out two separate columns for the year and quarter from your R script:

data <- read.table("inflation.txt", sep = " ", header=FALSE,
            na.strings="", stringsAsFactors=FALSE, encoding="UTF-8")
names(data) <- c("ym", "vals")
data$year <- as.numeric(sub("-.*$", "", data$ym))
data$month <- as.numeric(sub("^\\d+-", "", data$ym))
data <- data[, c("year", "month", "vals")]
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
0

The issue is that "−" that you have in your data is not minus sign (it is a dash), hence the data is being read as character.

You have two options.

  1. Open the file in any text editor and find and replace all the "−" with negative sign and then using read.table would work directly.
data <- read.table("inflation.txt")
  1. If you can't change the data in the original file then replace them with sub after reading the data into R.
data$V2 <- as.numeric(sub('−', '-', data$V2, fixed = TRUE))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213