0

I've got problem with excluding a specific country phone numbers out of a column. the problem is that they are not in a same format and some countries have 3 digit country code ex:"001" and others have 4 digit country code ex:"0098" sample:

00989121234567
009809121234567
989121234567
9121234567
09121234567   

first I need to convert all of those formats into 1 format and next exclude them out of that column.output phone numbers must be in this format:

"989121234567"
MrFlick
  • 195,160
  • 17
  • 277
  • 295
Techmod
  • 27
  • 8
  • Welcome to stackoverflow! Your question is unclear, please read and edit your question according to [How to make a great R reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) so that other users could help you. Also, add expected output. – pogibas Feb 01 '19 at 15:28
  • If you know the final numbers of digits, you could try taking the last x digits with `substring` to have the right count – Clemsang Feb 01 '19 at 15:51
  • @Clemsang I've already tried unq5<-data.frame(substr(unq1$mobile,3,16)) but it returns all countries phone numbers and also remove country code from numbers with this format:"989121234567" – Techmod Feb 02 '19 at 01:54

2 Answers2

1

You can use startsWith and substr (or gsub would do as well) for this. First though, you need an array with prefixes:

# variables
country_codes <- c('1', '98')
prefix <- union(country_codes, paste0('00', country_codes))
numbers <- c('00989121234567','009809121234567','989121234567','9121234567','09121234567')

# get rid of prefix
new_numbers <- character(length(numbers))
for (k in seq_along(prefix)) {
  ind <- startsWith(numbers, prefix[k])
  new_numbers[ind] <- substr(numbers[ind], nchar(prefix[k]) + 1, nchar(numbers[ind]))
}
new_numbers[new_numbers == ""] <- numbers[new_numbers == ""]
# results
new_numbers
# [1] "9121234567"  "09121234567" "9121234567"  "9121234567"  "09121234567"

You can then add new country codes e.g. 44,31 etc. or you could also add paste0('+', country_codes) in prefix to deal with numbers of the form +1xxxx.

niko
  • 5,253
  • 1
  • 12
  • 32
  • What happened when the number starts with a potential prefix like '189121234567' ? – Clemsang Feb 04 '19 at 08:10
  • It'll return `89121234567` if `1` is stored as a prefix. For such cases one needs exception handling or having prefixes starting with `00` or `+` or something like that. – niko Feb 04 '19 at 08:27
  • Thanks, your solution is great but it is indeed the remaining exception – Clemsang Feb 04 '19 at 08:29
0

If you define the vector that includes the telephone number as numeric the zeros in front are removed and you are then free to remove the numbers that you don't want.

Using the numbers provided:

nr <- c(00989121234567,009809121234567,989121234567,9121234567,09121234567)
nr
[1] 9.891212e+11 9.809121e+12 9.891212e+11 9.121235e+09 9.121235e+09


subset(nr,!grepl("^98",nr))
[1] 9121234567 9121234567

EDIT: I see you added the requirement of returning a character vector. You can just use the as.character() function for that on the final vector.