2

I have a database which contains ID such as

        ID
000000000000000000000PBC_1164321
00000000000000000000000RP_395954
00000000000000000000000MOP_395954
00000000000000000000000395954

I want to get only the data after the leading 0's something like this:

ID
PBC_1164321
RP_395954
MOP_395954
395954

I am testing it on https://regexr.com. I tried using ^(0*) but it selects all the 0's. I want the opposite of it, not select all the leading 0 but everything after it. Kindly help.

I am not using any language such as Python and R. Just want to use regex to select/match anything apart from leading 0's

SierraOscar
  • 17,507
  • 6
  • 40
  • 68
Kshitij Yadav
  • 1,357
  • 1
  • 15
  • 35
  • 2
    Try with `sub`. You can specify the start of the string (`^`) followed by 0 or more 0's and replace it with `""`, `sub(^0*", "", df1$ID)` – akrun Apr 08 '19 at 15:25
  • can you try that on the link i used in the question. It doesnt recognize "sub" – Kshitij Yadav Apr 08 '19 at 15:26
  • There you need `^0*` `sub` is a `R` function. – akrun Apr 08 '19 at 15:26
  • @akrun the ^0* is only selecting 0's, I want the opposite of that. select everything after the 0 – Kshitij Yadav Apr 08 '19 at 15:29
  • 1
    You can try the pattern `"^0*"` at your link---all that link does is the regex matching. You might want to do a lot of things with regex matches--extract them, extract the whole string if there is a match, return a logical if there is match, replace the match, etc. `sub` is an R function that replaces (substitutes) the match. A nice way to remove 0s is to replace the 0s with nothing `""`. That is what akrun proposes. – Gregor Thomas Apr 08 '19 at 15:29
  • 1
    Khsitij, did you try it? `sub("^0*", "", "00000000000000000000000MOP_395954")` returns `"MOP_395954"`, exactly what you were requesting. – r2evans Apr 08 '19 at 15:30
  • How can I do this without using any python or R fucntions such as "sub" and "Replace". Use only regex for the whole thing? Is it possible? – Kshitij Yadav Apr 08 '19 at 15:30
  • Regex just matches. It doesn't *do* anything with the match. You want to do something, so you need more than just regex. – Gregor Thomas Apr 08 '19 at 15:31
  • 2
    `([^0].*)` should work for you @KshitijYadav – codelessbugging Apr 08 '19 at 15:32
  • @Gregor thats fine, how can i match anything except the leading 0's. I actaully want that only – Kshitij Yadav Apr 08 '19 at 15:32
  • Khsitij, okay, fair question. Where are you trying to do this? Without R or python or another programming language, you can do it in `notepad` or on a piece of paper ... which is obviously sarcasm and because I don't know more context. – r2evans Apr 08 '19 at 15:32
  • @CodelessBugging thats excellent. Please answer this, iI will upvote and close this question. – Kshitij Yadav Apr 08 '19 at 15:34
  • 1
    @Toto, please read the rest of the comments before closing this as a dupe. The OP specifically asked about doing it without python/R, and while it is not clear to me what that means, it does suggest that more details are needed before knowing that this is a duplicate. – r2evans Apr 08 '19 at 15:34
  • 1
    @KshitijYadav cool, glad I could help, but seems like I can't answer until the question is no longer flagged as a dupe – codelessbugging Apr 08 '19 at 15:35
  • @CodelessBugging i dont think its a duplicate question. how can I un-flag this? Any ideas? – Kshitij Yadav Apr 08 '19 at 15:37
  • 1
    OK, done. But the comment that they don't use R or Python was said after I closed – Toto Apr 08 '19 at 15:40
  • @Toto Sure. Thank you. – Kshitij Yadav Apr 08 '19 at 15:41
  • @CodelessBugging please write the answer :) – Kshitij Yadav Apr 08 '19 at 15:41
  • 1
    @Toto, thanks for re-opening. For the record, though: *"without using any python or R fucntions"* at 15:30:49Z, your close was at 15:32:26Z. Perhaps it took a minute or so for it to register on the page ... yeah, that happens to me a lot. – r2evans Apr 08 '19 at 15:46

2 Answers2

2

You can match text after the leading zeros using ([^0].*)

codelessbugging
  • 2,849
  • 1
  • 14
  • 19
1

Use /[A-Z1-9].*/gi [A-Z1-9] will match any character in A-Z 1-9 essentially not a 0, the dot matches any character and the star matches it 0 or more times. The tags match globally and case insensitive.

I initially attempted a solution similar to @Codelessbugging but was unable to get the site to process it correctly.

Link to this solution
Link to CodelessBugging solution

CT Hall
  • 667
  • 1
  • 6
  • 27
  • I think codeless and yours both are right. Since the software I am using does the interpretation one row at a time, it was able to parse it with what codeless presented. – Kshitij Yadav Apr 08 '19 at 15:55