-2

I have following string

1,2,3,a,b,c,a,b,c,1,2,3,c,b,a,2,3,1,

I would like to get only the first occurrence of any number without changing the order. This would be

1,2,3,a,b,c,

With this regex (found @ https://stackoverflow.com/a/29480898/9307482) I can find them, but only the last occurrences. And this reverses the order.

(\w)(?!.*?\1) (https://regex101.com/r/3fqpu9/1)

It doesn't matter if the regex ignores the comma. The order is important.

Toto
  • 89,455
  • 62
  • 89
  • 125
Massimo
  • 15
  • 6

3 Answers3

3

Regular expression is not meant for that purpose. You will need to use an index filter or Set on array of characters.

Since you don't have a language specified I assume you are using javascript.

Example modified from: https://stackoverflow.com/a/14438954/1456201

String.prototype.uniqueChars = function() {
    return [...new Set(this)];
}

var unique = "1,2,3,a,b,c,a,b,c,1,2,3,c,b,a,2,3,1,".split(",").join('').uniqueChars();
console.log(unique); // Array(6) [ "1", "2", "3", "a", "b", "c" ]
Jim
  • 3,210
  • 2
  • 17
  • 23
  • That finds all unique chars right? – The fourth bird Jan 11 '20 at 15:45
  • Yes, I updated it to show the example provided in the question. – Jim Jan 11 '20 at 15:50
  • 1
    `Array.from` is not necessary, as `this` is iterable. – trincot Jan 11 '20 at 15:50
  • I think it is about all the first unique characters. – The fourth bird Jan 11 '20 at 15:51
  • @Thefourthbird not sure what you mean – Jim Jan 11 '20 at 15:53
  • The pattern from the OP already finds the unique characters, but it finds the last occurrences. I think the question is about matching the first occurrences instead. – The fourth bird Jan 11 '20 at 15:55
  • The answer does that. A set will use the first occurrence of a character then eliminate any future occurrences of the same character. It returns the exact result expected in the question. Even though the script does not care about order it starts from the beginning and only includes unique characters from from the beginning of the string, as is expected in the question. – Jim Jan 11 '20 at 16:01
  • 1
    @Jim I was not aware of the internal workings of a Set. In that case, if the language is JavaScript it is brilliant ;) +1 – The fourth bird Jan 11 '20 at 16:20
  • You can think of it like an object where each property (key) is a character from the string. Starting from the beginning of the string, set each character as an object property. Obviously when you overwrite an existing property it does not add anything new it just sets an existing property name as the same value it was already set at. Then just output the property names and you get only unique characters in FIFO. – Jim Jan 11 '20 at 18:51
  • Thank you. This is a problem in Autohotkey. I thought it was possible with a simple regexmatch. Therefore, I didn't explicit mentioned the language. Thank you for the answer. – Massimo Jan 11 '20 at 19:11
0

I would use something like this:

// each index represents one digit: 0-9
const digits = new Array(10);

// make your string an array
const arr = '123abcabc123cba231'.split('');

// test for digit
var reg = new RegExp('^[0-9]$');

arr.forEach((val, index) => {
  if (reg.test(val) && !reg.test(digits[val])) {
    digits[val] = index;
  }
});

console.log(`occurrences: ${digits}`); // [,0,1,2,,,,....]

To interpret, for the digits array, since you have nothing in the 0 index you know you have zero occurrences of zero. Since you have a zero in the 1 index, you know that your first one appears in the first character of your string (index zero for array). Two appears in index 1 and so on..

ram
  • 680
  • 5
  • 15
0

A perl way to do the job:

use Modern::Perl;

my $in = '4,d,e,1,2,3,4,a,b,c,d,e,f,a,b,c,1,2,3,c,b,a,2,3,1,';
my (%h, @r);
for (split',',$in) {
    push @r, $_ unless exists $h{$_};
    $h{$_} = 1;
}
say join',',@r;

Output:

4,d,e,1,2,3,a,b,c,f
Toto
  • 89,455
  • 62
  • 89
  • 125