-1

I have a string that can contain any characters, but I want the text between the first and last occurrence of characters.

Example string:

"x{1{hello}}y}z"

I want the string between the first { and the last }

Expected result:

"1{hello}}y"
Adrian Rosca
  • 6,922
  • 9
  • 40
  • 57
  • 4
    Have you looked into using Regular Expressions for this? – ADyson Jun 24 '20 at 08:44
  • @ADyson It was not clear in the question, but i need to be very performant. RegEx is not very fast. I will clairfy the question. – Adrian Rosca Jun 24 '20 at 08:47
  • 1
    @AdrianRosca Some regex, if well formed and compiled, can be quite performant. We need a benchmark to really compare, but I'd suggest 1) to not rule out regex by principle, 2) provide some info about your performance requirements (size of input in number of lines / characters, and execution speed objective as order of magnitude) – Pac0 Jun 24 '20 at 08:50
  • @Pac0 Good point. 1) I kinda always rule out regex when I need performance. 2) The string are ~40kb and I need to process ~100-500 sec. – Adrian Rosca Jun 24 '20 at 08:55
  • @PavelAnikhouski I'm not sure RegEx and the 2 input strings will be performant enough. I can limit my case to 2 input chars. – Adrian Rosca Jun 24 '20 at 08:57
  • @AdrianRosca The mentioned duplicate has 3 different answers – Pavel Anikhouski Jun 24 '20 at 08:59
  • @AdrianRosca see my answer for benchmarks :) – Falco Alexander Jun 24 '20 at 10:12

4 Answers4

5
string s = "asdasd{1{hello}}y}zsd";
int first = s.IndexOf('{')+1;
int last =   s.LastIndexOf('}');
// classic
var expect = s.Substring(first, last-first);
// very nice way with c# 8:
var expect = s[first..last]

added the nice c# 8 range feature for the string

Bonus: as I read from the comments of the Q, the author is aware of different performance in large data throughput scenarios. So I was curious and did Benchmarks:

the same with Benchmarkrunnter set on throughput:

Job=FastAndDirtyJob  IterationCount=2  LaunchCount=2  
RunStrategy=Throughput  WarmupCount=1  

|                  Method |            Mean |         Error |       StdDev |
|
------------------------ |----------------:|--------------:|-------------:|
|           WithSubstring |        22.82 ns |      6.173 ns |     0.955 ns |
|               WithRange |        22.91 ns |      4.382 ns |     0.678 ns |
|               WithRegEx |     5,153.74 ns |    112.028 ns |    17.336 ns |
|       WithRegExCompiled | 2,903,170.70 ns | 32,123.041 ns | 4,971.076 ns |
|         WithRegExInited |     1,967.75 ns |    196.775 ns |    30.451 ns |
| WithRegExCompiledInited |       848.55 ns |    110.135 ns |    17.043 ns |

.Throughput is the default RunStrategy, works perfectly for microbenchmarking. It's automatically choosing the amount of operation in main iterations based on a set of pilot iterations. The amount of iterations will also be chosen automatically based on accuracy job settings. A benchmark method should have a steady state.

and the code:

void Main()
{
        var summary = BenchmarkRunner.Run<SubstringBenchmarks>();
    }
    
    
[SimpleJob(RunStrategy.Monitoring, launchCount: 5, warmupCount: 5, targetCount: 5, id: "FastAndDirtyJob")]
public class SubstringBenchmarks
{
    string s = "aasd{1{hello}}y}zsd";
    string reg =  @"(?<={).*(?=})";
    Regex rc;
    
    public SubstringBenchmarks()
    {
         rc = new Regex(reg, RegexOptions.Compiled );
    }

    [Benchmark]
    public string WithSubstring()  
        {
        int first = s.IndexOf('{') + 1;
        int last = s.LastIndexOf('}');
        return s.Substring(first, last - first);
        }
        
        [Benchmark]
        public string WithRange() 
        {
            int first = s.IndexOf('{') + 1;
            int last = s.LastIndexOf('}');
            return s[first..last];
        }
        
        [Benchmark]
        public string WithRegEx()
        {
            var r = new Regex(reg);
            return r.Match(s).Value;
        }
        
        [Benchmark]
        public string WithRegExCompiled()
        {
            var r = new Regex(reg, RegexOptions.Compiled );
            return r.Match(s).Value;
        }
        [Benchmark]
        public string WithRegExCompiledInited()
        {
            return rc.Match(s).Value;
        }
}

Edit: in the first answer I choosed a different, less appropriate benchmark run strategy: Monitoring. According to the docs: If a benchmark method takes at least 100ms, you can also use the Monitoring strategy. In this case, the pilot stage will be omitted,

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.18363.900 (1909/November2018Update/19H2)

Intel Core i5-7300U CPU 2.60GHz (Kaby Lake), 1 CPU, 4 logical and 2 physical cores    
.NET Core SDK=3.1.301
[Host] : .NET Core 3.1.5 (CoreCLR 4.700.20.26901, CoreFX
4.700.20.27001), X64 RyuJIT

Job=FastAndDirtyJob  IterationCount=5  LaunchCount=5   RunStrategy=Monitoring  WarmupCount=5  

|                  Method|         Mean |      Error |     StdDev |   Median  |
------------------------ |-------------:|-----------:|-----------:|----------
WithSubstring            |     4.524 μs |   2.322 μs |   3.099 μs |    3.700 μs | 
WithRange                |     3.536 μs |   1.784 μs |   2.381 μs |    2.600 μs |
WithRegEx                |    33.740 μs |   6.592 μs |   8.800 μs |    29.500 μs |       
WithRegExCompiled        | 3,337.984 μs | 380.808 μs | 508.367 μs | 3,214.100 μs |
WithRegExCompiledInited  |     8.780 μs |   2.551 μs |   3.406 μs |     7.700 μs 
Falco Alexander
  • 3,092
  • 2
  • 20
  • 39
2

String.IndexOf method returns the index of the first occurrence of a specified Unicode character or string within this instance, and String.LastIndexOf method returns the index of the last occurrence of a specified Unicode character or string within this instance.

We can use String.Substring method which retrieves a sub-string from this instance:

string sourceString = "x{1{hello}}y}z";
string expectedResult = sourceString.Substring(sourceString.IndexOf("{") + 1, sourceString.LastIndexOf("}") - sourceString.IndexOf("{") - 1);

Console.WriteLine(expectedResult);
// Output:
// 1{hello}}y

We can use string[index..index] instead of String.Substring method:

string sourceString = "x{1{hello}}y}z";
string expectedResult = sourceString[(sourceString.IndexOf("{") + 1)..(sourceString.LastIndexOf("}") - sourceString.IndexOf("{") + 1)];

Console.WriteLine(expectedResult);
// Output:
// 1{hello}}y
1

The next approach can be used to solve this problem:

string str = "x{1{hello}}y}z";
int start = str.IndexOf('{') + 1;
int end = str.LastIndexOf('}') - 1;
string substr = str.Substring(start, end - start + 1);
Iliar Turdushev
  • 4,935
  • 1
  • 10
  • 23
0
string str = "x{1{hello}}y}z";
        string Res = str.Substring(str.IndexOf('{') + 1, str.LastIndexOf('}') - str.IndexOf('{') - 1);
        return Res;
Mohamed Hasan
  • 207
  • 4
  • 13