0

I am trying to do a Regex in Go to match s3 bucket urls.

so far i have

https://s3.amazonaws.com/(.+?)/",
        "http://s3.amazonaws.com/(.+?)/",
        "//s3-us-east-2.amazonaws.com/(.+?)/",
        "//s3-us-west-1.amazonaws.com/(.+?)/",
        "//s3-us-west-2.amazonaws.com/(.+?)/",
        "//s3.ca-central-1.amazonaws.com/(.+?)/",
        "//s3-ap-south-1.amazonaws.com/(.+?)/",
        "//s3-ap-northeast-2.amazonaws.com/(.+?)/",
        "//s3-ap-southeast-1.amazonaws.com/(.+?)/",
        "//s3-ap-northeast-1.amazonaws.com/(.+?)/",
        "//s3-eu-central-1.amazonaws.com/(.+?)/",
        "//s3-eu-west-1.amazonaws.com/(.+?)/",
        "//s3-eu-west-2.amazonaws.com/(.+?)/",
        "//s3-eu-west-3.amazonaws.com/(.+?)/",
        "//s3.sa-east-1.amazonaws.com/(.+?)/",
        "https://(.+?).s3.amazonaws.com",
        "//s3.amazonaws.com/([A-z0-9-]+)",
        "//s3-ap-southeast-2.amazonaws.com/(.+?)/",

but this is overkill so i was looking at

//s3.amazonaws.com/([A-z0-9-]+)

but this misses out the . but when i do //s3.amazonaws.com/([A-z0-9-]\.+) it does not match any of the strings found.

I am currently trying to match it against

//s3.amazonaws.com/bucket.name/ and //s3.amazonaws.com/bucket-name-here

any suggestions?

txt3rob
  • 11
  • 1
  • 2

2 Answers2

3

In your regex you use [A-z0-9-]. Note that [A-z] is different from [A-Za-z].

To match a literal dot you could escape it: \.

This part ([A-z0-9-]\.+) in this regex //s3.amazonaws.com/([A-z0-9-]\.+) will match your character class once and then one or more times the dot like j.....

To fully match the 2 urls from your example, you could add a dot in the character class, add an optional forward slash at the end and you might omit the capturing group (parenthesis around the character class([])) if you only want to match the full url and not use the data in the captured group itself for further usage.

//s3\.amazonaws\.com/[.A-z0-9-]+/?

Looking at the other urls in your example, maybe this regex can help you and you can adapt it to your further requirements:

(?:https?:)?//[A-z0-9.-]+\.amazonaws\.com(?:/(?:[A-z0-9.-]*/?))?

The fourth bird
  • 154,723
  • 16
  • 55
  • 70
0

Add dot to the character class:

//s3.amazonaws.com/([-A-z0-9.]+)

Demo

mrzasa
  • 22,895
  • 11
  • 56
  • 94