0

Background

There is an application where users are required to enter information that will be stored in a DB. I then have an application that runs every 5 minute and gets the information that was entered by the user using the previous application. My app then grabs all the information from the database and then proceed to do create the given document and then places it in a server for the user to get. However users started having issues with a specific document, where certain functionalities were not executing correctly. So I identified the issue as being the string which a user entered in the entry application, in the title column they had "Jame's Bond Story" so my application creates the document and does not have any issue what so ever. So after debugging I identified the following problem.

Problem

Not sure how the specific user did what he did but the single quote ' was not really a single quote but some other type of weird character anomaly. I proved this by running the following code to see if I can remove it.

 string cleanTitle = BookRec.TitleName.Replace("'","");

However this did not work for me at all. I then broke the string into a character array and instead of getting the character I got a weird digit. So then I proceeded into using this regex code to clean every character and only allow numbers and letters.

string cleanTitle = Regex.Replace(BookRec.TitleName, "[^\\w\\. _]", "");

This has now become an issue because the users want the Title to contain special the following characters ( ) _ , - .

I am looking for a way to to filter out any characters including the type I ran into this week and only allow the 6 characters which the users have agreed to. I can up with the following regex formula bu I am getting an empty string.

Regex fomrula = new Regex(@"^[a-zA-Z0-9_\[])(,\-.'");

However I am getting an empty string when I am replacing the title. I am not a big fan of regex, I am also open to a a sub string approach to this as well.

Appended Information

I am not able to access the application that inserts the information to the given database. I am only able to read from the database and then preform actions.

EasyE
  • 560
  • 5
  • 26
  • 2
    See this question posted 20 mins ago https://stackoverflow.com/questions/45969065/passing-a-string-containing-special-characters-to-database – L.B Aug 30 '17 at 20:41
  • @L.B these questions **should not be closed**. They are not exact duplicates, and in fact almost entirely unrelated. – maccettura Aug 30 '17 at 20:45
  • @maccettura I think so, you can always vote to reopen it – L.B Aug 30 '17 at 20:46
  • 2
    @L.B OP may have a SQL vulnerability, but the core of their question has nothing to do with SQL Injection. OP wants to know how to filter specific characters since their document generation code fails when unexpected characters are found. – maccettura Aug 30 '17 at 20:48
  • 1
    @maccettura see the link in this question. it is about *Why do we always prefer using parameters in SQL statements* – L.B Aug 30 '17 at 20:49
  • @L.B I read through the the suggested question and it is totally unrelated. This question is complete unrelated to database entry. I think you may have misread the issue I was trying to relay in this question. – EasyE Aug 30 '17 at 20:49
  • @L.B its a great suggestion for the OP to read, but it is not a duplicate of this question. Neither was the last one you single handedly closed. – maccettura Aug 30 '17 at 20:49
  • @maccettura it is your opinion. As I said. I think so... if you disagree vote to reopen. – L.B Aug 30 '17 at 20:50
  • @EasyE This should help you to remove the invalid characters: https://pastebin.com/cSB5Aawc – Xiaoy312 Aug 30 '17 at 20:56
  • 2
    "I am **not able to** access the application that **inserts** the information to the given database. **I am only able to read** from the database and then preform actions." The problem isn't even about database or sql, the data is given as is. The problem is about fixing his replace method. – Xiaoy312 Aug 30 '17 at 21:20
  • @Xiaoy312 you you actually answered my question thank you. I am having trouble in one spec though, seems like that reqex formula removed the spaces between words as well. Any suggestions? Also if you can place your response as the answer it will give help to future questions like the one I asked. – EasyE Aug 30 '17 at 21:30
  • @EasyE you can add space to the `ValidSymbols` – Xiaoy312 Aug 31 '17 at 16:35
  • 1
    @Xiaoy312 yup you was finally got it working with that. Thanks the question was opened for a little while this morning but then was closed again. What a shame. – EasyE Aug 31 '17 at 16:37
  • 1
    @EasyE None of us can vote to reopen anymore because technically we already have. :/ – maccettura Aug 31 '17 at 19:32

1 Answers1

2

You may want to try something like this:

string cleanTitle = Regex.Replace(BookRec.TitleName, @"[^\u0000-\u007F]+", "");

This will replace any Unicode character that is not between those values. I'm not sure if those are the ones that are causing you problems but hopefully it may give you a hint in the right direction.

Andrew
  • 7,602
  • 2
  • 34
  • 42