0

I'm currently working with a database table that stores people's first and last names as a single string. This isn't a huge deal when it comes down to it, but I'd at least like to attempt to make it look nicer. Here's what my output currently looks like:

ID      PersonName
-------------------
0001    JohnSmith
0002    JaneDoe
0003    MikeJones

And if possible, I'd like it to look like one of the following:

ID      PersonName              ID      FName    LName
-------------------             -----------------------
0001    John Smith              0001    John     Smith
0002    Jane Doe                0002    Jane     Doe
0003    Mike Jones              0003    Mike     Jones

For the most part, all the names in the table have the same FirstLast format, with capital letters signifying the beginning of each name. However, I'm seeing a few issues, such as hyphenated or double last names that look like JaneDoe Smith with a space or JohnSmith Sr or they have an extra initial like C MichaelJones.

Even if there's no way to account for the ones with different formatting, I'm still curious as to whether there's actually a way to split the others. All the other forum posts I've read so far are only explaining how to split strings that are comma delimited or have some sort of other special mark/symbol that's easy to pick out.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
EJF
  • 451
  • 3
  • 9
  • 28
  • 2
    https://stackoverflow.com/questions/23470794/split-words-with-a-capital-letter-in-sql – Dale K Jan 23 '19 at 20:42
  • 2
    Don't forget handling Malcolm McDowell, Conan O'Brien, Carl van der Walt and Jonathan Rhys-Davies. – jarlh Jan 23 '19 at 20:44
  • Ha! Yeah, it's probably going to end up being easier to just leave the formatting the way it is. It's still easy enough to read. I figured I might as well ask though. – EJF Jan 23 '19 at 20:46
  • An extra step you could take is if you know the first and last name is always together, find spaces and remove them before and after the longest string of characters. In the case of `JaneDoe Smith` this wouldn't work, but your other examples would. Seems like it would trim down the errors. – Icculus018 Jan 23 '19 at 20:47
  • do you have regex in sql-server? `'([a-z])([A-Z])', '$1 $2'` may this helps: https://stackoverflow.com/questions/21378193/regex-pattern-inside-sql-replace-function – SL5net Jan 23 '19 at 20:48
  • Your best bet may be to write a script in your favorite programming language to fix these. Another problem that may not have been thought of is that somehow this data got into the database in the first place, so an effort would also need to be done to ensure that this kind of data isn't put in the database after you run a cleanup. – dmoore1181 Jan 23 '19 at 20:56
  • @dmoore1181 the weird thing is that other tables containing data about these same people have their names split with a space, split into two columns, and sometimes even in a `LastName, FirstName Initial` format. Not sure why it varies. It's the back-end DB for a medical record system in a clinic so I have no control over how it decides to format everything. It's irritating. – EJF Jan 23 '19 at 21:10
  • @EJF that is scary. Guess that gives some security via Obscurity if the data in the different systems can't even be linked up in a normal way – dmoore1181 Jan 23 '19 at 21:15

0 Answers0