2

I know it's not the cleanest code to date, but I can't figure out why I can't get this one to work.

I'm looking to populate the field m.customersTemp with Customer Numbers from field c.ClientNumber. But only when a LIKE match from c.EmailAddress is found m.Emails... m.Emails is a field with a list of e-mails. Code Below.

UPDATE market m, customer c
SET m.customersTemp = CONCAT(m.customersTemp, c.ClientNumber)
WHERE m.Emails LIKE CONCAT('%', TRIM(c.EMailAddress), '%')
AND TRIM(c.EMailAddress)<>''

The result in field m.customersTemp only displays one value (customer number)... and I know there are many matches.

TABLE CUSTOMER
ClientNumber    | EMailAddress
1234              a@a.com
4567              b@b.com
2222              
1111              d@d.com

------------------------------------------------------------- 
|                        TABLE MARKET                       |
-------------------------------------------------------------
| ID            | Emails                     | customersTemp|
-------------------------------------------------------------
|1              | a@a.com, b@b.com, c@c.com  |              |
|2              | a@a.com, b@b.com, g@g.com  |              |
|3              | e@e.com                    |              |
|4              | f@f.com                    |              |
-------------------------------------------------------------

Result in customersTemp at ID 1 and 2 is only 1 ClientNumber. 4567

John K.
  • 23
  • 5
  • Maybe you should add some rows of data contained in the mentioned tables. – dhh Jun 30 '16 at 05:10
  • added simple sample data – John K. Jun 30 '16 at 05:26
  • John, you just changed the question entirely. I am rolling it back. Ask a new question and accept mine. – Drew Jun 30 '16 at 16:44
  • There, I rolled back the roll back. I hope you get an answer from someone. And please do yourself a favor: don't save CSV data in a column. See my info [here](http://stackoverflow.com/a/32620163) – Drew Jun 30 '16 at 16:46
  • Sorry Drew, it was a miscue on the Emails field requirement. – John K. Jun 30 '16 at 16:47
  • I will take half the blame. You had a comma before but I new line was depicted. It was like 2am for me. – Drew Jun 30 '16 at 16:48
  • So, the moral of the story is that to put data into a CSV (1,2,3) is easy with `group_concat()`. Getting data out would require `find_in_set()`, so I will spend a little time doing that, or someone else can and you can accept their answer. – Drew Jun 30 '16 at 16:56

1 Answers1

0

Don't forget to read the Warning at the bottom as to why you should NEVER save your data like this.

You can test this on a backup copy. I wouldn't run it against your main tables. Akin to someone saying: "Here, try this delete command, I think it will work."

-- drop table customer;
create table customer
(   ClientNumber int,
    EMailAddress varchar(100)
);
insert customer (ClientNumber,EMailAddress) values
(1234,'john@john.com'),
(4567,'joe@joe.com'),
(2222,''), 
(1111,'somone@someone.com'),
(5454,'john@john.com');

-- drop table market;
create table market
(   Emails varchar(100),
    customersTemp varchar(100)
);
insert market(Emails,customersTemp) values
('john@john.com',''),
('joe@joe.com',''),
('test@test.com',''),
('more@more.com','');

The Update statement:

UPDATE market
INNER JOIN 
(   SELECT c.EMailAddress as e,GROUP_CONCAT(c.ClientNumber ORDER BY c.ClientNumber) theList
    FROM customer c
    GROUP BY c.EMailAddress
) xDerived1
ON market.EMails = xDerived1.e
SET market.customersTemp = xDerived1.theList;

Results:

select * from market;
+---------------+---------------+
| Emails        | customersTemp |
+---------------+---------------+
| john@john.com | 1234,5454     |
| joe@joe.com   | 4567          |
| test@test.com |               |
| more@more.com |               |
+---------------+---------------+

Version2

drop table customer;
create table customer
(   ClientNumber int,
    EMailAddress varchar(100)
);
insert customer (ClientNumber,EMailAddress) values
(1234,'a@a.com'),
(4567,'b@b.com'),
(2222,''), 
(1111,'d@d.com'),
(8484,'g@g.com');
-- select * from customer;

drop table market;
create table market
(   id int auto_increment primary key,
    Emails varchar(100),
    customersTemp varchar(3000)
);
insert market(Emails,customersTemp) values
('a@a.com,b@b.com,c@c.com',''),
('a@a.com,b@b.com,g@g.com',''),
('e@e.com',''),
('f@f.com','');
-- select * from market;

drop table if exists marketHelper7;
create table marketHelper7
(   -- btw this might be the kind of table
    -- as an intersect/junction table that you
    -- should have to begin with
    -- and not have your CSV stuff 
    cid int not null,
    mid int not null
);

insert marketHelper7 (cid,mid)
select c.ClientNumber,m.id as MarketId
from customer c
join market m
on find_in_set(c.EMailAddress,m.Emails)>0;

update market set customersTemp=''; -- do a reset

UPDATE market m
join 
(   SELECT mh.mid as i,GROUP_CONCAT(mh.cid ORDER BY mh.cid) theList
    FROM marketHelper7 mh
    GROUP BY mh.mid
) xDerived1
ON m.id = xDerived1.i
SET m.customersTemp = xDerived1.theList;

drop table marketHelper7;

.

select * from market;
+----+-------------------------+----------------+
| id | Emails                  | customersTemp  |
+----+-------------------------+----------------+
|  1 | a@a.com,b@b.com,c@c.com | 1234,4567      |
|  2 | a@a.com,b@b.com,g@g.com | 1234,4567,8484 |
|  3 | e@e.com                 |                |
|  4 | f@f.com                 |                |
+----+-------------------------+----------------+

Version 2 above has the helper table.

Warning: By the way, never save your data like this. It is insane, and the performance is awful. Please see my answer here on Junction Tables (many-to-many) (similar to association tables or "item has" tables a.k.a. One-to-Many). They are all the same concept that utilized Data Normalization best practices and fast indexes during queries. Plus you stay happier not fighting with your data constantly or wondering if you blow the buffer size with group_concat.

Note that group_concat() has flexibility for its separator choice, and the order by, baked inside the function call.

The maximum length for the output of group_concat is subject to the system variable group_concat_max_len which probably defaults to 1K but can be set to at least 4GB.

The Percona article on group_concat(), and the manual pages for group_concat() and find_in_set().

Community
  • 1
  • 1
Drew
  • 24,851
  • 10
  • 43
  • 78
  • Thanks for the thoughts Drew... produces the same result however. – John K. Jun 30 '16 at 05:27
  • Thanks for the new suggestion Drew. It's definitely creating another way to think about this Query. I'll try some updates shortly! One update: The market.Emails field has to be a list of E-mail addresses (the database has been like this for some time) as opposed to one per row. I'm editing the data format above to be more clear. – John K. Jun 30 '16 at 16:22
  • Routine modified (Version 2). John, please get your organization to change the schema :p ... I can't imagine the facepalm that my peers will have (Strawberry) when they see this answer. I am basically The Enabler of Worst Practices by showing this. – Drew Jun 30 '16 at 18:08
  • In order for this to work, the find_in_set() must find no spaces after the comma for this list. You will note that in the strings for the table `market`. So that is comma-separated, not " space then comma separated". That is the best I can do. – Drew Jun 30 '16 at 18:12
  • Junction Tables definitely the way to go. I'll work on converting their CSV fields in the near future and get them on the path to better data storing procedures. Kudos Drew. – John K. Jun 30 '16 at 19:54