0

We have a business requirement to keep a repository of clients up to date in Salesforce, to that end it's been decided that we should implement a small rest API returning JSON that is able to be cached and paged that will be queried once a day-ish.

The current plan is to call a stored procedure inside SQL Server, to call upon a custom view.

However returning all the results would be very inefficient to both serve as JSON, page through, etc.

Luckily all the relevant rows in the database, have a LastUpdated timestamp, Unluckily it's fairly well normalized, so now I find myself in the position of merging upwards of 15 tables worth of LastUpdated dates in a max-like function, which appears fairly complex to do in SQL in a null safe manner.

I probably don't actually need the LastUpdated data in Salesforce, which makes me wonder, would it be better to Take the max of 15 LastModified columns in various nested queries, or compare them all to the input date in many different where clauses, and somehow join them all back to the root object (student in this example).

e.g. for a simple example of 3 tables (ignoring the fact my real world query has several sub queries already that I barely understand, (In reality students have 1 or more guardians, guardians have 1 or more phone numbers of varying types)

Student
-------------------------------------------------------------------------
StudentID   |   Student   |   SchoolID   |   GuardianID   | LastModified



Guardian
-------------------------------------------------------------------------
GuardianID   |   Name   |   Phone   |   LastModified



School
-------------------------------------------------------------------------
SchoolID   |   Name   |   Phone   |   LastModified

To end up with a view that would export

View
--------------------------------------------------------------------------------------------------------------------
StudentID   |   Student   |   SchoolName   |   SchoolPhone   |   GuardianName   |   GuardianPhone   |   LastModified

Where either the LastModified is the latest of all 3 tables, OR all entries in the view are somehow the most recent changes, perhaps using rowversions?

Ryan Leach
  • 4,262
  • 5
  • 34
  • 71
  • I don't believe the question deserves the unfocused close vote, just because I'm attempting to avoid an XY Problem. – Ryan Leach Aug 26 '21 at 01:38
  • I don't quite follow what you mean by "merging 15 tables worth of lastupdated dates". Could you provide a minimal example using 2 tables? – allmhuran Aug 26 '21 at 02:23
  • Sure, I've provided 3 table definitions to hopefully give a better idea. – Ryan Leach Aug 26 '21 at 02:43
  • Roger, so it looks like your idea is to return a denormalized set of rows to salesforce if any of the rows in the multiple joined tables have an updated date that is later than the last ETL date, is that right? ie, the naive but logically correct condition would be `where T1.lastmodified > @lastEtl or T2.lastmodified > @lastEtl or ...` etc? Or are these temporal tables with mulitple possible rows for a given studentId, guardianId, etc? – allmhuran Aug 26 '21 at 03:03
  • These are not temporal tables, Sorry I didn't consider that as a possibility when I first wrote the question. But yes I think you understand what I'm trying to convey. – Ryan Leach Aug 26 '21 at 03:20
  • 1
    Roger. You could do some fancy syntax with a table valued constructor https://stackoverflow.com/a/6871572/7165279 but honestly, I'd go with the naive solution of just combining a bunch of `or` conditions. Performance of lots of `or`'s can be horrible, but whether or not it actually is horrible in your case can only be discovered by trying it. – allmhuran Aug 26 '21 at 03:36
  • 1
    How do you deal with this complexity on Salesforce end? You forced Student to have just 1 guardian, 1 phone etc or is the data model somewhat similar? Maybe you can simplify it into separate syncs, even if each needs 5 tables... – eyescream Aug 26 '21 at 17:24
  • @eyescream I wish I knew. This task landed on my desk from an external team, with a deadline too close to argue about methodology. Hopefully when they have had time to evaluate, we can consider sane options. It's part of the reason I offered so much context, I don't work with Salesforce directly. And was hoping one of the answers would be "You are doing this completely wrong, salesforce has support for xyz." As a generalist, it pains me to see skills siloed away like this. – Ryan Leach Aug 27 '21 at 00:47

1 Answers1

1

You might want to rethink that one. And if "doesn't matter, deliver" comes from management - cover your a$$ in writing. If performance is an issue worst case you could load the view's result to some temporary table and serve that, right? Even if it takes 5 minutes run - as long as data's ready by the time the SF scheduled call comes it should be fine. (SF developer can use CRON expression for scheduling it and you could plan accordingly)

In no specific order...

I fear that even if you manage to pull this view off it will be useless for the Salesforce. 15-to-1 denormalisation into 1 flat table makes you wonder about quality of data in SF.

Has this ever been done before? Data was manually keyed in to SF or there was some data migration, ETL process you could reuse?

It's likely your data model in SF contains multiple tables including many-to-many relationships too (ERD, bit messy). Out of the box you have Account and Contacts and they can be related directly (by Contact.AccountId foreign key) and indirectly (by AccountContactRelation many-to-many table). Out of the box Contact has 5 phone fields (AssistantPhone, Fax, HomePhone, MobilePhone, OtherPhone, Phone). They might have used these for guardians or they might have gone the (fairly new) ContactPoint route and really have a related table with however many phones and addresses you need.

And then your company might have added custom tables, fields or decided to model it completely away from the standard tables. At the very least they could show you some diagrams from Setup -> Schema Builder to help match stuff.

If there are multiple tables - ideally each of them would have 1 or more field marked as "external id" (something unique on your end)

Another consideration is size. How many changes happen daily? If SF custom code will request JSON and receive something that (together with local variables) will tip it over 6 MB RAM limit it might be fun times. (Dev could use a nightly batch job to bump it to 12 but still). How will you track errors? If integration is down and you need to rerun 1 months' worth of data - godspeed?

Will you even make the service visible from outside? It can't be just on your local company network, it has to be an endpoint accessible from internet. How will you protect it (shared certificate maybe?), will you whitelist all SF data centre IPs... If not - you might have to push to SF instead of SF pulling it.

As to how else could you do it... Push vs pull techniques are very nicely described in Integration Patterns doc (html or pdf). Part of it is knowing what to google for and whether you already have some ETL tools. BizTalk? Informatica? Mulesoft? Azure Data Factory? Hell, there's a paid SQL Server plugin that hides the nasties as linked database, from what I remember you work with stored procedures and they abstract the SF API calls away. If you don't have any of these but still want to push - Salesforce "Data Loader" and "sfdx" developer utility can be run from commandline and upsert (bit like MS SQL merge statement) data. You could schedule them, wrap them in some simple utility script, use SF API libraries for .NET/PHP/Java/Python/what have you... Many ways to skin this cat.

If you'd expose a JSON service with Swagger there's even a no-code way to do it called "external services". And if you can produce OData compatible service there's something called Lightning Connect for realtime access (and your implementation of OData decides how much power you give, which tables, just readonly or CRUD, with search or not...)

eyescream
  • 18,088
  • 2
  • 34
  • 46
  • Thankyou this has been incredibly insightful, and has made it clear to me the work done so far is more than likely garbage due to miscommunication and simple assumptions between teams. About the only thing we had right was security, using known token authentication methods. – Ryan Leach Aug 28 '21 at 12:53
  • Swagger came automatically setup with the bare-bones web-api setup I inherited, but my experience with asp.net's (webapi 2) automatic swagger generation is it nearly always needs to be hand-held into producing the right results, that you may as well write the swagger spec yourself. For what it's worth, the word mulesoft was mentioned in passing, if you were morbidly curious. – Ryan Leach Aug 28 '21 at 12:56
  • 1
    I'm sorry I didn't have a ready solution for you. I do realise this reeks of "let's blow it up into a project, you need some business analysts on it and system/technical architect". Is contacting some consulting company ($$$) an option? Consider cross-posting to dedicated https://salesforce.stackexchange.com/ for visibility, maybe there's an awesome idea I'm not aware of. – eyescream Aug 28 '21 at 14:55
  • And (depending on your SF support plan) you could contact your SF account manager about the "Accelerators" program. https://www.salesforce.com/content/dam/web/en_us/www/documents/success/expert-coaching-catalog.pdf . It used to be once a quarter opportunity to work with SF team and improve your solution in 1 area. There's not much about integrations but if it feels like you need to relaunch SF, make some changes to data model... If customer comes and says "I'm unhappy with my implementation" they tend to dedicate some help or get you in touch with experts ;) – eyescream Aug 28 '21 at 14:57
  • We are close to the start, so it's probably not too late. However I'm one tiny piece of the puzzle who ended up with a specific task in my lap, and people generally don't appreciate cocky external contractors to say "This isn't going to work" without proposing a solution. It just pains me to see us wasting energy. Fortunately, we have a subject matter expert in salesforce joining us shortly, so the ship should hopefully righted shortly so I'm probably going to stay in my lane for now. – Ryan Leach Aug 28 '21 at 16:58
  • I think it was mostly a case of people too far removed from the original problem being asked to solve it by yesterday, with zero context. – Ryan Leach Aug 28 '21 at 16:59