Oh my god. It's full of code!

Posts tagged “Import

Salesforce Apex CSV Parsing to sObject

So for a recent project I’ve been working on, a person wants to be able to email in a small CSV file to an automated bot. The bot will extract the attachment, iterate through the data and import the data. So of course first I had to get together a CSV parser. Thankfully that was already handled by the awesome people over at the developer force wiki. So first, grab their Apex CSV parser. So that returns a list of a list of strings. Each list entry is a list of the values in that row. So now while this is nice, it’s still not idea for importing. You need to convert that data into an sObject, or a list of them. So I drafted this little function that works in tandem with the first one. You can pass it that data you get from the first parser, and the type of sObject you want to serialize the data into and it will give you back a list of sObjects. Some things to note about this function.

  • You must pass in the column headers as the first row
  • The column headers must match the API names of the fields. So your column headers must match the sObject field names
  • The parser can deal with invalid field names. It will simple not set them on the sObject. So you can include extra data, it will just be discarded
  • I havn’t tested it much. It may fail with some data types, but I’m not sure
//Name: parseCSV
//Description: Class for parsing and casting CSV content to sObjects
//Author: Daniel Llewellyn and blog post at http://www.preludeinteractive.com/2009/09/parsing-csv-files-in-apex/ (author unknown)
//Date: December 1 2011
public class parseCSV
{   
    //code borrowed from http://www.preludeinteractive.com/2009/09/parsing-csv-files-in-apex/
    public static List<List<String>> parseCSV(String contents,Boolean skipHeaders)
    {
        List<List<String>> allFields = new List<List<String>>();
    
        // replace instances where a double quote begins a field containing a comma
        // in this case you get a double quote followed by a doubled double quote
        // do this for beginning and end of a field
        contents = contents.replaceAll(',"""',',"DBLQT').replaceall('""",','DBLQT",');
        // now replace all remaining double quotes - we do this so that we can reconstruct
        // fields with commas inside assuming they begin and end with a double quote
        contents = contents.replaceAll('""','DBLQT');
        // we are not attempting to handle fields with a newline inside of them
        // so, split on newline to get the spreadsheet rows
        List<String> lines = new List<String>();
        try {
            lines = contents.split('\n');
        } catch (System.ListException e) {
            System.debug('Limits exceeded?' + e.getMessage());
        }
        Integer num = 0;
        for(String line : lines) {
            // check for blank CSV lines (only commas)
            if (line.replaceAll(',','').trim().length() == 0) break;
            
            List<String> fields = line.split(',');  
            List<String> cleanFields = new List<String>();
            String compositeField;
            Boolean makeCompositeField = false;
            for(String field : fields) {
                if (field.startsWith('"') && field.endsWith('"')) {
                    cleanFields.add(field.replaceAll('DBLQT','"'));
                } else if (field.startsWith('"')) {
                    makeCompositeField = true;
                    compositeField = field;
                } else if (field.endsWith('"')) {
                    compositeField += ',' + field;
                    cleanFields.add(compositeField.replaceAll('DBLQT','"'));
                    makeCompositeField = false;
                } else if (makeCompositeField) {
                    compositeField +=  ',' + field;
                } else {
                    cleanFields.add(field.replaceAll('DBLQT','"'));
                }
            }
            
            allFields.add(cleanFields);
        }
        if (skipHeaders) allFields.remove(0);
        return allFields;       
    }
 
    public static list<sObject> csvTosObject(List<List<String>> parsedCSV, string objectType)
    {
        Schema.sObjectType objectDef = Schema.getGlobalDescribe().get(objectType).getDescribe().getSObjectType();
        system.debug(objectDef);
        
        list<sObject> objects = new list<sObject>();
        list<string> headers = new list<string>();
        
        for(list<string> row : parsedCSV)
        {
            for(string col : row)
            {
                headers.add(col);
            }
            break;
        }
        system.debug('========================= File Column Headers');
        system.debug(headers);
            
        integer rowNumber = 0;
        for(list<string> row : parsedCSV)
        {
            system.debug('========================= Row Index' + rowNumber);
            if(rowNumber == 0)
            {
                rowNumber++;
                continue;
            }
            else
            {
                sObject thisObj = objectDef.newSobject();
                integer colIndex = 0;
                for(string col : row)
                {                   
                    string headerName = headers[colIndex].trim();
                    system.debug('========================= Column Name ' + headerName);
                    if(headerName.length() > 0)
                    {                  
                        try
                        {                       
                            thisObj.put(headerName,col.trim());
                        }
                        catch(exception e)
                        {
                            system.debug('============== Invalid field specified in header ' + headerName);                           
                        }
                        colIndex++;
                    }
                } 
                objects.add(thisObj);
                rowNumber++;
            }       
        }
        return objects;
    }
    @isTest
    public static void testParseCSV()
    {
        string csvContent = 'firstname,lastname\nTest,Guy';
        List<List<String>> parsedContent = parseCSV(csvContent,false);
        list<contact> contactList = (list<contact>) csvTosObject(parsedContent, 'contact');
        system.assertEquals(contactList.size(),1);
        system.assertEquals(contactList[0].firstname,'Test');
        system.assertEquals(contactList[0].lastname,'Guy');
    
    }    
}

If nothing else this should at least be a jumping off spot for anyone trying to work with CSV files in Apex. If you make any improvements, I’d love to see/hear them. Hope this helps someone!


While This is Awesome, There Must be a Better Way.

This is half an interesting post about how I solved a fairly complicated problem, and half me looking for a better way to do it. The problem I was solving in theory is fairly simple.

I want to import answers from an external survey software into our Salsforce org.

Pretty simple right? Standard import functionality. Nothing too complex. But now let’s build on this.

1) It should be automated. No manual triggering of the import.
2) It should be imported as it is created. Closer to real time than to ETL.
3) The place the data is coming from has no API. We only have direct ODBC access to it.
4) Since the process must be closer to real time, we should use a ‘push’ style mentality for import. The survey software should push the data into Salesforce as it is created, instead of Salesforce requesting it.
5) The survey software only gives me access to javascript to add functionality. No server side processing of any kind.
6) Solution should be as cloud based as possible. Ideally no code on our servers (we have a cold fusion server we use for a lot of data brokering, but we would like to get rid of it).

Suddenly got harder eh? Right off the bat, point 3 means we are going to need to use some kind of middle-ware (in this case our Coldfusion webserver that can talk to the database). Salesforce has no method to talk directly to a database, as it shouldn’t. Everything is done with API’s these days, nobody directly queries databases anymore. At least they shouldn’t. So my first task was to write an API that Salesforce could access to get it’s data. So I wrote up a simple ColdFusion webservice that queries the survey database, formats the data in a reliable way, and returns a JSON array. While not ideal because it does use our server, at least it’s fairly transparent. It’s just a webservice call that could possibly be replaced in the future if the survey software does release an API in the future.

With the webservice up and running, now I need some Apex code to call out to it, get the data and handle it. So I wrote a nice little Apex method that uses a simple HTTP get with some URL params to invoke my ColdFusion webservice. Now it can get the JSON data. Using the new awesome JSON parser in winter 12 I am able to parse the data into a list of custom objects. Hooray.

So now I need a Salesforce object to store this data right? Actually I am going to want a few. I ended up creating 3 in total. The first object ‘survey’, is just a simple container object with pretty much no fields. The second object, ‘survey entry’ is just an object that will contain all the answer objects for a person in a given survey. It of course has a lookup to the ‘survey’ object, as well as a lookup to a contact, and some other info (when they took the survey, etc). The third object ‘survey answer’ is where the real juicy data is. It has the ID of the question, the text of the question, the persons answer, and a lookup to the survey entry.

So now I modified my Apex class a bit to create the survey answers objects, and relate them to the proper ‘survey entry’ (and create one if one does not exist for this person in this survey yet). Boom, so now all the ‘hard’ work is done. I have a class that can be called that will import the survey data for a person into Salesforce. But wait, how do I actually call this thing? I don’t want this on a scheduler, I want it to get called when new data is available. So I need to make this class itself into a webservice of some kind.

I have a bit of experience with Apex REST so I decided that would be a fitting way to handle this. This class only needs the ID of the survey, and the person for whom it needs to import data. That information is easily included in the URL or in POST fields, so I quickly modified my class to be an Apex REST service. Now it was ready to begin being accessed from the outside world. The question now is, how do I invoke the service itself?

First I used the apigee app console to make sure it was working as required. Apigee handles the oAuth and lets you specify params so testing your Apex REST service couldn’t be easier. Once I had verified that it worked, I needed some method to allow the survey software to invoke it. Problem is of course, if you remember that the survey software only supports JavaScript. JavaScript is still subject to that cross domain security policy BS. Normally you could use the script injection technique to make a callout to a different domain, but I need to set headers and such in the request, as well as making it a post request, so that wasn’t going to fly. On top of that I have would have no idea how to let JavaScript start using oAuth or get a valid session ID. So here is where things get a little murky.

How could I allow a javascript only application invoke my Apex REST service? Looks like I would again have to turn to my ColdFusion middleware box. I wrote another webservice which can invoke the desired Apex method from ColdFusion. You can call Apex REST services using a session ID instead of having to deal with oAuth so I went that route. I already have integration between Salesforce and ColdFusion through use of the awesome CFC library provided at RIA Forge (I actually helped contribute a bit to that). So I just wrote up what basically amounts to be a wrapper. You can invoke it from a simple get request and it will wrap the request with the required headers (authorization, content-length, content-type) and send it off to Salesforce. ColdFusion web services have the awesome feature of being able to be called via a URL, instead of having to use a WSDL or whatever. Come to think of it, they are almost a forerunner for REST, but I digress.

So now I have a URL that when called (with some arguments/params) will invoke my Apex REST service that goes and gets the survey data and imports it. So now I still need to get the survey software to call this URL. Here I use the classic script injection technique to make my cross domain request (because the survey software and my ColdFusion box live on different domains) and it much to my surprise it all worked. If you are curious, the code to do that looks like this.


function loadJSON(url)
{
var headID = document.getElementsByTagName("head")[0];
var newScript = document.createElement('script');
newScript.type = 'text/javascript';
newScript.src = url;
headID.appendChild(newScript);
}
var survey = '12345';
var token = 'mkwcgvixvxskchn';
var contact = '003GASDFADFAFDAS';
newUrl = 'http://XXXXXXXXXXXXX/webservice.cfc?method=importData&survey='+survey+'&token&='+token+'&contact='+contact;
loadJSON(newUrl);

So in the end, this is the process I came up with.

1) User takes online survey from 3rd party site (lets call it survey.com)
2) survey.com invokes javascript which calls the ColdFusion webservice (which includes survey id and person id in the request)
3) ColdFusion receives the request, and ‘wraps’ it with the needed authorization information to make a valid HTTP request.
4) Salesforce custom Apex REST class receives the request (with survey id and person id still included)
5) Salesforce sends request BACK to a different ColdFusion webservice, which requests the actual question/answer data.
6) ColdFusion receives request. Queries survey database and encodes database info as an array of JSON encoded objects.
7) JSON encoded data is returned to the calling Apex, where it is parsed into custom objects and committed to the database.

Seems kind of obtuse eh? Yet I can’t think of any way to make it leaner. I really would like to eliminate the 2nd and 3rd step and just have the survey software invoke the Apex REST directly somehow, or at least make the call totally cloud based. I suppose I could host a visualforce page that does the same thing, and have the JavaScript call that…

So anyway, here you can see an interesting case study of integrating a lot of different crap using some fairly new methods and techniques. I am totally open to suggestions on how to refine this. Right now there are just so many points of failure that it makes me nervous but again it seems to be about the best I can do. Thoughts and feedback welcome in the comments.